MSCP-A: examples

MSCP-A: examples of the supercompilation

How to understand the examples?

A Refal function definition is a list of rewrite rules. The rewrite rules use the three types of variables: e-variables can be associated with an arbitrary expression; t-variables can be associated with a symbol or an expression (e.x), where e.x is arbitrary; s-variables can be associated only with a symbol.
The associative concatenation ++ is built-in, thus i.e. we understand the pattern e.x s.y as e.x ++ s.y. A call of the function F on the argument Expr is written as <F Expr>.
We say that the supercompiler proves a statement if the input program computes some predicate and in the residual program all the outputs have the same truth value. That means the semantic property of the input program becomes the syntactic property of the residual one.

Example list:

Fabc
Fabc (finite alphabet)
SqSum
Fibonacci
Prefix
Prefix2
Prefix of A's
Prefix plus AB
Equations as a condition
Correct protocol with a loop
Incorrect protocol with a loop
Pushing equations while generalizing
More pushing equations while generalizing
Proving properties of Fibonacci words
Generalization of negative restrictions
Correct exhaustive deletion v.1
Correct exhaustive deletion v.2
CheckForB
CheckForB variant
WordEq
InfCall
PalAB
Not only 'A'-s
Trinh et al test
Abdulla et al test

Open vars test
Non-quadratic equation
Incorrect sanitization
Correct sanitization – ip
Correct sanitization – brackets
Correct sanitization – ip and no blanks
Correct but silly sanitization
Guarnieri et al test
Square subword
Naive matching to KMP
Commutativity of the associative addition
Specializing an Refal interpreter written in Refal w.r.t. Zip function
Word equation interpreter specialized w.r.t. xAy=yBx
Word equation interpreter specialized w.r.t. ABxy = xyBA
Word equation interpreter specialized w.r.t. xAy = zzz
Word equation interpreter specialized w.r.t. zAByx = yzxBA
.............................................................................................................................
SMT-1
SMT-2
SMT-3
SMT-4
SMT-5
SMT-6
SMT-7
.............................................................................................................................

Follow this link for the experiments on specialization of the word equation interpreters by the supercompiler SCP4.

The results of supercompilation of more examples of string SMT models are given there.

Example #1 (Fabc):

The input and the residual program

A simple test: merging two loops in a one.

Example #2 (Fabc (finite alphabet)):

The input and the residual program

A simple test: merging two loops in a one in the alphabet {A,B,C}.

Example #3 (SqSum):

The input and the residual program

The function computing a square sum in the Peano arithmetic takes a partially defined value as an argument. The definitions of the sum and the multiplication use associativity of the language. The residual program is given for the strengthened generalization mode (~UnAmb mode), but the distictions in the both modes are minor.

Example #4 (Fibonacci):

The input and the residual program

The function computing the n-th Fibonacci number takes a partially defined value as an argument.

Example #5 (Prefix):

The input and the residual program

The test of the equivalence of the two concatenation definitions.

Example #6 (Prefix2):

The input and the residual program (in both modes)

The test of the equivalence of the two other concatenation definitions.

Example #7 (Prefix of A's):

The input and the residual program

In what cases the string x is a prefix of 'A'++x++y? Appending is done by the built-in concatenation.

Example #8 (Prefix Plus AB):

The input and the residual program

In what cases the string x is a prefix of 'AB'++x++y? Appending of y is done in a loop processing a string from its end.

Example #9 (Equations as a condition):

The input and the residual program

A word equation is given explicitly. It is used during the unfolding.

Example #10 (Correct protocol with a loop):

The input and the residual program

Verification of a safe protocol taken from the paper D. Beyer, M. Dangl, P. Wendler. Boosting k-Induction with Continuously-Refined Invariants // In: Kroening D., Pasareanu C. (eds) Computer Aided Verification. CAV 2015. / LNCS, Vol. 9206, pp: 622-640. Springer, 2015.

Example #11 (Incorrect protocol with a loop):

The input and the residual program

Finding a counterexample to an unsafe protocol taken from the paper D. Beyer, M. Dangl, P. Wendler. Boosting k-Induction with Continuously-Refined Invariants // In: Kroening D., Pasareanu C. (eds) Computer Aided Verification. CAV 2015. / LNCS, Vol. 9206, pp: 622-640. Springer, 2015.

Example #12 (Pushing equations while generalizing):

The input and the residual program

A test on using the word equations in constructing generalizations of expressions.

Example #13 (More using equations while generalizing ):

The input and the residual program

Another test on using word equations in generalization.

Example #14 (Proving properties of Fibonacci words):

The input program and the residual program of the second iteration

After two iterations of the supercompilation, a property of the Fibonacci words is proven.

Example #15 (Generalization of negative restrictions ):

The input and the residual program

A negative restriction on a symbol is generalized to a negative restriction on an expression.

Example #16 (Exhaustive deletion 1 ):

The input and the residual program

A correct version of the exhaustive deletion algorithm.

Example #17 (Exhaustive deletion 2 ):

The input and the residual program

Another correct version of the exhaustive deletion.

Example #18 (CheckForB)

The input and the residual program

Given a function generating a string in the language 'A'*, the task is to check the last letter of the string is never 'B'.

Example #19 (CheckForB2)

The input and the residual program

Given a function generating a string in the language 'A'*, the task is to check the last letter of the string never contains 'B'.

Example #20 (Word Equations)

The input and the residual program

Existence of the multiple t-variables in the pattern may lead to an equation generation.

Example #21 (InfCall)

The input and the residual program

The normal computation order on the associative data may lead to non-trivial function order in a stack.

Example #22 (PalAB)

The input and the residual program

If we append the prefix 'AB' to the string and then append its inversion, we can never get a palindrome.

Example #23 (Not only 'A'-s):

The input and the residual program

A simple test on removing refuted hypotheses after a generalization. First, the hypothesis of belonging to the language A* is generated. When the next generalization attempt is done, it is refuted.

Example #24 (Thinh et al test):

The input and the residual program

The test from the paper: M.T Trinh, D.H Chu, J. Jaffar, Progressive reasoning over recursively-defined strings / International Conference on Computer Aided Verification, 2016, 218-240.

Example #25 (Abdulla et al test):

The input and the residual program

The test from the paper: Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Bui Phi Diep, Lukas Holik, Ahmed Rezine, Philipp Rummer. Flatten and conquer: a framework for efficient analysis of string constraints, Programming Language Design and Implementation (PLDI), 2017, Barcelona, Spain, ACM, pages 602-617.

Example #26 (Open variables test):

The input and the residual program

A test on processing open variables in patterns. The residual program is given for the strengthened generalization mode (~UnAmb mode), but the distictions in the both modes are minor.

Example #27 (Non-quadratic equations test):

The input and the residual program

Given a non-quadratic equation, the supercompiler manages to solve it combining elementary actions.

Example #28 (Incorrect sanitization):

The input and the residual program

A simple sanitization algorithm is shown to be incorrect.

Example #29 (Correct sanitization – ip):

The input and the residual program

The test is taken from the paper: M.T Trinh, D.H Chu, J. Jaffar, Progressive reasoning over recursively-defined strings / International Conference on Computer Aided Verification, 2016, 218-240. A sanitization algorithm is shown to be correct in respect with the given conditions.

Example #30 (Correct sanitization – brackets):

The input and the residual program

The test is taken from the paper: M.T Trinh, D.H Chu, J. Jaffar, Progressive reasoning over recursively-defined strings / International Conference on Computer Aided Verification, 2016, 218-240. A sanitization algorithm is also shown to be correct in respect with the given conditions.

Example #31 (Correct sanitization – ip and no blanks):

The input and the residual program

A modified version of the test #29.

Example #32 (Correct but silly sanitization):

The input and the residual program

A simple sanitization algorithm is shown to be correct.

Example #33 (Guarnieri et al test):

The input and the residual program

The test is taken from the paper Guarnieri, M. Pistoia, O. Tripp, Ju. Dolby, S. Teilhet, R. Berg. Saving the World Wide Webfrom Vulnerable JavaScript // In: Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA '11), pp: 177-187, 2011 A sanitization of the innertext method.

Example #34 (Square subword):

The input program and the residual program of the second iteration

Proving that every word in the alphabet {'A', 'B'} has a square subword. The proof is performed by supercompiling the residual program after the first MSCP-A run.

Example #35 (Naive matching to KMP):

The input and the residual program

A version of the classical supercompiler test for the associative data type.

Example #36 (Commutativity of the addition):

The input and the residual program

The commutativity of the associative addition (on strings representing unary numbers). This example is motivated by the example of the manual supercompilation proving commutativity of the addition operation given on p.213 (Example 6) in: F. Turchin. The Language REFAL, the Theory of Compilation, and Metasystem Analysis. Courant Institute Report #20, New York, 1980, 245 p.

Example #37 (Specializing an Refal interpreter written in Refal w.r.t. Zip function):

The input and the residual program

The specialization task is as follows, where %Zip-Source stands for the Zip source code.

<Scp ............................................... >

        <Interpreter       #e.data   ................. 
          >

                      
          <Zip   !     >   %Zip-Source           ;

The input program is an interpreter of a subset of Refal language. The residual program is a version of the Zip function definition.

Example #38 (Word equations interpreter – xAy = yBx):

The input and the residual program

The specialization task is as follows, where %Eq-Source stands for the Eq source code.

<Scp ......................................... >

<Eq #e.Path (x A y)(y B x)> %Eq-Source ;
The input program is an interpreter Eq of the paths solving word equations. The residual program represents a graph of the solutions of the given word equation x++A++y = y++B++x. The program does not contain outputs with the value True; thus we have proved the fact that the equation x++A++y = y++B++x has no solutions.

Example #39 (Word equations interpreter – ABxy = xyBA):

The input and the residual program

The specialization task is as follows, where %Eq-Source stands for the Eq source code.

<Scp ............................................. >

<Eq #e.Path (A B x y)(x y B A)> %Eq-Source ;
The input program is an interpreter of the paths solving word equations. The residual program represents a graph of the solutions of the given word equation A++B++x++y = x++y++B++A.

Example #40 (Word equations interpreter – xAy = zzz):

The input and the residual program

The specialization task is as follows, where %Eq-Source stands for the Eq source code.

<Scp ......................................... >

<Eq #e.Path (x A y)(z z z)> %Eq-Source ;
The input program is an interpreter of the paths solving word equations. The residual program represents a graph of the solutions of the given word equation x++A++y = z++z++z. The graph is finite.

Example #41 (Word equations interpreter – zAByx = yzxBA):

The input and the residual program

The specialization task is as follows, where %Eq-Source stands for the Eq source code.

  <Scp   ................................................. 
          >

        
          <Eq #e.Path (z A B y x)(y z x B A)> %Eq-Source       ;

The input program is an interpreter of the paths solving word equations. The residual program represents a graph of the solutions of the given word equation z++A++B++y++x = y++z++x++B++A.

Example #42 (Empty SMT string model):

The input and the residual program

This example contains a description of the empty smt string model.

(declare-fun i () String)

(declare-fun x () String)

(assert (not (str.contains x "A")))

(assert (= i (str.replace_all x "A" "B" ) ))

(assert (not (= i x)))

The residual program contains no True branches, which proves the emptiness.

Example #43 (Empty SMT string model 2):

The input and the residual program

This example contains a description of the empty smt string model.

(declare-fun x () String)

(assert (str.contains (str.replace_all x "AB" "") "CA"))

(assert (not (str.contains x "CAB")) )

(assert (not (str.contains x "CA")) )

The residual program contains no True branches, which proves the emptiness.

Example #44 (Empty SMT string model 3):

The input and the residual program

This example contains a description of the empty smt string model.

(declare-fun x () String)

(assert (not (str.contains (str.replace_all x "bbb" "cb") "a"))

(assert (str.contains x "ca") )
The residual program contains no True branches, which proves the emptiness.

Example #45 (Empty SMT string model 4):

The input and the residual program

This example contains a description of the empty smt string model.

(declare-fun x () String)

(assert (= (str.replace_all x "a" "b") (str.replace_all x "ba" "cc")))

(assert (str.contains x "ac") )
The residual program contains no True branches, which proves the emptiness.

Example #46 (Empty SMT string model 5):

The input and the residual program

This example contains a description of the empty smt string model.

(declare-fun v () String)

(declare-fun a () String)

(declare-fun q () String)

(assert (= q (str.replace_all v "ac" "cabba")) )

(assert (= q (str.++ v a)) )

(assert (str.contains a "b") )

The residual program contains no True branches, which proves the emptiness.

Example #47 (Empty SMT string model 6):

The input and the residual program (~UnAmb mode)

This example contains a description of the empty smt string model.

(declare-fun b () String)

(declare-fun y () String)

(declare-fun z () String)

(assert (= y (str.replace_all b "bca" "b" ) ))

(assert (= z (str.replace_all y "abba" "aa" ) ))

(assert (not (str.contains z "c")))

(assert (= b (str.++ z "ac") ))

The residual program after supercompilation with strengthened generalization (~UnAmb mode) contains no True branches, which proves the emptiness. In the standard mode, the True branch is present in the residual program due to over-generalizarion.

Example #48 (Empty SMT string model 7):

The input and the residual program

This example contains a description of the empty smt string model.

(declare-fun o () String)

(declare-fun v () String)

(declare-fun z () String)

(assert (= o (str.replace_all z "cb" "b" ) ))

(assert (= o (str.++ v v) ))

(assert (= v (str.++ "a" z z) ))

The residual program contains no True branches, which proves the emptiness.