Is this TLA+ specification correct? - formal-methods

So far, i have done a little bit of code for my project, but don't know whether its true or false. Can u guys see my code?? First at all, i should post the requirement for better understanding..
In computer science, mutual exclusion refers to the requirement of ensuring that no two processes or threads are in their critical section at the same time. A critical section refers to a period of time when the process accesses a shared resource, such as shared memory. The problem of mutual exclusion was first identified and solved by Edsger W. Dijkstra in 1965 in his paper titled: Solution of a problem in concurrent programming control.
For visual description of a problem see Figure. In the linked list the removal of a node is done by changing the “next” pointer of the preceding node to point to the subsequent node (e.g., if node i is being removed then the “next” pointer of node i-1 will be changed to point to node i+1). In an execution where such a linked list is being shared between multiple processes, two processes may attempt to remove two different nodes simultaneously resulting in the following problem: let nodes i and i+1 be the nodes to be removed; furthermore, let neither of them be the head nor the tail; the next pointer of node i-1 will be changed to point to node i+1 and the next pointer of node i will be changed to point to node i+2. Although both removal operations complete successfully, node i+1 remains in the list since i-1 was made to point to i+1 skipping node i (which was the node that reflected the removal of i+1 by having it's next pointer set to i+2). This problem can be avoided using mutual exclusion to ensure that simultaneous updates to the same part of the list cannot occur.
This is my code :
EXTENDS Naturals
CONSTANT Unlocked, Locked
VARIABLE P1,P2
TypeInvariant == /\ P1 \in {Unlocked,Locked}
/\ P2 \in {Unlocked,Locked}
Init == /\ TypeInvariant
/\ P1 = Unlocked
Remove1 == P2' = Locked
Remove2 == P1' = Locked
Next == Remove1 \/ Remove2
Spec == Init /\ [][Next]_<<P1,P2>>
THEOREM Spec => []TypeInvariant

First, what are you trying to model? It seems the only thing you are interested in proving as a theorem is the type invariant. I would have thought you want to prove something about mutual exclusion and not just the possible values of P1 and P2. For example, do you want to prove that P1 and P2 are never both locked at the same time?
MutualExclusion == ~(P1 = Locked /\ P2 = Locked)
THEOREM MutualExclusionTheorem == Spec => []MutualExclusion
Also, I wouldn't put TypeInvariant in the definition of Init. Your Init should contain the starting value(s) of your variables P1 and P2. Then you have a theorem,
THEOREM InitTypeInvariant == Init => TypeInvariant
and prove it. This theorem will be used in the proof of the theorem Spec.

Related

Without unwinding, translate a simple while loop iteration into SMT-LIB formula to prove correctness

Consider proving correctness of the following while loop, i.e. I want show that given the loop condition holds to start with, it will eventually terminate and result in the final assertion being true.
int x = 0;
while(x>=0 && x<10){
x = x + 1;
}
assert x==10;
What would be the correct translation into SMT-LIB for checking the correctness, without using loop unwinding?
Hoare logic and loop-invariants
Typical proof of such a statement would be done via the classic Hoare logic, which I assume you're already familiar with. If not, see: https://en.wikipedia.org/wiki/Hoare_logic
The idea is to come up with an invariant for your loop. This invariant must be true before the loop starts, it must be maintained by the loop body, and it must imply the final result when the loop condition is no longer true. Additionally, you also need to prove that the loop will eventually terminate, by means of a measure function. (More on that later.)
You can convince yourself why this would be sufficient: An invariant is something that's "always" true. And if it implies your final result, then your proof is complete. The proof steps I outlined above ensure that the invariant is indeed an invariant, i.e., its truth is always maintained by your program.
Coming up with the invariant
What would be a good invariant for your loop here? Let's give this invariant the name I. A moment of thinking reveals a good choice for I is:
I = x >= 0 && x <= 10
Note how similar (but not exactly the same!) this is to your loop-condition, and this is not by accident. Loop-invariants are not unique, and coming up with a good one can be really difficult. It's an active area of research (since 60's) to synthesize loop-invariants automatically. See the plethora of research out there. https://en.wikipedia.org/wiki/Loop_invariant is a good starting point.
Proof using SMT
Now that we "magically" came up with the loop invariant, let's use SMT to prove that it is indeed correct. Instead of writing SMTLib (which is verbose and mostly intended for machines only), I'll use z3-python interface as a close enough substitute. To finish the proof, I need to show 4 things:
The invariant holds before the loop starts
The invariant is maintained by the loop body
The invariant and the negation of the loop-condition implies the desired post-condition
The loop terminates
Let's look at each in turn.
(0) Preliminaries
Since we'll use z3's python interface, we'll have to do a little bit of leg-work to get us started. Here's the skeleton we need:
from z3 import *
def C(p):
return And(p >= 0, p < 10)
def I(p):
return And(p >= 0, p <= 10)
x = Int('x')
Note that we parameterized the loop-condition (C) and the invariant (I) with a parameter so it's easy to call them with different arguments. This is a common trick in programming, abstracting away the control from the data. This way of coding will simplify our life later on.
(1) The invariant holds before the loop starts
This one is easy. Right before the loop, we know that x = 0. So we need to ask the SMT solver if x == 0 implies our invariant:
>>> prove (Implies(x == 0, I(x)))
proved
Voila! If you want to see the SMTLib for the proof obligation, you can ask z3 to print it for you:
>>> print(Implies(x == 0, I(x)).sexpr())
(=> (= x 0) (and (>= x 0) (<= x 10)))
(2) The invariant is maintained by the loop-body
The loop body is only run when the loop condition (C) is true. The body increments x by one. So, what we need to show is that if our invariant (I) is true, if the loop condition (C) is true, and if I increment x by one, then I remains true. Let's ask z3 exactly that:
>>> prove(Implies(And(I(x), C(x)), I(x+1)))
proved
Almost too easy!
(3) The invariant implies the result when loop condition is false
This time, all we need to ask the solver is to prove the required conclusion when I holds, but C doesn't:
>>> prove(Implies(And(I(x), Not(C(x))), x == 10))
proved
And we have now completed what's known as the partial-correctness claim. That is, if the loop terminates, then x will indeed be 10 at the end. This is what you were trying to prove to start with.
(4) The loop terminates
What we've done so far is known as partial-correctness. It says if the loop terminates, then your post-condition (i.e., x == 10) holds. But it does not make any guarantees that the loop will always terminate.
To get a full-proof, we have to prove termination. This is done by coming up with a measure function: A measure function is a function that assigns (typically) a numeric value to the set of program variables, which is bounded from below. Then we show that it goes down in each iteration and has an initial value that's above its lower-bound. Then we know that the loop cannot continue forever: The measure has to go down in each iteration, but it cannot do so since it's bounded below.
Termination proofs are usually harder, and coming up with a good measure can be tricky. But in this case, it's easy to come up with it:
def M(x):
return 10-x
The claim is that the measure is always non-negative in this case. Let's prove that before the loop starts, i.e., when x == 0:
>>> prove (Implies(x == 0, M(x) >= 0))
proved
It goes down in each iteration:
>>> prove (Implies(C(x), M(x) > M(x+1)))
proved
And finally, it's always positive if the loop executes:
>>> prove (Implies(C(x), M(x) >= 0))
proved
Now we know that the loop will terminate, so our proof is complete.
But wait!
You might wonder if I pulled a rabbit out of a hat here. How do we know that the above steps are sufficient? Or that I didn't make a mistake in my coding as I waved my hand over your program and magically translated it to z3-python?
For the first question: There's established research that for traditional imperative program semantics, Hoare-logic style reasoning is sound. Here's a good slide deck to start with: https://www.cl.cam.ac.uk/teaching/1617/HLog+ModC/slides/lecture2.pdf
For the second question: This is where the rubber hits the road. You have to put my argument to peer-review, possibly using an established theorem prover to code the whole thing up and trust that the mechanization is correct. Why3 (https://why3.lri.fr) is a good-platform to get started for this style of reasoning.
Picking the invariant
The trickiest part of this proof is coming up with the right invariant. A "good" invariant is one that's not only true, but one that allows you to prove the result you want. For instance, consider the following invariant:
def I(p):
return True
This invariant is manifestly true for all programs as well! But if you attempt to run the proofs we had with this version of I, you'll see that it won't go through and you'll get a counter-example. (It's quite instructive to do so.) In general, you can:
Pick an "invariant" that's not really enforced by your program, i.e., it doesn't stay true at all times as described above. Hopefully the counter-example you get from the solver will be helpful to identify what goes wrong.
Or, and this is way more likely, the invariant you picked is indeed an invariant of the program, but it is not strong enough to prove the result you want. In this case the counter-example will be less useful, and for complicated programs it can be hard to track down the reason why.
An invariant that allows you to prove the final result is called an "inductive invariant." The process of "improving" the invariant to get to a proof is known as "strengthening the invariant." There's a plethora of research in all of these topics, especially in the realm of model-checking. A good paper to read in these topics is Bradley's "Understanding IC3:" https://theory.stanford.edu/~arbrad/papers/Understanding_IC3.pdf.
Summary
The strategy outlined here is a "meta"-level proof: It's equivalent to a paper-proof which identified the proof goals, and shipped them to an SMT solver (z3 in this case), to finish the job. This is common practice in modern day proofs, i.e., coming up with sub-goals and using an automated-solver to discharge them. Theorem-provers like ACL2, Isabelle, Coq, etc. mechanize the "coming up with subgoals" part to a large extent, making sure the whole proof is sound with respect to a trusted (but typically very small) set of core-axioms. (This is the so called LCF methodology, see https://www.cl.cam.ac.uk/~jrh13/slides/manchester-12sep01/slides.pdf for a nice slide-deck on it.)
Hopefully this is a detailed-enough level answer for you to get you started in program verification with SMT-solvers. Perhaps it's more than what you asked for; but the rule-of-thumb is there is no free lunch in verification. It is a lot of work! However, you get pretty close to push-button reasoning these days (at least for certain kinds of programs) with the advances in automated theorem provers, SMT-solvers, and other frameworks that many people built over the years. Best of luck, but be warned that program-verification remains the holy-grail of computer science after almost 7-decades of work on it. Things always get better/easier, but there's much more work to be done in the field.

How do I remember the root of a binary search tree in Haskell

I am new to Functional programming.
The challenge I have is regarding the mental map of how a binary search tree works in Haskell.
In other programs (C,C++) we have something called root. We store it in a variable. We insert elements into it and do balancing etc..
The program takes a break does other things (may be process user inputs, create threads) and then figures out it needs to insert a new element in the already created tree. It knows the root (stored as a variable) and invokes the insert function with the root and the new value.
So far so good in other languages. But how do I mimic such a thing in Haskell, i.e.
I see functions implementing converting a list to a Binary Tree, inserting a value etc.. That's all good
I want this functionality to be part of a bigger program and so i need to know what the root is so that i can use it to insert it again. Is that possible? If so how?
Note: Is it not possible at all because data structures are immutable and so we cannot use the root at all to insert something. in such a case how is the above situation handled in Haskell?
It all happens in the same way, really, except that instead of mutating the existing tree variable we derive a new tree from it and remember that new tree instead of the old one.
For example, a sketch in C++ of the process you describe might look like:
int main(void) {
Tree<string> root;
while (true) {
string next;
cin >> next;
if (next == "quit") exit(0);
root.insert(next);
doSomethingWith(root);
}
}
A variable, a read action, and loop with a mutate step. In haskell, we do the same thing, but using recursion for looping and a recursion variable instead of mutating a local.
main = loop Empty
where loop t = do
next <- getLine
when (next /= "quit") $ do
let t' = insert next t
doSomethingWith t'
loop t'
If you need doSomethingWith to be able to "mutate" t as well as read it, you can lift your program into State:
main = loop Empty
where loop t = do
next <- getLine
when (next /= "quit") $ do
loop (execState doSomethingWith (insert next t))
Writing an example with a BST would take too much time but I give you an analogous example using lists.
Let's invent a updateListN which updates the n-th element in a list.
updateListN :: Int -> a -> [a] -> [a]
updateListN i n l = take (i - 1) l ++ n : drop i l
Now for our program:
list = [1,2,3,4,5,6,7,8,9,10] -- The big data structure we might want to use multiple times
main = do
-- only for shows
print $ updateListN 3 30 list -- [1,2,30,4,5,6,7,8,9,10]
print $ updateListN 8 80 list -- [1,2,3,4,5,6,7,80,9,10]
-- now some illustrative complicated processing
let list' = foldr (\i l -> updateListN i (i*10) l) list list
-- list' = [10,20,30,40,50,60,70,80,90,100]
-- Our crazily complicated illustrative algorithm still needs `list`
print $ zipWith (-) list' list
-- [9,18,27,36,45,54,63,72,81,90]
See how we "updated" list but it was still available? Most data structures in Haskell are persistent, so updates are non-destructive. As long as we have a reference of the old data around we can use it.
As for your comment:
My program is trying the following a) Convert a list to a Binary Search Tree b) do some I/O operation c) Ask for a user input to insert a new value in the created Binary Search Tree d) Insert it into the already created list. This is what the program intends to do. Not sure how to get this done in Haskell (or) is am i stuck in the old mindset. Any ideas/hints welcome.
We can sketch a program:
data BST
readInt :: IO Int; readInt = undefined
toBST :: [Int] -> BST; toBST = undefined
printBST :: BST -> IO (); printBST = undefined
loop :: [Int] -> IO ()
loop list = do
int <- readInt
let newList = int : list
let bst = toBST newList
printBST bst
loop newList
main = loop []
"do balancing" ... "It knows the root" nope. After re-balancing the root is new. The function balance_bst must return the new root.
Same in Haskell, but also with insert_bst. It too will return the new root, and you will use that new root from that point forward.
Even if the new root's value is the same, in Haskell it's a new root, since one of its children has changed.
See ''How to "think functional"'' here.
Even in C++ (or other imperative languages), it would usually be considered a poor idea to have a single global variable holding the root of the binary search tree.
Instead code that needs access to a tree should normally be parameterised on the particular tree it operates on. That's a fancy way of saying: it should be a function/method/procedure that takes the tree as an argument.
So if you're doing that, then it doesn't take much imagination to figure out how several different sections of code (or one section, on several occasions) could get access to different versions of an immutable tree. Instead of passing the same tree to each of these functions (with modifications in between), you just pass a different tree each time.
It's only a little more work to imagine what your code needs to do to "modify" an immutable tree. Obviously you won't produce a new version of the tree by directly mutating it, you'll instead produce a new value (probably by calling methods on the class implementing the tree for you, but if necessary by manually assembling new nodes yourself), and then you'll return it so your caller can pass it on - by returning it to its own caller, by giving it to another function, or even calling you again.
Putting that all together, you can have your whole program manipulate (successive versions of) this binary tree without ever having it stored in a global variable that is "the" tree. An early function (possibly even main) creates the first version of the tree, passes it to the first thing that uses it, gets back a new version of the tree and passes it to the next user, and so on. And each user of the tree can call other subfunctions as needed, with possibly many of new versions of the tree produced internally before it gets returned to the top level.
Note that I haven't actually described any special features of Haskell here. You can do all of this in just about any programming language, including C++. This is what people mean when they say that learning other types of programming makes them better programmers even in imperative languages they already knew. You can see that your habits of thought are drastically more limited than they need to be; you could not imagine how you could deal with a structure "changing" over the course of your program without having a single variable holding a structure that is mutated, when in fact that is just a small part of the tools that even C++ gives you for approaching the problem. If you can only imagine this one way of dealing with it then you'll never notice when other ways would be more helpful.
Haskell also has a variety of tools it can bring to this problem that are less common in imperative languages, such as (but not limited to):
Using the State monad to automate and hide much of the boilerplate of passing around successive versions of the tree.
Function arguments allow a function to be given an unknown "tree-consumer" function, to which it can give a tree, without any one place both having the tree and knowing which function it's passing it to.
Lazy evaluation sometimes negates the need to even have successive versions of the tree; if the modifications are expanding branches of the tree as you discover they are needed (like a move-tree for a game, say), then you could alternatively generate "the whole tree" up front even if it's infinite, and rely on lazy evaluation to limit how much work is done generating the tree to exactly the amount you need to look at.
Haskell does in fact have mutable variables, it just doesn't have functions that can access mutable variables without exposing in their type that they might have side effects. So if you really want to structure your program exactly as you would in C++ you can; it just won't really "feel like" you're writing Haskell, won't help you learn Haskell properly, and won't allow you to benefit from many of the useful features of Haskell's type system.

BFS bad complexity

I am using adjacency lists to represent graph in OCaml. Then I made the following implementation of a BFS in OCaml starting at the node s.
let bfs graph s=
let size = Array.length graph in
let seen = Array.make size false and next = [s] in
let rec aux = function
|[] -> ()
|t::q -> if not seen.(t) then begin seen.(t) <- true; aux (q#graph.(t)) end else aux q
in aux next
size represents the number of nodes of the graph. seen is an array where seen.(t) = true if we've seen the node t, and next is a list of the node we need to see.
The thing is that normally the time complexity for BFS is linear (O( V +E)) yet I feel like my implementation doesn't have this complexity. If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
Thus I am wondering how can I adapt this code to make an efficient BFS? The problem (I think) comes from the implementation of a Queue using lists. Does the complexity of the Queue module in OCaml takes O(1) to add an element? In this case how can I use this module to make my bfs work, since I can't do pattern matching with Queue just as easily as list?
the complexity of q#graph.(t) is quite big since it's O(| q |). So my complexity is quite bad since at each step I am concatenating two lists and this is heavy in time.
You are absolutely right – this is the bottleneck of your BFS. You should be happily able to use the Queue module, because according to https://ocaml.org/learn/tutorials/comparison_of_standard_containers.html operation of insertion and taking elements is O(1).
One of the differences between queues and lists in OCaml is that queues are mutable structures, so you will need to use non pure functions like add, take and top that respectively insert element in-place, pop element from the front and return first element.
If I am not mistaken the complexity of q#graph.(t) is quite big since it's O(| q |).
That is indeed the problem. What you should be using is graph.(t) # q. The complexity of that is O(| graph.(t) |).
You might ask: What difference does that make?
The difference is that |q| can be anything from 0 to V * E. graph.(t) on the other hand you can work with. You visit every vertex in the graph at most once so overall the complexity will be
O(\Sum_V |grahp.(v))
The sum of all edges of each vertex in the graph. Or in other words: E.
That brings you to the overall complexity of O(V + E).

Gain maximization on trees

Consider a tree in which each node is associated with a system state and contains a sequence of actions that are performed on the system.
The root is an empty node associated with the original state of the system. The state associated with a node n is obtained by applying the sequence of actions contained in n to the original system state.
The sequence of actions of a node n is obtained by queuing a new action to the parent's sequence of actions.
Moving from a node to another (i.e., adding a new action to the sequence of actions) produces a gain, which is attached to the edge connecting the two nodes.
Some "math":
each system state S is associated with a value U(S)
the gain achieved by a node n associated with the state S cannot be greater than U(S) and smaller than 0
If n and m are nodes in the tree and n is the parent of m, U(n) - U(m) = g(n,m), i.e., the gain on the edge between n and m represents the reduction of U from n to m
See the figure for an example.
My objective is the one of finding the path in the tree that guarantees the highest gain (where the gain of a path is computed by summing all the gains of the edges on the path):
Path* = arg max_{path} (sum g(n,m), for each adjacent n,m in path)
Notice that the tree is NOT known at the beginning, and thus a solution that does not require to visit the entire tree (discarding those paths that for sure do not bring to the optimal solution) to find the optimal solution would be the best option.
NOTE: I obtained an answer here and here for a similar problem in offline mode, i.e., when the graph was known. However, in this context the tree is not known and thus algorithms such as Bellman-Ford would perform no better than a brute-fore approach (as suggested). Instead, I would like to build something that resembles backtracking without building the entire tree to find the best solution (branch and bound?).
EDIT: U(S) becomes smaller and smaller as depth increases.
As you have noticed, a branch and bound can be used to solve your problem. Just expand the nodes that seem the most promising until you find complete solutions, while keeping track of the best known solution. If a node has a U(S) lower than the best known solution during the process, just skip it. When you have no more node, you are done.
Here is an algorithm :
pending_nodes <- (root)
best_solution <- nothing
while pending_nodes is not empty
Drop the node n from pending_nodes having the highest U(n) + gain(n)
if n is a leaf
if best_solution = nothing
best_solution <- n
else if gain( best_solution ) < gain( n )
best_solution <- n
end if
else
if best_solution ≠ nothing
if U(n) + gain(n) < gain(best_solution)
stop. best_solution is the best
end if
end if
append the children of n to pending_nodes
end if
end while

Some questions on DLR call site caching

Kindly correct me if my understanding about DLR operation resolving methodology is wrong
DLR has something call as call site caching. Given "a+b" (where a and b are both integers), the first time it will not be able to understand the +(plus) operator as wellas the operand. So it will resolve the operands and then the operators and will cache in level 1 after building a proper rule.
Next time onwards, if a similar kind of match is found (where the operand are of type integers and operator is of type +), it will look into the cache and will return the result and henceforth the performance will be enhanced.
If this understanding of mine is correct(if you fell that I am incorrect in the basic understanding kindly rectify that), I have the below questions for which I am looking answers
a) What kind of operations happen in level 1 , level 2
and level 3 call site caching in general
( i mean plus, minus..what kind)
b) DLR caches those results. Where it stores..i mean the cache location
c) How it makes the rules.. any algorithm.. if so could you please specify(any link or your own answer)
d) Sometime back i read somewhere that in level 1 caching , DLR has 10 rules,
level 2 it has 20 or something(may be) while level 3 has 100.
If a new operation comes, then I makes a new rule. Are these rules(level 1 , 2 ,3) predefined?
e) Where these rules are kept?
f) Instead of passing two integers (a and b in the example),
if we pass two strings or one string and one integer,
whether it will form a new rule?
Thanks
a) If I understand this correctly currently all of the levels work effectively the same - they run a delegate which tests the arguments which matter for the rule and then either performs an operation or notes the failure (the failure is noted by either setting a value on the call site or doing a tail call to the update method). Therefore every rule works like:
public object Rule(object x, object y) {
if(x is int && y is int) {
return (int)x + (int)y;
}
CallSiteOps.SetNotMatched(site);
return null;
}
And a delegate to this method is used in the L0, L1, and L2 caches. But the behavior here could change (and did change many times during development). For example at one point in time the L2 cache was a tree based upon the type of the arguments.
b) The L0 cache and the L1 caches are stored on the CallSite object. The L0 cache is a single delegate which is always the 1st thing to run. Initially this is set to a delegate which just updates the site for the 1st time. Additional calls attempt to perform the last operation the call site saw.
The L1 cache includes the last 10 actions the call site saw. If the L0 cache fails all the delegates in the L1 cache will be tried.
The L2 cache lives on the CallSiteBinder. The CallSiteBinder should be shared across multiple call sites. For example there should generally be one and only one additiona call site binder for the language assuming all additions are the same. If the L0 and L1 all of the available rules in the L2 cache will be searched. Currently the upper limit for the L2 cache is 128.
c) Rules can ultimately be produced in 2 ways. The general solution is to create a DynamicMetaObject which includes both the expression tree of the operation to be performed as well as the restrictions (the test) which determines if it's applicable. This is something like:
public DynamicMetaObject FallbackBinaryOperation(DynamicMetaObject target, DynamicMetaObject arg) {
return new DynamicMetaObject(
Expression.Add(Expression.Convert(target, typeof(int)), Expression.Convert(arg, typeof(int))),
BindingRestrictions.GetTypeRestriction(target, typeof(int)).Merge(
BindingRestrictions.GetTypeRestriction(arg, typeof(int))
)
);
}
This creates the rule which adds two integers. This isn't really correct because if you get something other than integers it'll loop infinitely - so usually you do a bunch of type checks, produce the addition for the things you can handle, and then produce an error if you can't handle the addition.
The other way to make a rule is provide the delegate directly. This is more advanced and enables some advanced optimizations to avoid compiling code all of the time. To do this you override BindDelegate on CallSiteBinder, inspect the arguments, and provide a delegate which looks like the Rule method I typed above.
d) Answered in b)
e) I believe this is the same question as b)
f) If you put the appropriate restrictions on then yes (which you're forced to do for the standard binders). Because the restrictions will fail because you don't have two ints the DLR will probe the caches and when it doesn't find a rule it'll call the binder to produce a new rule. That new rule will come back w/ a new set of restrictions, will be installed in the L0, L1, and L2 caches, and then will perform the operation for strings from then on out.