Why is the existential necessary in strongest postconditions? - verification

Every formulation of the strongest postcondition predicate transformer I have seen presents the assignment rule as follows:
sp(X:=E, P) = ∃v. (X=E[v/X] ∧ P[v/X])
I am wondering, why is the existential (and thus the existentially quantified variable "v") necessary in the above rule? It seems to me the strongest postconditions predicate transformer is almost identical to symbolic evaluation, in that you maintain a state (a mapping from variables to values) and a path condition (a predicate that must be true at a particular point in the program). Yet, symbolic evaluation does not rely on an existential quantifier.
So, I think I must be missing something here. Any help is appreciated!

I will give some intuitive description, since you have some knowledge in symbolic evaluation
If you have an arbitrary map to variables, you can not say anything about future state changes in the program before looking at them during the analysis.
Symbolical evaluation remembers each chosen path[as state space seperation], so it does not need to be contained in the evaluation formula to solve.
Here however you argue about every possible path and thus need an arbitrary formula to describe the behavior.
Assuming you would keep the variable in the formula, then you would argue about only 1 path of the possible runs. If you know that your variable does not induce other paths, then you can simplify this behavior.
Having however weakest liberal precondition, you know from which possible path you start and wrap all paths together to proof properties about your system.

Related

Amortized runtime for insertion in scapegoat tree

I am working on the following problem, from a problem set for a course I am self studying.
I have solved the first part. I'm stuck on the second. These are my thoughts so far. I think that the proper way to rebuild the subtree rooted at v would be to traverse it once to copy the values into an array in sorted order, and then, traverse it once again to build it into a balanced binary tree. Thus, this would be linear in v.size. However, I don't see where the potential and the constant can turn this into a O(1), let alone how such a constant could depend upon alpha. As I thought the rebuild operation was independent of alpha, and alpha simply affects how often you have to rebuild? So would the alpha come out of the potential function? And then the c just serves to cancel the alpha? If so, could I have some guidance as to how to rewrite the potential function?
You don't need to rewrite the potential function. The way that c and alpha interact is in the part of (2) in which "a subtree that is not alpha-balanced". That should help you derive a lower bound on the potential of that subtree. Part (1) helps you derive an upper bound on the potential of that subtree after the rebuilding. The resulting difference in potential should help you pay for the rebuilding.
In particular, the lower bound will be something like f(c,alpha) * m for some function f. This problem wants you to find an expression for c in terms of alpha so that f(c,alpha) >= 1.

translation from Datalog to SQL

I am still thinking on how to translate the recursivity of a Datalog program into SQL, such as
P(x,y) <- Q(x,y).
Q(x,y) <- P(x,z), A(y).
where A/1 is an EDB predicate. This, there is a co-dependency between P and Q. For longer queries, how to solve this problem?
Moreover, is there any system completely implement the translation? If there is, may I know what system or which paper may I refer?
If you adopt an approach of "tabling" previous conclusions and forward-chain reasoning on these to infer new conclusions, no recursive "depth" is required.
Bear in mind that Datalog requires some restrictions on rules and variable that assure finite termination and hence finitely many conclusions. Variables must have a finite range of possible values, for example.
Let's assume your example refers to constants rather than to variables:
P(x,y) <- Q(x,y).
Q(x,y) <- P(x,z), A(y).
One wrinkle is that you want A/1 to be implemented as an extended stored procedure or external code. For that I would propose tabling all the results of calling A on all possible arguments (finitely many). These are after all among the conclusions (provable statements) of your system.
Once that is done the forward-chaining inference proceeds iteratively rather than recursively. At each step consider each rule, applying it with premises (right-hand sides) that are previously obtained (tabled) conclusions if it produces a new conclusion. If no rule produces a new conclusion in the current step, halt. The proof procedure is complete.
In your example the proofs stop after all the A facts are adduced, because there are no conclusions sufficient to apply either rule to get new conclusions.
A possible approach is to use recursive CTEs in SQL, which provide the power of transitive closure. Relational algebra + transitive closure = Datalog.
Logica does something like this. It translates a datalog-like language into SQL for Google BigQuery, PostgreSQL and SQLite.

Optimization of Function Calls in Haskell

Not sure what exactly to google for this question, so I'll post it directly to SO:
Variables in Haskell are immutable
Pure functions should result in same values for same arguments
From these two points it's possible to deduce that if you call somePureFunc somevar1 somevar2 in your code twice, it only makes sense to compute the value during the first call. The resulting value can be stored in some sort of a giant hash table (or something like that) and looked up during subsequent calls to the function. I have two questions:
Does GHC actually do this kind of optimization?
If it does, what is the behaviour in the case when it's actually cheaper to repeat the computation than to look up the results?
Thanks.
GHC doesn't do automatic memoization. See the GHC FAQ on Common Subexpression Elimination (not exactly the same thing, but my guess is that the reasoning is the same) and the answer to this question.
If you want to do memoization yourself, then have a look at Data.MemoCombinators.
Another way of looking at memoization is to use laziness to take advantage of memoization. For example, you can define a list in terms of itself. The definition below is an infinite list of all the Fibonacci numbers (taken from the Haskell Wiki)
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
Because the list is realized lazily it's similar to having precomputed (memoized) previous values. e.g. fibs !! 10 will create the first ten elements such that fibs 11 is much faster.
Saving every function call result (cf. hash consing) is valid but can be a giant space leak and in general also slows your program down a lot. It often costs more to check if you have something in the table than to actually compute it.

Grammatically correct double-noun identifiers, plural versions

Consider compounds of two nouns, which in natural English would most often appear in the form "noun of noun", e.g. "direction of light", "output of a filter". When programming, we usually write "LightDirection" and "FilterOutput".
Now, I have a problem with plural nouns. There are two cases:
1) singular of plural
e.g. "union of (two) sets", "intersection of (two) segments"
Which is correct, SetUnion and SegmentIntersection or SetsUnion and SegmentsIntersection?
2) plural of plural
There are two subcases:
(a) Many elements, each having many related elements, e.g. "outputs of filters"
(b) Many elements, each having single related element, e.g. "directions of vectors"
Shall I use FilterOutputs and VectorDirections or FiltersOutputs and VectorsDirections?
I suspect correct is the first version (FilterOutupts, VectorDirections), but I think it may lead to ambiguities, e.g.
FilterOutputs - many outputs of a single filter or many outputs of many filters?
LineSegmentProjections - projections of many segments or many projections of a single segment?
What are the general rules, I should follow?
There's a grammatical misunderstanding lying behind this question. When we turn a phrase of form:
1. X of Y
into
2. Y X
the Y changes grammatical role from a noun in the possessive (1) to an adjective in the attributive (2). So while one may pluralise both X and Y in (1), one may only pluralise X in (2), because Y in (2) is an adjective, and adjectives do not have grammatical number.
Hence, e.g., SetsUnion is not in accordance with English. You're free to use it if it suits you, but you are courting unreadability, and I advise against it.
Postscript
In particular, consider two other possessive constructions, first the old-fashioned construction using the possessive pronoun "its", singular:
3a. Y, its X
the equivalent plural:
4a. Ys, their X
and their contractions, with 4b much less common than 3b:
3b. Y's X
4b. Ys' X
Here, SetsUnion suggests it is a rendering of the singular possessive type (3) Set's Union (=Set, its Union), where you intended to communicate the plural possessive (4) Sets, their Union (contracted to the less common Sets' Union).
So it's actively misleading.
Unless you're getting hamstrung by a convention driven system (ruby on rails, cakePHP etc), why not use OutputsOfFilters, UnionOfSets etc? They may not be conventional but they may be clearer.
For example its pretty clear that ProjectionOfLineSegments and ProjectionsOfLineSegment are different things or even ProjectionsOfLineSegments....
Using plural forms of nouns can make them more difficult to read.
When you have a number of things, they are usually stored in a datastructure - an array, a list, a map, set, etc.. generically called a collection or abstract data type. The interface to a collection of items is typically part of the programming environment (e.g. Collections in java and .net, STL in C++) and is well understood by developers to involve quantities of items.
You can avoid pluralizing your nouns, and make the fact that you are dealing with multiple quantities explicit, and indicate how they are accessed by incorporating the name of the collection. For example,
VectorDirectionList - the vectors and their directions are listed, e.g. some kind of Pair type. Works particularly well if you have a VectorDirection, combining a Vector and a Direction.
VectorDirectionMap - if the vector directions are mapped from vector.
Because it's a collection type, dealing with multiple objects is understood as it is endemic to a collection type. It then puts it in the same class as SetUnion - a union always involves at least 2 sets, and a VectorDirectionList makes it clear there can be more than one VectorDirection.
I agree about avoiding homonyms where the word has more than one word class, e.g. Filter, (and actually, Set, although to my mind Set would not really be used in a class name as a verb, so I interpret it as a noun.) I originally wrote this using FilterOutput as an example, but it didn't read well. Using a compound for Filter may help disambiguate - e.g. ImageFilterOutputs (or applying my own adivce, this would be ImageFilterOutputList.)
Avoiding plural forms with class names seems natural when you consider that an instance of a class is itself always one item - "an instance". If we use a plural name, then we get a mismatch - an instance trying to imply that it is multiple things - it itself is just one thing, even if it references multiple other things. The collection naming above builds on this - you have an instance which is a list, a map etc so there is no mismatch.
I'm assuming you are talking about programming language constructs, although the same thinking applies to tables/views. These are understood to involve quantities of items and table names are consequently often singlular (Customer, Order, Item) even though they store multiple rows. Many-to-Many Mapping tables are usually compounds of the entities being related, e.g. relating orders to items - OrderItem. In my experience, using plurals for table names makes the SQL difficult to read.
To sum up, I would avoid plural froms as they make reading harder. There are sure to be cases where they are unavoidable - where using the plural form is more readable than creating a huge name of nested entities and collections, but these are the exception than the rule.
What are the general rules, I should follow?
Make it Clear -- for both visual and aural thinkers.
Make it Specific but Accurate.
Make it pass the "crowded room" or "emergency phone call" test.
To illustrate with the SetsUnion example:
"SetsUnion" is right out; It's easily confused for a typo and speaking it (even in your head) will confuse it for "Set's Union" (Or worse).
The plural is also implied, so the 2nd 's' is redundant.
SetUnion is better but still ambiguous.
UnionOfSets is clearer and should be the bare minimum standard.
But all of these, so far, are uselessly vague (unless you are working with pure mathematical theory).
The term really should be specific. For example, "Red cars", "Programmers who spent too much time on esoterica", etc.
These are all unions of sets, but they tell you something useful. ;-)
.
Finally, Phil Factor had the right of it. To paraphrase:
Can you shout a (term) out across a crowded room and have it keyed in, and successfully (used), by a listener at the other side?
Try yelling, "SetsUnion," or even, "UnionOfSets," across a packed Irish bar. ;-)
1) i would use SetUnion and SegmentIntersection because i think in this case the plurality is implied anyway and it just looks nicer that way.
2) again, i would use FilterOutputs and VectorDirections, for the same reason. you could always use MultipleFilterOutputs if you want to be more specific.
but ultimately it's entirely down to your personal preference.
I think that while general naming conventions and consistency are important, but in a very very tight/tricky algorithm, clarity should trump convention. If it helps, use veryLongAndDescriptiveIdentifiers.
What's wrong with Union()?
Moreover, "union of sets" turns into "sets' union" (the two sets' union is ...); I'm sure I'm not the only person who's okay with CamelCase but not CamelsCaseMinusApostrophes. If it needs an apostrophe to make sense, don't use it. Set.Union() reads exactly like "union of set(s)".
Mathematations will also say "the (set) union of A and B", or rarely "A and B's (set) union". "The sets' union of A and B" makes no sense!
Most people will also see Vector[] vectors and Directions[] vectorDirections and assume that vectors[i] corresponds to vectorDirections[i]. If things really get ambiguous, I use something like vector_by_index and vectorDirection_by_index. Then you can have Map<Filter,Output> output_by_filter or Map<Filter,Output[]> outputs_by_filter, which makes it very obvious what the key is (this is very important in Objective-C where it's completely non-obvious what type the keys or values are).
If you really want, you can add an s and get vectors_by_index, but then consistency gives you the silly outputss_by_filter.
The right thing is, of course, something like struct FilterState { Filter filter; Output[] outputs; }; FilterState[] filterStates;.
I'd suggest singular for the first word: SetUnion, VectorDirections, etc.
Do a quick class search in your IDE, for: Strings*, Sets*, Vectors*, Collections*
Anyway, whatever you choose, be consistent throughout the whole application.

Appropriate operators for assignment semantics in a non-pure declarative language

I'm designing a declarative language for defining signal networks. I want to use variable bindings to represent groups of nodes in the network. It occurred to me that there are two types of "assignment" I wish to do for these variables.
On the one hand, a variable should represent the output of a specific group of signal operators. This output can then be attached to another input. This is important for directing different outputs to different places, for example:
a, b, c = (SignalA with three outputs)
(SignalB a)
(SignalC c)
(SignalD a)
In this case there would be a SignalA with three outputs, where the first and third outputs get linked to SignalB and SignalC respectively, and SignalD also gets linked to the first output of SignalA. There is only one instance of SignalA.
On the other hand, a variable should represent a common pattern of signal operations, so that it's easy to reproduce a common configuration:
a = (SignalA (SignalB))
(SignalC a)
(SignalD a)
In this case, I'd like a to represent the composition of SignalA and SignalB, and this is reproduced as the input for SignalC and SignalD. There are two instances of SignalA here.
So my question is, in functional/declarative programming, are there common terms for these two assignment semantics? And in my language, which one should get '=', and what would be a common operator for the other? (perhaps := ?)
I realized of course that if each Signal really represented a pure function, then both of these would be the same, but in my case it's possible for side effects to occur when the signal is processed, so I need to differentiate these two cases.
It's past my bed time, so I may not be reading carefully enough. But is the second case similar to an anonymous function? Your syntax looks lisp-like already, so I wonder if lisp's shortcut syntax for the lambda function might be what you want.
a = '(SignalA (SignalB))
If your usage is not actually similar in meaning to lambda, then it will probably cause more confusion.
BTW, in the first case, you could follow Perl's idea for the left side of a list assignment:
(a, b, c) = (SignalA with three outputs)
No idea if this will be helpful; I'm not that experienced outside of imperative languages like perl and C.