Can different program implementations have the same program semantics? - semantics

So for any given language, if we implement the same program(i.e same output for any given input) twice, using different syntax (i.e. using i++ instead of i+1) will the two programs have the same semantics? Why?
Does the same apply in case where we use different constructs (i.e. Arrays vs Arraylists)?
Thanks

Yes. Depending on the programming language, there can be (combinations of) different syntax constructs with identical semantics.
For example, we can define a programming language with 3 constructs: A and B, both of which are semantically equivalent, and composition (e.g XY for any X and Y where any of these can either be A, B or any composition thereof). Hence program A is equivalent to program B. Also AA is equal to AB, BA and BB etc.
Further, if we extend the language with C which is semantically equivalent to AA, then, for example, BC is equivalent to AAA etc.

So for any given language, if we implement the same program(i.e same output for any given input) twice, using different syntax (i.e. using i++ instead of i+1) will the two programs have the same semantics?
That question is a tautology. The answer is yes. Obviously.
If two different programs produce the same results for all possible input sets, then they do have the same semantics. By definition1.
Why?
Because that is what "same semantics" means!
Does the same apply in case where we use different constructs (i.e. Arrays vs Arraylists)?
Yes.
(One data structure might use more memory, and that might cause an OOME for one version and not the other ... for certain input datasets. But then I would argue that the programs DO NOT produce the same results for all possible inputs.)
Note that this applies to all practical programming languages. Any programming language where there are programs that can only be written one way ... is probably too restrictive to be usable.
1 - OK, so anyone who has studied programming semantics would probably have a fit when they read that. But I am trying to provide an intuitive explanation rather than one that has a decent mathematical foundation. Horses for courses ... as they say.

Related

How to display a sequence of pairs in alloy?

I would like to know, how can I define a pair and a sequence of pair in alloy?
For example, in the Z notation, we can have a variable definition like c as a sequence of pairs, ie, "c: seq (A \cross B)". Is there any equivalent to this definition in the alloy language?
Alloy is pretty expressive, and often you can translate directly from Z into Alloy. In this case, for example, you could declare a signature representing the pairs
sig Pair {first, second: X}
and then define a field as a sequence of pairs
s: seq Pair
But usually there's a better way of doing it. For example, maybe having two sequences is better; maybe the sequences can be represented as orderings; or maybe you don't need sequences at all and sets will do. Generally this is what people find when modeling with Alloy: that making things simpler for analysis makes things easier to understand and express too. Good luck!

Can antlr do type-dependent parsing?

Let me ask whether antlr3 accepts the following example grammar.
for an input , x + y * z ,
it is parsed as x+(y*z) if each in {x,y,z} is a number;
it is parsed as (x+y)*z if each in {x,y,z} is an object of a particular type T;
And let me ask whether such grammars are used sometimes or very rarely for computer languages.
Thank you very much.
In general, parsers (produced by parser generators) only check syntax.
A parser (produced by any means) that can explore multiple parses (I believe ANTLR does this by backtracking; other parsing engines [GLR, Earley] do it by parallel exploration of possible parses), if augmented with semantic checking information, could reject parses that didn't meet semantic constraints.
People tend not to build such parsers in my experience, partly because it is hard to explain to users. If they don't get it, your parser isn't successful; your example is especially bad IMHO in terms of explainability. They also tend not to do this because they need that type information, and that's not always convenient to collect as you parse. The GCC parsers famously do just this this to parse statements such as
X*T;
and the parser is a bit of a mess because of the need to parse and collect this type information as it goes.
I suspect ANTLR can check semantic predicates. How easy it is to get type information of the kind you discuss to those semantic checks is another question; I have no experience here.
The GLR parsing engine used by our DMS Software Reengineering Toolkit does have "semantic" predicates. It isn't particularly easy to get real semantic type information to those predicates by architectural design; we wanted such predicates to be driven off of "syntax". But then, everything (including type inference) is driven off syntax. So we stick information purely local to the reduction being proposed. This is particulary handy in (not) recognizing as separate types of parses, the following peculiar FORTRAN construct for nested-do-termination vs. shared-do-termination:
DO 10 I=1,10,1
DO 10 J=1,10,1
A(I,J)=0
10 CONTINUE
20 CONTINUE
vs.
DO 20 I=1,10,1
DO 10 J=1,10,1
A(I,J)=0
10 CONTINUE
20 CONTINUE
To the parser, at the pure syntax level, both of these look like:
DO <INT> <VAR>=...
DO <INT> <VAR>=...
<STMTS>
<INT> CONTINUE
<INT> CONTINUE
How can one determine which CONTINUE statement belongs to which DO consrtuct with only this information? You can't.
The DMS FORTRAN parser does exactly this by having two sets of rules for DO loops, one for unshared continues, an one for shared continues. They differentiate using semantic predicates that check that the CONTINUE statement label matches the DO loop designated label. And thus the DMS FORTRAN parser gets the loop nesting right as it parses. AFAIK, all the other FORTRAN compilers parse the statements individually, and then stitch the DO loop nests together in a post pass.
And yes, while FORTRAN has this (confusing) construct, no other modern language that I know copied it.

Grammatically correct double-noun identifiers, plural versions

Consider compounds of two nouns, which in natural English would most often appear in the form "noun of noun", e.g. "direction of light", "output of a filter". When programming, we usually write "LightDirection" and "FilterOutput".
Now, I have a problem with plural nouns. There are two cases:
1) singular of plural
e.g. "union of (two) sets", "intersection of (two) segments"
Which is correct, SetUnion and SegmentIntersection or SetsUnion and SegmentsIntersection?
2) plural of plural
There are two subcases:
(a) Many elements, each having many related elements, e.g. "outputs of filters"
(b) Many elements, each having single related element, e.g. "directions of vectors"
Shall I use FilterOutputs and VectorDirections or FiltersOutputs and VectorsDirections?
I suspect correct is the first version (FilterOutupts, VectorDirections), but I think it may lead to ambiguities, e.g.
FilterOutputs - many outputs of a single filter or many outputs of many filters?
LineSegmentProjections - projections of many segments or many projections of a single segment?
What are the general rules, I should follow?
There's a grammatical misunderstanding lying behind this question. When we turn a phrase of form:
1. X of Y
into
2. Y X
the Y changes grammatical role from a noun in the possessive (1) to an adjective in the attributive (2). So while one may pluralise both X and Y in (1), one may only pluralise X in (2), because Y in (2) is an adjective, and adjectives do not have grammatical number.
Hence, e.g., SetsUnion is not in accordance with English. You're free to use it if it suits you, but you are courting unreadability, and I advise against it.
Postscript
In particular, consider two other possessive constructions, first the old-fashioned construction using the possessive pronoun "its", singular:
3a. Y, its X
the equivalent plural:
4a. Ys, their X
and their contractions, with 4b much less common than 3b:
3b. Y's X
4b. Ys' X
Here, SetsUnion suggests it is a rendering of the singular possessive type (3) Set's Union (=Set, its Union), where you intended to communicate the plural possessive (4) Sets, their Union (contracted to the less common Sets' Union).
So it's actively misleading.
Unless you're getting hamstrung by a convention driven system (ruby on rails, cakePHP etc), why not use OutputsOfFilters, UnionOfSets etc? They may not be conventional but they may be clearer.
For example its pretty clear that ProjectionOfLineSegments and ProjectionsOfLineSegment are different things or even ProjectionsOfLineSegments....
Using plural forms of nouns can make them more difficult to read.
When you have a number of things, they are usually stored in a datastructure - an array, a list, a map, set, etc.. generically called a collection or abstract data type. The interface to a collection of items is typically part of the programming environment (e.g. Collections in java and .net, STL in C++) and is well understood by developers to involve quantities of items.
You can avoid pluralizing your nouns, and make the fact that you are dealing with multiple quantities explicit, and indicate how they are accessed by incorporating the name of the collection. For example,
VectorDirectionList - the vectors and their directions are listed, e.g. some kind of Pair type. Works particularly well if you have a VectorDirection, combining a Vector and a Direction.
VectorDirectionMap - if the vector directions are mapped from vector.
Because it's a collection type, dealing with multiple objects is understood as it is endemic to a collection type. It then puts it in the same class as SetUnion - a union always involves at least 2 sets, and a VectorDirectionList makes it clear there can be more than one VectorDirection.
I agree about avoiding homonyms where the word has more than one word class, e.g. Filter, (and actually, Set, although to my mind Set would not really be used in a class name as a verb, so I interpret it as a noun.) I originally wrote this using FilterOutput as an example, but it didn't read well. Using a compound for Filter may help disambiguate - e.g. ImageFilterOutputs (or applying my own adivce, this would be ImageFilterOutputList.)
Avoiding plural forms with class names seems natural when you consider that an instance of a class is itself always one item - "an instance". If we use a plural name, then we get a mismatch - an instance trying to imply that it is multiple things - it itself is just one thing, even if it references multiple other things. The collection naming above builds on this - you have an instance which is a list, a map etc so there is no mismatch.
I'm assuming you are talking about programming language constructs, although the same thinking applies to tables/views. These are understood to involve quantities of items and table names are consequently often singlular (Customer, Order, Item) even though they store multiple rows. Many-to-Many Mapping tables are usually compounds of the entities being related, e.g. relating orders to items - OrderItem. In my experience, using plurals for table names makes the SQL difficult to read.
To sum up, I would avoid plural froms as they make reading harder. There are sure to be cases where they are unavoidable - where using the plural form is more readable than creating a huge name of nested entities and collections, but these are the exception than the rule.
What are the general rules, I should follow?
Make it Clear -- for both visual and aural thinkers.
Make it Specific but Accurate.
Make it pass the "crowded room" or "emergency phone call" test.
To illustrate with the SetsUnion example:
"SetsUnion" is right out; It's easily confused for a typo and speaking it (even in your head) will confuse it for "Set's Union" (Or worse).
The plural is also implied, so the 2nd 's' is redundant.
SetUnion is better but still ambiguous.
UnionOfSets is clearer and should be the bare minimum standard.
But all of these, so far, are uselessly vague (unless you are working with pure mathematical theory).
The term really should be specific. For example, "Red cars", "Programmers who spent too much time on esoterica", etc.
These are all unions of sets, but they tell you something useful. ;-)
.
Finally, Phil Factor had the right of it. To paraphrase:
Can you shout a (term) out across a crowded room and have it keyed in, and successfully (used), by a listener at the other side?
Try yelling, "SetsUnion," or even, "UnionOfSets," across a packed Irish bar. ;-)
1) i would use SetUnion and SegmentIntersection because i think in this case the plurality is implied anyway and it just looks nicer that way.
2) again, i would use FilterOutputs and VectorDirections, for the same reason. you could always use MultipleFilterOutputs if you want to be more specific.
but ultimately it's entirely down to your personal preference.
I think that while general naming conventions and consistency are important, but in a very very tight/tricky algorithm, clarity should trump convention. If it helps, use veryLongAndDescriptiveIdentifiers.
What's wrong with Union()?
Moreover, "union of sets" turns into "sets' union" (the two sets' union is ...); I'm sure I'm not the only person who's okay with CamelCase but not CamelsCaseMinusApostrophes. If it needs an apostrophe to make sense, don't use it. Set.Union() reads exactly like "union of set(s)".
Mathematations will also say "the (set) union of A and B", or rarely "A and B's (set) union". "The sets' union of A and B" makes no sense!
Most people will also see Vector[] vectors and Directions[] vectorDirections and assume that vectors[i] corresponds to vectorDirections[i]. If things really get ambiguous, I use something like vector_by_index and vectorDirection_by_index. Then you can have Map<Filter,Output> output_by_filter or Map<Filter,Output[]> outputs_by_filter, which makes it very obvious what the key is (this is very important in Objective-C where it's completely non-obvious what type the keys or values are).
If you really want, you can add an s and get vectors_by_index, but then consistency gives you the silly outputss_by_filter.
The right thing is, of course, something like struct FilterState { Filter filter; Output[] outputs; }; FilterState[] filterStates;.
I'd suggest singular for the first word: SetUnion, VectorDirections, etc.
Do a quick class search in your IDE, for: Strings*, Sets*, Vectors*, Collections*
Anyway, whatever you choose, be consistent throughout the whole application.

Appropriate operators for assignment semantics in a non-pure declarative language

I'm designing a declarative language for defining signal networks. I want to use variable bindings to represent groups of nodes in the network. It occurred to me that there are two types of "assignment" I wish to do for these variables.
On the one hand, a variable should represent the output of a specific group of signal operators. This output can then be attached to another input. This is important for directing different outputs to different places, for example:
a, b, c = (SignalA with three outputs)
(SignalB a)
(SignalC c)
(SignalD a)
In this case there would be a SignalA with three outputs, where the first and third outputs get linked to SignalB and SignalC respectively, and SignalD also gets linked to the first output of SignalA. There is only one instance of SignalA.
On the other hand, a variable should represent a common pattern of signal operations, so that it's easy to reproduce a common configuration:
a = (SignalA (SignalB))
(SignalC a)
(SignalD a)
In this case, I'd like a to represent the composition of SignalA and SignalB, and this is reproduced as the input for SignalC and SignalD. There are two instances of SignalA here.
So my question is, in functional/declarative programming, are there common terms for these two assignment semantics? And in my language, which one should get '=', and what would be a common operator for the other? (perhaps := ?)
I realized of course that if each Signal really represented a pure function, then both of these would be the same, but in my case it's possible for side effects to occur when the signal is processed, so I need to differentiate these two cases.
It's past my bed time, so I may not be reading carefully enough. But is the second case similar to an anonymous function? Your syntax looks lisp-like already, so I wonder if lisp's shortcut syntax for the lambda function might be what you want.
a = '(SignalA (SignalB))
If your usage is not actually similar in meaning to lambda, then it will probably cause more confusion.
BTW, in the first case, you could follow Perl's idea for the left side of a list assignment:
(a, b, c) = (SignalA with three outputs)
No idea if this will be helpful; I'm not that experienced outside of imperative languages like perl and C.

Three value variables, max, min, actual

A long while ago I developed systems using Egeria an expert system language. It had a really useful feature where variables had three values, a min, max and current. In this way the probability of a partly known value could calculated, with the results ending up as a range. I can't remember the syntax, but it was something like this :-
A.Min = 1;
A.Max = 5;
A.Current= 4;
B.Min = 2;
B.Max = 4;
B.Current= 4;
A * B = {2, 20, 16}
My question is this, what is this approach called, and do any current languages implement it?
Multi-valued variables like the ones you describe may be used in constraint-based programming. For a recent paper see Radul and Sussman, "The Art of the Propagator".
Mr. Radul presented at ILC 2009 last week. He gave an example of (what one might consider) multi-valued variables that represent a probabilistic approximation to "truth". (I apologize in advance for any misrepresentation, I don't have notes.)
Consider a system that must reconcile readings from two thermal sensors. Suppose further that each sensor's readings come with some degree of uncertainty: sensor A says the temp is between A1 and A2, sensor B says temp is between B1 and B2. Should the system fail in the attempt to compute the temperature? Perhaps the "truth" can be expressed in terms of the range where the readings overlap.
It sounds like, as an "approach", it may be a species of fuzzy logic. Especially when you describe it being used probabilistically.
Appendix C of the original paper on Yacc (published in Volume 2 of the UNIX Programmer's Manual for Version 7; the paper is dated 1978-07-31) described a 'a desk calculator that does floating point interval arithmetic'. It used intervals with the notation '( min, max )' and implemented range-based arithmetic. What you describe is an extension of that with the 'current' value too.
Most object oriented languages could do this fairly easily using classes.
In C++, in particular, it would be very easy to make a templated class that handled this for you for any base type, for example.
I don't know of any languages that support this as part of the core language, though.