JVM spec: grammar for generic signatures requires lookahead - jvm

In the spec for generic signatures, ClassSignature has the form
ClassSignature:
[TypeParameters] SuperclassSignature {SuperinterfaceSignature}
TypeParameters:
< TypeParameter {TypeParameter} >
TypeParameter:
Identifier ClassBound {InterfaceBound}
ClassBound:
: [ReferenceTypeSignature]
InterfaceBound:
: ReferenceTypeSignature
So the superclass bound of a type parameter can be omitted (some examples here).
If I have a class declaration public class A<T, LFooBar>, the Java compiler generates the signature <T:Ljava/lang/Object;LFooBar:Ljava/lang/Object;>Ljava/lang/Object;.
IUC, the class bound could be omitted, in which case the signature would be <T:LFooBar:>Ljava/lang/Object;.
Parsing that short signature requires looking ahead to the second : in order to know that T:LFooBar: are two type parameters, and not one type parameter T with class bound FooBar.
Maybe in practice, leaving away the class bound is only done if there's an interface bound? For public class A<T extends Comparable<? super T>>, javac produces the signature <T::Ljava/lang/Comparable<-TT;>;>Ljava/lang/Object;. But I guess i cannot rely on this assumption.
Did I misunderstand something?

If you look closely, the only thing that can follow an omitted ReferenceTypeSignature is an Identifier or a >. Since ReferenceTypeSignature must either begin with an [ or end with a ; and identifiers can't contain these characters, while identifiers must be followed by a :, which can't appear type signatures, there is no ambiguity between those options.
Note that identifiers can start with >, so you need to look ahead for a colon to determine whether you are at the end TypeParameters or not. But that's a separate issue.
I'm not sure how the JVM implements it, but one possible approach is this:
Examine the first character. If it is [, you have a type signature. If it is >, scan ahead for the first [, ;, or :. If the first one you see is :, that means you have identifier, otherwise you have end of type parameters.
Otherwise, scan ahead for the first ; or :. If it is :, you have a identifier, otherwise you have a class bound.
Edit: Identifiers in signatures cannot contain >, so ignore that bit. (They also can't contain :, another potential source of ambiguity)

Related

The use of ">>" in Pharo/Smalltalk

I am implementing futures in Pharo. I came across this website http://onsmalltalk.com/smalltalk-concurrency-playing-with-futures. I am following this example and trying to replicate it on Pharo. However, I get to this point the last step and I have no idea what ">>" means: This symbol is not also included as part of Smalltalk syntax in http://rigaux.org/language-study/syntax-across-languages-per-language/Smalltalk.html.
BlockClosure>>future
^ SFuture new value: self fixTemps
I can see future is not a variable or a method implemented by BlockClosure. What should I do with this part of the code to make the promises/futures work as indicated at http://onsmalltalk.com/smalltalk-concurrency-playing-with-futures? I cannot add it on the Playground or as a method to my Promise class as it is, or am I missing something?
After adding the future method to BlockClosure, this is the code I try on the PlayGround.
value1 := [200 timesRepeat:[Transcript show: '.']. 6] future.
value2 := [200 timesRepeat:[Transcript show: '+']. 6] future.
Transcript show: 'other work'.
Transcript show: (value1 + value2).
Date today
The transcript displays the below error instead of the expected value of 12.
UndefinedObject>>DoIt (value1 is Undeclared)
UndefinedObject>>DoIt (value2 is Undeclared)
For some reason that it would be nice to learn, there is a traditional notation in Smalltalk to refer to the method with selector, say, m in class C which is C>>m. For example, BlockClosure>>future denotes the method of BlockClosure with selector #future. Interestingly enough, the expression is not an evaluable Smalltalk one, meaning, it is not a Smalltalk expression. It is just a succinct way of saying, "what comes below is the source code of method m in class C". Just that.
In Smalltalk, however, methods are objects too. In fact, they are instances of CompiledMethod. This means that they can be retrieved by sending a message. In this case, the message is methodAt:. The receiver of the message is the class which implements the method and the argument is the selector (respectively, C and #m, or BlockClosure and #future in your example).
Most dialects, therefore, implement a synonym of methodAt: named >>. This is easily done in this way:
>> aSymbol
^self methodAt: aSymbol
This puts the Smalltalk syntax much closer to the traditional notation because now BlockClosure>>future looks like the expression that would send the message >> to BlockClosure with argument future. However, future is not a Symbol unless we prepend it with #, namely #future. So, if we prefix the selector with the # sign, we get the literal Symbol #future, which is a valid Smalltalk object. Now the expression
BlockClosure >> #future
becomes a message, and its result after evaluating it, the CompiledMethod with selector #future in the class BlockClosure.
In sum, BlockClosure>>future is a notation, not a valid Smalltalk expression. However, by tweaking it to be BlockClosure >> #future, it becomes an evaluable expression of the language that returns the method the notation referred to.

How does `Yacc` identifies function calls?

I am trying to figure out how yacc identifies function calls in a C code. For Example: if there is a function call like my_fun(a,b); then which rules does this statement reduces to.
I am using the cGrammar present in : C Grammar
Following the Grammar given over there manually; I figured out that we only have two choices in translation unit. Everything has to either be a function definition or a declaration. Now all declaration starts type_specifiers, storage_class_specifier etc but none of them starts with IDENTIFIER
Now in case of a function call the name would be IDENTIFIER. This leaves me unclear as to how it will be parsed and which rules will be used exactly?
According to the official yacc specification specified here yacc, everything is handled by user given routines. When you have a function call the name of course is IDENTIFIER.It is parsed using the user defined procedures.According to the specifications, the user can specify his input in terms of individual input characters, or in terms of higher level constructs such as names and numbers. The user-supplied routine may also handle idiomatic features such as comment and continuation conventions, which typically defy easy grammatical specification.
Do have a look.By the way you are supposed to do a thorough research before putting questions here.

What kind of meta argument is the first argument of predicate_property/2?

In other words, should it be 0 or : or something else? The Prolog systems SICStus, YAP, and SWI all indicate this as :. Is this appropriate? Shouldn't it be rather a 0 which means a term that can be called by call/1?
To check your system type:
| ?- predicate_property(predicate_property(_,_),P).
P = (meta_predicate predicate_property(:,?)) ? ;
P = built_in ? ;
P = jitted ? ;
no
I should add that meta-arguments — at least as in the form used here — cannot guarantee the same algebraic properties we expect from pure relations:
?- S=user:false, predicate_property(S,built_in).
S = user:false.
?- predicate_property(S,built_in), S=user:false.
false.
Here is the relevant part from ISO/IEC 13211-2:
7.2.2 predicate_property/2
7.2.2.1 Description
predicate_property(Prototype, Property) is true in the calling context of a module M iff the procedure
associated with the argument Prototype has predicate property
Property.
...
7.2.2.2 Template and modes
predicate_property(+prototype, ?predicate_property)
7.2.2.3 Errors
a) Prototype is a variable — instantiation_error.
...
c) Prototype is neither a variable nor a callable term —
type_error(callable, Prototype).
...
7.2.2.4 Examples
Goals attempted in the context of the module
bar.
predicate_property(q(X), exported).
succeeds, X is not instantiated.
...
Thats an interesting question. First I think there are two
kinds of predicate_property/2 predicates. The first kind
takes a callable and is intended to work smoothly with
for example vanilla interpreters and built-ins such as
write/1, nl/0, etc.., i.e.:
solve((A,B)) :- !, solve(A), solve(B).
solve(A) :- predicate_property(A, built_in), !, A.
solve(A) :- clause(A,B), solve(B).
For the first kind, I guess the 0 meta argument specifier
would work fine. The second kind of predicate_property/2
predicates works with predicate indicators. Callable
and predicate indicators are both notions already defined
in the ISO core standard.
A predicate indicator has the form F/N, where F is an atom
and N is an integer. Matters get a little bit more complicated
if modules are present, especially because of the operator
precedence of (:)/2 versus (/)/2. If predicate property works
with predicate indicators, we can still code the vanilla
interpreter:
solve((A,B)) :- !, solve(A), solve(B).
solve(A) :- functor(A,F,N), predicate_property(F/N, built_in), !, A.
solve(A) :- clause(A,B), solve(B).
Here we loose the connection of a possible meta argument
0, for example of solve/1, with predicate property. Because
functor/3 has usually no meta predicate declaration. Also
to transfer module information via functor/3 to
predicate_property/2 is impossible, since functor/3 is agnostic to
modules, it usually has no realization that can deal with
arguments that contain module qualification.
There are now two issues:
1) Can we give typing and/or should we give typing to predicates
such as functor/3.
2) Can we extend functor/3 so that it can convey module
qualification.
Here are my thoughts:
1) Would need a more elaborate type system. One that would
allow overloading of predicates with multiple types. For
example functor/3 could have two types:
:- meta_predicate functor(?,?,?).
:- meta_predicate functor(0,?,?).
The real power of overloading multiple types would only
shine in predicates such as (=)/2. Here we would have:
:- meta_predicate =(?,?).
:- meta_predicate =(0,0).
Thus allowing for more type inference, if one side of
(=)/2 is a goal we could deduced that the other side
is also a goal.
But matter are not so simple, it would possibly make
sense to have also a form of type cast, or some other
mechanism to restrict the overloading. Something which
is not covered by introducing just a meta predicate
directive. This would require further constructs inside
the terms and goals.
Learning form lambda Prolog or some dependent type
system, could be advantageous. For example (=)/2 can
be viewed as parametrized by a type A, i.e.:
:- meta_predicate =(A,A).
2) For Jekejeke Prolog I have provided an alternative
functor/3 realization. The predicate is sys_modfunc_site/2.
And it works bidirectionally like functor/3, but returns
and accepts the predicate indicator as one whole thing.
Here are some example runs:
?- sys_modfunc_site(a:b(x,y), X).
X = a:b/2
?- sys_modfunc_site(X, a:b/2).
X = a:b(_A,_B)
The result of the predicate could be called a generalized
predicate indicator. It is what SWI-Prolog already understands
for example in listing/1. So it could have the same meta argument
specification as listing/1 has. Which is current : in SWI-Prolog.
So we would have, and subsequently predicate_property/2 would
take the : in its first argument:
:- meta_predicate sys_modfunc_site(?,?).
:- meta_predicate sys_modfunc_site(0,:).
The vanilla interpreter, that can also deal with modules, then
reads as follows. Unfortunately a further predicate is needed,
sys_indicator_colon/2, which compresses a qualified predicate
indicator into an ordinary predicate indicator, since our
predicate_property/2 does not understand generalized predicate
indicators for efficiency reasons:
solve((A,B)) :- !, solve(A), solve(B).
solve(A) :-
sys_modfunc_site(A,I),
sys_indicator_colon(J,I),
predicate_property(J, built_in), !, A.
solve(A) :- clause(A,B), solve(B).
The above implements a local semantic of the colon (:)/2,
compared to the rather far reaching semantic of the colon
(:)/2 as described in the ISO module standard. The far
reaching semantic imputes a module name on all the literals
of a query. The local semantic only expects a qualified
literal and just applies the module name to that literal.
Jekejeke only implements local semantic with the further
provision that call-site is not changed. So under the hood
sys_modfunc_site/2 and sys_indicator_colon/2 have also to
transfer the call-site so that predicate_property/2 makes
the right decision for unqualified predicates, i.e. resolving
the predicate name by respecting imports etc..
Finally a little epilog:
The call-site transfer of Jekejeke Prolog is a pure runtime
thing, and doesn't need some compile time manipulation, especially
no ad hoc adding of module qualifiers at compile time. As a result
certain algebraic properties are preserved. For example assume
we have the following clause:
?- [user].
foo:bar.
^D
Then the following things work fine, since not only sys_modfunc_site/2
is bidirectional, but also sys_indicator_colon/2:
?- S = foo:bar/0, sys_indicator_colon(R,S), predicate_property(R,static).
S = foo:bar/0,
R = 'foo%bar'/0
?- predicate_property(R,static), sys_indicator_colon(R,S), S = foo:bar/0.
R = 'foo%bar'/0,
S = foo:bar/0
And of course predicate_property/2 works with different input and
output modes. But I guess the SWI-Prolog phaenomenom has first an
issue that a bare bone variable is prefixed with the current module. And
since false is not in user, but in system, it will not show false.
In output mode it will not show predicates which are equal by resolution.
Check out in SWI-Prolog:
?- predicate_property(X, built_in), write(X), nl, fail; true.
portray(_G2778)
ignore(_G2778)
...
?- predicate_property(user:X, built_in), write(X), nl, fail; true.
prolog_load_file(_G71,_G72)
portray(_G71)
...
?- predicate_property(system:X, built_in), write(X), nl, fail; true.
...
false
...
But even if the SWI-Prolog predicate_property/2 predicate would
allow bar bone variables, i.e. output goals, we would see less
commutativity in the far reaching semantic than in the local
semantic. In the far reaching semantic M:G means interpreting G
inside the module M, i.e. respecting the imports of the module M,
which might transpose the functor considerable.
The far reaching semantic is the cause that user:false means
system:false. On the other hand, in the local semantics, where M:G
means M:G and nothing else, we have the algebraic property more often.
In the local semantics user:false would never mean system:false.
Bye

Exclamation operator?

I'm learning D and have seen a lot of code like this:
ushort x = to!ushort(args[1]);
I assume this casts args[1] to ushort, but what's the difference between this and cast(ushort)?
EDIT: And what other uses does the exclamation mark operator have?
In D,
to!ushort(args[1])
is shorthand for the template instantiation
to!(ushort)(args[1])
and is similar to
to<ushort>(args[1])
in languages like C++/Java/C#.
The exclamation point is to note the fact that it's not a regular argument, but a template argument.
The notation does not use angle brackets because those are ridiculously difficult to parse correctly for a compiler (they make the grammar very context-sensitive), which makes it that much more difficult to implement a correct compiler. See here for more info.
The only other use I know about is just the unary 'not' operation (e.g. false == !true)... I can't think of any other uses at the moment.
Regarding the cast:
cast(ushort) is an unchecked cast, so it won't throw an exception if the value is out of range.
to!ushort() is a checked cast, so it throws an exception if the value is out of range.
The exclamation mark here is not an operator, it is just a token part of the explicit template instantiation syntax (described in detail here).
std.conv.to (docs) is a function template for converting between arbitrary types. It is implemented entirely in the library and has no special support in the language. It has a broader and different scope compared to the cast operator.
The to template takes two type parameters; a "to" type and a "from" type, in that order. In your example, the template is explicitly instantiated with the single type argument ushort for the "to" parameter, and a second type argument string (assuming args comes from the first parameter to main) is automatically inferred from the regular function argument passed to the function (args[1]) as the "from" parameter.
The resulting function takes a string parameter and returns a ushort parsed from that string, or throws an exception if it failed. The cast operator will not attempt this kind of high-level conversion.
Note that if there is more than one explicit template parameter, or that parameter has more than one token in it (ushort is a single keyword token), you must wrap the template parameter list in parentheses:
ushort result;
result = to!(typeof(result))(args[1]);
In this example, typeof, (, result and ) are four separate tokens and the parentheses are thus required.
To answer your last question, the ! token is also used for the unary not operator, unrelated to template instantiations:
bool yes = true;
bool no = !yes; // 'no' is false
You already got two excellent answers by jA_cOp and Merhdad. I just want answer directly to the OP question (what's the difference between this and cast(ushort)?) - The difference is that cast(ushort)args[1] will not work (you cannot cast from a string to an uint just like that), while the to!(type)(param) template knows what to do with the string and how to convert it to the primitive type.

What does the number 16 in the DLL symbol _FooBar#16 represent?

What does the number 16 in the DLL symbol _FooBar#16 represent?
It means that _FooBar is a __stdcall function that takes 16 bytes of parameters.
32-bit calling conventions on x86 are described here: http://blogs.msdn.com/oldnewthing/archive/2004/01/08/48616.aspx
This is general name mangling and depends upon the calling convention of the function.
The various calling conventions and name mangling applied to functions is documented as Argument Passing and Naming Conventions. You will have to click the individual links to see the exact mangling applied.
In your case, you have a __stdcall convention which uses the following naming convention:
An underscore (_) is prefixed to the
name. The name is followed by the at
sign (#) followed by the number of
bytes (in decimal) in the argument
list. Therefore, the function declared
as int func( int a, double b ) is
decorated as follows: _func#12