Are there any libraries for predicate argument context constraints? - wordnet

I'm trying to do word sense disambiguation with WordNet's verb frames, but they lack constraints on the predicate arguments, so there is no way to disambiguate using just the verb frames.
I could add the constraints manually, but I wanted to see if someone had done so already. Are there publicly available collections of verb frames marked with predicate argument context constraints? I'm using Java and GATE to call WordNet, if the programming language and environment matter.

Related

Why is adding methods to a type different than adding a sub or an operator in perl6?

Making subs/procedures available for reuse is one core function of modules, and I would argue that it is the fundamental way how a language can be composable and therefore efficient with programmer time:
if you create a type in your module, I can create my own module that adds a sub that operates on your type. I do not have to extend your module to do that.
# your module
class Foo {
has $.id;
has $.name;
}
# my module
sub foo-str(Foo:D $f) is export {
return "[{$f.id}-{$f.name}]"
}
# someone else using yours and mine together for profit
my $f = Foo.new(:id(1234), :name("brclge"));
say foo-str($f);
As seen in Overloading operators for a class this composability of modules works equally well for operators, which to me makes sense since operators are just some kinda syntactic sugar for subs anyway (in my head at least). Note that the definition of such an operator does not cause any surprising change of behavior of existing code, you need to import it into your code explicitly to get access to it, just like the sub above.
Given this, I find it very odd that we do not have a similar mechanism for methods, see e.g. the discussion at How do you add a method to an existing class in Perl 6?, especially since perl6 is such a method-happy language. If I want to extend the usage of an existing type, I would want to do that in the same style as the original module was written in. If there is a .is-prime on Int, it must be possible for me to add a .is-semi-prime as well, right?
I read the discussion at the link above, but don't quite buy the "action at a distance" argument: how is that different from me exporting another multi sub from a module? for example the rust way of making this a lexical change (Trait + impl ... for) seems quite hygienic to me, and would be very much in line with the operator approach above.
More interesting (to me at least) than the technicalities is the question if language design: isn't the ability to provide new verbs (subs, operators, methods) for existing nouns (types) a core design goal for a language like perl6? If it is, why would it treat methods differently? And if it does treat them differently for a good reason, does that not mean we are using way to many non-composable methods as nouns where we should be using subs instead?
From a language design perspective, it all comes down to a simple question: which language are we speaking? In Perl 6, this is a question about which we always try to be very clear.
The notion of ones current language in Perl 6 is defined entirely in terms of lexical scope. Sub declarations are lexically scoped. When we import symbols from a module, including extra multi candidates, those are lexically scoped. When we perform language tweaks - such as introducing new operators - those are lexically scoped. Verbs in our current language - that is, subroutine calls - are those with a lexical definition. (Operators are simply sub calls with more interesting parsing.) Since lexical scopes are closed at the end of compile time, the compiler has a complete view of the current language. That's why sub calls to non-existent subs, or references to undeclared variables, are detected and reported at compile time, as well as some basic compile-time type checking; future Perl 6 versions are likely to extend the set of compile-time checks that can be expected. The current language is the static, early-bound, part of Perl 6.
By contrast, a method call is a verb to be interpreted in the target object's language. This is the dynamic, late-bound, part of Perl 6. While the most immediate result of that is the typical polymorphism found in various forms in implementations of OO, thanks to meta-programming even the manner in which a verb is interpreted is up for grabs. For example, a monitor will acquire a lock while it interprets the verb and release it afterwards. Other objects might have been constructed based on things other than Perl 6 code, and so the interpretation of a verb doesn't mean invoking code written as a Perl 6 method. Or the code might be somewhere over the network. Who knows? Well, certainly not the caller, and that's the point, and the power, and the risk, of late binding.
The Perl 6 answer to "I want to extend the range of verbs I can use with this object in my current language" is very simple: use language features that relate to extending the current language! There's even a special syntax, $obj.&foo, that allows for a verb foo to be defined in the current language - by writing a sub - and then invoked much as if it's a method on the object. However, the small syntactic distinction makes it clear to the reader - and to the compiler - what is going on, and which language is getting to define that verb.
Through the use of augment it is possible to extend the language defined by some type of objects. However, it's rarely the best way to do things, given that it will have global effect, and also scatter the definition of the language of the object.
Much of what we do in programming is about building languages. By that I don't mean new syntax; most of our new languages - even in a language as open to mutation as Perl 6 - are just nouns and verbs defined using standard language features. However, in any non-trivial program, we can't keep every detail of every language in mind at once. When I go to the restaurant and order a schnitzel, I don't know how the order will be transported to the kitchen, what the kitchen looks like, whether the schnitzel is hammered out, breaded, and cooked on demand, or just served from a (hopefully not too stale) cache of prepared schnitzels. The kitchen and I have just enough shared meaning to make the right kind of thing happen, but I don't know how they'll precisely react to my request and they need not know what I'll do in the meantime. This kind of thinking is acknowledged by OO itself - at least when we fully embrace it - and at a larger scale by concepts such as bounded contexts, as found in Domain Driven Design.
In summary, Perl 6 tries to help us keep our languages straight: to know what is in our current language, and what we express with only limited understanding. That distinction is encoded by the sub/method distinction, which also turns out to be a sensible place to hang a static/dynamic distinction too.

When is it okay to switch statements in an OO language supporting polymorphism? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I recently learned that switch statements are bad in OOP, perticularly from "Clean Code"(p37-39) by Robert Martin.
But consider this scene: I'm writing a game server, receiving messages from clients, which contain an integer that indicates player's action, such as move, attack, pick item... etc, there will be more than 30 different actions. When I'm writing code to handle these messages, no metter what solutions I think about, it will have to use switch somewhere. What pattern should I use if not switch statement?
A switch is like any other control structure. There are places where it's the best/cleanest solution, and many more places where it's completely inappropriate. It's just abused way more than other control structures.
In OO design, it's generally considered preferable in a situation like yours to use different message types/classes that inherit from a common message class, then use overloaded methods to "automatically" differentiate between the different types.
In a case like yours, you could use an enumeration that maps to your action codes, then attach an attribute to each enumerated value that will let you use generics or type-building to build different Action sub-class objects so that the overloading method will work.
But that's a real pain.
Evaluate whether there's a design option such as the enumeration that is feasible in your solution. If not, just use the switch.
'Bad' switch statements are often those switching on object type (or something that could be an object type in another design). In other words hardcoding something that might be better handled by polymorphism. Other kinds of switch statements might well be OK
You will need a switch statement, but only one. When you receive the message, call a Factory object to return an object of the appropriate Message subclass (Move, Attack, etc), then call a message->doit() method to do the work.
That means if you add more message types, only the factory object has to change.
The Strategy pattern comes to mind.
The strategy pattern is intended to provide a means to define a family of algorithms, encapsulate each one as an object, and make them interchangeable. The strategy pattern lets the algorithms vary independently from clients that use them.
In this case, the "family of algorithms" are your different actions.
As for switch statements - in "Clean Code", Robert Martin says that he tries to limit himself to one switch statement per type. Not eliminate them altogether.
The reason is that switch statements do not adhere to OCP.
I'd put the messages in an array and then match the item to the solution key to display the message.
From the design patterns perspective you can use the Command Pattern for your given scenario. (See http://en.wikipedia.org/wiki/Command_pattern).
If you find yourself repeatedly using switch statements in the OOP paradigm, this is an indication that your classes may not be well design. Suppose you have a proper design of super and sub classes and a fair amount of Polymorphism. The logic behind switch statements should be handled by the sub classes.
For more information on how you are removed these switch statements and introduce the proper sub-classes, I recommend you read the first chapter of Refactoring by Martin Fowler. Or you can find similiar slides here http://www1.informatik.uni-wuerzburg.de/database/courses/pi2_ss03_dir/RefactoringExampleSlides.pdf. (Slide 44)
IMO switch statements are not bad, but should be avoided if possible. One solution would be to use a Map where the keys are the commands, and the values Command objects with an execute() method. Or a List if your commands are numeric and have no gaps.
However, usually, you would use switch statements when implementing design patterns; one example would be to use a Chain of responsibility pattern to handle the commands given any command "id" or "value". (The Strategy pattern was also mentionned.) However, in your case, you might also look into the Command pattern.
Basically, in OOP, you'll try to use other solutions than relying on switch blocks, which use a procedural programming paradigm. However, when and how to use either is somewhat your decision. I personally often use switch blocks when using the Factory pattern etc.
A definition of code organisation is :
a package is a group of classes with coherant API (ex: Collection API in many frameworks)
a class is a set of coherent functionalities (ex: a Math class...
a method is a functionality; it should do one thing and one thing only. (ex: adding an item in a list may require to enlarge that said list, in which case the add method will rely on other methods to do that and will not perform that operation itself, because it's not it's contract.)
Therefore, if your switch statement perform different kinds of operations, you are "violating" that definition; whereas using a design pattern does not as each operation is defined in it's own class (it's own set of functionalities).
I don't buy it. These OOP zealots seem to have machines that have infinite RAM and amazing performance. Obviously with inifinite RAM you don't have to worry about RAM fragmentation and the performance impacts that has when you continuously create and destroy small helper classes. To paraphrase a quote for the 'Beautiful Code' book - "Every problem in Computer Science can be solved with another level of abstraction"
Use a switch if you need it. Compilers are pretty good at generating code for them.
Use commands. Wrap the action in an object and let polymorphism do the switch for you. In C++ (shared_ptr is simply a pointer, or a reference in Java terms. It allows for dynamic dispatch):
void GameServer::perform_action(shared_ptr<Action> op) {
op->execute();
}
Clients pick an action to perform, and once they do they send that action itself to the server so the server doesn't need to do any parsing:
void BlueClient::play() {
shared_ptr<Action> a;
if( should_move() ) a = new Move(this, NORTHWEST);
else if( should_attack() ) a = new Attack(this, EAST);
else a = Wait(this);
server.perform_action(a);
}

Ast representation of lambda function

I'm developing and Abstract Syntax Tree meta-model for a smalltalk and right now I have troubles with modeling a blocks. They are sort of literals but on the other hand they are behavioral entities like methods. Blocks are sort of lambda functions so maybe someone had better practice of working with them.
I'll be thankful for any advice.
The Refactoring Browser has a very nice AST, have a look at its implementation.
Regarding your question: The Refactoring Browser extracts the shared parts of blocks and methods into a separate node type called SequenceNode. The sequence node models the temps and the sequence of statements. The block node then wraps the sequence node, adds the arguments, and inherits the shared behaviour of value nodes. The method node wraps the sequence node and adds method name, arguments, pragmas, etc.

Preserve Whole Object VS Don't Look For Things

I was reading Fowler's Refactoring Book and saw Preserve Whole Object. A different, newer opinion says that this refactoring is the exact opposite of what you should do: The Clean Code Talks - Don't Look For Things!.
Fowler does mention that you should look to see if the method can just be moved to the class which uses the large list of arguments. I think that would be the only reasonable alternative. This refactoring seems like a band-aid for a poorly defined method.
The Fowler source material is a bit dated. Is the prevailing wisdom to let this technique go the way of the dodo or is there actually a case when you'd want to do this kind of refactoring? Or have I misunderstood the test-driven style because those examples deal with object construction, not message sending?
There are many concepts in the Object Oriented Design such as Patterns, Principles and Practices that may seem to be either similar or contradictory at first. In fact, most of them are neither similar nor contradictory. And the thing that makes them different and consistent is their intent.
The seeming contradiction between the Preserve Whole Object refactoring and the Service Locator pattern mentioned in The Clean Code Talks video occurs when they are treated as a same concept, although they are different in their intent and their essence.
The Preserve Whole Object refactoring is simply a technique used to make code easier to read, understand and maintain by reducing the number of arguments to a function. The Service Locator, on the other hand, is a design pattern that is used to manage dependencies between different components in a system using the Inversion of Control concept. Unlike the Preserve Whole Object refactoring technique which has local effect on a system, that is applied to a small part of the system (a function), the Service Locator pattern has a global effect on the system and addresses a bigger architectural problem (Dependency Management).
When to use the Preserve Whole Object refactoring?
Use the Preserve Whole Object refactoring when You have two or more arguments to a function which are basically the properties of one object, so pass the object instead.
There is a similar concept called Parameter Object (aka Argument Object) (Introduce Parameter Object refactoring) which states that if You have a group of parameters that are not the properties of one object but are conceptually related to each other or naturally go together, wrap them with a class of their own and pass the instance of that class instead. It is mainly used when sending a message to an object.
A quote from Clean Code, Chapter 3: Functions, Function Arguments, page 43 (Robert C. Martin):
Argument Objects
When a function seems to need more than two or three arguments, it is likely that some of
those arguments ought to be wrapped into a class of their own. Consider, for example, the
difference between the two following declarations:
Circle makeCircle(double x, double y, double radius);
Circle makeCircle(Point center, double radius);
Reducing the number of arguments by creating objects out of them may seem like
cheating, but it’s not. When groups of variables are passed together, the way x and
y are in the example above, they are likely part of a concept that deserves a name of its
own.
When to use the Service Locator pattern?
Use the Service Locator pattern when Your class has dependencies that are not conceptually related and You do not want Your class to depend on concrete implementations. Actually, this is when You would want to use any of the Dependency Management approaches. Another alternative is the Dependency Injection approach which explicitly specifies all the dependencies as a separate arguments to the constructor. Whereas the Service Locator pass all the dependencies in a single container object. In fact, it is this very similarity between the Service Locator pattern and the Preserve Whole Object refactoring of combining the arguments in a single object that serves as a source of confusion. The Dependency Management techniques are mainly used in object construction.
There are pros and cons to both approaches of the Dependency Management which are discussed in the Inversion of Control Containers and the Dependency Injection pattern article by Martin Fowler.
When to use both?
Sometimes there will be situations where Your class will have two or more dependencies that are conceptually related and You might want to combine them in a single object and pass it as a dependency using the Service Locator. So, as You can see these two concepts are not mutually exclusive.

What is open recursion?

What is open recursion? Is it specific to OOP?
(I came across this term in this tweet by Daniel Spiewak.)
just copying http://www.comlab.ox.ac.uk/people/ralf.hinze/talks/Open.pdf:
"Open recursion Another handy feature offered by most languages with objects and classes is the ability for one method body to invoke another method of the same object via a special variable called self or, in some langauges, this. The special behavior of self is that it is late-bound, allowing a method defined in one class to invoke another method that is defined later, in some subclass of the first. "
This paper analyzes the possibility of adding OO to ML, with regards to expressivity and complexity. It has the following excerpt on objects, which seems to make this term relatively clear –
3.3. Objects
The simplest form of object is just a record of functions that share a common closure environment that
carries the object state (we can call these simple objects). The function members of the record may or may not
be defined as mutually recursive. However, if one wants to support inheritance with overriding, the structure
of objects becomes more complicated. To enable open recursion, the call-graph of the method functions
cannot be hard-wired, but needs to be implemented indirectly, via object self-reference. Object self-reference
can be achieved either by construction, making each object a recursive, self-referential value (the fixed-point
model), or dynamically, by passing the object as an extra argument on each method call (the self-application
or self-passing model).5 In either case, we will call these self-referential objects.
The name "open recursion" is a bit misleading at first, because it has nothing to do with the recursion that normally is used (a function calling itself); and to that extent, there is no closed recursion.
It basically means, that a thing is referring to itself. I can only guess, but I do think that the term "open" comes from open as in "open for extension".
In that sense an object is open to extension, but still referring to itself.
Perhaps a small example can shed some light on the concept.
Imaging you write a Python class like this one:
class SuperClass:
def method1(self):
self.method2()
def method2(self):
print(self.__class__.__name__)
If you ran this by
s = SuperClass()
s.method1()
It will print "SuperClass".
Now we create a subclass from SuperClass and override method2:
class SubClass(SuperClass):
def method2(self):
print(self.__class__.__name__)
and run it:
sub = SubClass()
sub.method1()
Now "SubClass" will be printed.
Still, we only call method1() as before. Inside method1() the method2() is called, but both are bound to the same reference (self in Python, this in Java). During sub-classing SuperClass method2() is changed, which means that an object of SubClass refers to a different version of this method.
That is open recursion.
In most cases, you override methods and call the overridden methods directly.
This scheme here is using an indirection over self-reference.
P.S.: I don't think this has been invented but discovered and then explained.
Open recursion allows to call another methods of object from within, through special variable like this or self.
In short, open recursion is about something actually not related to OOP, but more general.
The relation with OOP comes from the fact that many typical "OOP" PLs have such properties, but it is essentially not tied to any distinguishing features about OOP.
So there are different meanings, even in same "OOP" language. I will illustrate it later.
Etymology
As mentioned here, the terminology is likely coined in the famous TAPL by BCP, which illustrates the meaning by concrete OOP languages.
TAPL does not define "open recursion" formally. Instead, it points out the "special behavior of self (or this) is that it is late-bound, allowing a method defined in one class to invoke another method that is defined later, in some subclass of the first".
Nevertheless, neither of "open" and "recursion" comes from the OOP basis of a language. (Actually, it is also nothing to do with static types.) So the interpretation (or the informal definition, if any) in that source is overspecified in nature.
Ambiguity
The mentioning in TAPL clearly shows "recursion" is about "method invocation". However, it is not that simple in real languages, which usually do not have primitive semantic rules on the recursive invocation itself. Real languages (including the ones considered as OOP languages) usually specify the semantics of such invocation for the notation of the method calls. As syntactic devices, such calls are subject to the evaluation of some kind of expressions relying on the evaluations of its subexpressions. These evaluations imply the resolution of method name, under some independent rules. Specifically, such rules are about name resolution, i.e. to determine the denotation of a name (typically, a symbol, an identifier, or some "qualified" name expressions) in the subexpression. Name resolution often respects to scoping rules.
OTOH, the "late-bound" property emphasizes how to find the target implementation of the named method. This is a shortcut of evaluation of specific call expressions, but it is not general enough, because entities other than methods can also have such "special" behavior, even make such behavior not special at all.
A notable ambiguity comes from such insufficient treatment. That is, what does a "binding" mean. Traditionally, a binding can be modeled as a pair of a (scoped) name and its bound value, i.e. a variable binding. In the special treatment of "late-bound" ones, the set of allowed entities are smaller: methods instead of all named entities. Besides the considerably undermining the abstraction power of the language rules at meta level (in the language specification), it does not cease the necessity of traditional meaning of a binding (because there are other non-method entities), hence confusing. The use of a "late-bound" is at least an instance of bad naming. Instead of "binding", a more proper name would be "dispatching".
Worse, the use in TAPL directly mix the two meanings when dealing with "recusion". The "recursion" behavior is all about finding the entity denoted by some name, not just specific to method invocation (even in those OOP language).
The title of the chapter (Case Study: Imperative Objects) also suggests some inconsistency. Obviously, the so-called late binding of method invocation has nothing to do with imperative states, because the resolution of the dispatching does not require mutable metadata of invocation. (In some popular sense of implementation, the virtual method table need not to be modifiable.)
Openness
The use of "open" here looks like mimic to open (lambda) terms. An open term has some names not bound yet, so the reduction of such a term must do some name resolution (to compute the value of the expression), or the term is not normalized (never terminate in evaluation). There is no difference between "late" or "early" for the original calculi because they are pure, and they have the Church-Rosser property, so whether "late" or not does not alter the result (if it is normalized).
This is not the same in the language with potentially different paths of dispatching. Even that the implicit evaluation implied by the dispatching itself is pure, it is sensitive to the order among other evaluations with side effects which may have dependency on the concrete invocation target (for example, one overrider may mutate some global state while another can not). Of course in a strictly pure language there can be no observable differences even for any radically different invocation targets, a language rules all of them out is just useless.
Then there is another problem: why it is OOP-specific (as in TAPL)? Given that the openness is qualifying "binding" instead of "dispatching of method invocation", there are certainly other means to get the openness.
One notable instance is the evaluation of a procedure body in traditional Lisp dialects. There can be unbound symbols in the body and they are only resolved when the procedure being called (rather than being defined). Since Lisps are significant in PL history and the are close to lambda calculi, attributing "open" specifically to OOP languages (instead of Lisps) is more strange from the PL tradition. (This is also a case of "making them not special at all" mentioned above: every names in function bodies are just "open" by default.)
It is also arguable that the OOP style of self/this parameter is equivalent to the result of some closure conversion from the (implicit) environment in the procedure. It is questionable to treat such features primitive in the language semantics.
(It may be also worth noting, the special treatment of function calls from symbol resolution in other expressions is pioneered by Lisp-2 dialects, not any of typical OOP languages.)
More cases
As mentioned above, different meanings of "open recursion" may coexist in a same "OOP" language.
C++ is the first instance here, because there are sufficient reasons to make them coexist.
In C++, name resolution are all static, normatively name lookup. The rules of name lookup vary upon different scopes. Most of them are consistent with identifier lookup rules in C (except for the allowance of implicit declarations in C but not in C++): you must first declare the name, then the name can be lookup in the source code (lexically) later, otherwise the program is ill-formed (and it is required to issue an error in the implementation of the language). The strict requirement of such dependency of names are considerable "closed", because there are no later chance to recover from the error, so you cannot directly have names mutually referenced across different declarations.
To work around the limitation, there can be some additional declarations whose sole duty is to break the cyclic dependency. Such declarations are called "forward" declarations. Using of forward declarations still does not require "open" recursion, because every well-formed use must statically see the previous declaration of that name, so each name lookup does not require additional "late" binding.
However, C++ classes have special name lookup rules: some entities in the class scope can be referred in the context prior to their declaration. This makes mutual recursive use of name across different declarations possible without any additional "forward" declarations to break the cycle. This is exactly the "open recursion" in TAPL sense except that it is not about method invocation.
Moreover, C++ does have "open recursion" as per the descriptions in TAPL: this pointer and virtual functions. Rules to determine the target (overrider) of virtual functions are independent to the name lookup rules. A non-static member defined in a derived class generally just hide the entities with same name in the base classes. The dispatching rules kick in only on virtual function calls, after the name lookup (the order is guaranteed since evaulations of C++ function calls are strict, or applicative). It is also easy to introduce a base class name by using-declaration without worry about the type of the entity.
Such design can be seen as an instance of separate of concerns. The name lookup rules allows some generic static analysis in the language implementation without special treatment of function calls.
OTOH, Java have some more complex rules to mix up name lookup and other rules, including how to identify the overriders. Name shadowing in Java subclasses is specific to the kind of entities. It is more complicate to distinguish overriding with overloading/shadowing/hiding/obscuring for different kinds. There also cannot be techniques of C++'s using-declarations in the definition of subclasses. Such complexity does not make Java more or less "OOP" than C++, anyway.
Other consequences
Collapsing the bindings about name resolution and dispatching of method invocation leads to not only ambiguity, complexity and confusion, but also more difficulties on the meta level. Here meta means the fact that name binding can exposing properties not only available in the source language semantics, but also subject to the meta languages: either the formal semantic of the language or its implementation (say, the code to implement an interpreter or a compiler).
For example, as in traditional Lisps, binding-time can be distinguished from evaluation-time, because program properties revealed in binding-time (value binding in the immediate contexts) is more close to meta properties compared to evaluation-time properties (like the concrete value of arbitrary objects). An optimizing compiler can deploy the code generation immediately depending on the binding-time analysis either statically at the compile-time (when the body is to be evaluate more than once) or derferred at runtime (when the compilation is too expensive). There is no such option for languages blindly assume all resolutions in closed recursion faster than open ones (and even making them syntactically different at the very first). In such sense, OOP-specific open recursion is not just not handy as advertised in TAPL, but a premature optimization: giving up metacompilation too early, not in the language implementation, but in the language design.