Why Associations are Magnitudes in Smalltalk? - smalltalk

I haven't checked many dialects yet (in Pharo Association is a subclass of LookupKey, which is a subclass of Magnitude) but I presume this is fairly common.
Isn't this definition counterintuitive? Associations usually take part in unordered collections and I don't think a Smalltalker ever takes into account that their keys could be sent #<=. What I would like to know is whether this is something we inherited from old implementations of Smalltalk and never bothered to challenge, or it is just me who am missing something. Bottomline: has anyone ever used this feature?

I don't think that Dictionary needs that; all it needs are = and hash.
However, you often want to get a list of associations and sort them later (eg. to show them in some sorted list). Then, it is nice to have an order defined already.
And the cost is only a "<" method in Association (or LookupKey, if that is the superclass), so it comes almost for free by inheriting from Magnitude instead of Object.

Related

Cocoa. Object equality and hashing clarification

I'm studying Cocoa collections currently and my research has brought to Mike Ash's post on object equality and hashing.
Here's an exerpt from the post:
Because of the semantics of hash, if you override isEqual: then you must override hash. If you don't, then you risk having two objects which are equal but which don't have the same hash. If you use these objects in a dictionary, set, or something else which uses a hash table, then hilarity will ensue.
Unfortunately the author doesn't get further in details of what the hilarity will occur and my curiosity doesn't let me just leave it without trying to dig deeper. So the question is: what exactly will happen if i have two equal objects with different hash values and i put these objects into one collection? What sort of problem i will run into?
The answer is in this section from Mike's post
A hash table is basically a big array with special indexing. Objects are placed into an array with an index that corresponds to their hash. The hash is essentially a pseudorandom number generated from the object's properties. The idea is to make the index random enough to make it unlikely for two objects to have the same hash, but have it be fully reproducible. When an object is inserted, the hash is used to determine where it goes. When an object is looked up, its hash is used to determine where to look.
In more formal terms, the hash of an object is defined such that two objects have an identical hash if they are equal. Note that the reverse is not true, and can't be: two objects can have an identical hash and not be equal. You want to try to avoid this as much as possible, because when two unequal objects have the same hash (called a collision) then the hash table has to take special measures to handle this, which is slow. However, it's provably impossible to avoid it completely.
What it means is that you will have your 2 objects which claim to be equal. You add the first as the key in a dictionary with some value. Then you try to extract that value using the other object as the key. And it doesn't work. It should, because your objects are equal. But the initial hash lookup failed.
To be clear, this might not happen. It might work fine for some objects and fail for others. The point is, if you don't implement both methods, you don't know what's going to happen.
Putting aside the desire to know "why", you should just look at Apple's documentation.
http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Protocols/NSObject_Protocol/Reference/NSObject.html%23//apple_ref/occ/intfm/NSObject/isKindOfClass:
If two objects are equal, they must have the same hash value.
All other discussion is interesting from an academic perspective, but fundamentally whether you agree with Apples rules or not, you must abide by them if you want to use the Foundation frameworks.
What Mike and the above poster say seem to be true, for the current incarnation of NSDictionary - there is no guarantee that the same implementation will remain in-place for future releases. However, whatever Apple might replace it with, it will (probably) retain all of the same guarantees and restrictions.

Transferring items from NSArray to NSSet for NSManagedObject

Two related questions:
When you use [NSSet setWithArray:], does it remove duplicate object for you automatically?
How can you tell NSSet exactly what you want "duplicate" to mean? I.e. if you have a bunch of "College course" objects, each with a name and section number, and you wanted to transfer to an NSSet, keeping only one of each college course for a given name (for example, if you had three sections of Calculus, how would you tell it to only keep one section of calculus, even if their section numbers are different, so they're not perceived as identical by default).
Thanks! Let me know if that question was unclear at all. I was having trouble figuring out a way to word it.
Edit: This question is specific to NSManagedObjects, whose isEqual: method cannot be overridden.
From the documentation:
If the same object appears more than once in array, it is added
only once to the returned set.
Equality is determined here as throughout Cocoa with the -isEqual: method (and the -hash method). If you want two custom objects to be considered equal, you should override these appropriately, and you must override both. These are generally used so that objects that really are equivalent and generally interchangeable (but are separate objects) can be seen as such. In your example, it sounds like the college course objects really are "different" (ie, they represent different classes, even if they might share the same overall "calculus" topic), so it seems problematic to call those object instances "equal" if this is a large scale project/code base. In your case, you might consider adding the object to the set one by one and do your own comparisons as you add to make sure you get one of each "topic".

OOP - How to choose a possible object candidate?

I 'm concern about what techniques should I use to choose the right object in OOP
Is there any must-read book about OOP in terms of how to choose objects?
Best,
Just write something that gets the job done, even if it's ugly, then refactor continuously:
eliminate duplicate code (don't repeat yourself)
increase cohesion
reduce coupling
But:
don't over-engineer; keep it simple
don't write stuff you ain't gonna need
It's not a precise recipe, just some general guidelines. Keep practicing.
P.S.
Code objects are not related to tangible real-life objects; they are just constructs that hold related information together.
Don't believe what the Java books/schools teach about objects; they're lying.
You probably mean "the right class", rather than "the right object". :-)
There are a few techniques, such as text analysis (a.k.a. underlining the nouns) and Class Responsibility Collaborator (CRC).
With "underlining the nouns", you basically start with a written, natural language (i.e. plain English) description of the problem you want to solve and underline the nouns. That gives you a list of candidate classes. You will need to perform several passes to refine it into a list of classes to implement.
For CRC, check out the Wikipedia.
I suggest The OPEN Toolbox of Techniques for full reference.
Hope it helps.
I am assuming that there is understanding of what is sctruct, type, class, set, state, alphabet, scalar and vector and relationship.
Object is a noun, method is a verb. Object members can represent identity, state or scalar value per field. Relationships between objects usually are represented with references, where references are members of objects. In cases, when relationships are complex, multidirectional, have arity greater than 2, represent some sort of grouping or containment, then relationships can be expressed as objects.
For other, broader technical reasons objects are most likely the only way to represent any form of information in OOP languages.
I am adding a second answer due to demian's comment:
Sometimes the class is so obvious
because it's tangible, but other times
the concept of object it's to abstract
like a db connector.
That is true. My preferred approach is to perform a behavioural analysis of the system (using use cases, for example), and then derive system operations. Once you have a stable list of system operations (such as PrintDocument, SaveDocument, SpellCheck, MergeMail, etc. for a word processor) you need to assign each of them to a class. If you have developed a list of candidate classes with some of the techniques that I mentioned earlier, you will be able to allocate some of the operations. But some will remain unallocated. These will signal the need of more abstract or unintuitive classes, which you will need to make up, using your good judgment.
The whole method is documented in a white paper at www.openmetis.com.
You should check out Domain-Driven Design, by Eric Evans. It provides very useful concepts in thinking about the objects in your model, what their function are in the domain, and how they could be organized to work together. It's not a cookbook, and probably not a beginner book - but then, I read it at different stages of my career, and every time I found something valuable in it...
(source: domaindrivendesign.org)

Naming a dictionary structure that stores keys in a predictable order?

Note: Although my particular context is Objective-C, my question actually transcends programming language choice. Also, I tagged it as "subjective" since someone is bound to complain otherwise, but I personally think it's almost entirely objective. Also, I'm aware of this related SO question, but since this was a bigger issue, I thought it better to make this a separate question. Please don't criticize the question without reading and understanding it fully. Thanks!
Most of us are familiar with the dictionary abstract data type that stores key-value associations, whether we call it a map, dictionary, associative array, hash, etc. depending on our language of choice. A simple definition of a dictionary can be summarized by three properties:
Values are accessed by key (as opposed to by index, like an array).
Each key is associated with a value.
Each key must be unique.
Any other properties are arguably conveniences or specializations for a particular purpose. For example, some languages (especially scripting languages such as PHP and Python) blur the line between dictionaries and arrays and do provide ordering for dictionaries. As useful as this can be, such additions are not a fundamental characteristics of a dictionary. In a pure sense, the actual implementation details of a dictionary are irrelevant.
For my question, the most important observation is that the order in which keys are enumerated is not defined — a dictionary may provide keys in whatever order it finds most convenient, and it is up to the client to organize them as desired.
I've created custom dictionaries that impose specific key orderings, including natural sorted order (based on object comparisons) and insertion order. It's obvious to name the former some variant on SortedDictionary (which I've actually already implemented), but the latter is more problematic. I've seen LinkedHashMap and LinkedMap (Java), OrderedDictionary (.NET), OrderedDictionary (Flash), OrderedDict (Python), and OrderedDictionary (Objective-C). Some of these are more mature, some are more proof-of-concept.
LinkedHashMap is named according to implementation in the tradition of Java collections — "linked" because it uses a doubly-linked list to track insertion order, and "hash" because it subclasses HashMap. Besides the fact that user shouldn't need to worry about that, the class name doesn't really even indicate what it does. Using ordered seems like the consensus among existing code, but web searches on this topic also revealed understandable confusion between "ordered" and "sorted", and I feel the same. The .NET implementation even has a comment about the apparent misnomer, and suggests that it should be "IndexedDictionary" instead, owing to the fact that you can retrieve and insert objects at a specific point in the ordering.
I'm designing a framework and APIs and I want to name the class as intelligently as possible. From my standpoint, indexed would probably work (depending on how people interpret it, and based on the advertised functionality of the dictionary), ordered is imprecise and has too much potential for confusion, and linked "is right out" (apologies to Monty Python). ;-)
As a user, what name would make the most sense to you? Is there a particular name that says exactly what the class does? (I'm not averse to using slightly longer names like InsertionOrderDictionary if appropriate.)
Edit: Another strong possibility (discussed in my answer below) is IndexedDictionary. I don't really like "insertion order" because it doesn't make sense if you allow the user to insert keys at a specific index, reorder the keys, etc.
I vote OrderedDictionary, for the following reasons:
"Indexed" is never used in Cocoa classes, except in one instance. It always appears as a noun (NSIndexSet, NSIndexPath, objectAtIndex:, etc). There is only one instance when "Index" appears as a verb, which is on NSPropertyDescription's "indexed" property: isIndexed and setIndexed. NSPropertyDescription is roughly analogous to a table column in a database, where "indexing" refers to optimizing to speed up search times. It would therefore make sense that with NSPropertyDescription being part of the Core Data framework, that "isIndexed" and "setIndexed" would be equivalent to an index in a SQL database. Therefore, to call it "IndexedDictionary" would seem redundant, since indices in databases are created to speed up lookup time, but a dictionary already has O(1) lookup time. However, to call it "IndexDictionary" would also be a misnomer, since an "index" in Cocoa refers to position, not order. The two are semantically different.
I understand your concern over "OrderedDictionary", but the precedent has already been set in Cocoa. When users want to maintain a specific sequence, they use "ordered": -[NSApplication orderedDocuments], -[NSWindow orderedIndex], -[NSApplication orderedWindows], etc. So, John Pirie has mostly the right idea.
However, you don't want to make insertion into the dictionary a burden on your users. They'll want to create a dictionary once and then have it maintain an appropriate order. They won't even want to request objects in a specific order. Order specification should be done during initialization.
Therefore, I recommend making OrderedDictonary a class cluster, with private subclasses of InsertionOrderDictionary and NaturalOrderDictionary and CustomOrderDictionary. Then, the user simply creates an OrderedDictionary like so:
OrderedDictionary * dict = [[OrderedDictionary alloc] initWithOrder:kInsertionOrder];
//or kNaturalOrder, etc
For a CustomOrderDictionary, you could have them give you a comparison selector, or even (if they're running 10.6) a block. I think this would provide the most flexibility for future expansion while still maintain an appropriate name.
I vote for InsertionOrderDictionary. You nailed it.
Strong vote for OrderedDictionary.
The word "ordered" means exactly what you are advertising: that in iterating through a list of items, there is a defined order to selection of those items. "Indexed" is an implementation word -- it talks more to how the ordering is achieved. Index, linked list, tree... the user doesn't care; that aspect of the data structure should be hidden. "Ordered" is the exact word for the additional feature you are offering, regardless of how you get it done.
Further, it seems like the choice of ordering could be at the user's option. Any reason why you couldn't create methods on your datatype that allow the user to switch from, say, alphabetical ordering to insertion-time ordering? In the default case, a user would choose a particular ordering and stick with it, in which case implementation would be no less efficient than if you created specialized subclasses for each ordering method. And in some less-used cases, the developer might actually wish to use any of a number of different orderings for the same data, depending on app context. (I can think of specific projects I've worked on where I would have loved to have such a data structure available.)
Call it OrderedDictionary, because that's precisely what it is. (Frankly, I have more of a problem with the use of the word "Dictionary", because that word heavily implies ordering, where popular implementations of such don't provide it, but that's my pet peeve. You really should just be able to say "Dictionary" and know that the ordering is alphabetical -- because that's what a dictionary IS -- but that argument is too late for existing implementations in the popular languages.) And allow the user to access in what order he chooses.
Since posting this question, I'm starting to lean towards something like IndexedDictionary or IndexableDictionary. While it is useful to be able to maintain arbitrary key ordering, limiting that to insertion ordering only seems like a needless restriction. Plus, my class already supports indexOfKey: and keyAtIndex:, which are (purposefully) analagous to NSArray's indexOfObject: and objectAtIndex:. I'm strongly considering adding insertObject:forKey:atIndex: which matches up with NSMutableArray's insertObject:atIndex:.
Everyone knows that inserting in the middle of an array is inefficient, but that doesn't mean we shouldn't be allowed to on the rare occasions that it's truly useful. (Besides, the implementation could secretly use a doubly-linked list or any other suitable structure for tracking the ordering if needed...)
The big question: is "indexed" or "indexable" as vague or potentially confusing as "ordered"? Would people think of database indexes, or book indexes, etc.? Would it be detrimental if they assumed it was implemented with an array, or might that simplify user understanding of the functionality?
Edit: This name makes even more sense given the fact that I'm considering adding methods that work with an NSIndexSet in the future. (NSArray has -objectsAtIndexes: as well as methods for adding/removing observers for objects at given indexes.)
What about KeyedArray?
As you said in your last paragraph, I think that InsertionOrder(ed)Dict(ionary) is pretty unambiguous; I don't see how it could be interpreted in any way other than that the keys would be returned in the order they were inserted.
By decoupling the indexed order from the insertion order, doesn't this simply boil down to keeping an array and Dictionary in a single object? I guess my vote for this type of object is IndexedKeyDictionary
In C#:
public class IndexedKeyDictionary<TKey, TValue> {
List<TKey> _keys;
Dictionary<TKey, TValue> _dictionary;
...
public GetValueAtIndex(int index) {
return _dictionary[_keys[index]];
}
public Insert(TKey key, TValue val, int index) {
_dictionary.Add(key, val);
// do some array massaging (splice, etc.) to fit the new key
_keys[index] = key;
}
public SwapKeyIndexes(TKey k1, TKey k2) {
// swap the indexes of k1 and k2, assuming they exist in _keys
}
}
What would be really cool is indexed values...so we have a way to sort the values and get the new key order. Like if the values were graph coordinates, and we could read the keys (bin names) as we move up/down along the coordinate plane. What would you call that data structure? An IndexedValueDictionary?
At first glance I'm with the first reply -- InsertionOrderDictionary, though it's a bit ambiguous as to what "InsertionOrder" means at first glance.
What you're describing sounds to me almost exactly like a C++ STL map. From what I understand, a map is a dictionary that has additional rules, including ordering. The STL simply calls it "map", which I think is fairly apt. The trick with map is you can't really give the inheritance a nod without making it redundant -- i.e. "MapDictionary". That's just too redundant. "Map" is a bit too basic and leaves a lot of room for misinterpretation.
Though "CHMap" might not be a bad choice after looking at your documentation link.
Maybe "CHMappedDictionary"? =)
Best of luck.
Edit: Thanks for the clarification, you learn something new every day. =)
Is the only difference that allKeys returns keys in a specific order? If so, I would simply add allKeysSorted and allKeysOrderdByInsertion methods to the standard NSDictionary API.
What is the goal of this insertion order dictionary? What benefits does it give the programmer vs. an array?

What do you call a method of an object that changes its class?

Let's say you have a Person object and it has a method on it, promote(), that transforms it into a Captain object. What do you call this type of method/interaction?
It also feels like an inversion of:
myCaptain = new Captain(myPerson);
Edit: Thanks to all the replies. The reason I'm coming across this pattern (in Perl, but relevant anywhere) is purely for convenience. Without knowing any implementation deals, you could say the Captain class "has a" Person (I realize this may not be the best example, but be assured it isn't a subclass).
Implementation I assumed:
// this definition only matches example A
Person.promote() {
return new Captain(this)
}
personable = new Person;
// A. this is what i'm actually coding
myCaptain = personable.promote();
// B. this is what my original post was implying
personable.promote(); // is magically now a captain?
So, literally, it's just a convenience method for the construction of a Captain. I was merely wondering if this pattern has been seen in the wild and if it had a name. And I guess yeah, it doesn't really change the class so much as it returns a different one. But it theoretically could, since I don't really care about the original.
Ken++, I like how you point out a use case. Sometimes it really would be awesome to change something in place, in say, a memory sensitive environment.
A method of an object shouldn't change its class. You should either have a member which returns a new instance:
myCaptain = myPerson->ToCaptain();
Or use a constructor, as in your example:
myCaptain = new Captain(myPerson);
I would call it a conversion, or even a cast, depending on how you use the object. If you have a value object:
Person person;
You can use the constructor method to implicitly cast:
Captain captain = person;
(This is assuming C++.)
A simpler solution might be making rank a property of person. I don't know your data structure or requirements, but if you need to something that is trying to break the basics of a language its likely that there is a better way to do it.
You might want to consider the "State Pattern", also sometimes called the "Objects for States" pattern. It is defined in the book Design Patterns, but you could easily find a lot about it on Google.
A characteristic of the pattern is that "the object will appear to change its class."
Here are some links:
Objects for States
Pattern: State
Everybody seems to be assuming a C++/Java-like object system, possibly because of the syntax used in the question, but it is quite possible to change the class of an instance at runtime in other languages.
Lisp's CLOS allows changing the class of an instance at any time, and it's a well-defined and efficient transformation. (The terminology and structure is slightly different: methods don't "belong" to classes in CLOS.)
I've never heard a name for this specific type of transformation, though. The function which does this is simply called change-class.
Richard Gabriel seems to call it the "change-class protocol", after Kiczales' AMOP, which formalized as "protocols" many of the internals of CLOS for metaprogramming.
People wonder why you'd want to do this; I see two big advantages over simply creating a new instance:
faster: changing class can be as simple as updating a pointer, and updating any slots that differ; if the classes are very similar, this can be done with no new memory allocations
simpler: if a dozen places already have a reference to the old object, creating a new instance won't change what they point to; if you need to update each one yourself, that could add a lot of complexity for what should be a simple operation (2 words, in Lisp)
That's not to say it's always the right answer, but it's nice to have the ability to do this when you want it. "Change an instance's class" and "make a new instance that's similar to that one" are very different operations, and I like being able to say exactly what I mean.
The first interesting part would be to know: why do you want/need an object changes its class at runtime?
There are various options:
You want it to respond differently to some methods for a given state of the application.
You might want it to have new functionality that the original class don't have.
Others...
Statically typed languages such as Java and C# don't allow this to happen, because the type of the object should be know at compile time.
Other programming languages such as Python and Ruby may allow this ( I don't know for sure, but I know they can add methods at runtime )
For the first option, the answer given by Charlie Flowers is correct, using the state patterns would allow a class behave differently but the object will have the same interface.
For the second option, you would need to change the object type anyway and assign it to a new reference with the extra functionality. So you will need to create another distinct object and you'll end up with two different objects.