Standard Structure for Object-Value Pairs - oop

I'm attempting to create a class, A, that has a collection of objects, X[].
Each element in X will contain a reference to another class, B, and associate a Boolean value, U, with that reference.
In this way, I'll be able to create a instance of an object, and poll if it's relationship with X[i] is true, false, or none.
Is there a standard practice for doing this?
The particular problem I'm trying to solve is that I have a array of cells, each of which is defined by a positive or negative relationship to its bounding surfaces.
I want to loop through the cells, and find out the path length of a ray that transverses a series of them.

Don't restrict your thinking to objects or data structures. Think dynamically. If the Boolean value you want to associate to every class can be deduced from some logical rules, at it is likely the case, implement a message that will return that value. Then enumerate the classes and collect the Boolean values by sending the (same) message to all of them.
Then think dynamically again and apply the very same concept to calculate the collection of classes: do not hardcode them in a list or array, implement the message that will select them based on the logic that dictates such selection.
Of course, the ability to do all of this depends on the language of your choice as it will have to support classes as first-class objects. But hey, if you have a problem that can be better expressed in some language, different from the one you are currently using, take the opportunity to give it a try.

Related

What is the purpose of the type system in light of CLOS (Common Lisp)?

It is my understanding that the memory layout of a Common Lisp object (bitwise tagging is defined by CLOS (classes).
I understand that every class has a corresponding type, but not every type has a corresponding class, because types can be compound (lists). I think that types are like logical constraints, as opposed to classes that are concrete "types" with a tagging scheme.
If this is correct, does the type system serve any other purpose other than being a logical constraint (such as specifying that an integer must be within a certain range, or that an array contains a particular type)?
If this is not correct, what purpose does the type system actually serve in light of CLOS? Thanks.
An object has only one class at a time, whereas it can satisfy multiple types.
The type system is a lattice, where you can compute a least-upper-bound and greatest lower bound of two types (using resp. or, and), and which admits a top type (T) and a bottom type (the NIL type, which is not the same as the NULL type).
An implementation of Common Lisp must be able to determine if a value belongs to a type, and that starts with atomic type specifiers, like character or integer, and grows with compound type specifiers (which can be defined by the user).
But whether this is done using tags or by static analysis is left to the implementation; in practice, CL is such that there are cases where you cannot statically determine the type of an object precisely (other than T), simply because an object can be redefined at a later point: you cannot assume its type is fixed (say: a function; that's why inlining or global declarations may help with type inference).
But if you have a scope in which a type can be guaranteed to be invariant, the the compiler is free to use unboxed data types to store values. Then you don't have tagged data. That is the case for local declaration of types for variables, but also for specialized arrays: once an array is built, its element type does not change over time and in some cases knowing that an array contains only (integer 0 15) elements can be used to pack data more efficiently.
CLOS was added to CL fairly late in the game (and it was not the only object system designed for CL)
Even with CLOS, the type system can be used by the compiler for optimizations and by user to reason about their code.
I think it's important to get away from the implementation of things, and instead concentrate on how the language thinks about them. Clearly the implementation needs to have enough information to know what sort of thing a given object is, and it's going to do that with some kind of 'tag' (which may or may not be some extra bits attached to the object -- some of it might be the leading bits of the address for instance). Below I've called this the 'representational type'. But you really have almost no access to that implementation detail from the language. It's tempting to think that, type-of tells you something which maps 1-1 onto the representational type, but that's not true: (type-of (cons 1 2) is permitted to return (cons integer integer) for instance, and I think it is probably allowed to return (cons integer number) or (cons (integer 1 1) (integer 2 2)). It's unlikely that there are distinct representational types for all of these: indeed there can't be since (type-of 1) can return (integer m n) for an infinite number of values of m & n.
So here's a take on how the language thinks about things, and the differences between classes and types, in CL.
Both the type system and the class system consist of a bounded lattice of types / classes. Being a lattice means that for any pair of objects there is a unique supremum (so, for types, a unique type of which both types are subtypes, and which has no subtypes for which that is true) and infimum (the reverse). Being bounded means there is a top & a bottom type / class.
Classes
Classes are first-class objects (you can store a class in a variable for instance).
All objects (including classes) belong to a class, and there is a well-defined operator to find the immediate class to which any object belongs.
There are a finite number of classes.
The class of an object corresponds fairly closely to its representational type, but not completely (there may be specialized array types which do not have corresponding classes for instance).
Classes can serve as types: (type-of 1 (class-of 1)) works, as does (subtypep (class-of 1) '(integer 0 1)) (the answers being t and nil, t respectively).
Types
Types are ways to denote collections of objects with common properties, but they are not themselves objects: they are, if anything, just names for collections of things -- the language specification calls these 'type specifiers'. In particular there are an infinite number of types: think of the type (integer m n) for instance. A small number of this infinitude of types correspond to representational types -- the actual information that tells the system what sort of thing something is -- but obviously most of them do not. There may be representational types which do not have corresponding types.
Types in practice serve three purposes I think.
Type information can tell the system about what representational types to use which can help it check that things are the right representational type and optimise things.
Type information can let the system make inferences which can help things significantly.
Type information can let programmers talk about what sort of things they are dealing with, even when that information is not helpful to the system. The system can treat such declarations as assertions about types which can make programs safer & easier to debug. This is an important reason for types: even if the system does not check them, it is useful for the person reading your code to know that it expects, say, an integer in [0, 30], ie an (integer 0 30). Indeed, even if the system does not automatically check declarations you can force checks with, say (check-type x '(integer 0 30) ...).
The second case is interesting. Let's say I have something which I have told the system is of type (double-float 0.0d0). This is very unlikely to be more useful in terms of representational type than double-float would be. But if I take the square root of this thing then knowing this type might be very useful indeed: the system can know that the result is a double-float, rather than a (complex double-float), and those types are extremely unlikely to be representationally the same. So the system can use my type declaration to make inferences in this way (and these inferences can cascade through the program). Note that classes can't do this (at least CL's classes can't), and neither can the representational type of an object: you need more information than that.
So yes, types serve a number of very useful purposes which aren't satisfied by classes.
A type is a set of values.
A type specifier is some way to succinctly represent a type.
Implementations may do all kinds of markings and registering in order to help them sort out the types of things, but that is not inherent to the concept of types.
A class is an object describing a set of other objects. Since having a succinct name for such a set (type) is quite useful, Common Lisp registers the class name as a type specifier for the corresponding set of objects. That is the whole relation of types to classes.
The type system defines different objects that do different things. The CLOS system is more so used for methods that define special behaviors for types in a more logical way for some programmers. Coming from Java, the CLOS System was more logical and systematic for me, so it has a role for some programmers. I like to think of the CLOS system as a class in Java such as the Integer class, and the type system similar to primitives in Java. The CLOS system simply helps you extend your objects with methods in a more systematic way than creating a structure imho.

Why is Multimap not camel cased?

This one really annoys me (and my colleague).
It's not
Hashmap
Treemap
org.apache.commons.collections.Multimap
etc.
So why didn't anyone notice this naming convention flaw or is there an intention behind this typo?
It's not a typo. Guava's Multimap is not a Map, i.e. it does not extend Map interface (and it shouldn't). See Guava's wiki page on this topic:
Multimap.get(key) always returns a non-null, possibly empty collection. This doesn't imply that the multimap spends any memory associated with the key, but instead, the returned collection is a view that allows you to add associations with the key if you like.
If you prefer the more Map-like behavior of returning null for keys that aren't in the multimap, use the asMap() view to get a Map<K, Collection<V>>. (Or, to get a Map<K,List<V>> from a ListMultimap, use the static Multimaps.asMap() method. Similar methods exist for SetMultimap and SortedSetMultimap.)
Multimap.containsKey(key) is true if and only if there are any elements associated with the specified key. In particular, if a key k was previously associated with one or more values which have since been removed from the multimap, Multimap.containsKey(k) will return false.
Multimap.entries() returns all entries for all keys in the Multimap. If you want all key-collection entries, use asMap().entrySet().
Multimap.size() returns the number of entries in the entire multimap, not the number of distinct keys. Use Multimap.keySet().size() instead to get the number of distinct keys.
On the other hand, Apache Commons Collections' MultiMap (not the capital "M" in map) extends Map, but it's a bit awkward in use, plus Apache devs also came to consludion that extending and mimicking map-like behavior in multimap is not what user wants, so they deprecated MultiMap (yes, you should not use old MultiMap interface and its implementations in new code!) and now recommend using MultiValueMap instead - it does not extend Map and has quite similar API to Guava equivalent.
The word "multimap" (one word, entirely lowercase) refers to a specific data structure. It's different from a "map", which is another data structure. So since they're different data structures, they have different names.
The Map interface you usually use in Java is the chosen name for an associative array, also known as "map", "dictionary", "hash", etc. Likewise, Guava's Multimap interface is their representation of the multimap data structure.

Fast, efficient method of assigning large array of data to array of clusters?

I'm looking for a faster, more efficient method of assigning data gathered from a DAQ to its proper location in a large cluster containing arrays of subclusters.
My current method 1 relies heavily on the OpenG cluster manipulation tools, but with a large data-set the performance is far too slow.
The array and cluster location of each element of data from the DAQ is determined during an initialization phase and doesn't change during acquisition.
Because the data element origin and end points are the same throughout acquisition, I would think an array of memory locations could be created and the data directly assigned to its proper place. I'm just not sure how to implement such a thing.
The following code does what you want:
For each of your cluster elements (AMC, ANLG_PM and PA) you should add a case in the string case structure, for the elements AMC and PA you will need to place a second case structure.
This is really more of a comment, but I do not have the reputation to leave those yet, so here it is:
Regarding adding cases for every possible value of Array name, is there any reason why you cannot use an enum here? Since you are placing it into a cluster anyway, I would suggest making a type-defined enum of your possible array names. That way, when you want to add or remove one, you only have to do it in one place.
You will still need to right-click on your case structures that use this enum and select Add item for every value if you are adding a value, or manually delete the obsolete value if you are removing one. I suppose some maintenance is required either way...

How does one keep track of many objects in OO design?

I'm struggling with an issue regarding basic object-oriented design that I don't really have the vocabulary to describe. I'm a first-year computer systems student, so my software development education so far is focused on the very basics.
In a large OO project, like for instance a role-playing game, you can end up with many objects whose names are not known at compile time. In a game example, you might have a database file containing details of different enemy encounters. A particular encounter might have the player face three goblin warriors and a troll berserker. These different enemies might all be instances of class Combatant, with different creation arguments specifying their powers and equipment.
However, when instantiating these objects, what names do we give them? Stated differently: I use Python, and when you instantiate an object, you need to give it an identifying name in order to refer to it later; but how can I name a thing if the variable name itself is not literally typed into my code?
next_combatant = load_from_file()
??? = Combatant(next_combatant)
In other words, what do I put in place of the ??? above?
The solution I'm currently using is to use lists (arrays), appending each new object to the list. This way each object does not strictly have a name in the sense of 'goblin_003', but I can refer to objects by using indices of the list, and I can also do other nice things like count how many enemies there are, etc.
My question, then: is this how it is handled in industry? Do programmers typically use arrays to keep all their objects organised? Or is there some clever trick that allows me to retrieve a variable name from file?
(I realise this question is poorly worded, so if anyone needs clarification, just ask.)
EDIT: Is there a name for using collections in this way?
Of course. Collections serve this purpose well enough. With many collections, you don't need to specify an identifier of a stored item. However, if you would like to, a dictionary is a way to go. I don't know much about python, but there should be some collection where you specify a key and a value for an item you want to store in it.
Just imagine that you are to create millions of warriors in one game. It would not made a sense to have warrior_999999 and warrior_1000000.
That's correct, you don't have variable names for all monsters and objects in your game or in other software where you need dynamic content. And you don't need any variable names really.
But what is correct way to store those objects depends on your needs, what you wanna do with them and what is purpose of your objects. You can use Arrays, Lists, Trees and what ever, what just makes sense in your use case.
If you need to identify specific dynamic object / monster in your game/software, add identifier variable to your object.
Identifier variable in your object can be just integer value. But you have to keep it so that you don't have two objects with same value.
Example, if you have 3 monster objects in array and monster object has int id; variable.
Monster1, id = 1
Monster2, id = 2
Monster3, id = 3... and so on.
In OpenGL, if you use picking, good identifier is RGBA value:
Monster1, id = RGBA(0, 0, 0, 1);
Monster2, id = RGBA(0, 0, 0, 2);
Monster256, id = RGBA(0, 0, 1, 0); and so on.
What is right identifier, depends again on your needs.

Overriding CompareTo when there are multiple ways to compare two objects of the same type?

What's a sound approach to override the CompareTo() method in a custom class with multiple approaches to comparing the data contained in the class? I'm attempting to implement IComparable(Of T) just so I have a few of the baseline interfaces implemented. Not planning on doing any sorting yet, but this will save me down the road if I need to.
Reading MSDN states mostly that we have to return 0 if the objects are equal, -1 if obj1 is less than obj2, or 1 is obj1 is greater than obj2. But that's rather simplistic.
Consider an IPv4 address (which is what I'm implementing in my class). There's two main numbers to consider -- the IP address itself, and the CIDR. An IPv4 address by itself is assumed to have a CIDR of /32, so in that case, in a CompareTo method, I can just compare the addresses directly to determine if one is greater or less than the other. But when the CIDR's are different, then things get tricky.
Assume obj1 is 10.0.0.0/8 and obj2 is 192.168.75.0/24. I could compare these two addresses a number of ways. I could just ignore the CIDR, and still regard as obj2 as being greater than obj1. I could compare them based on their CIDR, which would be comparing the size of the network (and a /8 will trump a /24 quite easily). I could compare them on both their numerical address AND their CIDR, on the off chance obj2 was actually an address inside the network defined by obj1.
What's the kind of approach used to handle situations like this? Can I define two CompareTo methods, overloaded, such that one would evaluate one address relative to another address, and the second would evaluate the size of the overall network? How would the .NET framework be told which one to use depending on how one might want to sort an array? Or do some other function that relies on CompareTo()?
For CompareTo, you should use a comparison that represents the default, normal sort order for a particular type. For example, in the example you gave, I would probably expect it to sort on the address first, then on the subnet size.
But for the case where there is no obvious "default" sort order, or when there are multiple ways to compare (such as case sensitive or not when comparing strings) the recommended approach is to use an IComparer<T>. This would be a separate object that is able to compare two instances of your type. For example, a AddressComparer or SubnetComparer. You could even make them static properties of a class which is what StringComparer does.
Just about all methods that take IComparable types should also have an overload that allows you to specify an IComparer to use instead. You don't have to implement both, but if it makes sense, do it. That way you can specify a particular comparer when needed or use the default built-in IComparable logic of your type.