Best way to extend Pharo Smalltalk class behavior? - smalltalk

I want to extend the String class with a method to create a url slug out of a string. I found a link here that shows how you can move extensions to their own package:
Smalltalk Daily 07/13/10: Extending Behavior II.
However, I can't find any "move to package" option in Pharo Smalltalk. Is it ok to just extend the core class with the new method, or is there a better way?

In Pharo or Squeak put the extension methods for MyPackage in a method category called *mypackage (or if you want to be more descriptive *mypackage-slug).
The methods in these categories belong automatically to the MyPackage package (at least from the Monticello point of view)

"Is it ok to just extend the core class with the new method, or is there a better way?"
There are tradeoffs to this decision. In fact, Pharo had String>>asUrl until very recently, when it was removed as part of cleaning the system. On one hand, it is considered bad style by some (see Kent Beck's Best Practices) to have conversion methods between objects that do not have similar protocols (are semantically similar). Additionally, this leads to bloated core classes (like String and Object). However, in your own application, there may be a good reason that balances these factors, and since you are packaging it with your app, and not with the system, rock out.

In pharo 7, * is forbidden.
A message tells you have to tick the extension checkbox in method edition pane.
If you do so, you can choose your package.

Related

extending objects at run-time via categories?

Objective-C’s objects are pretty flexible when compared to similar languages like C++ and can be extended at runtime via Categories or through runtime functions.
Any idea what this sentence means? I am relatively new to Objective-C
While technically true, it may be confusing to the reader to call category extension "at runtime." As Justin Meiners explains, categories allow you to add additional methods to an existing class without requiring access to the existing class's source code. The use of categories is fairly common in Objective-C, though there are some dangers. If two different categories add the same method to the same class, then the behavior is undefined. Since you cannot know whether some other part of the system (perhaps even a system library) adds a category method, you typically must add a prefix to prevent collisions (for example rather than swappedString, a better name would likely be something like rnc_swappedString if this were part of RNCryptor for instance.)
As I said, it is technically true that categories are added at runtime, but from the programmer's point of view, categories are written as though just part of the class, so most people think of them as being a compile-time choice. It is very rare to decide at runtime whether to add a category method or not.
As a beginner, you should be aware of categories, but slow to create new ones. Creating categories is a somewhat intermediate-level skill. It's not something to avoid, but not something you'll use every day. It's very easy to overuse them. See Justin's link for more information.
On the other hand, "runtime functions" really do add new functionality to existing classes or even specific objects at runtime, and are completely under the control of code. You can, at runtime, modify a class such that it responds to a method it didn't previously respond to. You can even generate entirely new classes at runtime that did not exist when the program was compiled, and you can change the class of existing objects. (This is exactly how Key-Value Observation is implemented.)
Modifying classes and objects using the runtime is an advanced skill. You should not even consider using these techniques in production code until you have significant experience. And when you have that experience, it will tell you that you very seldom what to do this anyway. You will know the runtime functions because they are C-based, with names like method_exchangeImplmentations. You won't mistake them for normal ObjC (and you generally have to import objc/runtime.h to get to them.)
There is a middle-ground that bleeds into runtime manipulation called message forwarding and dynamic message resolution. This is often used for proxy objects, and is implemented with -forwardingTargetForSelector, +resolveInstanceMethod, and some similar methods. These are tools that allow classes to modify themselves at runtime, and is much less dangerous than modifying other classes (i.e. "swizzling").
It's also important to consider how all of this translates to Swift. In general, Swift has discouraged and restricted the use of runtime class manipulation, but it embraces (and improves) category-like extensions. By the time you're experienced enough to dig into the runtime, you will likely find it an even more obscure skill than it is today. But you will use extensions (Swift's version of categories) in every program.
A category allows you to add functionality to an existing class that you do not have access to source code for (System frameworks, 3rd party APIs etc). This functionality is possible by adding methods to a class at runtime.
For example lets say I wanted to add a method to NSString that swapped uppercase and lowercase letters called -swappedString. In static languages (such as C++), extending classes like this is more difficult. I would have to create a subclass of NSString (or a helper function). While my own code could take advantage of my subclass, any instance created in a library would not use my subclass and would not have my method.
Using categories I can extend any class, such as adding a -swappedString method and use it on any instance of the class, such asNSString transparently [anyString swappedString];.
You can learn more details from Apple's Docs

What are Smalltalk pragmas conceptually?

I have used pragmas in Pharo Smalltalk and have an idea about how they work and have seen examples for what they are used in Pharo.
My questions are:
what are pragmas conceptually,
to what construct do they compare in other languages,
when should i introduce a pragma?
I already found an interesting article about their history: The history of VW Pragmas.
You must think of it as Annotations attached to a CompiledMethod, or if you want as additionnal properties.
Then, thanks to reflection, some tools can walk other compiled methods, collect those with certain annotations (properties) and apply some special handling, like constructing a menu, a list of preferences, or other UI, invoking every class methods marked as #initializer, or some mechanism could be walking the stack back until a method is marked as an #exceptionHandler ...
There are many possibilities, up to you to invent your own meta-property...
EDIT
For the second point, I don't know, it must be a language that can enumerate the methods, and can attach properties to them.
The third point is also hard to answer. In practice, I would say you would use some already existing annotations, but very rarely create a new one, unless you're trying to create a new framework for exception handling, or a new framework for GUI (you want to register some known events or some handlers...). The main usage I would see is for extending, composing an application with unrelated parts, like a main menu. It seems like a relatively un-intrusive way to introduce DECLARATIVE hooks - compared to the very intrusive way to override a well known method TheWorld>>mainMenu. It's also a bit lighter than registering/un-registering IMPERATIVELY via traditional message send at class initialization/unoading. On the other hand, the magic is a bit more hidden.

How does the organisation of classes in categories and packages work in different versions of Pharo?

Can someone explain how the organisation of classes in Pharo works in different versions of Pharo?
All Classes are part of the Smalltalk global (have always been, seem to stay like this?)
Classes can have a Category, but thats only a kind of tag? (has always been, seems to stay like this? But the categories are somehow mapped to packages sometimes?)
There are different kinds of Packages in different Versions of Pharo
MCPackages representing Monticello Packages
PackageInfo
RPackage (Pharo 1.4)?
In addition there is SystemNavigation which somehow helps navigating classes and methods based on some of the above mentioned constructs?
Classes
The fact that classes are keys in the Smalltalk global is an implementation detail. As long as there is a single global namespace for class names, it is likely that the implementation will stay the same.
Class Categories
The class category is very much like a tag. A class can only be in one category at a time. Originally the class category was used by the Browser for organizing the classes in the system.
When Monticello was created, the class category was overloaded to also indicate membership in a Monticello package theMCPackage and PackageInfo classes were created to manage this mapping.
PackageInfo does all the heavy lifting: finding the classes and loose methods that belong to a package.
MCPackage is a Monticello-specific wrapper for PackageInfo that adds some protocol that wasn't necessarily appropriate for the more general PackageInfo.
Packages
Overloading the class category for package membership was a neat trick to ease the adoption of Monticello (existing development tools didn't need to be taught Monticello), however, it is still a trick. Not to mention the fact that the implementation of PackageInfo was not very efficient.
RPackage was created to address the performance problems of PackageInfo and to be used as part of the next generation of development tools.
Both package implementations will continue to exist until PackageInfo can be phased out.
SystemNavigation
As Frank says,
SystemNavigation is a class that, as its name suggests, permits easy
querying of a number of different things: the classes in the image,
senders-of, implementors-of, information about packages loaded in the
image and so on.
Classes are, at the moment at least, the keys in the Smalltalk dictionary.
PackageInfo contains information about a grouping of classes and extensions to other packages.
A Monticello package contains a deployable unit of code. Usually one of these will correspond to a PackageInfo instance. (Hitting the "+Package" button in a Monticello Browser will create one of these, for instance.) A Monticello package may contain pre-load and post-load scripts, so the two classes perform separate, if related, functions.
SystemNavigation is a class that, as its name suggests, permits easy querying of a number of different things: the classes in the image, senders-of, implementors-of, information about packages loaded in the image and so on.

Are tag (or "marker") interfaces obsolete?

I'm trying to help a coworker come to terms with OO, and I'm finding that for some cases, it's hard to find solid real-world examples for the concept of a tag (or marker) interface. (An interface that contains no methods; it is used as a tag or marker or label only). While it really shouldn't matter for the sakes of our discussions, we're using PHP as the platform behind the discussions (because it's a common language between us). I'm probably not the best person to teach OO since most of my background is highly theoretical and about 15 years old, but I'm what he's got.
In any case, the dearth of discussions I've found regarding tag interfaces leads me to believe it's not even being used enough to warrant discussion. Am I wrong here?
Tag interfaces are used in Java (Serializable being the obvious example). C# and even Java seem to be moving away from this though in favor of attributes, which can accomplish the same thing but also do much more.
I still think there's a place for them in other languages that don't have the attribute concept that .NET and Java have.
ETA:
You would typically use this when you have an interface that implies an implementation, but you don't want the class that implements the interface to actually have to provide that implementation.
Some real world examples:
Serializable is a good example - it implies that there is an implementation (somewhere) that can serialize the object data, but since a generic implementation for that is available, there is no need to actually have the object implement that functionality itself.
Another example might be a web page caching system. Say you have a "Page" object and a "RequestHandler" object. The RequestHandler takes a request for a page, locates/creates the corresponding Page object, calls a Render() method on the Page object, and sends the results to the browser.
Now, say you wanted to implement caching for rendered pages. But the hitch is that some pages are dynamic, so they can't be cached. One way to implement this would be to have the cacheable Page objects implement an ICacheable "tag" interface (Or vice-versa, you could have an INotCacheable interface). The RequestHandler would then check if the Page implemented ICacheable, and if it did, it would cache the results after calling Render() and serve up those cached results on subsequent requests for that page.
In .Net Tag interfaces can be great for use with reflection and extension methods. Tag interfaces are typically interfaces without any methods. They allow you to see if an object is of a certain type without having the penatilty of reflecting over your objects.
An examples in the .Net Framework INamingContainer is part of ASP.Net
I'd call myself an OO programmer, and I've never heard of a tag interface.
I think tag interfaces are worth discussing, because they're an interesting corner case of the concept of an interface. Their infrequent use is also worth noting, though!
I've used tag interfaces a couple of times in an object model representing a SQL database. In these instances, it's a subtype of the root interface for particular types of objects. It's easier to check for a tag interface than an attribute ('obj is IInterface' rather than using reflection)
The .NET Style guide says to use Attributes rather than tag/marker interfaces.
alt text http://www.freeimagehosting.net/uploads/th.4528577db5.jpg
Click for Full image
source: http://www.informit.com/articles/article.aspx?p=423349&seqNum=6
or any number of other exposure points of Cwalina's recommendations, like the book.
I've used a tag interface twice in the past month. They can cut through some nasty problems when refactoring to make a function more generic.
That said, another thing I just discovered is using a tag interface as a parent class to a bunch of related interfaces with methods. The object can be passed around, and various preprocessors can check to see if they need to deal with a particular object. A way of keeping the processing in a separate object where it belongs, but the implementation details of processed objects in their definitions where they belong.

What is the best way to solve an Objective-C namespace collision?

Objective-C has no namespaces; it's much like C, everything is within one global namespace. Common practice is to prefix classes with initials, e.g. if you are working at IBM, you could prefix them with "IBM"; if you work for Microsoft, you could use "MS"; and so on. Sometimes the initials refer to the project, e.g. Adium prefixes classes with "AI" (as there is no company behind it of that you could take the initials). Apple prefixes classes with NS and says this prefix is reserved for Apple only.
So far so well. But appending 2 to 4 letters to a class name in front is a very, very limited namespace. E.g. MS or AI could have an entirely different meanings (AI could be Artificial Intelligence for example) and some other developer might decide to use them and create an equally named class. Bang, namespace collision.
Okay, if this is a collision between one of your own classes and one of an external framework you are using, you can easily change the naming of your class, no big deal. But what if you use two external frameworks, both frameworks that you don't have the source to and that you can't change? Your application links with both of them and you get name conflicts. How would you go about solving these? What is the best way to work around them in such a way that you can still use both classes?
In C you can work around these by not linking directly to the library, instead you load the library at runtime, using dlopen(), then find the symbol you are looking for using dlsym() and assign it to a global symbol (that you can name any way you like) and then access it through this global symbol. E.g. if you have a conflict because some C library has a function named open(), you could define a variable named myOpen and have it point to the open() function of the library, thus when you want to use the system open(), you just use open() and when you want to use the other one, you access it via the myOpen identifier.
Is something similar possible in Objective-C and if not, is there any other clever, tricky solution you can use resolve namespace conflicts? Any ideas?
Update:
Just to clarify this: answers that suggest how to avoid namespace collisions in advance or how to create a better namespace are certainly welcome; however, I will not accept them as the answer since they don't solve my problem. I have two libraries and their class names collide. I can't change them; I don't have the source of either one. The collision is already there and tips on how it could have been avoided in advance won't help anymore. I can forward them to the developers of these frameworks and hope they choose a better namespace in the future, but for the time being I'm searching a solution to work with the frameworks right now within a single application. Any solutions to make this possible?
Prefixing your classes with a unique prefix is fundamentally the only option but there are several ways to make this less onerous and ugly. There is a long discussion of options here. My favorite is the #compatibility_alias Objective-C compiler directive (described here). You can use #compatibility_alias to "rename" a class, allowing you to name your class using FQDN or some such prefix:
#interface COM_WHATEVER_ClassName : NSObject
#end
#compatibility_alias ClassName COM_WHATEVER_ClassName
// now ClassName is an alias for COM_WHATEVER_ClassName
#implementation ClassName //OK
//blah
#end
ClassName *myClass; //OK
As part of a complete strategy, you could prefix all your classes with a unique prefix such as the FQDN and then create a header with all the #compatibility_alias (I would imagine you could auto-generate said header).
The downside of prefixing like this is that you have to enter the true class name (e.g. COM_WHATEVER_ClassName above) in anything that needs the class name from a string besides the compiler. Notably, #compatibility_alias is a compiler directive, not a runtime function so NSClassFromString(ClassName) will fail (return nil)--you'll have to use NSClassFromString(COM_WHATERVER_ClassName). You can use ibtool via build phase to modify class names in an Interface Builder nib/xib so that you don't have to write the full COM_WHATEVER_... in Interface Builder.
Final caveat: because this is a compiler directive (and an obscure one at that), it may not be portable across compilers. In particular, I don't know if it works with the Clang frontend from the LLVM project, though it should work with LLVM-GCC (LLVM using the GCC frontend).
If you do not need to use classes from both frameworks at the same time, and you are targeting platforms which support NSBundle unloading (OS X 10.4 or later, no GNUStep support), and performance really isn't an issue for you, I believe that you could load one framework every time you need to use a class from it, and then unload it and load the other one when you need to use the other framework.
My initial idea was to use NSBundle to load one of the frameworks, then copy or rename the classes inside that framework, and then load the other framework. There are two problems with this. First, I couldn't find a function to copy the data pointed to rename or copy a class, and any other classes in that first framework which reference the renamed class would now reference the class from the other framework.
You wouldn't need to copy or rename a class if there were a way to copy the data pointed to by an IMP. You could create a new class and then copy over ivars, methods, properties and categories. Much more work, but it is possible. However, you would still have a problem with the other classes in the framework referencing the wrong class.
EDIT: The fundamental difference between the C and Objective-C runtimes is, as I understand it, when libraries are loaded, the functions in those libraries contain pointers to any symbols they reference, whereas in Objective-C, they contain string representations of the names of thsoe symbols. Thus, in your example, you can use dlsym to get the symbol's address in memory and attach it to another symbol. The other code in the library still works because you're not changing the address of the original symbol. Objective-C uses a lookup table to map class names to addresses, and it's a 1-1 mapping, so you can't have two classes with the same name. Thus, to load both classes, one of them must have their name changed. However, when other classes need to access one of the classes with that name, they will ask the lookup table for its address, and the lookup table will never return the address of the renamed class given the original class's name.
Several people have already shared some tricky and clever code that might help solve the problem. Some of the suggestions may work, but all of them are less than ideal, and some of them are downright nasty to implement. (Sometimes ugly hacks are unavoidable, but I try to avoid them whenever I can.) From a practical standpoint, here are my suggestions.
In any case, inform the developers of both frameworks of the conflict, and make it clear that their failure to avoid and/or deal with it is causing you real business problems, which could translate into lost business revenue if unresolved. Emphasize that while resolving existing conflicts on a per-class basis is a less intrusive fix, changing their prefix entirely (or using one if they're not currently, and shame on them!) is the best way to ensure that they won't see the same problem again.
If the naming conflicts are limited to a reasonably small set of classes, see if you can work around just those classes, especially if one of the conflicting classes isn't being used by your code, directly or indirectly. If so, see whether the vendor will provide a custom version of the framework that doesn't include the conflicting classes. If not, be frank about the fact that their inflexibility is reducing your ROI from using their framework. Don't feel bad about being pushy within reason — the customer is always right. ;-)
If one framework is more "dispensable", you might consider replacing it with another framework (or combination of code), either third-party or homebrew. (The latter is the undesirable worst-case, since it will certainly incur additional business costs, both for development and maintenance.) If you do, inform the vendor of that framework exactly why you decided to not use their framework.
If both frameworks are deemed equally indispensable to your application, explore ways to factor out usage of one of them to one or more separate processes, perhaps communicating via DO as Louis Gerbarg suggested. Depending on the degree of communication, this may not be as bad as you might expect. Several programs (including QuickTime, I believe) use this approach to provide more granular security provided by using Seatbelt sandbox profiles in Leopard, such that only a specific subset of your code is permitted to perform critical or sensitive operations. Performance will be a tradeoff, but may be your only option
I'm guessing that licensing fees, terms, and durations may prevent instant action on any of these points. Hopefully you'll be able to resolve the conflict as soon as possible. Good luck!
This is gross, but you could use distributed objects in order to keep one of the classes only in a subordinate programs address and RPC to it. That will get messy if you are passing a ton of stuff back and forth (and may not be possible if both class are directly manipulating views, etc).
There are other potential solutions, but a lot of them depend on the exact situation. In particular, are you using the modern or legacy runtimes, are you fat or single architecture, 32 or 64 bit, what OS releases are you targeting, are you dynamically linking, statically linking, or do you have a choice, and is it potentially okay to do something that might require maintenance for new software updates.
If you are really desperate, what you could do is:
Not link against one of the libraries directly
Implement an alternate version of the objc runtime routines that changes the name at load time (checkout the objc4 project, what exactly you need to do depends on a number of the questions I asked above, but it should be possible no matter what the answers are).
Use something like mach_override to inject your new implementation
Load the new library using normal methods, it will go through the patched linker routine and get its className changed
The above is going to be pretty labor intensive, and if you need to implement it against multiple archs and different runtime versions it will be very unpleasant, but it can definitely be made to work.
Have you considered using the runtime functions (/usr/include/objc/runtime.h) to clone one of the conflicting classes to a non-colliding class, and then loading the colliding class framework? (this would require the colliding frameworks to be loaded at different times to work.)
You can inspect the classes ivars, methods (with names and implementation addresses) and names with the runtime, and create your own as well dynamically to have the same ivar layout, methods names/implementation addresses, and only differ by name (to avoid the collision)
Desperate situations call for desperate measures. Have you considered hacking the object code (or library file) of one of the libraries, changing the colliding symbol to an alternative name - of the same length but a different spelling (but, recommendation, the same length of name)? Inherently nasty.
It isn't clear if your code is directly calling the two functions with the same name but different implementations or whether the conflict is indirect (nor is it clear whether it makes any difference). However, there's at least an outside chance that renaming would work. It might be an idea, too, to minimize the difference in the spellings, so that if the symbols are in a sorted order in a table, the renaming doesn't move things out of order. Things like binary search get upset if the array they're searching isn't in sorted order as expected.
#compatibility_alias will be able to solve class namespace conflicts, e.g.
#compatibility_alias NewAliasClass OriginalClass;
However, this will not resolve any of the enums, typedefs, or protocol namespace collisions. Furthermore, it does not play well with #class forward decls of the original class. Since most frameworks will come with these non-class things like typedefs, you would likely not be able to fix the namespacing problem with just compatibility_alias.
I looked at a similar problem to yours, but I had access to source and was building the frameworks.
The best solution I found for this was using #compatibility_alias conditionally with #defines to support the enums/typedefs/protocols/etc. You can do this conditionally on the compile unit for the header in question to minimize risk of expanding stuff in the other colliding framework.
It seems that the issue is that you can't reference headers files from both systems in the same translation unit (source file). If you create objective-c wrappers around the libraries (making them more usable in the process), and only #include the headers for each library in the implementation of the wrapper classes, that would effectively separate name collisions.
I don't have enough experience with this in objective-c (just getting started), but I believe that is what I would do in C.
Prefixing the files is the simplest solution I am aware of.
Cocoadev has a namespace page which is a community effort to avoid namespace collisions.
Feel free to add your own to this list, I believe that is what it is for.
http://www.cocoadev.com/index.pl?ChooseYourOwnPrefix
If you have a collision, I would suggest you think hard about how you might refactor one of the frameworks out of your application. Having a collision suggests that the two are doing similar things as it is, and you likely could get around using an extra framework simply by refactoring your application. Not only would this solve your namespace problem, but it would make your code more robust, easier to maintain, and more efficient.
Over a more technical solution, if I were in your position this would be my choice.
If the collision is only at the static link level then you can choose which library is used to resolve symbols:
cc foo.o -ldog bar.o -lcat
If foo.o and bar.o both reference the symbol rat then libdog will resolve foo.o's rat and libcat will resolve bar.o's rat.
Just a thought.. not tested or proven and could be way of the mark but in have you considered writing an adapter for the class's you use from the simpler of the frameworks.. or at least their interfaces?
If you were to write a wrapper around the simpler of the frameworks (or the one who's interfaces you access the least) would it not be possible to compile that wrapper into a library. Given the library is precompiled and only its headers need be distributed, You'd be effectively hiding the underlying framework and would be free to combine it with the second framework with clashing.
I appreciate of course that there are likely to be times when you need to use class's from both frameworks at the same time however, you could provide factories for further class adapters of that framework. On the back of that point I guess you'd need a bit of refactoring to extract out the interfaces you are using from both frameworks which should provide a nice starting point for you to build your wrapper.
You could build upon the library as you and when you need further functionality from the wrapped library, and simply recompile when you it changes.
Again, in no way proven but felt like adding a perspective. hope it helps :)
If you have two frameworks that have the same function name, you could try dynamically loading the frameworks. It'll be inelegant, but possible. How to do it with Objective-C classes, I don't know. I'm guessing the NSBundle class will have methods that'll load a specific class.