How does find-by-example work in the Pharo Finder? - smalltalk

One of the things I was most impressed with when digging into Pharo was that the Finder could do find-by-example. I'd previously only seen this in languages like Haskell, where it's possible to know for certain that a function has no side effects. How does Pharo manage to implement this in a way that is safe, performant, and side-effect free?

Magic :)
Actually... although I've been dreaming about creating the list from the tests automatically, the reality is that we manually maintain a list of safe messages (obviously error-prone; I seriously doubt it's 100% accurate). See MethodFinder>>#initialize.
So a trick, but not exactly magic ;)

Related

Avoid using new language features because unfamiliar to most programmers?

While reading "Python scripting for computational science" I came across the following text in the section discussing generators:
Whether to rapidly write a generator or to implement the class methods __iter__ and __next__ depends on the application, personal tast, readibility, and complexity of the iterator. Since generators are very compact and unfamiliar to most programmers, the code often becomes less readable than a corresponding version using __iter__ and __next__.
This led me to wonder whether unfamiliarity (of other programmers) is a good reason NOT to use relatively new and powerful features of a language (like Python generators).
If you don't use it, how can it ever become popular and familiar?
So, my question: is unfamiliarity sometimes a good reason not to use new language features?
Your own unfamiliarity with a language feature may be a good reason to tread lightly. For example, in C#, if you aren't certain about the differences between object y = func1() ?? func2(); and object y = func1() != null ? func1() : func2(); (hint: left-to-right order of evaluation), then maybe you are better off writing the corresponding if clause just because it's clearer what actually is going on. Someone who knows the nuances of the language better may very well come around and refactor later, and in the meantime, the cost is usually low.
However, if you know how to use a language feature, I see little reason to avoid using it simply because others may find it difficult to understand. If you really feel the need to, then add a comment (such as, perhaps, "?? is the _null coalescing operator_") to help fellow developers know what to look for if they can't figure out from the code alone what it is doing, and you are afraid that they may have to go it alone.
This, mind you, is about production code. Experimenting certainly has its place, but its place is not necessarily in the mainline codebase. I always keep a "scratch" project handy for when I want to try something out without risking impact to anything else. There, I often take liberties far beyond those I take in production or to-be-production code.
I wouldn't say that unfamilarity is a good reason to not use new language features. Or for that matter, use new languages.
Lack of support for a new feature across tool vendors could be a reason if you have any concerns about working with multiple vendors.
Since the question is subjective, I'll express the contrary opinion.
If you work where there are code reviews, you'll find out soon enough what your co-workers consider "unfamiliar".
Since they also have to maintain the code, you can try and help them become familiar with the "unfamiliar" code. But, it's ultimately a judgment call, and sometimes, what you think is clear code, isn't.

Do they always hide implementation files?

I am getting started with CoreAnimation and want to know more about some of the implementations. Where (if possible) can I find the implementation files?
This is my first go with a proprietary framework (I come from web world) so I guess it makes sense for them to be off-limits.
I mean, the .h and docs are OK, but it would be nice to see what's actually happening!
This seems like such a stupid question, but I just want to make sure that I'm not missing something painfully obvious: I'm not supposed to see how they actually do things, am I? It's just... what if I want to over-ride something and can't see how it's done?
Very few of Apple's frameworks are open-source. For the most part, if something is meant to be overridden, the documentation will contain the information you need to do it. There are a few rough edges, but that's generally the way it is. You shouldn't, ideally, need to see the implementation details — they are subject to change anyway. It's a sign of bad library design if you actually need to read a method's implementation to use it (not to say that it wouldn't be nice to have the source, but clients of the library shouldn't need it).
If you really like having the source to everything, Linux or BSD might interest you. (I personally don't feel the tradeoff in usability is worth it, but your mileage may vary.)

Libraries vs Original Code?

I'm working on an LGPL game engine library and I prefer to code without dependencies. So far I have windowing code using Xlib and OpenGL code. But I'm worried that eventually I'll need to use libraries anyway. This may be the case, I can write my own image loading stuff and much more, but I can't write audio code or networking code.
Now, I'm wondering, is it best to do it all myself for the learning experience? I'm sure I could figure it out, but what I'm really worried about is having bugs in my code that libraries have solved.
Now, if I do use libraries, that'd make it pointless to write original code and just use libraries.
I'm sorry if this is a hard thing, but I have OCD and it's either one or the other or some kind of solution like writing original code and having libraries as alternatives (since everything is abstracted anyway).
I do use libraries, that'd make it pointless to write original code and just use libraries.
Right.
Notice that everyone seems to use libraries of other people's code.
Download a few dozen large, sophisticated open-source projects and look at the dependencies.
You can climb higher by standing on the shoulders of giants.
Use other people's code early and often. The "No Dependencies" life-style can't exist unless you write your own OS and language.
but I have OCD and
Doesn't matter. Keep your personal issues to yourself. Seriously. If you refuse to make a technical decision based on the technology, consider another line of work.
Personally, I write original code before I use a library to do it for me. I like to know how it works and I learn best by actually doing it. Some people can understand it better by simply reading through the libraries. It depends on what suits you best.
I would definitely use libraries for large projects to help avoid bugs though.
The libraries are there for us to use. There is no point in stressing yourself out to do something that is already done.
If there's something the libraries aren't offering, then you write your own code to satisfy that specific need. It is much faster and efficient to do it this way.

How do you write good highly useful general purpose libraries?

I asked this question about Microsoft .NET Libraries and the complexity of its source code. From what I'm reading, writing general purpose libraries and writing applications can be two different things. When writing libraries, you have to think about the client who could literally be everyone (supposing I release the library for use in the general public).
What kind of practices or theories or techniques are useful when learning to write libraries? Where do you learn to write code like the one in the .NET library? This looks like a "black art" which I don't know too much about.
That's a pretty subjective question, but here's on objective answer. The Framework Design Guidelines book (be sure to get the 2nd edition) is a very good book about how to write effective class libraries. The content is very good and the often dissenting annotations are thought-provoking. Every shop should have a copy of this book available.
You definitely need to watch Josh Bloch in his presentation How to Design a Good API & Why it Matters (1h 9m long). He is a Java guru but library design and object orientation are universal.
One piece of advice often ignored by library authors is to internalize costs. If something is hard to do, the library should do it. Too often I've seen the authors of a library push something hard onto the consumers of the API rather than solving it themselves. Instead, look for the hardest things and make sure the library does them or at least makes them very easy.
I will be paraphrasing from Effective C++ by Scott Meyers, which I have found to be the best advice I got:
Adhere to the principle of least astonishment: strive to provide classes whose operators and functions have a natural syntax and an intuitive semantics. Preserve consistency with the behavior of the built-in types: when in doubt, do as the ints do.
Recognize that anything somebody can do, they will do. They'll throw exceptions, they'll assign objects to themselves, they'll use objects before giving them values, they'll give objects values and never use them, they'll give them huge values, they'll give them tiny values, they'll give them null values. In general, if it will compile, somebody will do it. As a result, make your classes easy to use correctly and hard to use incorrectly. Accept that clients will make mistakes, and design your classes so you can prevent, detect, or correct such errors.
Strive for portable code. It's not much harder to write portable programs than to write unportable ones, and only rarely will the difference in performance be significant enough to justify unportable constructs.
Even programs designed for custom hardware often end up being ported, because stock hardware generally achieves an equivalent level of performance within a few years. Writing portable code allows you to switch platforms easily, to enlarge your client base, and to brag about supporting open systems. It also makes it easier to recover if you bet wrong in the operating system sweepstakes.
Design your code so that when changes are necessary, the impact is localized. Encapsulate as much as you can; make implementation details private.
Edit: I just noticed I very nearly duplicated what cherouvim had posted; sorry about that! But turns out we're linking to different speeches by Bloch, even if the subject is exactly the same. (cherouvim linked to a December 2005 talk, I to January 2007 one.) Well, I'll leave this answer here — you're probably best off by watching both and seeing how his message and way of presenting it has evolved :)
FWIW, I'd like to point to this Google Tech Talk by Joshua Bloch, who is a greatly respected guy in the Java world, and someone who has given speeches and written extensively on API design. (Oh, and designed some exceptionally good general purpose libraries, like the Java Collections Framework!)
Joshua Bloch, Google Tech Talks, January 24, 2007:
"How To Design A Good API and Why it
Matters" (the video is about 1 hour long)
You can also read many of the same ideas in his article Bumper-Sticker API Design (but I still recommend watching the presentation!)
(Seeing you come from the .NET side, I hope you don't let his Java background get in the way too much :-) This really is not Java-specific for the most part.)
Edit: Here's another 1½ minute bit of wisdom by Josh Bloch on why writing libraries is hard, and why it's still worth putting effort in it (economies of scale) — in a response to a question wondering, basically, "how hard can it be". (Part of a presentation about the Google Collections library, which is also totally worth watching, but more Java-centric.)
Krzysztof Cwalina's blog is a good starting place. His book, Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries, is probably the definitive work for .NET library design best practices.
http://blogs.msdn.com/kcwalina/
The number one rule is to treat API design just like UI design: gather information about how your users really use your UI/API, what they find helpful and what gets in their way. Use that information to improve the design. Start with users who can put up with API churn and gradually stabilize the API as it matures.
I wrote a few notes about what I've learned about API design here: http://www.natpryce.com/articles/000732.html
I'd start looking more into design patterns. You'll probably not going to find much use for some of them, but as you get deeper into your library design the patterns will become more applicable. I'd also pick up a copy of NDepend - a great code measuring utility which may help you decouple things better. You can use .NET libraries as an example, but, personally, i don't find them to be great design examples mostly due to their complexities. Also, start looking at some open source projects to see how they're layered and structured.
A couple of separate points:
The .NET Framework isn't a class library. It's a Framework. It's a set of types meant to not only provide functionality, but to be extended by your own code. For instance, it does provide you with the Stream abstract class, and with concrete implementations like the NetworkStream class, but it also provides you the WebRequest class and the means to extend it, so that WebRequest.Create("myschema://host/more") can produce an instance of your own class deriving from WebRequest, which can have its own GetResponse method returning its own class derived from WebResponse, such that calling GetResponseStream will return your own class derived from Stream!
And your callers will not need to know this is going on behind the scenes!
A separate point is that for most developers, creating a reusable library is not, and should not be the goal. The goal should be to write the code necessary to meet requirements. In the process, reusable code may be found. In that case, it should be refactored out into a separate library, where it can be reused in the future.
I go further than that (when permitted). I will usually wait until I find two pieces of code that actually do the same thing, or which overlap. Presumably both pieces of code have passed all their unit tests. I will then factor out the common code into a separate class library and run all the unit tests again. Assuming that they still pass, I've begun the creation of some reusable code that works (since the unit tests still pass).
This is in contrast to a lesson I learned in school, when the result of an entire project was a beautiful reusable library - with no code to reuse it.
(Of course, I'm sure it would have worked if any code had used it...)

To monkey-patch or not to?

This is more general question then language-specific, altho I bumped into this problem while playing with python ncurses module. I needed to display locale characters and have them recognized as characters, so I just quickly monkey-patched few functions / methods from curses module.
This was what I call a fast and ugly solution, even if it works. And the changes were relativly small, so I can hope I haven't messed up anything. My plan was to find another solution, but seeing it works and works well, you know how it is, I went forward to other problems I had to deal with, and I'm sure if there's no bug in this I won't ever make it better.
The more general question appeared to me though - obviously some languages allow us to monkey-patch large chunks of code inside classes. If this is the code I only use for myself, or the change is small, it's ok. What if some other developer takes my code though, he sees that I use some well-known module, so he can assume it works as it's used to. Then, this method suddenly behaves diffrent then it should.
So, very subjective, should we use monkey patching, and if yes, when and how? How should we document it?
edit: for #guerda:
Monkey-patching is the ability to dynamicly change the behavior of some piece of code at the execution time, without altering the code itself.
A small example in Python:
import os
def ld(name):
print("The directory won't be listed here, it's a feature!")
os.listdir = ld
# now what happens if we call os.listdir("/home/")?
os.listdir("/home/")
Don't!
Especially with free software, you have all the possibilities out there to get your changes into the main distribution. But if you have a weakly documented hack in your local copy you'll never be able to ship the product and upgrading to the next version of curses (security updates anyone) will be very high cost.
See this answer for a glimpse into what is possible on foreign code bases. The linked screencast is really worth a watch. Suddenly a dirty hack turns into a valuable contribution.
If you really cannot get the patch upstream for whatever reason, at least create a local (git) repo to track upstream and have your changes in a separate branch.
Recently I've come across a point where I have to accept monkey-patching as last resort: Puppet is a "run-everywhere" piece of ruby code. Since the agent has to run on - potentially certified - systems, it cannot require a specific ruby version. Some of those have bugs that can be worked around by monkey-patching select methods in the runtime. These patches are version-specific, contained, and the target is frozen. I see no other alternative there.
I would say don't.
Each monkey patch should be an exception and marked (for example with a //HACK comment) as such so they are easy to track back.
As we all know, it is all to easy to leave the ugly code in place because it works, so why spend any more time on it. So the ugly code will be there for a long time.
I agree with David in that monkey patching production code is usually not a good idea.
However, I believe that for languages that support it, monkey patching is a very valuable tool for unit testing. It allows you to isolate the piece of code you need to test even when it has complex dependencies - for instance with system calls that cannot be Dependency Injected.
I think the question can't be addressed with a single definitive yes-no/good-bad answer - the differences between languages and their implementations have to be considered.
In Python, one needs to consider whether a class can be monkey-patched at all (see this SO question for discussion), which relates to Python's slightly less-OO implementation. So I'd be cautious and inclined to expend some effort looking for alternatives before monkey-patching.
In Ruby, OTOH, which was built to be OO down into the interpreter, classes can be modified irrespective of whether they're implemented in C or Ruby. Even Object (pretty much the base class of everything) is open to modification. So monkey-patching is rather more enthusiastically adopted as a technique in that community.