I don't know well Smalltalk, but I know some Objective-C. And I'm interested a lot in Smalltalk.
Their syntax are a lot different, but essential runtime structures (that means features) are very similar. And runtime features are supported by runtime.
I thought two languages are very similar in that meaning, but there are many features on Smalltalk that absent on Objective-C runtime. For an example, thisContext that manipulates call-stack. Or non-local return that unwinds block execution. The blocks. It was only on Smalltalk, anyway now it's implemented on Objective-C too.
Because I'm not expert on Smalltalk, I don't know that sort of features. Especially for advanced users. What features that only available in Smalltalk? Essentially, I want to know the advanced features in Smalltalk. So it's OK the features already implemented on Objective-C like block.
While I'm reasonably experienced within Objective-C, I'm not as deeply versed in Smalltalk as many, but I've done a bit of it.
It would be difficult to really enumerate a list of which language has which features for a couple of reasons.
First, what is a "language feature" at all? In Objective-C, even blocks are really built in conjunction with the Foundation APIs and things like the for(... in ...) syntax requires conformance to relatively high level protocol. Can you really talk about a language any more without also considering features of the most important API(s)? Same goes for Smalltalk.
Secondly, the two are very similar in terms of how messaging works and how inheritance is implemented, but they are also very different in how code goes from a thought in your head to running on your machine. Conceptually different to the point that it makes a feature-by-feature comparisons between the two difficult.
The key difference between the two really comes down to the foundation upon which they are built. Objective-C is built on top of C and, thus, inherits all the strengths (speed, portability, flexibility, etc..) and weaknesses (effectively a macro assembler, goofy call ABI, lack of any kind of safety net) of C & compiled-to-the-metal languages. While Objective-C layers on a bunch of relatively high level OO features, both compile time and runtime, there are limits because of the nature of C.
Smalltalk, on the other hand, takes a much more top-to-bottom-pure-OO model; everything, down to the representation of a bit, is an object. Even the call stack, exceptions, the interfaces, ...everything... is an object. And Smalltalk runs on a virtual machine which is typically, in and of itself, a relatively small native byte code interpreter that consumes a stream of smalltalk byte code that implements the higher level functionality. In smalltalk, it is much less about creating a standalone application and much more about configuring the virtual machine with a set of state and functionality that renders the features you need (wherein that configuration can effectively be snapshotted and distributed like an app).
All of this means that you always -- outside of locked down modes -- have a very high level shell to interact with the virtual machine. That shell is really also typically your IDE. Instead of edit-compile-fix-compile-run, you are generally writing code in an environment where the code is immediately live once it is syntactically sound. The lines between debugger, editor, runtime, and program are blurred.
Not a language feature, but the nil-eating behaviour of most Objective-C frameworks gives a very different developing experience than the pop-up-a-debugger, fix and continue of smalltalk.
Even though Objective-C now supports blocks, the extremely ugly syntax is unlikely to lead to much use. In Smalltalk blocks are used a lot.
Objective-C 2.0 supports blocks.
It also has non-local returns in the form of return, but perhaps you particularly meant non-local returns within blocks passed as parameters to other functions.
thisContext isn't universally supported, as far as I'm aware. Certainly there are Smalltalks that don't permit the use of continuations, for instance. That's something provided by the VM anyway, so I can conceive of an Objective-C runtime providing such a facility.
One thing Objective-C doesn't have is become: (which atomically swaps two object pointers). Again, that's something that's provided by the VM.
Otherwise I'd have to say that, like bbum points out, the major difference is probably (a) the tooling/environment and hence (b) the rapid feedback you get from the REPL-like environment. It really does feel very different, working in a Smalltalk environment and working in, say, Xcode. (I've done both.)
Related
I got into a conversation with someone about OOP, who said that OOP costs to much performance. Now I know that in some cases it might, but as I see it, it would depend on different things.
Language execution. In languages using an interpretor, I can see that it could be a possibility. But what about compiled language like C++ or half compiled like Java? In any case it would just slow down the compilation vs. C, but as native or byte code I would think that the compilers would have optimized it to a point where this is not a problem.
Language structure. If we take PHP as an example, it is quite a flexible language with little rules. Java on the other hand uses strict naming schemes, strict file structure rules and is strict about data types. This speeds up lookup quite a bit. What if we used the same rules in PHP? Made it 100% OOP and adapted the same rules as Java has, would this not speed up PHP?
I found a really great OOP example, but this example does not prove the upside of OOP, but rather the upside of overview and structure. It's no problem using PP to do the same, at least not in PHP.
OOP is a very moot term and that's why your question is moot as well.
On the most generic level, OOP is about objects (let's not dive into what they are) encapsulating some state and passing each other messages to enquire or change that state. As you can see, these objects might be processes running on separate network-connected machines and message passing might be done quite literally—by passing messages of some application-level protocol over that network; this is the one extreme. The opposite edge of this spectrum is, say, C or C++ or Object Pascal etc which are compiled down to machine instructions and in which objects are just memory regions. I reckon the only "interesting" topic is a language on this side of the OOP spectrum, right?
In this, down-to-machine, level, the only relevant slow down I perceive is dynamic dispatch which is what is typically used to implement implementation inheritance (class Bar extends class Foo as in PHP) which allows you to pass objects of derived classes to the code expecting objects of their base class. This is typically requires a lookup through the table of methods at runtime to select a relevant method.
Note that this is not somehow inherent to the concept of OOP. For instance, dynamic lookup like this has been routinely used in plain C code even before C++ came to existence, and C is not an OOP language.
What I'm leading you to, is that some ways to access data and code cost more than others in terms of performance but provide powerful programming tools. Picking such an algorythm while considering a resulting tradeoff is not at all peculiar to implementing OOP concepts and happens in any computing field and any computing paradigm or a combinations of them.
In the end, I would say that the most visible slow-downs will come not from the code running on a CPU but rather from the runtime system. For instance, PHP is known for its ability to dynamically load code at runtime. Does this count as a feature of it being an OOP-enabled language? On the one hand, in these days of heavy frameworks, when PHP loads something it usually loads definitions of classes. On the other hand, if these frameworks were, say, purely procedural the same performance cost would be incurred (as the most loading time is spent waiting on I/O). Interpreted or JIT-compiled languages have to interpret or compile the code they execute and this incurs pefrormance hits. Does this depend on some of these languages implementing OOP concepts? Unlikely, IMO.
I'm currently working on a project for my class. I'm building a compiler with Flex (lex) and Bison (YACC) and C. I have done just a little bit of semantic an syntax analysis but i have been thinking how im going to implement the object oriented part. That is, how can I handle classes, overloading, polymorphism and heritage.
I can't seem to find something useful on google and the dragon book its just too low level. I mean too focused on building a compiler from scratch. So I was hoping that someone could point me to a good book, tutorials,example, something that can help me to clear my doubts.
Thanks in advance for the help, and im sorry if someone thinks this is asking to have my homework done.
I agree with the first comment that this question is far too broad to be answered. But I'll try anyway.
There are several aspects to your question:
What are the semantics of the commonly used concepts of object-oriented programming?
How can they be implemented in a compiler?
How are they usually implemented in other compilers?
What are good resources for further studies?
Semantics
Varies widely between languages and there also is quite a bit of confusion/controversity about what OOP actually means (a nice presentation on that topic: http://www.infoq.com/presentations/It-Is-Possible-to-Do-OOP-in-Java which also has some examples of implementing OOP-features). Just pick one model and look up a reference that defines the semantics such as a language specification or a scientific paper on the model.
Javascript probably is the easiest model to implement as it very directly maps to the implementation without much of a necessary surrounding framework in the compiler. A static version of the Java model (compile time class compilation instead of runtime classloading) shouldn't be too hard either. More sophisticated models would be C++ (which allows multiple inheritance) and Smalltalk or Common Lisp/CLOS (with Meta-object-protocols).
Possible Implementations
Again a wide range of choices. As the semantics are fixed and mostly rather straightforward the implementation effort most strongly depends on the performance you want to archive and the existing infrastructure of your compiler. Storing everything in lists and scanning them for the first entry that satisfies the rules probably is the easiest implementation.
Usual Implementation
Most programming languages out of the Java/C#/C++ area do static compile-time name/signature lookups to find the definitions of the things referred to and use a http://en.wikipedia.org/wiki/Virtual_method_table to resolve polymorphic calls. They also use the Vtable pointer for instanceof-checks and for checking down-casts.
Resources
While only 30 pages are directly concerned with objects I still think Lisp in Small Pieces (LiSP) is a great book for learning to work at that level within a compiler. It focusses on implementing language features, trade-offs in implementations and fitting the pieces together. (if (you can get over the the syntax used) (it's great)).
I'd like to learn objective-c and Cocoa. I want yo to ask if you can recommend me any kind of thing to learn that language and the cocoa framework (For Mac OS X Development). I currently know PHP. Will be difficult to learn Obj-C coming from PHP?
PD: English is not first language, I have a quite good level. Would it be that difficult to learn with my knowledge of English?
Excellent! It's inspiring to hear you want to learn some new programming languages.
Objective-C is an interesting language because it is a superset of ANSI-C that includes Message Passing. You may consider learning C first, because it will help you learn some computer science fundamentals that aren't relevant to PHP, and once you know C, Objective C is much easier to understand.
Also, I find that when learning a new language, it helps very much to understand some of the differences between them. (Forgive me if you already understand the following information or if it is too basic!)
PHP is an interpreted language. Thus, every time you run a PHP script, the php binary or CGI perhaps decides what to do with each function call or statement you make in the script. On the other hand, C is a compiled language. This means that first you write C code, then "compile" it into assembly language (which is a written-language representation of machine code, more or less) and then you assemble it down to machine code (1's and 0's.)
Thankfully, you don't have to do these steps yourself! The compiler and assembler do these. The point is that C code ultimately is transformed into a binary application that runs right on the computer's processor without being interpreted.
You'll need to learn how to manage memory and data structures on your own. In PHP, memory for variables and structures is automatically allocated for you. In C or Objective-C, your application will need to do this using a function call or message. In addition, you will need to dispose of the memory when your application no longer needs the variable or data structure.
PHP is what is called a "loosely" or "dynamically" typed language, meaning that checking a type of a variable (for the purposes of converting one type to another) is done while the script runs.
On the other hand, C and Objective-C are (mostly) statically typed, which means that type conversions are checked when the application is compiled.
Finally, Objective-C also has message passing, which is similar to a function call, although a message is always sent to an object.
There are many other differences as well, but these are some of the main ones. Feel free to comment with questions.
Also, to others, feel free to point out any errors or things I may have missed.
In addition to Tom's answer, I would say that you need a good understanding of memory allocation and pointers. These are new concepts coming from PHP.
I would recommend learning and practicing the different layers of memory management from simple to complex, and I would use other languages as a bridge from PHP to Objective-C:
stack allocation (C)
raw heap allocation: malloc() / free() (C)
smart heap allocation: C++ new / delete (C++)
automatic memory management based on references (C#)
reference counting and garbage collection (Objective-C)
And the tools to handle that memory:
value types, pointers and arrays (C)
pointers to objects (C++)
references (C++ and C#)
This will help you understand the difference between a block of memory and a pointer or reference that points to that block of memory.
Good Luck
Of the object-oriented languages I know, pretty much all but C++ and Objective-C compile to bytecode running on some sort of virtual machine. Why have so many different languages settled on compiling to bytecode, as opposed to machine code? Is it possible in princible to have a high-level memory-managed OOP language that compiled to machine code?
Edit: I'm aware that multiplatform support is often advanced as an advantage of this approach. However, it's quite possible to compile natively on multiple platforms, without making a new compiler per platform. One can, per example, emit C code and then compile that with GCC.
There's no reason in fact, this is a kind of coincidence. OOP now is the leading concept in "big" programming, and so virtual machines are.
Also note, that there are 2 distinct parts of traditional virtual machines - garbage collector and bytecode interpreter/JIT-compiler, and these parts can exist separately. For example, Common Lisp implementation called SBCL compiles program to a native code, but at runtime heavily uses garbage collection.
This is done to allow a VM or JIT compiler the chance to compile the code on demand optimally for the architecture on which the code is executed. Also, it allows for cross-platform bytecode to be created once and then executed on multiple hardware architectures. This allows for hardware specific optimizations to be placed into the compiled code.
Since byte code is not limited to a microarchitecture, it can be smaller than machine code. Complex instructions can be represented vs. the much more primitive instructions available in modern day CPUs, since the constraints in the design of CPU instructions are very different from the constraints in designing a bytecode architecture.
Then there's the issue of security. The bytecode can be verified and analyzed prior to execution (i.e., no buffer overflows, variables of a certain type being accessed as something they are not), etc...
Java uses bytecode because two of its initial design goals were portability and compactness. Those both came from the initial vision of a language for embedded devices, where fragments of code could be downloaded on the fly.
Python, Ruby, Smalltalk, javascript, awk and so on use bytecode because writing a native compiler is a lot of work, but a textual interpreter is too slow - bytecode hits a sweet spot of being fairly easy to write, but also satisfactorily quick to run.
I have no idea why the Microsoft languages use bytecode, since for them, neither portability nor compactness is a big deal. A lot of the thinking behind the CLR came out of computer scientists in Cambridge, so i imagine considerations like ease of program analysis and verification were involved.
Note that as well as C++ and Objective C, Eiffel, Ada 9X, Vala and Go are OO languages (of varying vintage) that are compiled straight to native code.
All in all, i'd say that OO and bytecode do not go hand in hand. Rather, we have a coincidental convergence of several streams of development: the traditional bytecoded interpreters of scripting languages like Python and Ruby, the mad Gosling masterplan of Java, and whatever it is Microsoft's motives are.
The biggest reason why most interpreted languages (not specifically OO languages) are compiled to bytecode is for performance. The most expensive part of interpreting code is transforming text source to an intermediate representation. For instance, to perform something like:
foo + bar;
The interpreter would have to scan 10 characters, transform them into 4 tokens, build an AST for the operation, resolve three symbols (+ is a symbol, which depends on the types of foo and bar), all before it can perform any action that actually depends on the run-time state of the program. None of this can change from run to run, and so many languages try to store some form of intermediate representation.
bytecode, rather than storing an AST has a few advantages. For one, bytecodes are easy to serialize, so the IR can be written to disk and reused at the next invocation, further reducing interpretation time. Another reason is that bytecode often takes up less actual ram. significantly bytecode representations are often easy to just in time compile, because they are often structurally similar to typical machine code.
As another data point, the D programming language is GC'ed, OO, and a lot higher level than C++ while still being compiled to native code.
Bytecode is significantly more flexible medium than machine code. First, it provides the basis for platform portability without the need for a compiler or shipping source code. So a developer can distribute a single version of the application without needing to give up the source, require complex developer tools, or anticipate potential target platforms. While the later is not always practical it does happen. Especially with developer libraries say I distribute a library that I've only tested on Windows, but someone else uses it on Linux or Android. It happens quite frequently actually, and most of the time it works as expected.
Byte code is also generally more optimized that an interpreter because it's closer to machine instructions therefore faster to translate to machine instructions. Not all OO languages are compiled. Ruby, Python, and even Javascript are interpreted so they aren't compiled to anything so the ruby interpreter has to take a very flexible language and turn that into instructions, but that flexibility comes at a price paid an runtime: parse text, generate AST, translate AST to machine code, etc. It's also easy to do optimizations like JIT where byte code is translated to machine code directly, and even gives the possibility for creating optimizations for specific hardware.
Finally, just because one language compiles to bytecode doesn't preclude other languages taking advantage of of that byte code. Now any optimization using that byte code can be applied to these other languages that might know how to translate themselves to that byte code. That makes the byte code a very important layer for reusability for other languages.
OO and byte code compilation goes back to the 70s with Smalltalk, and I'm sure someone will say LISP as early as the 50s/60s. But, it really wasn't until the 90s that it started to really be used in production systems on a large scale.
Native compilation sounds like the optimal path, and probably why our industry spent 20 years or more thinking that was THE ANSWER to all our problems, but the last 15 years we've seen byte code compilation take stage and it's been a significant advantage over what we did before. Looking back we realize how much time wasted natively compiling everything mostly by hand.
I agree with Chubbard's answer and I'd add that in OO languages type information can be very important for enabling optimizations by virtual-machines or last-level compilers
It is easier to develop an interpreter than a compiler.
Effort in development of...:
interpreter < bytecode-interpreter < bytecode-jit-compiler < compiler-to-platform-independent-language < compiler-to-multiple-machine-dependent-assembler.
It is a general trend to stop the development at jit-compilers because of platform independence. Only the preferred languages in respect to performance and research in theoretical computer science are and will be developed in ALL possible directions, including new bytecode-interpreter, even while there are good and advanced compilers to platform independent languages and to different machine-dependant assemblers.
The research in OOP languages is pretty ...let's say dull, compared to functional languages, because really new language and compiler technologies are more easily expressed with/in/using mathematical cathegory theory and mathematical descriptions of touring-complete type-systems. In other words: it is nearly functional in itself, while imperative languages are nearly only assembler-frontends with some syntactic sugar. OOP languages tend to be imperative languages, because functional languages have already closures and lambda. There are other ways to implement java-like "interfaces" in functional languages, and there is just no need for additional object oriented features.
In i.e. Haskell, adding the feature of OOP-like programming would probably be more than only a few steps back in technology – there would be no point in using that. (<- that is not only IMHO... you ever heard of GADTs or Multi-parameter-type-classes?) Probably there might be even better ways to dynamically create Objects with Interfaces to communicate with OOP-languges than changing that language itself. But there are other functional languages, too, that explicitely combine functional and OOP aspects. There is just more science with mainly functional languages than non-functional OO-languages.
OO languages can not be easily compiled to other OO languages, iff they are in some way more "advanced". Usually, they have features like stack-protector, advanced debugging abilities, abstract and inspectable multi-threading, dynamic object-loading from files from the internet... Many of these features are not or not-easily realisable with C or C++ as compiler-backend. The functional language LISP (which is 50 years old!) was AFAIK the first with garbage collector. As compiler-backend LISP used a hacked version of the language C, because plain C did not allow some of those things, assembler did allow, i.e. proper-tail-calls or tables-next-to-code. C-- allows that.
An other aspect: Imperative languages are intended to run on a specific architecture, i.e. C and C++ programs run on only those architectures, they are programmed for. Java is more extreme: it runs only on a single architecture, a virtual one, which itself runs on others.
Functional languages are usually by design pretty architecture-independent: LISP was developed to be so immense architecture-unspecific, that it could be compiled to genetic code, in some distant future. Yes, like programs running in living biologic cells.
With the bytecode for the LLVM, functional languages will most-likely be compiled to bytecode in the future, too. Most imperative languages will most likely still have the same inherited problems as they have now from not-abstracting-far-enough. Well, I'm not that sure about clang and D, but those two are not "the most" anyway.
I've become very comfortable in the world of pointer-free, garbage-collected programming languages. Now I have to write a small Mac component. I've been learning Objective-C, but as I confront the possibility of dangling pointers and the need to manage retain counts, I feel disheartened.
I know that Objective-C now has garbage collection but this only works with Leopard. My component must work with Tiger, too.
I need to access some Cocoa libraries not available to Java, so that rules out my usual weapon of choice.
What are my alternatives? Especially with no explicit pointers and automatic garbage collection.
What do you mean by "component?" Do you mean a chunk of code or a library you are going to hand to other people to link into their apps? If so then it is not realistic to use any of the bridged languages at this time. While a lot of the bridges are very nice, they almost always have complications and issues that most app developers will not be willing to deal with to use a single component, especially if it involves bringing in a substantial runtime.
The bridges are most valuable to bridge other language libraries into your Objective C app. While you can write fairly complete apps using them, doing so often requires a better understanding of Objective C than simply writing an Objective C application, since you need to understand and cope with the language, object model, threading, and memory allocation impedance mismatches that occur.
This is also why many people argue that even if you are quite familiar with a language, trying to learn Cocoa using that language through a bridge is generally more difficult that learning it using Objective C.
Finally, much of the recent support for bridged languages was due to "BridgeSupport," a feature was added in Leopard. Even bridges that predate that have been migrating towards, sometimes in such a way that using the bridged language on Tiger and Leopard can have substantial differences. Also, there is currently no bridge support for iPhone, and most bridged languages will not work on it, if that is an issue.
Ultimately, if you are writing a library that is going to be linked into other apps, you need to run on Tiger and Leopard, and you need to access Cocoa only APIs I think you will find using any non-Objective C solution quite difficult.
You can try PyObjC to write Cocoa apps in python, or MacRuby if you are interested in Ruby.
You shouldn't be intimidated by Cocoa's retain/release reference counting. It's much, much easier in practice than GC fans would have you believe. The Cocoa memory management rules are dead simple, they only affect a tiny amount of your code, and even that code can be generated automagically.
Here's the trick. You encapsulate your MM code in accessor methods, and always use accessors. Xcode has built-in scripts to generate the appropriate accessors, or if you need more flexibility there are 3rd-part apps like Accessorizer.
This isn't an intrusive approach - you only need to worry about retaining an object if you're going to need to keep it for later use, and if you're going to do that you'll need an instance variable in which to keep it anyway. And, if you're using KVO and bindings, you'll need to use accessors to make sure the appropriate observer notifications are fired. Basically, if you're using good OOP and Cocoa practices, there's practically no additional thought or effort involved with memory management.
Most folks who have difficulty with Cocoa's "manual" memory management are doing so as a result of misusing it. The most common mistake is to scatter the relevant code all over the place. That means that a missing retain, extra release, etc will be difficult to find.
Try any of the Cocoa bridges listed in here http://www.cocoadev.com/index.pl?CocoaBridges
You can also try F-Script - a smalltalk dialect that is written specifically for MacOSX/Cocoa.
RubyCocoa is getting steadily more impressive, and I've seen lots of successful implementations using it. That is, of course, if Ruby's your cup of tea...
You can always use REALbasic (www.realsoftware.com). Real easy and fun to use, not free though. You can not make dylibs (or dll's) using it, but you can use dylibs and dll's in your code. And you can use cocoa libraries as well
Don't forget that you can use java as well, and i don't mean java-cocoa bridge, i mean actual java.
There's also a package from apple that provides you with access to a couple of osx features as well.
Also to comment on Shem's point, if your targeting osx 10.5 and above, you can take advantage of garbage collection.
If you want lisp syntax then Nu is a lisp implemented on top of Objective-C http://www.programming.nu/
Also, FreePascal can generate native Carbon Apps (in progress working for Coccoa)
Look at Python and wxPython (the wxWidgets in Python).
The wxWidgets have a very elegant App-Doc-View application design pattern that's very, very nice. It's not used enough, IMO. I haven't found any wxPython examples of this App-Doc-View example, so you have to use the C examples to reason out how it would work in Python.
I'd post examples, but I haven't got it all working yet, either.
.NET via Mono mono-project.com
See NObjective (http://code.google.com/p/objcmapper/) bridge to Cocoa. It provides more features than others with less overheads.
I'm looking into Mono too. Objective-C is a little too bizarre for me at this point. Too many years doing C/C++, Java, C#, Perl etc. I suppose. All these seem pretty easy to float between. Not so for Objective-C. Love my Mac but afraid it will take too much precious time to master the language.