When implementing an interpreter, is it a good or bad to piggyback off the host language's garbage collector? - language-design

Let's say you are implementing an interpreter for a GCed language in a language that is GCed.
It seems to me you'd get garbage collection for free as long as you are reasonably careful about your design.
Is this generally how it's done? Are there good reasons not to do this?

Language and runtime are two different things. They are not really related IMHO.
Therefore, if your existing runtime offers a GC already, there must be a good reason to extend the runtime with another GC. In the good old days when memory allocations in the OS were slow and expensive, apps brought their own heap managers which where more efficient in dealing with small chunks of data. That was one readon for adding another memory management to an existing runtime (or OS). But if you're talking Java, .NET or so - those should be good and efficient enough for most tasks at hand.
However, you may want to create a proper interface/API for memory and object management tasks (and others), so that your language ("guest") runtime could be implemented on to of another host runtime later on.

For an interpreter, there should be no problem with using the host GC, IMHO, particularly at first. As always they goal should be to get something working, then make it work right, then make it fast. This is particularly true for Domain Specific Languages (DSL) where the goal is a small language. For these, implementing a full GC would be overkill.

Related

Pros and Cons of using object oriented programming for progress openedge

I understand the pros and cons of using object oriented programming as a concept. What I'm looking for are the pros and cons of using oo in progress/openedge specifically. Are there challenges that I need to take into account? Are there parts of the language that don't mesh well with oo? Stuff like that.
Edit: using 10.2b
I'll give you my opinion, but be forewarned that I'm probably the biggest Progress hater out there. ;) That said, I have written several medium-sized projects in OOABL so I have some experience in the area. These are some things I wrote, just so you know I'm not talking out of my hat:
STOMP protocol framework for clients and servers
a simple ORM mimicking ActiveRecord
an ABL compiler interface for the organization I was at (backend and frontend)
a library for building up Excel and Word documents (serializing them using the MS Office 2003 XML schemas; none of that silly COM stuff)
an email client that could send emails using multiple strategies
OOABL Pros:
If you absolutely must write Progress code, it is a great option for creating reusable code.
Great way to clean up an existing procedural codebase
OOABL Cons:
Class hierarchies are limited; you can’t create inherited (sub-)
interfaces in 10.2B (I think this was going to be added in 11). Older
versions of OpenEdge have other limitations like lack of abstract
classes. This limits your ability to create clean OO design and will
hurt you when you start building non-trivial things.
Error handling sucks - CATCH/THROW doesn’t let you throw your custom
errors and force callers to catch them. Backwards compatibility
prevents this from evolving further so I doubt it will ever improve.
Object memory footprint is large, and there are no AVM debugging
tools to track down why (gotta love these closed systems!)
Garbage collection wasn’t existent ‘til 10.2A, and still
has some bugs even in 11 (see official OE forum for some examples)
Network programming (with sockets) is a PITA - you have to run a
separate persistent procedure to manage the socket. I think evented
programming in OOABL was a PITA in general; I remember getting a lot
of errors about “windowed environments” or something to that effect
when trying to use them. PUBLISH/SUBSCRIBE didn’t work either,
if memory serves.
Depending on your environment, code reviews may be difficult as most
Progress developers don’t do OOABL so may not understand your code
If above point is true, you may face active resistance from
entrenched developers who feel threatened by having to learn new
things
OO is all about building small, reusable pieces that can be combined to make a greater whole. A big problem with OOABL is that the “ABL” part drags you down with its coarse data structures and lack of enumerators, which prevent you from really being able to build truly beautiful things with it. Unfortunately, since it is a closed language you can’t just sidestep the hand you’re dealt and create your own new data or control structures for it externally.
Now, it is theoretically possible to try and build some of these things using MEMPTRs, fixed arrays (EXTENT), and maybe WORK-TABLEs. However, I had attempted this in 10.1C and the design fell apart due to the lack of interface inheritance and abstract classes, and as I expected, performance was quite bad. The latter part may just be due to my poor ability, but I suspect it's an implementation limitation that would be nigh impossible to surmount.
The bottom line is use OOABL if you absolutely must be coding in OpenEdge - it’s better than procedural ABL and the rough edges get slightly smoother after each iterative release of OpenEdge. However, it will never be a beautiful language (OO or otherwise).
If you want to learn proper object-oriented programming and aren’t constricted to ABL, I would highly recommend looking at a language that treats objects as a first-class citizen such as Ruby or Smalltalk.
In the last four years I have worked 80% of the time with OOABL (started with 10.1c).
I definitely recommend using OOABL but I think it is very important to consider that using OOABL the same way as in other OO languages is fraught with problems.
With "the same way" I mean design patterns and implementation practices that are common in the oo world. Also some types of applications, especially in the area of technical frameworks are hard to do with OpenEdge (e.g. ORM).
The causes are performance problems with OOABL and missing OO features in the language.
If you are programming in C# or Java for exampe, memory footprint and instantiation time of objects are not a big issue in many cases. Using ABL this becomes a big issue much more often.
This leads to other design decisions and prevents the implementation of some patterns and frameworks.
Some missing or bad OO features:
No class library, no data structures needed for oo
No package visibility as in java (internal in c#)
This becomes relevant especially in larger applications
No Generics
Really terrible Exception Handling
Very limited reflection capabilities (improved in oe11)
So if you are familiar with oo programming in other languages and start working with OOABL, you could reach a point at which you were missing a lot of things you expect to be there, and get frustrated when trying to implement such things in ABL.
If your application has to run on Windows only, it is also possible to implement new oo code in C# and call it from your existing progress code via clr bridge, which works very smoothly.
K+
Only one thing - "Error handling sucks" - it sucks, but not because you can not make your own error-classes a catch them in caller block - that works, I'm using that. What sucks is that mix from old NO-ERROR / ERROR-HANDLE option and Progress.Lang.Error / CATCH block and ROUTINE-LEVEL ON ERROR UNDO, THROW. That is a big problem, when exists no convention in team, which errorhandling and how will be used.

Smalltalk runtime features absent on Objective-C?

I don't know well Smalltalk, but I know some Objective-C. And I'm interested a lot in Smalltalk.
Their syntax are a lot different, but essential runtime structures (that means features) are very similar. And runtime features are supported by runtime.
I thought two languages are very similar in that meaning, but there are many features on Smalltalk that absent on Objective-C runtime. For an example, thisContext that manipulates call-stack. Or non-local return that unwinds block execution. The blocks. It was only on Smalltalk, anyway now it's implemented on Objective-C too.
Because I'm not expert on Smalltalk, I don't know that sort of features. Especially for advanced users. What features that only available in Smalltalk? Essentially, I want to know the advanced features in Smalltalk. So it's OK the features already implemented on Objective-C like block.
While I'm reasonably experienced within Objective-C, I'm not as deeply versed in Smalltalk as many, but I've done a bit of it.
It would be difficult to really enumerate a list of which language has which features for a couple of reasons.
First, what is a "language feature" at all? In Objective-C, even blocks are really built in conjunction with the Foundation APIs and things like the for(... in ...) syntax requires conformance to relatively high level protocol. Can you really talk about a language any more without also considering features of the most important API(s)? Same goes for Smalltalk.
Secondly, the two are very similar in terms of how messaging works and how inheritance is implemented, but they are also very different in how code goes from a thought in your head to running on your machine. Conceptually different to the point that it makes a feature-by-feature comparisons between the two difficult.
The key difference between the two really comes down to the foundation upon which they are built. Objective-C is built on top of C and, thus, inherits all the strengths (speed, portability, flexibility, etc..) and weaknesses (effectively a macro assembler, goofy call ABI, lack of any kind of safety net) of C & compiled-to-the-metal languages. While Objective-C layers on a bunch of relatively high level OO features, both compile time and runtime, there are limits because of the nature of C.
Smalltalk, on the other hand, takes a much more top-to-bottom-pure-OO model; everything, down to the representation of a bit, is an object. Even the call stack, exceptions, the interfaces, ...everything... is an object. And Smalltalk runs on a virtual machine which is typically, in and of itself, a relatively small native byte code interpreter that consumes a stream of smalltalk byte code that implements the higher level functionality. In smalltalk, it is much less about creating a standalone application and much more about configuring the virtual machine with a set of state and functionality that renders the features you need (wherein that configuration can effectively be snapshotted and distributed like an app).
All of this means that you always -- outside of locked down modes -- have a very high level shell to interact with the virtual machine. That shell is really also typically your IDE. Instead of edit-compile-fix-compile-run, you are generally writing code in an environment where the code is immediately live once it is syntactically sound. The lines between debugger, editor, runtime, and program are blurred.
Not a language feature, but the nil-eating behaviour of most Objective-C frameworks gives a very different developing experience than the pop-up-a-debugger, fix and continue of smalltalk.
Even though Objective-C now supports blocks, the extremely ugly syntax is unlikely to lead to much use. In Smalltalk blocks are used a lot.
Objective-C 2.0 supports blocks.
It also has non-local returns in the form of return, but perhaps you particularly meant non-local returns within blocks passed as parameters to other functions.
thisContext isn't universally supported, as far as I'm aware. Certainly there are Smalltalks that don't permit the use of continuations, for instance. That's something provided by the VM anyway, so I can conceive of an Objective-C runtime providing such a facility.
One thing Objective-C doesn't have is become: (which atomically swaps two object pointers). Again, that's something that's provided by the VM.
Otherwise I'd have to say that, like bbum points out, the major difference is probably (a) the tooling/environment and hence (b) the rapid feedback you get from the REPL-like environment. It really does feel very different, working in a Smalltalk environment and working in, say, Xcode. (I've done both.)

Is garbage collection used in production quality Cocoa apps?

I'm mainly wondering about the affect that garbage collection would have on performance. Is the use of garbage collection frowned upon for release apps?
Another concern that I can think of is that using garbage collection could lead to sloppier programming.
Do you use garbage collection in your apps?
Garbage Collection is used in many production quality applications. Xcode, Automator, System Preferences and several other system applications are GC'd, and you can expect that trend to continue over time.
As well, many developers have embraced GC and are using it exclusively in their applications. Intuit's new versions of Quicken and QuickBooks for the Mac are garbage collected, for example.
There are a number of advantages of GC, too. Off the top of my head and from personal experience:
it makes multithreading easier; a simple assign is an atomic declaration of ownership
it offloads a bunch of memory management to other cores; it is naturally concurrent and offloads a bunch computation from the main thread (or computation threads)
in many cases, allocation and deallocation can happen entirely within the context of a thread, thus eliminating any need for global synchronization or locking
the collector has gotten faster with each release of Mac OS X and that trend will continue (just as it has with the rest of the system). By offloading more of your app's computational load to the system provided frameworks, your app will gain more and more from optimizations to the underlying system.
because the collector has intimate knowledge of the object graph -- of the pointers between objects -- in memory, it makes analysis and debugging significantly easier. Instead of "where did this dangling pointer come from?", the question is now "Give me a list of reasons why this object is sticking around longer than I think it should?".
This isn't to say that there isn't work to do to make your application work optimally under GC. There certainly are such tasks!
Garbage collection has been around since the 1960s and is used in many released applications. All .NET apps use garbage collection. Apple uses libauto in Xcode.
Garbage collection generally leads to better quality apps in Cocoa since the developer is freed from the memory management burden. There are tons of Cocoa apps that leak! (though it may not a significant amount of memory)
I tend to use gc since since I can turn around my apps faster and don't have to worry about messaging zombie objects!
I use GC whenever I can, because the best code of all is the code you don't have to write in the first place. Also, as Bbum pointed out above, running under GC means that you have far more information available for performance analysis, should you need to debug any bottlenecks.
Garbage collection is recommended for any new Cocoa applications, and Apple eats its own dog food by using it in Xcode. Performance is an interesting situation, because while you're most likely going to be consuming more CPU cycles overall, the application may actually end up faster in some areas due to multithreading of the collector and the simplification of accessor methods.
Computers are made to do work for us. Cocoa's reference counting is usually easy to manage, but garbage collection is one more thing it can do now--let the machine do the work so you can focus on things that matter!
Like the others, I would strongly recommend using GC. The performance overhead usually is negligible! I don't need to repeat the benefits as stated by other users.
However, I would strongly encourage writing libraries, as opposed to applications, to run in a non-GC mode as well. Some environments cannot run GC code, the iPhone being the notable one; so if you created an internal library for yourself, that you envision reusing it later for an iPhone app, then I would recommend designing it so it would work in a non-GC environment as well.
Converting a GC code to non-GC code is much more difficult than the other way around!

Getting Embedded with D (the programming language)

I like a lot of what I've read about D.
Unified Documentation (That would
make my job a lot easier.)
Testing capability built in to the
language.
Debug code support in the language.
Forward Declarations. (I always
thought it was stupid to declare the
same function twice.)
Built in features to replace the
Preprocessor.
Modules
Typedef used for proper type checking
instead of aliasing.
Nested functions. (Cough PASCAL
Cough)
In and Out Parameters. (How obvious is that!)
Supports low level programming -
Embedded systems, oh yeah!
However:
Can D support an embedded system that
not going to be running an OS?
Does the outright declearation that
it doesn't support 16 bit processors
proclude it entirely from embedded
applications running on such machines? Sometimes you don't need a hammer to solve your problem.
Garbage collection is great on Windows or Linux, but, and unfortunately embedded applications sometime must do explicit memory management.
Array bounds checking, you love it, you hate it. Great for design assurance, but not alway permissable for performance issues.
What are the implications on an embedded system, not running an OS, for multithreading support? We have a customer that doesn't even like interrupts. Much less OS/multithreading.
Is there a D-Lite for embedded systems?
So basically is D suitable for embedded systems with only a few megabytes (sometimes less than a magabyte), not running an OS, where max memory usage must be known at compile time (Per requirements.) and possibly on something smaller than a 32 bit processor?
I'm very interested in some of the features, but I get the impression it's aimed at desktop application developers.
What is specifically that makes it unsuitable for a 16-bit implementation? (Assuming the 16 bit architecture could address sufficient amounts of memory to hold the runtimes, either in flash memory or RAM.) 32 bit values could still be calculated, albeit slower than 16 bit and requiring more operations, using library code.
I have to say that the short answer to this question is "No".
If your machines are 16 bit, you'll have big problems fitting D into it - it is explicitly not designed for it.
D is not a light languages in itself, it generates a lot of runtime type info that normally is linked into your app, and that also is needed for typesafe variadics (and thus the standard formatting features be it Tango or Phobos). This means that even the smallest applications are surprisingly large in size, and may thus disqualify D from the systems with low RAM. Also D with a runtime as a shared lib (which could alleviate some of these issues), has been little tested.
All current D libraries requires a C standard library below it, and thus typically also an OS, so even that works against using D. However, there do exist experimental kernels in D, so it is not impossible per se. There just wouldn't be any libraries for it, as of today.
I would personally like to see you succeed, but doubt that it will be easy work.
First and foremost read larsivi's answer. He's worked on the D runtime and knows of what he's talking about.
I just wanted to add: Some of what you asked about is already possible. It won't get you all the way, and a miss is as good as a mile here but still, FYI:
Garbage collection is great on Windoze or Linux, but, and unfortunately embedded apps sometime must do explicite memory management.
You can turn garbage collection off. The various experimental D OSes out there do it. See the std.gc module, in particular std.gc.disable. Note also that you do not need to allocate memory with new: you can use malloc and free. Even arrays can be allocated with it, you just need to attach a D array around the allocated memory using a slice.
Array bounds checking, you love it, you hate it. Great for design assurance, but not alway permissable for performance issues.
The specification for arrays specifically requires that compilers allow for bounds checking to be turned off (see the "Implementation Note"). gdc provides -fno-bounds-check, and in dmd using -release should disable it.
What are the implications on an embedded system, not running an OS, for multithreading support? We have a customer that doesn't even like interrupts. Much less OS/multithreading.
This I'm less clear on, but given that most C runtimes allow turning off multithreading, it seems likely one could get the D runtime to disable it as well. Whether that's easy or possible right now though I can't tell you.
The answers to this question are outdated:
Can D support an embedded system that not going to be running an OS?
D can be cross-compiled for ARM Linux and for ARM Cortex-M. Some projects aim at creating libraries for Cortex-M architectures like MiniLibD for the STM32 or this project which uses a generic library for the STM32. (You could implement your own minimalistic OS in D on ARM Cortex-M.)
Does the outright declearation that it doesn't support 16 bit processors proclude it entirely from embedded applications running on such machines? Sometimes you don't need a hammer to solve your problem.
No, see answer above... (But I would not expect that "smaller" architectures than Cortex-M will be supported in the near future.)
Garbage collection is great on Windows or Linux, but, and unfortunately embedded applications sometime must do explicit memory management.
You can write Garbage Collection free code. (The D foundation seems to aim at a "GC free compliant" standard library Phobos but that is work in progress.)
Array bounds checking, you love it, you hate it. Great for design assurance, but not alway permissable for performance issues.
(As you said this depends on your "personal taste" and design decisions. But I would assume an acceptable performance overhead for bound checking due to the background of the D compiler developers and D's design aims.)
What are the implications on an embedded system, not running an OS, for multithreading support? We have a customer that doesn't even like interrupts. Much less OS/multithreading.
(What is the question? One could implement mutlithreading using D's language capabilities e.g. like explained in this question. BTW: If you want to use interrupts consider this "hello world" project for a Cortex-M3.)
Is there a D-Lite for embedded systems?
The SafeD subset of D targets at the embedded domain.

Do I still need to learn about managing memory now that Objective-C/Cocoa has Garbage collection?

So I finally dusted off my Objective-C/Cocoa books.. turns out they are nearly 7 years old! With Objective-C 2.0 now having garbage collection, how important are the chapters on Memory Management?
How much of a difference has Garbage Collection made?
Memory management is still really important to understand. If you're targeting an older OS, you need to do memory management. If you're using an older library, you need to do memory management. If you drop down to the Core Foundation level, you (may or may not) need to do memory management.
If you're programming for iPhone, you need to do memory management.
The garbage collector in Objective-C is outstanding - and if you can be using it, you most definitely should be - but it just doesn't cover every programming situation yet.
Some Cocoa technologies, such as Distributed Objects, PyObjC (the Python-Objective-C bridge) plugins and CoreImage (at least last I heard; this may have been fixed) don't play well with Garbage Collection. If you're using these technologies, you will still have to use traditional memory management. Of course, as others have said, you still need to use traditional Cocoa memory management (reference counting) if you're supporting OS X 10.4, or the iPhone in your code.
On the other hand, the new GC can be very nice. It's not a free lunch however; you still need to understand the semantics of the GC system, its patterns and its limitations...just as you do with any technology.
Since many thrid-party frameworks may not yet support GC, it's probably best to still understand the reference counting system. If you follow the simple rules for object ownership given in Apple's memory management guide, you should always be OK.
If you're programming the iPhone platform, you need to know retain/release, because Cocoa Touch does not have GC.
If you're going to use Core Foundation, Core Graphics, most of Core Services, or any other CF-based API, you need to know retain/release, because CF-derived objects are not GC'd by default (and you must explicitly put them out for pick-up, anyway).
If you're going to use any of the POSIX APIs or any of the rest of Core Services, you need to know alloc/free memory management. You don't even get reference-counting. (Exception: Icon Services, which also has reference-counting. APIs born from Carbon are a mess.)
So, in a word: Yes.
It depends. If you're planning on ignoring 10.4 users, then you might not have to worry about it; but Objective-C 2.0 isn't available on 10.4 and below, so you still have to worry about memory management on those platforms.
That said, memory management is always a useful skill, and it's not that hard in Cocoa anyway, so it's not a bad skill to pick up.
It is probably worth learning about the concepts that underpin Cocoa memory management, as it's still useful in certain situations. The iPhone OS, for example does not support garbage collection. There may be other situations where it is advantageous to use manual memory management, and it's useful to have the ability to make that choice
Understanding Cocoa's excellent memory management concepts will help you with the concept of memory management in general. I've copied the autorelease concept into a few C++ projects and it worked great. Apache and Subversion are examples of other software that also uses autorelease.
Personally I find retain/release/autorelease to be just the right level of abstraction for me. It's not magic, so if I really need to do something weird, it's easy to do so. On the other hand, the rules are so simple that it becomes second nature, to the point where you eventually just don't think about memory management anymore, it just works.
Add to this the fact that, as mentioned above, only most of Cocoa supports garbage collection, while what you are writing is C, so any code you write and/or use that isn't Cocoa will need to be manually managed. This includes CoreAudio, CoreGraphics, and so on.
(Yes, CF objects work with GC, but only if you explicitly enable it for each object, and I found it hard to learn the GC-CF rules)
In summary: I never use the garbage collector myself (and the only time I did so it was very painful, as I had some C++ and CG in the mix), and as far as I know, most Cocoa coders are very used to retain/release/autorelease and use that.