Possible to share information between an add-on to an existing program and a standalone application? [duplicate] - objective-c

I'm looking at building a Cocoa application on the Mac with a back-end daemon process (really just a mostly-headless Cocoa app, probably), along with 0 or more "client" applications running locally (although if possible I'd like to support remote clients as well; the remote clients would only ever be other Macs or iPhone OS devices).
The data being communicated will be fairly trivial, mostly just text and commands (which I guess can be represented as text anyway), and maybe the occasional small file (an image possibly).
I've looked at a few methods for doing this but I'm not sure which is "best" for the task at hand. Things I've considered:
Reading and writing to a file (…yes), very basic but not very scalable.
Pure sockets (I have no experience with sockets but I seem to think I can use them to send data locally and over a network. Though it seems cumbersome if doing everything in Cocoa
Distributed Objects: seems rather inelegant for a task like this
NSConnection: I can't really figure out what this class even does, but I've read of it in some IPC search results
I'm sure there are things I'm missing, but I was surprised to find a lack of resources on this topic.

I am currently looking into the same questions. For me the possibility of adding Windows clients later makes the situation more complicated; in your case the answer seems to be simpler.
About the options you have considered:
Control files: While it is possible to communicate via control files, you have to keep in mind that the files need to be communicated via a network file system among the machines involved. So the network file system serves as an abstraction of the actual network infrastructure, but does not offer the full power and flexibility the network normally has. Implementation: Practically, you will need to have at least two files for each pair of client/servers: a file the server uses to send a request to the client(s) and a file for the responses. If each process can communicate both ways, you need to duplicate this. Furthermore, both the client(s) and the server(s) work on a "pull" basis, i.e., they need to revisit the control files frequently and see if something new has been delivered.
The advantage of this solution is that it minimizes the need for learning new techniques. The big disadvantage is that it has huge demands on the program logic; a lot of things need to be taken care of by you (Will the files be written in one piece or can it happen that any party picks up inconsistent files? How frequently should checks be implemented? Do I need to worry about the file system, like caching, etc? Can I add encryption later without toying around with things outside of my program code? ...)
If portability was an issue (which, as far as I understood from your question is not the case) then this solution would be easy to port to different systems and even different programming languages. However, I don't know of any network files ystem for iPhone OS, but I am not familiar with this.
Sockets: The programming interface is certainly different; depending on your experience with socket programming it may mean that you have more work learning it first and debugging it later. Implementation: Practically, you will need a similar logic as before, i.e., client(s) and server(s) communicating via the network. A definite plus of this approach is that the processes can work on a "push" basis, i.e., they can listen on a socket until a message arrives which is superior to checking control files regularly. Network corruption and inconsistencies are also not your concern. Furthermore, you (may) have more control over the way the connections are established rather than relying on things outside of your program's control (again, this is important if you decide to add encryption later on).
The advantage is that a lot of things are taken off your shoulders that would bother an implementation in 1. The disadvantage is that you still need to change your program logic substantially in order to make sure that you send and receive the correct information (file types etc.).
In my experience portability (i.e., ease of transitioning to different systems and even programming languages) is very good since anything even remotely compatible to POSIX works.
[EDIT: In particular, as soon as you communicate binary numbers endianess becomes an issue and you have to take care of this problem manually - this is a common (!) special case of the "correct information" issue I mentioned above. It will bite you e.g. when you have a PowerPC talking to an Intel Mac. This special case disappears with the solution 3.+4. together will all of the other "correct information" issues.]
+4. Distributed objects: The NSProxy class cluster is used to implement distributed objects. NSConnection is responsible for setting up remote connections as a prerequisite for sending information around, so once you understand how to use this system, you also understand distributed objects. ;^)
The idea is that your high-level program logic does not need to be changed (i.e., your objects communicate via messages and receive results and the messages together with the return types are identical to what you are used to from your local implementation) without having to bother about the particulars of the network infrastructure. Well, at least in theory. Implementation: I am also working on this right now, so my understanding is still limited. As far as I understand, you do need to setup a certain structure, i.e., you still have to decide which processes (local and/or remote) can receive which messages; this is what NSConnection does. At this point, you implicitly define a client/server architecture, but you do not need to worry about the problems mentioned in 2.
There is an introduction with two explicit examples at the Gnustep project server; it illustrates how the technology works and is a good starting point for experimenting:
http://www.gnustep.org/resources/documentation/Developer/Base/ProgrammingManual/manual_7.html
Unfortunately, the disadvantages are a total loss of compatibility (although you will still do fine with the setup you mentioned of Macs and iPhone/iPad only) with other systems and loss of portability to other languages. Gnustep with Objective-C is at best code-compatible, but there is no way to communicate between Gnustep and Cocoa, see my edit to question number 2 here: CORBA on Mac OS X (Cocoa)
[EDIT: I just came across another piece of information that I was unaware of. While I have checked that NSProxy is available on the iPhone, I did not check whether the other parts of the distributed objects mechanism are. According to this link: http://www.cocoabuilder.com/archive/cocoa/224358-big-picture-relationships-between-nsconnection-nsinputstream-nsoutputstream-etc.html (search the page for the phrase "iPhone OS") they are not. This would exclude this solution if you demand to use iPhone/iPad at this moment.]
So to conclude, there is a trade-off between effort of learning (and implementing and debugging) new technologies on the one hand and hand-coding lower-level communication logic on the other. While the distributed object approach takes most load of your shoulders and incurs the smallest changes in program logic, it is the hardest to learn and also (unfortunately) the least portable.

Disclaimer: Distributed Objects are not available on iPhone.
Why do you find distributed objects inelegant? They sounds like a good match here:
transparent marshalling of fundamental types and Objective-C classes
it doesn't really matter wether clients are local or remote
not much additional work for Cocoa-based applications
The documentation might make it sound like more work then it actually is, but all you basically have to do is to use protocols cleanly and export, or respectively connect to, the servers root object.
The rest should happen automagically behind the scenes for you in the given scenario.

We are using ThoMoNetworking and it works fine and is fast to setup. Basically it allows you to send NSCoding compliant objects in the local network, but of course also works if client and server are on he same machine. As a wrapper around the foundation classes it takes care of pairing, reconnections, etc..

Related

How does Smalltalk image handle IO?

I just started to learn Smalltalk, went through its syntax, but hasn't done any real coding with it. While reading some introductory articles and some SO questions like:
What gives Smalltalk the ability to do image persistence, and why
can't languages like Ruby/Python serialize
themselves?
What is a Smalltalk “image”?
One question always comes into my mind: How does Smalltalk image handle IO?
A smalltalk program can resume from where it exits, using information stored in the image. Say I have some opened TCP connections(not to mention all sorts of buffer), how do they get recovered? There seems to be no way other than reopening them(confirmed by this answer). And if Smalltalk does reopen those connections, isn't it going against the idea of "resume execution of the program at a later time exactly from where you left off"? Or is there some magic behind it?
I don't mind if the answer is specific to certain dialects, say Pharo.
Also would be interested to know some resources to learn more about this topic.
As you have noted some resources are not part of the memory heap and therefore will not be recovered just by loading the image back in memory. In particular this applies to all kinds of resources managed by the operating system, and cross-platform Smalltalks where you can copy the image from one OS to another and restart the image even have to restore such resources differently than they were before.
The trick in the Smalltalks I have seen is that all classes receive a message immediately after the image resumed. By implementing a method for that message they can restore any transient resources (sockets, connections, foreign handles, ...) that their instances might need. To find all instances some Smalltalks provide messages such as allInstances, or you must maintain a registry of the relevant objects yourself.
And if Smalltalk does reopen those connections, isn't it going against the idea of "resume execution of the program at a later time exactly from where you left off"?
From a user perspective, after that reinitialization and reallocation of resources, everything still looks like "exactly where you left off", even though some technical details have changed under the hood. Of course this won't be the case if it is impossible to restore the resources (no network, for example). Some limits cannot be overcome by Smalltalk magic.
How does the Smalltalk image handle IO?
To make that resumption described above possible, all external resources are wrapped and represented as some kind of Smalltalk object. The wrapper objects will be persisted in the image, although they will be disconnected from the outside world when Smalltalk is shut down. The wrappers can then restore the external resources after the image has been started up again.
It might be useful to add a small history lesson: Smalltalk was created in the 1970s at Xerox's Palo Alto Research Center (PARC). In the same time, at the same place, Personal Computing was invented. Also in the same time at the same place, the Ethernet was invented.
Smalltalk was a single integrated system, it was at the same time the IDE, the GUI, the shell, the kernel, the OS, even the microcode for the CPU was part of the Smalltalk System. Smalltalk didn't have to deal with non-Smalltalk resources from outside the image, because for all intents and purposes, there was no "outside". It was possible to re-create the exact machine state, since there wasn't really any boundary between the Virtual Machine and the machine. (Almost all the system was implemented in Smalltalk. There were only a couple of tiny bits of microcode, assembly, and Mesa. Even what we would consider device drivers nowadays were written in Smalltalk.)
There was no need to persist network connections to other computers, because nobody outside of a few labs had networks. Heck, almost no organization even had more than one computer. There was no need to interact with the host OS because Smalltalk machines didn't have an OS; Smalltalk was the OS. (You probably know the famous quote from Dan Ingalls' Design Principles Behind Smalltalk: "An operating system is a collection of things that don't fit into a language. There shouldn't be one.") Because Smalltalk was the OS, there was no need for a filesystem, all data was simply objects.
Smalltalk cannot control what is outside of Smalltalk. This is a general property that is not unique to Smalltalk. You can break encapsulation in Java by editing the compiled bytecode. You can break type-safety in Haskell by editing the compiled machine code. You can create a memory leak in Rust by editing the compiled machine code.
So, all the guarantees, features, and properties of Smalltalk are only available as long as you don't leave Smalltalk.
Here's an even simpler example that does not involve networking or moving the image to a different machine: open a file in the host filesystem. Suspend the image. Delete the file. Resume the image. There is no possible way to resume the image in the same state.
All Smalltalk can do, is approximate the state of external resources as good as it possibly can. It can attempt to re-open the file. If the file is gone, it can maybe attempt to create one with the same name. It can try to resume a network connection. If that fails, it can try to re-establish the connection, create a new connection to the same address.
But ultimately, everything outside the image is outside of the control of Smalltalk, and there is nothing Smalltalk can do about it.
Note that this impedance mismatch between the inside of the image and the "outside world" is one of the major criticisms that is typically leveled at Smalltalk. And if you look at Smalltalk systems that try to integrate deeply with the outside world, they often have to compromise. E.g. GNU Smalltalk, which is specifically designed to integrate deeply into a Unix system, actually gives up on the image and persistence.
I'll add one more angle to the nice answers of Joerg and JayK.
What is important to understand is the context of the time and age Smalltalk was created. (Joerg already pointed out important aspect of everything being Smalltalk). We are talking about time right after ARPANET.
I think they were not expecting the collaboration and interconnection we have nowadays. The image was meant as a record of a session of a single programmer without any external communication. Times changed and now you naturally ask the IO question. As JayK noted you can re-init the image so you will get image similar to the point you ended your session.
The real issue, the reason I've decided to add my 2c, is the collaboration among multiple developers. This is where the image, the original idea, is, in my opinion, outlived. There is no way to share an image among multiple developers so they could develop at the same time and share the code.
Imagine wouldn't it be great if you could have one central image and the developers would have only diffs and open their environment where they ended with everyone's new code incorporated? Sound familiar? This is kind of VCS we have like mercurial, git etc. without the image, only code. Sadly, to say such image re-construction does not exist.
Smalltalk is trying to catch up with the std. versioning tooling we use nowadays.
(Side note: Smalltalk had their own "versioning" systems but they rather lack in many ways compared to the current VCS. The ones used Monticello (Pharo), ENVY (VA Smalltalk), and Store (VisualWorks).)
Pharo is now trying to catch the train and implement the git functionality via iceberg. The Smalltalk/X-jv branch has integrated decent mercurial support. The Squeak has Squot for git (for now) [thank you #JayK].
Now "only" (that is BIG only) to add support for central/diff images :).
For some practical code dealing with image startUp and shutDown, take a look at the class side of ZnServer. SessionManager is the class providing all the functionality you need to deal with giving up system resources on shutdown and re-aquiring them again on startup.
Need to chime in on the source control discussion a bit.
The solution I have seen with VS(E)/VA is that you work with Envy/PVCS and share this repository with the developers.
Every developer has his/her own image with all the pros and cons of images.
One company I was working for was discussing whether it wouldnt make sense to build up the development image egain from scratch every couple of weeks in order to get rid of everything that might dilute the code quality (open Filehandles, global variables, pool dictionary entries, you name it, you will get it and it will crash your code during run-time).
When it comes to building a "run-time", you take the plain tiny standard image, and all your code comes from bind files and SLLs.

Distributed Objects + Grand Central Dispatch

Not a specific question as such, I'm more trying to test the waters. I like distributed objects, and I like grand central dispatch; How about I try to combine the two?
Does that even make sense? Has anyone played around in these waters? Would I be able to use GCD to help synchronize object access across machines? Or would it be better to stick to synchronizing local objects only? What should I look out for? What design patterns are helpful and what should I avoid?
as an example, I use GCD queues to synchronize accesses to a shared resource of some kind. What can I expect to happen if I make this resource public via distributed objects? Questions like: How nicely to blocks play with distributed objects? Can I expect to use the everything as normal across machines? If not, can I wrangle it to do so? What difficulties can I expect?
I very much doubt this will work well. GCD objects are not Cocoa objects, so you can't reference them remotely. GCD synchronization primitives don't work across process boundaries.
While blocks are objects, they do not support NSCoding, so they can't be transmitted across process boundaries. (If you think about it, they are not much more than function pointers. The pointed-to function must have been compiled into the executable. So, it doesn't make sense that two different programs would share a block.)
Also, Distributed Objects depends on the connection being scheduled in a given run loop. Since you don't manage the threads used by GCD, you are not entitled to add run loop sources except temporarily.
Frankly, I'm not even sure how you envision it even theoretically working. What do you hope to do? How do you anticipate it working?
Running across machines -- as in a LAN, MAN, or WAN?
In a LAN, distributed objects will probably work okay as long as the server you are connecting to is operational. However, most programmers you meet will probably raise an eyebrow and just ask you, "Why didn't you just use a web server on the LAN and just build your own wrapper class that makes it 'feel' like Distributed Objects?" I mean, for one thing, there are well-established tools for troubleshooting web servers, and it's easier and often cheaper to hire someone to build a web service for you rather than a distributed object server.
On a MAN or WAN, however, this would be slow and a very bad idea for most uses. For that type of communication, you're better off using what everyone else uses -- REST-like APIs with HTTPS/HTTP, sending either XML, JSON, or key/value data back and forth. So, you could make a class wrapper that makes this "feel" sort of like distributed objects. And my gut feeling tells me that you'll need to use tricks to speed this up, such as caching chunks of data locally on the client so that you don't have to keep fetching from the server, or even caching on the server so that it doesn't have to interact with a database as often.
GCD, Distributed Objects, Mach Ports, XPC, POSIX Message Queues, Named Pipes, Shared Memory, and many other IPC mechanisms really only make the most sense on local, application to application communication on the same computer. And they have the added advantage of privilege elevation if you want to utilize that. (Note, I said POSIX Message Queues, which are workstation-specific. You can still use a 'message queue service' on a LAN, MAN, or WAN -- there are many products available for that.)

MPI vs. Microsoft WCF vs. Microsoft TPL

I have a scientific program written in F# which I want to parallelize and run on 1 server with multiple processors (64) and for the future also in the cloud (Windows Azure?). The program will have a simple 1-1 communication between the nodes (no broadcast etc.).
If I used WCF, would it be as fast as MPI? What has MPI that WCF does not? There exists Pure MPI .NET written for WCF which puzzles me even more. I do not know if to use WCF or MPI.NET or Pure Mpi running on WCF.
PS: I guess that TPL is out of the game for 64 processors and more, right?
It is difficult to give a concrete answer, because it all depends on the specific aspects of your application, its current architecture (I suppose you already have some app) etc.
As you mention MPI and WCF, I assume that the application is written as several components that communicate with each other. The best way to structure this kind of application is to use F# agents.
As far as I understand, you want to run the application on a single server first. If you write it using agents, the agents can just communicate directly with each other (so you don't need MPI or WCF).
TPL should work well on a single-server (with lots of CPUs), but it will not scale to the distributed setting - you cannot run Task on another machine. However, you can use it inside individual components (e.g. agents) that will be distributed.
Regarding MPI vs. WCF - I don't have enough experience to answer that. However, if you use agent-based architecture, it should be easy to try various options. You may also check out fracture and related projects, which aims to implement high-performance sockets for F# (and possibly distributed agents in the future).
If you're doing it on 1 server you could just execute one process and execute the code in parallel. That way you could share memory more easily and faster than doing it through messages like MPI and WCF. Although the overhead of communication might not be that much, depending on your problem + solution.
Also the changes to your code would be much less that way, F# can usually be turned into prallel code with little effort. Going to MPI/WCF would require you to rewrite large portions.
Googling for F# + parallel gives plenty useful info that you should read first, like this for a good start:
http://blogs.msdn.com/b/dsyme/archive/2010/01/09/async-and-parallel-design-patterns-in-f-parallelizing-cpu-and-i-o-computations.aspx
So on 1 server, I woudl use the parallel features of F#, it's designed to prallelize easily.
Later when you want to go for cloud, that would be turning it into cleint-server. That's a different problem then parallization. I would treat and solve them seperately.
On the MPI vs WCF. WCF is designed as a RPC technology, i.e. you call remote procedures and get answers. If you want to use it for parallel programming with separate processes, you would have to create the boilerplate code for that. (Keep track of subsribed clients etc.)
MPI was designed to run that kind of architecture and handles it much more easily. (the first process gets number 0 and is the master, the other are slaves get numbered incrementally etc.)
Howver I don't think MPI will be very good to go cloud, since that invloves http, protocols, security etc. Not sure how well MPI works for those kind of things, WCF will handle that very well indeed.
The fact that there is an MPI.NET for WCF is because MPI is about a certain style of parallizing code that a lot of people are familiar with. So you can use the programming concepts and use them on the .NET platform leveraging WCF for the communications.
Something else you might want to look into if you need to exchange a lot of data over the wire is protocol-buffers (see protobuf-net for instance). That can easily be combined with WCF for communication and is very lean in serializing structured data so you can send over the wire efficiently.
Gert-Jan
WCF and MPI are different concepts. WCF is like a person A asks a person B to do something where as MPI is like a person A creates clones of himself (all clone have same ability/logic) and then these clones work on specific parts of the problem to be solved and once done they combine their results.
So choosing between which one fits your specific application depends on the problem your application is trying to solve. It may even be a combination of both WCF and MPI. Where your client application asks the WCF to do some task and the WCF create clones of the "problem solver" using MPI and when the clone are done with solving the problem (in parallel) they return the aggregated result back to the WCF and then that result is sent to client application.
You might also want to take at the 'mbrace' product, which provides a cloud monad (http://blogs.msdn.com/b/dsyme/archive/2011/08/23/m-brace-f-in-the-cloud.aspx). It's still at a fairly early stage though. I'm no expert but it may be that you can run an mbrace-based solution as effectively a private cloud on your 64-processor setup. When you outgrow that, a move to Azure would be seamless.

Why use AMQP/ZeroMQ/RabbitMQ

as opposed to writing your own library.
We're working on a project here that will be a self-dividing server pool, if one section grows too heavy, the manager would divide it and put it on another machine as a separate process. It would also alert all connected clients this affects to connect to the new server.
I am curious about using ZeroMQ for inter-server and inter-process communication. My partner would prefer to roll his own. I'm looking to the community to answer this question.
I'm a fairly novice programmer myself and just learned about messaging queues. As i've googled and read, it seems everyone is using messaging queues for all sorts of things, but why? What makes them better than writing your own library? Why are they so common and why are there so many?
what makes them better than writing your own library?
When rolling out the first version of your app, probably nothing: your needs are well defined and you will develop a messaging system that will fit your needs: small feature list, small source code etc.
Those tools are very useful after the first release, when you actually have to extend your application and add more features to it.
Let me give you a few use cases:
your app will have to talk to a big endian machine (sparc/powerpc) from a little endian machine (x86, intel/amd). Your messaging system had some endian ordering assumption: go and fix it
you designed your app so it is not a binary protocol/messaging system and now it is very slow because you spend most of your time parsing it (the number of messages increased and parsing became a bottleneck): adapt it so it can transport binary/fixed encoding
at the beginning you had 3 machine inside a lan, no noticeable delays everything gets to every machine. your client/boss/pointy-haired-devil-boss shows up and tell you that you will install the app on WAN you do not manage - and then you start having connection failures, bad latency etc. you need to store message and retry sending them later on: go back to the code and plug this stuff in (and enjoy)
messages sent need to have replies, but not all of them: you send some parameters in and expect a spreadsheet as a result instead of just sending and acknowledges, go back to code and plug this stuff in (and enjoy.)
some messages are critical and there reception/sending needs proper backup/persistence/. Why you ask ? auditing purposes
And many other use cases that I forgot ...
You can implement it yourself, but do not spend much time doing so: you will probably replace it later on anyway.
That's very much like asking: why use a database when you can write your own?
The answer is that using a tool that has been around for a while and is well understood in lots of different use cases, pays off more and more over time and as your requirements evolve. This is especially true if more than one developer is involved in a project. Do you want to become support staff for a queueing system if you change to a new project? Using a tool prevents that from happening. It becomes someone else's problem.
Case in point: persistence. Writing a tool to store one message on disk is easy. Writing a persistor that scales and performs well and stably, in many different use cases, and is manageable, and cheap to support, is hard. If you want to see someone complaining about how hard it is then look at this: http://www.lshift.net/blog/2009/12/07/rabbitmq-at-the-skills-matter-functional-programming-exchange
Anyway, I hope this helps. By all means write your own tool. Many many people have done so. Whatever solves your problem, is good.
I'm considering using ZeroMQ myself - hence I stumbled across this question.
Let's assume for the moment that you have the ability to implement a message queuing system that meets all of your requirements. Why would you adopt ZeroMQ (or other third party library) over the roll-your-own approach? Simple - cost.
Let's assume for a moment that ZeroMQ already meets all of your requirements. All that needs to be done is integrating it into your build, read some doco and then start using it. That's got to be far less effort than rolling your own. Plus, the maintenance burden has been shifted to another company. Since ZeroMQ is free, it's like you've just grown your development team to include (part of) the ZeroMQ team.
If you ran a Software Development business, then I think that you would balance the cost/risk of using third party libraries against rolling your own, and in this case, using ZeroMQ would win hands down.
Perhaps you (or rather, your partner) suffer, as so many developers do, from the "Not Invented Here" syndrome? If so, adjust your attitude and reassess the use of ZeroMQ. Personally, I much prefer the benefits of Proudly Found Elsewhere attitude. I'm hoping I can proud of finding ZeroMQ... time will tell.
EDIT: I came across this video from the ZeroMQ developers that talks about why you should use ZeroMQ.
what makes them better than writing your own library?
Message queuing systems are transactional, which is conceptually easy to use as a client, but hard to get right as an implementor, especially considering persistent queues. You might think you can get away with writing a quick messaging library, but without transactions and persistence, you'd not have the full benefits of a messaging system.
Persistence in this context means that the messaging middleware keeps unhandled messages in permanent storage (on disk) in case the server goes down; after a restart, the messages can be handled and no retransmit is necessary (the sender does not even know there was a problem). Transactional means that you can read messages from different queues and write messages to different queues in a transactional manner, meaning that either all reads and writes succeed or (if one or more fail) none succeeds. This is not really much different from the transactionality known from interfacing with databases and has the same benefits (it simplifies error handling; without transactions, you would have to assure that each individual read/write succeeds, and if one or more fail, you have to roll back those changes that did succeed).
Before writing your own library, read the 0MQ Guide here: http://zguide.zeromq.org/page:all
Chances are that you will either decide to install RabbitMQ, or else you will make your library on top of ZeroMQ since they have already done all the hard parts.
If you have a little time give it a try and roll out your own implemntation! The learnings of this excercise will convince you about the wisdom of using an already tested library.

How you test your applications for reliability under badly behaving i/o

Almost every application out there performs i/o operations, either with disk or over network.
As my applications work fine under the development-time environment, I want to be sure they will still do when the Internet connection is slow or unstable, or when the user attempts to read data from badly-written CD.
What tools would you recommend to simulate:
slow i/o (opening files, closing files, reading and writing, enumeration of directory items)
occasional i/o errors
occasional 'access denied' responses
packet loss in tcp/ip
etc...
EDIT:
Windows:
The closest solution to do the job as described seems to be holodeck, commercial software (>$900).
Linux:
Open solution wasn't found by now, but the same effect
can be achived as specified by smcameron and krosenvold.
Decorator pattern is a good idea.
It would require to wrap my i/o classes, but resulting in a testing framework.
The only remaining untested code would be in 3rd party libraries.
Yet I decided not to go this way, but leave my code as it is and simulate i/o errors from outside.
I now know that what I need is called 'fault injection'.
I thought it was a common production-line part with plenty of solutions I just didn't know.
(By the way, another similar good idea is 'fuzz testing', thanks to Lennart)
On my mind, the problem is still not worth $900.
I'm going to implement my own open-source tool based on hooks (targeting win32).
I'll update this post when I'm done with it. Come back in 3 or 4 weeks or so...
What you need is a fault injecting testing system. James Whittaker's 'How to break software' is a good read on this subject and includes a CD with many of the tools needed.
If you're on linux you can do tons of magic with iptables;
iptables -I OUTPUT -p tcp --dport 7991 -j DROP
Can simulate connections up/down as well. There's lots of tutorials out there.
Check out "Fuzz testing": http://en.wikipedia.org/wiki/Fuzzing
At a programming level many frameworks will let you wrap the IO stream classes and delegate calls to the wrapped instance. I'd do this and add in a couple of wait calls in the key methods (writing bytes, closing the stream, throwing IO exceptions, etc). You could write a few of these with different failure or issue type and use the decorator pattern to combine as needed.
This should give you quite a lot of flexibility with tweaking which operations would be slowed down, inserting "random" errors every so often etc.
The other advantage is that you could develop it in the same code as your software so maintenance wouldn't require any new skills.
You don't say what OS, but if it's linux or unix-ish, you can wrap open(), read(), write(), or any library or system call etc, with an LD_PRELOAD-able library to inject faults.
Along these lines:
http://scaryreasoner.wordpress.com/2007/11/17/using-ld_preload-libraries-and-glibc-backtrace-function-for-debugging/
I didn't go writing my own file system filter, as I initially thought, because there's a simpler solution.
1. Network i/o
I've found at least 2 ways to simulate i/o errors here.
a) Running a virtual machine (such as vmware) allows to configure bandwidth and packet loss rate. Vmware supports on-machine debugging.
b) Running a proxy on the local machine and tunneling all the traffic through it. For the case of upd/tcp communications a proxifier (e.g. widecap) can be used.
2. File i/o
I've managed to deduce this scenario to the previous one by mapping a drive letter to a network share which resides inside the virtual machine. The file i/o will be slow.
A cheaper alternative exists: to set up a local ftp server (e.g. FileZilla), configure speeds and use Novell's NetDrive to access it.
You'll wanna setup a test lab for this. What type of application are you building anyway? Are you really expecting the application be fed corrupt data?
A test technique I know the Microsoft Exchange Server people tried was sending noise to the server. Basically feeding every possible input with seemingly random data. They managed to crash the server quite often this way.
But still, if you can't trust input that hasn't been signed then general rules apply. Track every operation which could potentially be untrusted (result of corrupt data) and you should be able to handle most problems gracefully.
Just test your application behavior on random input, that should catch most problems but you'll never be able to fully protect your self from corrupt data. That's just not possible, as the data could be part of some internal buffer being handed off within the application itself.
Be mindful of when and how you decode data. That is all.
The first thing you'll need to do is define what "correct" means under these circumstances. You can only test against a definition of what behaviour is intended.
The tactics of testing will depend on technology. In the context of automated unit testing, I have found it very useful, in OO languages such as Java, to use various flavors of "mocking" or "stubbing" to pass e.g. misbehaving InputStreams to parts of my code that used file I/O.
Consider holodeck for some of the fault injection, if you have access to spare hardware you can simulate network impairment using Netem or a commercial product based on it the Mini-Maxwell, which is much more expensive than free but possibly easier to use.