I've implemented a few ExternalStrctures (as part of an "FFI effort"), and for some of them I want to implement finalization for reclaiming the external memory.
I'm trying to write some tests for that, and thought a good way to know if #finalize is called is to change the behaviour for the particular instance I'm using for testing. I'd rather not pollute the implementation with code for supporting tests if possible.
I believe mocking specific methods and changing specific instance behavior is in general a good tool for testing.
I know it's possible in other dialects, and I've implemented it myself in the past in Squeak using #doesNotUnderstand, but I'd like to know if there's a cleaner way, possibly supported by the VM.
Is there a way to change how a particular instance answers a particular message in Cuis/Squeak/Pharo?
Luciano gave this wonderful example:
EllipseMorph copy compile: 'defaultColor ^Color red'; new :: openInWorld
The mail thread is here:
http://cuis-smalltalk.org/pipermail/cuis-dev_cuis-smalltalk.org/2016-March/000458.html
After dealing with the problem I decided to go for an end to end test, actually verifying the resource (memory in my case) is restored to the system. I had not used instance behavior, though Luciano's and Juan's solution (in a comment) is very interesting. Here's the code I'm using for testing:
testFinalizationReleasesExternalMemory
" WeakArray restartFinalizationProcess "
| handles |
handles := (1 to: 11) collect: [:i |
Smalltalk garbageCollect.
APIStatus create getHandle].
self assert: (handles asSet size) < 11.
In the example, #create uses an FFI call to an external function that allocates memory and returns a pointer (the name create comes from the external API):
create
| answer |
answer := ExternalAPI current createStatus.
self finalizationRegistry add: answer.
^ answer
ExternalAPI here is the FFI interface, #createStatus is the API call that allocates the memory for an APIStatus and returns a pointer to it.
On finalization I call the API which restores the memory:
delete
self finalizationRegistry remove: self ifAbsent: [].
self library deleteStatus: self.
handle := nil.
Where #deleteStatus: is again the API call which frees the memory.
The test assumes that the external library reuses the memory once it's free, specially when the newly allocated block has the same size of the previous. This is correct in most cases today, but I'd like to see this test failing if it's not, if at least just to learn something new.
The test allocates 11 external structures, saves their pointers, leaves the finalization mechanism free the memory of each one before allocating the next, and then compares whether any of the pointers is repeated. I'm not sure why I decided to use 10 pointers as a good number, just 2 should be enough, but memory allocation algorithms are sometimes tricky.
Related
I've got a chunk of memory in a Buf I want to pass in to a C library, but the library will be using the memory beyond the lifetime of a single call.
I understand that can be problematic since the Garbage Collector can move memory around.
For passing in a Str, the Nativecall docs
say "If the C function requires the lifetime of a string to exceed the function call, the argument must be manually encoded and passed as CArray[uint8]" and have an example of doing that, essentially:
my $array = CArray[uint8].new($string.encode.list);
My question is: Must I do the same thing for a Buf? In case it gets moved by the GC? Or will the GC leave my Buf where it sits? For a short string, that isn't a big deal, but for a large memory buffer, that could potentially be an expensive operation. (See, for example, Archive::Libarchive which you can pass in a Buf with a tar file. Is that code problematic?
multi method open(Buf $data!) {
my $res = archive_read_open_memory $!archive, $data, $data.bytes;
...
Is there (could there be? should there be?) some sort of trait on a Buf that tells the GC not to move it around? I know that could be trouble if I add more data to the Buf, but I promise not to do that. What about for a Blob that is immutable?
You'll get away with this on MoarVM, at least at the moment, provided that you keep a reference to the Blob or Buf alive in Perl 6 for as long as the native code needs it and (in the case of Buf) you don't do a write to it that could cause a resize.
MoarVM allocates the Blob/Buf object inside of the nursery, and will move it during GC runs. However, that object does not hold the data; rather, it holds the size and a pointer to a block of memory holding the values. That block of memory is not allocated using the GC, and so will not move.
+------------------------+
| GC-managed Blob object |
+------------------------+ +------------------------+
| Elements |----->| Non-GC-managed memory |
+------------------------+ | (this bit is passed to |
| Size | | native code) |
+------------------------+ +------------------------+
Whether you should rely on this is a trickier question. Some considerations:
So far as I can tell, things could go rather less well if running on the JVM. I don't know about the JavaScript backend. You could legitimately decide that, due to adoption levels, you're only going to worry about running on MoarVM for now.
Depending on implementation details of MoarVM is OK if you just need the speed in your own code, but if working on a module you expect to be widely adopted, you might want to think if it's worth it. A lot of work is put in by both the Rakudo and MoarVM teams to not regress working code in the module ecosystem, even in cases where it can be well argued that it depended on bugs or undefined behavior. However, that can block improvements. Alternatively, on occasion, the breakage is considered worth it. Either way, it's time consuming, and falls on a team of volunteers. Of course, when module authors are responsive and can apply provided patches, it's somewhat less of a problem.
The problem with "put a trait on it" is that the decision - at least on the JVM - seems to need to be made up front at the time that the memory holding the data is allocated. In which case, a portable solution probably can't allow an existing Buf/Blob to be marked up as such. Perhaps a better way will be for I/O-ish things to be asked to give something CArray-like instead, so that zero-copy can be achieved by having the data in the "right kind of memory" in the first place. That's probably a reasonable feature request.
I would like to save an objective-c block to a file (or any other storage e.g. FTP server) and later load it from there and execute it.
From the Blocks Programming Guide > Using Blocks > Copying Blocks, I know that blocks can be stored in the heap. Because anything stored there can be modified, I think that it is possible to read and write arbitrary content from/to the heap and treat the data as a block.
My problem is, how do you save a block to a file? I don't even know what its structure is/how many bytes it covers. I highly doubt that doing a sizeof() and then reading/writing as many bytes is sufficient. Please help me in finding a start to read and write blocks to/from memory and to understand how they are composed.
Let's start from this code:
void (^myBlock)(void) = ^{ printf("Hello, I'm a Block\n"); };
printf("block size: %lu\n", sizeof(myBlock));
myBlock();
Output:
block size: 4
Hello, I'm a Block
As you can imagine, if this works, a long list of fascinating concepts could be implemented in iOS. Just to name a few:
Downloading executable code (as a block) from the web on the fly, storing it in the heap, and executing it, thus making dynamically linked libraries possible in iOS. From this idea, many more possibilities spawn which are simply too many to write in here.
Compiling code in-app and execute immediately, thus enabling any kind of natively executed scripting languages in iOS apps.
Manipulating code at runtime on the machine level in iOS. This is an important topic for AI and evolutionary/random algorithms.
A block object can be stored in the heap. But a block object itself, like other objects, does not contain executable code -- it only contains captured variables, some metadata, and a pointer to the underlying function that is executed. Even if you could hypothetically serialize block objects, you could only unserialize them on a system that has implemented the same block, i.e. has the same executable code.
To make an analogy, what you are saying applies equally with a normal Objective-C object -- Objective-C objects exist on the heap, you can serialize many Objective-C objects, and Objective-C objects contain executable "methods" that you can call on them. Does that mean you can "download executable code (as an object) from the web on the fly, storing it in the heap, and call methods on it, thus making dynamically linked libraries possible in iOS."? Of course not. You can only potentially unserialize objects on a system that has the same class.
It is not possible:
when you copy the block on the heap you are copying the address of the block itself, not the code of the block.
Moreover the possibility of run not compiled and signed code is against the concept of sandbox, and it'd open the possibility to run evil code in your app breaking the security.
You could implement a custom language interpreter in your app to run a interpred code, but it would be against the Apple policy and it would be rejected during the review process.
Reading through the other questions that are similar to mine, I see that most people want to know why you would need to know the size of an instance, so I'll go ahead and tell you although it's not really central to the problem. I'm working on a project that requires allocating thousands to hundreds of thousands of very small objects, and the default allocation pattern for objects simply doesn't cut it. I've already worked around this issue by creating an object pool class, that allows a tremendous amount of objects to be allocated and initialized all at once; deallocation works flawlessly as well (objects are returned to the pool).
It actually works perfectly and isn't my issue, but I noticed class_getInstanceSize was returning unusually large sizes. For instance, a class that stores one size_t and two (including isA) Class instance variables is reported to be 40-52 bytes in size. I give a range because calling class_getInstanceSize multiple times, even in a row, has no guarantee of returning the same size. In fact, every object but NSObject seemingly reports random sizes that are far from what they should be.
As a test, I tried:
printf("Instance Size: %zu\n", class_getInstanceSize(objc_getClass("MyClassName"));
That line of code always returns a value that corresponds to the size that I've calculated by hand to be correct. For instance, the earlier example comes out to 12 bytes (32-bit) and 24 bytes (64-bit).
Thinking that the runtime may be doing something behind the scenes that requires more memory, I watched the actual memory use of each object. For the example given, the only memory read from or written to is in that 12/24 byte block that I've calculated to be the expected size.
class_getInstanceSize acts like this on both the Apple & GNU 2.0 runtime. So is there a known bug with class_getInstanceSize that causes this behavior, or am I doing something fundamentally wrong? Before you blame my object pool; I've tried this same test in a brand new project using both the traditional alloc class method and by allocating the object using class_createInstance(self, 0); in a custom class method.
Two things I forgot to mention before: I'm almost entirely testing this on my own custom classes, so I know the trickery isn't down to the class actually being a class cluster or any of that nonsense; second, class_getInstanceSize([MyClassName class]) and class_getInstanceSize(self) \\ Ran inside a class method rarely produce the same result, despite both simply referencing isA. Again, this happens in both runtimes.
I think I've solved the problem and it was due to possibly the dumbest reason ever.
I use a profiling/debugging library that is old; in fact, I don't know its actual name (the library is libcsuomm; the header for it has no identifying info). All I know about it is that it was a library available on the computers in the compsci labs (I did a year of Comp-Sci before switching to a Geology major, graduating and never looking back).
Anyway, the point of the library is that it provides a number of profiling and debugging functionalities; the one I use it most for is memory leak detection, since it actually tracks per object unlike my other favorite memory-leak library (now unsupported, MSS) which is based in C and not aware of objects outside of raw allocations.
Because I use it so much when debugging, I always set it up by default without even thinking about it. So even when creating my test projects to try and pinpoint the bug, I set it up without even putting any thought into it. Well, it turns out that the library works by pulling some runtime trickery, so it can properly track objects. Things seem to work correctly now that I've disabled it, so I believe that it was the source of my problems.
Now I feel bad about jumping to conclusions about it being a bug, but at the time I couldn't see anything in my own code that could possibly cause that problem.
I am using Keil's ARM-MDK 4.11. I have a statically allocated block of memory that is used only at startup. It is used before the scheduler is initialised and due to the way RL-RTX takes control of the heap-management, cannot be dynamically allocated (else subsequent allocations after the scheduler starts cause a hard-fault).
I would like to add this static block as a free-block to the system heap after the scheduler is initialised. It would seem that __Heap_ProvideMemory() might provide the answer, this is called during initialisation to create the initial heap. However that would require knowledge of the heap descriptor address, and I can find no documented method of obtaining that.
Any ideas?
I have raised a support request with ARM/Keil for this, but they are more interested in questioning why I would want to do this, and offering alternative solutions. I am well aware of the alternatives, but in this case if this could be done it would be the cleanest solution.
We use the Rowley Crossworks compiler but had a similar issue - the heap was being set up in the compiler CRT startup code. Unfortunately the SDRAM wasn't initialised till the start of main() and so the heap wasn't set up properly. I worked around it by reinitialising the heap at the start of main(), after the SDRAM was initialised.
I looked at the assembler code that the compiler uses at startup to work out the structure - it wasn't hard. Subsequently I have also obtained the malloc/free source code from Rowley - perhaps you could ask Keil for their version?
One method I've used is to incorporate my own simple heap routines and take over the malloc()/calloc()/free() functions from the library.
The simple, custom heap routines had an interface that allowed adding blocks of memory to the heap.
The drawback to this (at least in my case) was that the custom heap routines were far less sophisticated than the built-in library routines and were probably more prone to fragmentation than the built-in routines. That wasn't a serious issue in that particular application. If you want the capabilities of the built-in library routines, you could probably have your malloc() defer to the built-in heap routines until it returns a failure, then try to allocate from your custom heap.
Another drawback is that I found it much more painful to make sure the custom routines were bug-free than I thought it would be at first glance, even though I wasn't trying to do anything too fancy (just a simple list of free blocks that could be split on allocation and coalesced when freed).
The one benefit to this technique is that it's pretty portable (as long as your custom routines are portable) and doesn't break if the toolchain changes it's internals. The only part that requires porting is taking over the malloc()/free() interface and making sure you get initialized early enough.
May be my question is stupid. But i would like to get it cleared. We know that functions are loaded in memory only once and when you create new objects, only instance variables gets created, functions are never created. My question is, say suppose there is server and all clients access a method named createCustomer(). Say suppose all clients do something which fired createCustomer on server. So, if the method is in middle of execution and new client fires it. Will the new request be put on wait? or new request also will start executing the method? How does it all get managed when there is only one copy of function in memory? No book mentions answers to this type of questions. So i am posting here where i am bound to get answers :).
Functions are code which is then executed in a memory context. The code can be run many times in parallel (literally in parallel on a multi-processor machine), but each of those calls will execute in a different memory context (from the point of view of local variables and such). At a low level this works because the functions will reference local variables as offsets into memory on something called a "stack" which is pointed to by a processor register called the "stack pointer" (or in some interpreted languages, an analog of that register at a higher level), and the value of this register will be different for different calls to the function. So the x local variable in one call to function foo is in a different location in memory than the x local variable in another call to foo, regardless of whether those calls happen simultaneously.
Instance variables are different, they're referenced via a reference (pointer) to the memory allocated to the instance of an object. Two running copies of the same function might access the same instance variable at exactly the same time; similarly, two different functions might do so. This is why we get into "threading" or concurrency issues, synchronization, locks, race conditions, etc. But it's also one reason things can be highly efficient.
It's called "multi-threading". If each request has its own thread, and the object contains mutable data, each client will have the opportunity to modify the state of the object as they see fit. If the person who wrote the object isn't mindful of thread safety you could end up with an object that's in an inconsistent state between requests.
This is a basic threading issue, you can look it up at http://en.wikipedia.org/wiki/Thread_(computer_science).
Instead of thinking in terms of code that is executed, try to think of memory context of a thread that is changed. It does not matter where and what the actual code happens to be, and if it is the same code or a duplicate or something else.
Basically, it can happen that the function is called while it was already called earlier. The two calls are independent and may even happen to run in parallel (on a multicore machine). The way to achieve this independence is by using different stacks and virtual address spaces for each thread.
There are ways to synchronize calls, so additional callers have to wait until the first call finishes. This is also explained in the above link.