Create a const char ** in Visual Works image - smalltalk

How i should create a (const char **) to pass it to a C function ?
Let say my const char ** is named prompts then:
user := 'User:' copyToHeap: #malloc:.
pwd := 'Password:' copyToHeap: #malloc:.
prompts := (ByteArray new: 64) copyToHeap: #malloc:.
prompts copyAt: 0 from: (user referentAddress asByteArraySize: 32) size: 4 startingAt: 1.
prompts copyAt: 31 from: (pwd referentAddress asByteArraySize: 32) size: 4 startingAt: 1.
So prompts is an array of 64bits where the first 32bits are a pointer to user and the secods 32bits are a pointer to pwd.
But the C function is not working.
In GemStone is working ok with:
prompts := CByteArray gcMalloc: 16.
user := CByteArray withAll: 'User:'.
pwd := CByteArray withAll: 'Password:'.
prompts uint64At: 0 put: user memoryAddress.
prompts uint64At: 8 put: pwd memoryAddress.

DLLCC offers some API very close to C.
You need an array of two char pointers.
prompts := CIntegerType char pointerType gcMalloc: 2.
Then you can populate this array like this:
prompts at: 0 put: user.
prompts at: 1 put: pwd.
Note that the indices mimic C like prompts[0]=user; prompts[1]=pwd;.
Last thing, everything you malloc, you must then free, otherwise you'll get memory leaks.
That means that you shall better protect all this code with some
["your protected code here"]
ensure: [prompts free. user free. pwd free]`
...or worse...
["your protected code here"]
ensure:
[prompts isNil ifFalse: [prompts free].
"etc..."]`.
In early development, I suggest that you shall better use gcMalloc and gcMalloc:.
AFTER THOUGHTS
gcMalloc is maybe not a such good idea for userand pwd.
This is because prompts will get a copy of the address of memory contained in user and pwd objects: it will point to same memory zone, but will not point to the Smalltalk objects...
gcMalloc only monitor the garbage collection of Smalltalk objects. Hence, if Smalltalk objects are not more used, C heap might get freed prematurely despite some other objects point to the same C heap...
Example:
fillPrompts
| user pwd prompts |
user := 'User:' copyToHeap: #gcMalloc:.
pwd := 'Password:' copyToHeap: #gcMalloc:.
prompts := CIntegerType char pointerType gcMalloc: 2.
prompts at: 0 put: user.
prompts at: 1 put: pwd.
^prompts
copyToHeap: creates a CPointer object. As long as the method is active, its context point to those objects (thru slots on the stack).
But after return of this method, there is not any object pointing to the CPointer objects.
If some garbage collection occur, their associated pointer to C heap will be freed.
But prompts still contain reference to already freed memory (the so called dangling pointers).
DLLCC being very close to C, one must adopt the same care as when writing C code... And double pointers is a source of bugs for the vast majority of C programmers.

You shouldn’t work directly on the bytes. That doesn’t even make sense in C.
create a struct with the two char* members, that’s easier to declare, to create and to handle.
use #gcCalloc or #gcCopyToHeap in order to allocate memory on the heap that’s still automatically freed. Typically it’s safe to use these methods because you only need that memory inside a single method to transfer it to C. The assumption is that the c-function copies this memory itself in case it needs it later.
you can use #memberAt:put: to assign members to the struct.

Related

Create a CByteArray from a CPointer in Visual Works Smalltalk

Some C function return aCPointer to a C struct.
The C struct is known.
Now i want to put the C struct into a ByteArray. Basically copy the contents of the struct to a ByteArray.
In GemStone/S this can be done with:
CByteArray fromCPointer: aCPointer numBytes: 120.
"this create aCByteArray with the contents of the struct referenced by CPointer (copying only 120 bytes)"
Is there something similar on Visual Works ?
I did not find it yet.
It could be possible to replicate C struct at Visual Works level but is only one struct and it is ok to handle it at low level.
There's only the rather ugly #copyAt:to:size:startingAt: that you can send to a pointer. You need to allocate a ByteArray yourself (make sure it's big enough).
answer := ByteArray new: size.
pointer
copyAt: 0
to: answer
size: size
startingAt: 1.
The other way (ByteArray -> Pointer) would be done using #copyAt:from:size:startingAt:.
This method works for both ByteArray and UninterpretedBytes. If you want to read data from the bytes, UninterpretedBytes may be more helpful as you can send things like #longAt: to read a long from an offset.
If aCPointer points to a struct of char * for example:
struct Names
{char * name;
char * longname;} name;
Then:
(aCPointer at: 0) copyCStringFromHeap. "answer [name]"
(aCPointer at: 1) copyCStringFromHeap. "answer [longname]"
For structs with char * it work nicely not tested with other C types.

Memory ownership in PKCS #11 C_FindObjects where ulMaxObjectCount != 1

The authors of PKCS #11 v2.40 utilize a common pattern when an API returns a variable length list of items. In APIs such as C_GetSlotList and C_GetMechanismList, the application is expected to call the APIs twice. In the first invocation, a pointer to a CK_ULONG is set to the number of items that will be returned on the next invocation. This allows the application to allocate enough memory and invoke the API again to retrieve the results.
The C_FindObjects call also returns a variable number of items, but it uses a different paradigm. The parameter CK_OBJECT_HANDLE_PTR phObject is set to the head of the result list. The parameter CK_ULONG_PTR pulObjectCount is set to the number of items returned, which is ensured to be less than CK_ULONG ulMaxObjectCount.
The standard does not explicitly say that phObject must be a valid pointer to a block of memory large enough to hold ulMaxObjectCount CK_OBJECT_HANDLEs.
One could interpret the standard as meaning that the application must pessimistically allocate enough memory for ulMaxObjectCount objects. Alternately, one could interpret the standard as meaning that the PKCS #11 implementation will allocate pulObjectCount CK_OBJECT_HANDLEs and it is then the application's responsibility to free that memory. This later interpretation seems suspect however, as no where else in the standard does the implementation of PKCS #11 ever allocate memory.
The passage is:
C_FindObjects continues a search for token and session objects that
match a template, obtaining additional object handles. hSession is
the session’s handle; phObject points to the location that receives
the list (array) of additional object handles; ulMaxObjectCount is
the maximum number of object handles to be returned; pulObjectCount
points to the location that receives the actual number of object
handles returned.
If there are no more objects matching the template, then the location
that pulObjectCount points to receives the value 0.
The search MUST have been initialized with C_FindObjectsInit.
The non-normative example is not very helpful, as it sets ulMaxObjectCount to 1. It does, however, allocate the memory for that one entry. Which seems to indicate that the application must pessimistically pre-allocate the memory.
CK_SESSION_HANDLE hSession;
CK_OBJECT_HANDLE hObject;
CK_ULONG ulObjectCount;
CK_RV rv;
.
.
rv = C_FindObjectsInit(hSession, NULL_PTR, 0);
assert(rv == CKR_OK);
while (1) {
rv = C_FindObjects(hSession, &hObject, 1, &ulObjectCount);
if (rv != CKR_OK || ulObjectCount == 0)
break;
.
.
}
rv = C_FindObjectsFinal(hSession);
assert(rv == CKR_OK);
Specification Link: http://docs.oasis-open.org/pkcs11/pkcs11-base/v2.40/pkcs11-base-v2.40.pdf
Yes, it would appear that the application is responsible for allocating space for the object handles returned by C_FindObjects(). The example code does this, even though it only requests a single object handle at a time, and so should you.
You could just as well rewrite the example code to request multiple object handles, e.g. like this:
#define MAX_OBJECT_COUNT 100 /* arbitrary value */
K_SESSION_HANDLE hSession;
CK_OBJECT_HANDLE hObjects[MAX_OBJECT_COUNT];
CK_ULONG ulObjectCount, i;
CK_RV rv;
rv = C_FindObjectsInit(hSession, NULL_PTR, 0);
assert(rv == CKR_OK);
while (1) {
rv = C_FindObjects(hSession, hObjects, MAX_OBJECT_COUNT, &ulObjectCount);
if (rv != CKR_OK || ulObjectCount == 0) break;
for (i = 0; i < ulObjectCount; i++) {
/* do something with hObjects[i] here */
}
}
rv = C_FindObjectsFinal(hSession);
assert(rv == CKR_OK);
Presumably, the ability to request multiple object handles in a single C_FindObjects() call is intended as a performance optimization.
FWIW, this is pretty much exactly how many C standard library functions like fread() work as well. It'd be extremely inefficient to read data from a file one byte at a time with fgetc(), so the fread() function lets you allocate an arbitrarily large buffer and read as much data as will fit into it.

Storing things in isa

The 64-bit runtime took away the ability to directly access the isa field of an object, something CLANG engineers had been warning us about for a while. They've been replaced by a rather inventive (and magic) set of everchanging ABI rules about which sections of the newly christened isa header contain information about the object, or even other state (in the case of NSNumber/NSString). There seems to be a loophole, in that you can opt out of the new "magic" isa and use one of your own (a raw isa) at the expense of taking the slow road through certain runtime code paths.
My question is twofold, then:
If it's possible to opt out and object_setClass() an arbitrary class into an object in +allocWithZone:, is it also possible to put anything up there in the extra space with the class, or will the runtime try to read it through the fast paths?
What exactly in the isa header is tagged to let the runtime differentiate it from a normal isa?
If it's possible to opt out and object_setClass() an arbitrary class into an object in +allocWithZone:
According to this article by Greg Parker
If you override +allocWithZone:, you may initialize your object's isa field to a "raw" isa pointer. If you do, no extra data will be stored in that isa field and you may suffer the slow path through code like retain/release. To enable these optimizations, instead set the isa field to zero (if it is not already) and then call object_setClass().
So yes, you can opt out and manually set a raw isa pointer. To inform the runtime about this, you have to the first LSB of the isa to 0. (see below)
Also, there's an environment variable that you can set, named OBJC_DISABLE_NONPOINTER_ISA, which is pretty self-explanatory.
is it also possible to put anything up there in the extra space with the class, or will the runtime try to read it through the fast paths?
The extra space is not being wasted. It's used by the runtime for useful in-place information about the object, such as the current state and - most importantly - its retain count (this is a big improvement since it used to be fetched every time from an external hash table).
So no, you cannot use the extra space for your own purposes, unless you opt out (as discussed above). In that case the runtime will go through the long path, ignoring the information contained in the extra bits.
Always according to Greg Parker's article, here's the new layout of the isa (note that this is very likely to change over time, so don't trust it)
(LSB)
1 bit | indexed | 0 is raw isa, 1 is non-pointer isa.
1 bit | has_assoc | Object has or once had an associated reference. Object with no associated references can deallocate faster.
1 bit | has_cxx_dtor | Object has a C++ or ARC destructor. Objects with no destructor can deallocate faster.
30 bits | shiftcls | Class pointer's non-zero bits.
9 bits | magic | Equals 0xd2. Used by the debugger to distinguish real objects from uninitialized junk.
1 bit | weakly_referenced | Object is or once was pointed to by an ARC weak variable. Objects not weakly referenced can deallocate faster.
1 bit | deallocating | Object is currently deallocating.
1 bit | has_sidetable_rc | Object's retain count is too large to store inline.
19 bits | extra_rc | Object's retain count above 1. (For example, if extra_rc is 5 then the object's real retain count is 6.)
(MSB)
What exactly in the isa header is tagged to let the runtime differentiate it from a normal isa?
As anticipated above you can discriminate between a raw isa and a new rich isa by looking at the first LSB.
To wrap it up, while it looks feasible to opt out and start messing with the extra bits available on a 64 bit architecture, I personally discourage it. The new isa layout is carefully crafted for optimizing the runtime performances and it's far from guaranteed to stay the same over time.
Apple may also decide in the future to drop the retro-compatibility with the raw isa representation, preventing opt out. Any code assuming the isa to be a pointer would then break.
You can't safely do this, since if (when, really) the usable address space expands beyond 33 bits, the layout will presumably need to change again. Currently though, the bottom bit of the isa controls whether it's treated as having extra info or not.

Variable sized arrays in Objective-C?

Okay, so apparently this works:
void foo(size_t s) {
int myArray[s];
// ... use myArray...
}
Is this really legal? I mean, it must be, because it compiles (where the C compiler would reject it as non-constant). The first part of my question is: how does this work? I assume it's allocating it on the stack? Is this different from using alloca()?
Practically, I found some code that does this:
void bar(size_t chunkSize) {
CFReadStreamRef foo = NULL;
// ...some stuff to init foo...
while (stuffToDo) {
UInt8 buffer[chunkSize];
// ...read some data from stream into buffer
// using CFReadStreamRead()...
}
}
This works. However, when I move the buffer allocation from inside the loop to the first line of the function (directly before foo is declared), the function... stops working. In the debugger it gets to the first access of local variables and then just... exits. I don't see any exceptions being thrown, it doesn't crash, it just program carries on running (in reality the function returns a string and that return value is NULL, which is what the return variable is initialized to). I'm not sure what's going on. The second part of my questions is, in light of the first part, what the heck is going on?
it is legal in C99, although dangerous, and yes -- it is like alloca.
because it's like alloca, you want reasonably sized arrays when allocating on the stack. i am not sure if this is defined if the length is zero, but you could definitely cause a stack overflow if the array is 'large enough' to do so.
as far as what is going on -- pulling it out of the loop should make no difference if the sizes are reasonable. i suspect you are seeing undefined behavior because a parameter value is too large (or perhaps 0) -- you should validate the chunkSize parameter. the assembly will tell you why pulling it out of the loop makes a difference (assuming everything else in the program is well-formed).

passing primitive or struct type as function argument

I'm trying to write some reasonably generic networking code. I have several kinds of packets, each represented by a different struct. The function where all my sending occurs looks like:
- (void)sendUpdatePacket:(MyPacketType)packet{
for(NSNetService *service in _services)
for(NSData *address in [service addresses])
sendto(_socket, &packet, sizeof(packet), 0, [address bytes], [address length]);
}
I would really like to be able to send this function ANY kind of packet, not just MyPacketType packets.
I thought maybe if the function def was:
- (void)sendUpdatePacket:(void*)packetRef
I could pass in anykind of pointer to packet. But, without knowing the type of packet, I can't dereference the pointer.
How do I write a function to accept any kind of primitive/struct as its argument?
What you are trying to achieve is polymorphism, which is an OO concept.
So while this would be quite easy to implement in C++ (or other OO languages), it's a bit more challenging in C.
One way you could get around is it to create a generic "packet" structure such as this:
typedef struct {
void* messageHandler;
int messageLength;
int* messageData;
} packet;
Where the messageHandler member is a function pointer to a callback routine which can process the message type, and the messageLength and messageData members are fairly self-explanatory.
The idea is that the method which you pass the packetStruct to would use the Tell, Don't Ask principle to invoke the specific message handler pointer to by messageHandler, passing in the messageLength and messageData without interpreting it.
The dispatch function (pointed to by messageHandler) would be message-specific and will be able to cast the messageData to the appropriate meaningful type, and then the meaningful fields can be extracted from it and processed, etc.
Of course, this is all much easier and more elegant in C++ with inheritance, virtual methods and the like.
Edit:
In response to the comment:
I'm a little unclear how "able to cast
the messageData to the appropriate
meaningful type, and then the
meaningful fields can be extracted
from it and processed, etc." would be
accomplished.
You would implement a handler for a specific message type, and set the messageHandler member to be a function pointer to this handler. For example:
void messageAlphaHandler(int messageLength, int* messageData)
{
MessageAlpha* myMessage = (MessageAlpha*)messageData;
// Can now use MessageAlpha members...
int messageField = myMessage->field1;
// etc...
}
You would define messageAlphaHandler() in such a way to allow any class to get a function pointer to it easily. You could do this on startup of the application so that the message handlers are registered from the beginning.
Note that for this system to work, all message handlers would need to share the same function signature (i.e. return type and parameters).
Or for that matter, how messageData
would be created in the first place
from my struct.
How are you getting you packet data? Are you creating it manually, reading it off a socket? Either way, you need to encode it somewhere as a string of bytes. The int* member (messageData) is merely a pointer to the start of the encoded data. The messageLength member is the length of this encoded data.
In your message handler callback, you don't want probably don't want to continue to manipulate the data as raw binary/hex data, but instead interpret the information in a meaningful fashion according to the message type.
Casting it to a struct essentially maps the raw binary information on to a meaningful set of attributes matching to the protocol of the message you are processing.
The key is that you must realize that everything in a computer is just an array of bytes (or, words, or double words).
ZEN MASTER MUSTARD is sitting at his desk staring at his monitor staring at a complex pattern of seemingly random characters. A STUDENT approaches.
Student: Master? May I interrupt?
Zen Master Mustard: You have answered your own inquiry, my son.
S: What?
ZMM: By asking your question about interrupting me, you have interrupted me.
S: Oh, sorry. I have a question about moving structures of varying size from place to place.
ZMM: If that it true, then you should consult a master who excels at such things. I suggest, you pay a visit to Master DotPuft, who has great knowledge in moving large metal structures, such as tracking radars, from place to place. Master DotPuft can also cause the slightest elements of a feather-weight strain gage to move with the force of a dove's breath. Turn right, then turn left when you reach the door of the hi-bay. There dwells Master DotPuft.
S: No, I mean moving large structures of varying sizes from place to place in the memory of a computer.
ZMM: I may assist you in that endeavor, if you wish. Describe your problem.
S: Specifically, I have a c function that I want to accept several different types of structs (they will be representing different type of packets). So my struct packets will be passed to my function as void*. But without knowing the type, I can't cast them, or really do much of anything. I know this is a solvable problem, because sento() from socket.h does exactly that:
ssize_t sendto(int socket, const void *message, size_t length, int flags, const struct sockaddr *dest_addr,socklen_t dest_len);
where sendto would be called like:
sendto(socketAddress, &myPacket, sizeof(myPacket), Other args....);
ZMM: Did you describe your problem to Zen Master MANTAR! ?
S: Yeah, he said, "It's just a pointer. Everything in C is a pointer." When I asked him to explain, he said, "Bok, bok, get the hell out of my office."
ZMM: Truly, you have spoken to the master. Did this not help you?
S: Um, er, no. Then I asked Zen Master Max.
ZMM: Wise is he. What was his advice to you useful?
S: No. When I asked him about sendto(), he just swirled his fists in the air. It's just an array of bytes."
ZMM: Indeed, Zen Master Max has tau.
S: Yeah, he has tau, but how do I deal with function arguments of type void*?
ZMM: To learn, you must first unlearn. The key is that you must realize that everything in a computer is just an array of bytes (or, words, or double words). Once you have a pointer to the beginning of a buffer, and the length of the buffer, you can sent it anywhere without a need to know the type of data placed in the buffer.
S: OK.
ZMM: Consider a string of man-readable text. "You plan a tower that will pierce the clouds? Lay first the foundation of humility." It is 82 bytes long. Or, perhaps, 164 if the evil Unicode is used. Guard yourself against the lies of Unicode! I can submit this text to sendto() by providing a pointer to the beginning of the buffer that contains the string, and the length of the buffer, like so:
char characterBuffer[300]; // 300 bytes
strcpy(characterBuffer, "You plan a tower that will pierce the clouds? Lay first the foundation of humility.");
// note that sizeof(characterBuffer) evaluates to 300 bytes.
sendto(socketAddress, &characterBuffer, sizeof(characterBuffer));
ZMM: Note well that the number of bytes of the character buffer is automatically calculated by the compiler. The number of bytes occupied by any variable type is of a type called "size_t". It is likely equivalent to the type "long" or "unsinged int", but it is compiler dependent.
S: Well, what if I want to send a struct?
ZMM: Let us send a struct, then.
struct
{
int integerField; // 4 bytes
char characterField[300]; // 300 bytes
float floatField; // 4 bytes
} myStruct;
myStruct.integerField = 8765309;
strcpy(myStruct.characterField, "Jenny, I got your number.");
myStruct.floatField = 876.5309;
// sizeof(myStruct) evaluates to 4 + 300 + 4 = 308 bytes
sendto(socketAddress, &myStruct, sizeof(myStruct);
S: Yeah, that's great at transmitting things over TCP/IP sockets. But what about the poor receiving function? How can it tell if I am sending a character array or a struct?
ZMM: One way is to enumerate the different types of data that may be sent, and then send the type of data along with the data. Zen Masters refer to this as "metadata", that is to say, "data about the data". Your receiving function must examine the metadata to determine what kind of data (struct, float, character array) is being sent, and then use this information to cast the data back into its original type. First, consider the transmitting function:
enum
{
INTEGER_IN_THE_PACKET =0 ,
STRING_IN_THE_PACKET =1,
STRUCT_IN_THE_PACKET=2
} typeBeingSent;
struct
{
typeBeingSent dataType;
char data[4096];
} Packet_struct;
Packet_struct myPacket;
myPacket.dataType = STRING_IN_THE_PACKET;
strcpy(myPacket.data, "Nothing great is ever achieved without much enduring.");
sendto(socketAddress, myPacket, sizeof(Packet_struct);
myPacket.dataType = STRUCT_IN_THE_PACKET;
memcpy(myPacket.data, (void*)&myStruct, sizeof(myStruct);
sendto(socketAddress, myPacket, sizeof(Packet_struct);
S: All right.
ZMM: Now, just us walk along with the receiving function. It must query the type of the data that was sent and the copy the data into a variable declared of that type. Forgive me, but I forget the exact for of the recvfrom() function.
char[300] receivedString;
struct myStruct receivedStruct;
recvfrom(socketDescriptor, myPacket, sizeof(myPacket);
switch(myPacket.dataType)
{
case STRING_IN_THE_PACKET:
// note the cast of the void* data into type "character pointer"
&receivedString[0] = (char*)&myPacket.data;
printf("The string in the packet was \"%s\".\n", receivedString);
break;
case STRUCT_IN_THE_PACKET:
// note the case of the void* into type "pointer to myStruct"
memcpy(receivedStruct, (struct myStruct *)&myPacket.data, sizeof(receivedStruct));
break;
}
ZMM: Have you achieved enlightenment? First, one asks the compiler for the size of the data (a.k.a. the number of bytes) to be submitted to sendto(). You send the type of the original data is sent along as well. The receiver then queries for the type of the original data, and uses it to call the correct cast from "pointer to void" (a generic pointer), over to the type of the original data (int, char[], a struct, etc.)
S: Well, I'll give it a try.
ZMM: Go in peace.