Malloc'ed string contains garbage values - objective-c

I just converted an Objective-C library to a C library in the hopes of making it cross platform. However, everything appears to do okay until I send this thing off to be processed.
It's at the point I get an error.
Looking back a few revisions, I noticed something in the debugger.
Right after a malloc'd string like so:
char *theString = malloc(SOME_SIZE * sizeof(char));
I would see that theString is \x03 and *theString is "3 '\003'".
I assumed at first that this was just weird memory since I haven't don a strcat or anything to it, but that odd starting character(s) carry through, and recurs at every other point that I perform a similar malloc.
In terms of normal processing, this is fine. Unfortunately, I don't understand what it is, otherwise, I'd just do something drastic like cutting off that first character or something.
Can someone explain to me what that is and how I deal with it if I want to convert it to an NSString safely?

The value returned by malloc is not guaranteed to be set to any specific value. It's only guaranteed to point to memory you own of length at least as long as you specified. If you want the memory initalized to some value you'll need to do it yourself. Or alternately use calloc which will zero out the memory.

Related

how to Transpose NSData in objective-c?

I have a NSString which i have to convert into NSData. After converting into NSData, I need to transpose the NSData in objective-c.
How do i transpose NSData in objective -c ?
Thanks
It seems you wish to reverse the order of the bytes, here is an outline algorithm:
Use NSData's length to get the number of bytes
Allocate a C-array of type Byte to hold the bytes. Use one of the malloc() family.
Use getBytes:length: to get a copy of the bytes
Set a C Byte * variable, say frontPtr, to point at the first byte; set another, say rearPtr to point at the last byte.
Now iterate, exchanging the bytes referenced by the two pointers, then increment frontPtr, decrement rearPtr, and keep iterating while rearPtr > frontPtr.
Create a new NSData from the working buffer using NSData's + dataWithBytesNoCopy:length:freeWhenDone: passing YES as the last argument - this will take ownership of the malloc'ed buffer so there is no need for you to free it.
The algorithm simply moves two pointers from the ends of the buffer towards the middle exchanging bytes as it goes. The termination condition make it work for even and odd lengths (in the latter case the middle byte doesn't need to be swapped).
If one the other hand you didn't wish to reverse the order of the bytes, but instead reverse the order of bits in each byte Google "C bit reverse" and follow the same general structure as the above algorithm but do bit reversing in the loop.
If after coding the above you have a problem ask a new question, include your code, and explain your issue. Someone will undoubtedly help you.
HTH

Can I use memchr safely with an internal UTF-8 char * returned from an NSString?

I'd like to use memchr instead of strlen to find the length of a C string potentially used as the backing string of an NSString. Is this safe to do, or do I risk reading memory that I don't own, causing a crash? Let's assume that the NSString will not be released before I'm done with the internal buffer.
memchr(s, 0, XXX) and strlen(s) should pretty much behave identically, save for mechr()'s ability to terminate after XXX bytes. But strnlen() can do that, too.
And that behavior is probably exactly what you don't want.
Neither function accounts for any kind of unicode encoding. Thus, the returned length will be the length-in-bytes and not the # of characters.
Use -length on the NSString if you want the string length. Beyond that, what are you trying to do?

Understanding pointers?

As the title suggests, I'm having trouble understanding exactly what a pointer is and why they're used. I've searched around a bit but still don't really understand. I'm working in Objective-C mainly, but from what I've read this is really more of a C topic (so I added both tags).
From what I understand, a variable with an asterisks in front points to an address in memory? I don't quite understand why you'd use a pointer to a value instead of just using the value itself.
For example:
NSString *stringVar = #"This is a test.";
When calling methods on this string, why is it a pointer instead of just using the string directly? Why wouldn't you use pointers to integers and other basic data types?
Somewhat off topic, but did I tag this correctly? As I was writing it I thought that it was more of a programming concept rather than something language specific but it does focus specifically on Objective-C so I tagged it with objective-c and c.
I don't quite understand why you'd use a pointer to a value instead of
just using the value itself.
You use a pointer when you want to refer to a specific instance of a value instead of a copy of that value. Say you want me to double some value. You've got two options:
You can tell me what the value is: "5": "Please double 5 for me." That's called passing by value. I can tell you that the answer is 10, but if you had 5 written down somewhere that 5 will still be there. Anyone else who refers to that paper will still see the 5.
You can tell me where the value is: "Please erase the number I've written down here and write twice that number in its place." That's called passing by reference. When I'm done, the original 5 is gone and there's a 10 in its place. Anyone else who refers to that paper will now see 10.
Pointers are used to refer to some piece of memory rather than copying some piece of memory. When you pass by reference, you pass a pointer to the memory that you're talking about.
When calling methods on this string, why is it a pointer instead of just using the string directly?
In Objective-C, we always use pointers to refer to objects. The technical reason for that is that objects are usually allocated dynamically in the heap, so in order to deal with one you need it's address. A more practical way to think about it is that an object, by definition, is a particular instance of some class. If you pass an object to some method, and that method modifies the object, then you'd expect the object you passed in to be changed afterward, and to do that we need to pass the object by reference rather than by value.
Why wouldn't you use pointers to integers and other basic data types?
Sometimes we do use pointers to integers and other basic data types. In general, though, we pass those types by value because it's faster. If I want to convey some small piece of data to you, it's faster for me to just give you the data directly than it is to tell you where you can find the information. If the data is large, though, the opposite is true: it's much faster for me to tell you that there's a dictionary in the living room than it is for me to recite the contents of the dictionary.
I think maybe you have got a bit confused between the declaration of a pointer variable, and the use of a pointer.
Any data type with an asterisk after it is the address of a value of the data type.
So, in C, you could write:
char c;
and that means value of c is a single character. But
char *p;
is the address of a char.
The '*' after the type name, means the value of the variable is the address of a thing of that type.
Let's put a value into c:
c = 'H';
So
char *p;
means the value of p is the address of a character. p doesn't contain a character, it contains the address of a character.
The C operator & yields the address of a value, so
p = &c;
means put the address of the variable c into p. We say 'p points at c'.
Now here is the slightly odd part. The address of the first character in a string is also the address of the start of the string.
So for
char *p = "Hello World. I hope you are all who safe, sound, and healthy";
p contains the address of the 'H', and implicitly, because the characters are contiguous, p contains the address of the start of the string.
To get at the character at the start of the string, the 'H', use the 'get at the thing pointed to' operator, which is '*'.
So *p is 'H'
p = &c;
if (*p == c) { ... is true ... }
When a function or method is called, to use the string of characters, the only the start address of the string (typically 4 or 8 bytes) need be handed to the function, and not the entire string. This is both efficient, and also means the function can act upon the string, and change it, which may be useful. It also means that the string can be shared.
A pointer is a special variable that holds the memory location of an other variable.
So what is a pointer… look at the definition mentioned above. Lets do this one step at a time in the three step process below:
A pointer is a special variable that holds the memory location of an
other variable.
So a pointer is nothing but a variable… its a special variable. Why is it special, because… read point 2
A pointer is a special variable that holds the memory location of an
other variable.
It holds the memory location of another variable. By memory location I mean that it does not contain value of another variable, but it stores the memory address number (so to speak) of another variable. What is this other variable, read point 3.
A pointer is a special variable that holds the memory location of an
other variable.
Another variable could be anything… it could be a float, int, char, double, etc. As long as its a variable, its memory location on which it is created can be assigned to a pointer variable.
To answer each of your questions:
(1) From what I understand, a variable with an asterisks in front points
to an address in memory?
You can see it that way more or less. The asterisk is a dereference operator, which takes a pointer and returns the value at the address contained in the pointer.
(2) I don't quite understand why you'd use a pointer to a value instead of
just using the value itself.
Because pointers allow different sections of code to share information, better than copying the value here and there, and also allows pointed variables or objects to be modified by called function. Further, pointers enabled complex linked data structures. Read this short tutorial Pointers and Memory.
(3) Why wouldn't you use pointers to integers and other basic data types?
String is a pointer, unlike int or char. A string is a pointer that points to the starting address of data that contains the string, and return all the value from the starting address of the data until an ending byte.
string is a more complex datatype than char or int, for example. In fact, don't think sting as type like int of char. string is a pointer that points to a chunk of memory. Due to its complexity, having a Class like NSString to provide useful functions to work with them becomes very meaningful. See NSString.
When you use NSString, you do not create a string; you create an object that contains a pointer to the starting address of the string, and in addition, a collection of methods that allows you to manipulate the output of the data.
I have heard the analogy that an object is like a ballon, an the string you're holding it with is the pointer. Typically, code is executed like so:
MyClass *someObj = [[MyClass alloc] init];
The alloc call will allocate the memory for the object, and the init will instantiate it with a defined set of default properties depending on the class. You can override init.
Pointers allow references to be passed to a single object in memory to multiple objects. If we worked with values without pointers, you wouldn't be able to reference the same object in memory in two different places.
NSString *stringVar = #"This is a test.";
When calling methods on this string, why is it a pointer instead of just using the string directly?
This is a fairly existential question. I would posit it this way: what is the string if not its location? How would you implement a system if you can't refer to objects somehow? By name, sure... But what is that name? How does the machine understand it?
The CPU has instructions that work with operands. "Add x to y and store the result here." Those operands can be registers (say, for a 32-bit integer, like that i in the proverbial for loop might be stored), but those are limited in number. I probably don't need to convince you that some form of memory is needed. I would then argue, how do you tell the CPU where to find those things in memory if not for pointers?
You: "Add x to y and store it in memory."
CPU: OK. Where?
You: Uh, I dunno, like, where ever ...
At the lowest levels, it doesn't work like this last line. You need to be a bit more specific for the CPU to work. :-)
So really, all the pointer does is say "the string at X", where X is an integer. Because in order to do something you need to know where you're working. In the same way that when you have an array of integers a, and you need to take a[i], i is meaningful to you somehow. How is that meaningful? Well it depends on your program. But you can't argue that this i shouldn't exist.
In reality in those other languages, you're working with pointers as well. You're just not aware of it. Some people would say that they prefer it that way. But ultimately, when you go down through all the layers of abstraction, you're going to need to tell the CPU what part of memory to look at. :-) So I would argue that pointers are necessary no matter what abstractions you end up building.

How does a NSMutableString exactly work?

I know all instances of NSString are inmutable. If you assign a new value to a string new memory is addressed and the old string will be lost.
But if you use NSMutableString the string will always keep his same address in memory, no matter what you do.
I wonder how this exactly works. With methods like replaceCharactersInRange I can even add more characters to a string so I need more memory for my string. What happens to the objects in memory that follow the string? Are they all relocated and put somewhere else in memory? I don't think so. But what is really going on?
I know all instances of NSString are
inmutable. If you assign a new value
to a string new memory is addressed
and the old string will be lost.
That isn't how mutability works, nor how references to NSStrings work. Nor how pointers work.
A pointer to an object -- NSString *a; declares a variable a that is a pointer to an object -- merely holds the address in memory of the object. The actual object is [generally] an allocation on the heap of memory that contains the actual object itself.
In those terms, there is really no difference at runtime between:
NSString *a;
NSMutableString *b;
Both are references to -- addresses of -- some allocation in memory. The only difference is during compile time, b will be treated differently than a and the compiler will not complain if, say, you use NSMutableString methods when calling b (but would when calling a).
As far as how NSMutableString works, it contains a buffer (or several buffers -- implementation detail) internally that contain the string data. When you call one of the methods that mutate the string's contents, the mutable string will re-allocate its internal storage as necessary to contain the new data.
Objects do not move in memory. Once allocated, an allocation will never move -- the address of the object or allocation will never change. The only semi-exception is when you use something like realloc() which might return a different address. However, that is really just a sequence of free(); malloc(); memcpy();.
I'd suggest you revisit the Objective-C Programming Guide or, possibly, a C programming manual.
the NSMutableString works just like the C++ std::string do. i don't know exactly how they work, but there are two popular approaches:
concating
you create a struct with two variables. one char and one pointer.
when a new char(or more are added) you create a new instance of the struct, and add the address to the last struct instance of the string. this way you have a bunch of structs with a pointer directing to the next struct.
copy & add
the way most newbies will go. not the worst, but maybe the slowest.
you save a "normal" unmutable string. if a new char is added, you allocate a area in the memory with the size of the old string +1, copy the old string and concate the new char. that's a very simple approach, isn't it?
a bit more advanced version would be to create the new string with a size +50, and just add the chars and a new null at the end, don't forget the to overwrite the old null. this way it's more efficient for string with a lot of changes.
as i said before, i don't know how std::string or NSMutableString approaches this issue. but these are the most common ways.

Does NSNumber add any extra bytes to the number it holds?

I'm working with Objective-C and I need to add int's from a NSArray to a NSMutableData (I'm preparing a to send the data over a connection). If I wrap the int's with NSNumber and then add them to NSMutableData, how would I find out how many bytes are in the NSNumber int? Would it be possible to use sizeof() since according to the apple documentation, "NSNumber is a subclass of NSValue that offers a value as any C scalar (numeric) type."?
Example:
NSNumber *numero = [[NSNumber alloc] initWithInt:5];
NSMutableData *data = [[NSMutableData alloc] initWithCapacity:0];
[data appendBytes:numero length:sizeof(numero)];
numero is not a numeric value, it is a pointer to a an object represting a numeric value. What you are trying to do won't work, the size will always be equal to a pointer (4 for 32 bit platforms and 8 for 64 bit), and you will append some garbage pointer value to your data as opposed to the number.
Even if you were to try to dereference it, you cannot directly access the bytes backing an NSNumber and expect it to work. What is going on is an internal implementation detail, and may vary from release to release, or even between different configurations of the same release (32 bit vs 64 bit, iPhone vs Mac OS X, arm vs i386 vs PPC). Just packing up the bytes and sending them over the wire may result in something that does not deserialize properly on the other side, even if you managed to get to the actual data.
You really need to come up with an encoding of an integer you can put into your data and then pack and unpack the NSNumbers into that. Something like:
NSNumber *myNumber = ... //(get a value somehow)
int32_t myInteger = [myNumber integerValue]; //Get the integerValue out of the number
int32_t networkInteger = htonl(myInteger); //Convert the integer to network endian
[data appendBytes:&networkInteger sizeof(networkInteger)]; //stuff it into the data
On the receiving side you then grab out the integer and recreate an NSNumber with numberWithInteger: after using ntohl to convert it to native host format.
It may require a bit more work if you are trying to send minimal representations, etc.
The other option is to use an NSCoder subclass and tell the NSNumber to encode itself using your coder, since that will be platform neutral, but it may be overkill for what you are trying to do.
First, NSNumber *numero is "A pointer to a NSNumber type", and the NSNumber type is an Objective-C object. In general, unless specifically stated somewhere in the documentation, the rule of thumb in object-oriented programming is that "The internal details of how an object chooses to represent its internal state is private to the objects implementation, and should be treated as a black box." Again, unless the documentation says you can do otherwise, you can't assume that NSNumber is using a C primitive type of int to store the int value you gave it.
The following is a rough approximation of what's going on 'behind the scenes' when you appendBytes:numero:
typedef struct {
Class isa;
double dbl;
long long ll;
} NSNumber;
NSNumber *numero = malloc(sizeof(NSNumber));
memset(numero, 0, sizeof(NSNumber));
numero->isa = objc_getClass("NSNumber");
void *bytes = malloc(1024);
memcpy(bytes, numero, sizeof(numero)); // sizeof(numero) == sizeof(void *)
This makes it a bit more clear that what you're appending to the NSMutableData object data is the first four bytes of what ever numero is pointing to (which, for an object in Obj-C is always isa, the objects class). I suspect what you "wanted" to do was copy the pointer to the instantiated object (the value of numero), in which case you should have used &numero. This is a problem if you're using GC as the buffer used by NSMutableData is not scanned (ie, the GC system will no longer "see" the object and reclaim it, which is pretty much a guarantee for a random crash at some later point.)
It's hopefully obvious that even if you put the pointer to the instantiated NSNumber object in to data, that pointer only has meaning in the context of the process that created it. A pointer to that object is even less meaningful if you send that pointer to another computer- the receiving computer has no (practical, trivial) way to read the memory that the pointer points to in the sending computer.
Since you seem to be having problems with this part of the process, let me make a recommendation that will save you countless hours of debugging some extremely difficult implementation bugs you're bound to run in to:
Abandon this entire idea of trying to send raw binary data between machines and just send simple ASCII/UTF-8 formatted information between them.
If you think that this is some how going to be slow, or inefficient, then let me recommend that you bring every thing up using a simplified ASCII/UTF-8 stringified version first. Trust me, debugging raw binary data is no fun, and the ability to just NSLog(#"I got: %#", dataString) is worth its weight in gold when you're debugging your inevitable problems. Then, once everything has gelled, and you're confident that you don't need to make any more changes to what it is you need to exchange, "port" (for lack of a better word) that implementation to a binary only version if, and only if, profiling with Shark.app identifies it as a problem area. As a point of reference, these days I can scp a file between machines and saturate a gigabit link with the transfer. scp probably has to do about five thousand times as much processing per byte to compress and encrypt the data than this simple stringification all while transferring 80MB/sec. Yet on modern hardware this is barely enough to budge the CPU meter running in my menu bar.