What does class_addIvar's alignment do in Objective-C? - objective-c

Someone has ask the same question before:Objective-C Runtime: What to put for size & alignment for class_addIvar?
But it's not fully resolved.
The functions declaration is as follows:
BOOL class_addIvar(Class cls, const char *name, size_t size, uint8_t alignment, const char *types)
Which is used to add an instance variable to a dynamically created class in Objective-C.
The forth argument, uint8_t alignment, is described in Apple's documentation:
The instance variable's minimum alignment in bytes is 1<<align. The minimum alignment of an instance variable depends on the ivar's type and the machine architecture. For variables of any pointer type, pass log2(sizeof(pointer_type)).
In some tutorials, it's just claimed that if the ivar is pointer type, I should use log2(sizeof(pointer_type)); if the ivar is value type, I should use sizeof(value_type). But why? Can someone explain this in detail?

If you really want to learn where these values come from, you'll need to look at architecture specific ABI references, for OSX and iOS, they can be found here: OS X, iOS.
Each of those documents should have a section titled 'Data Types and Data Alignment', which helps to explain those values for the specific architecture.
In practice, since C11, you can use the _Alignof operator to have the compiler give you the correct value for a specific type (as it already needs to know this in order to generate proper machine code), so you can create a class_addIvar that looks something like this:
class_addIvar(myClass, "someIvar", sizeof(int), log2(_Alignof(int)), #encode(int))
Which should take care of all those gory details of the underlying type for you.

Related

Isn't pointer type checking disabled in DLL/C-Connect, and is that OK?

After this somehow related question Why can't I pass an UninterpretedBytes to a void* thru DLL/C-Connect? where we saw that I could not pass a Smalltalk array of bits to a void * parameter, I further analyzed the method responsible for checking the compatibility of formal pointer description with effective object passed as argument, and I think that I discovered another questionable piece:
CPointerType>>coerceForArgument: anObject
...snip...
(anObject isKindOf: self defaultDatumClass)
ifTrue: [
(referentType = anObject type referentType
or: [(referentType isVoid
and: [anObject type referentType isConstant not])
or: [anObject type isArray not
or: [anObject type baseArrayType = referentType]]])
ifTrue: [^anObject asPointer]].
...snip...
It means the following:
It first checks if the argument is CDatum (a proxy to some C-formatted rawdata and associated CType).
If so, it checks whether the type is the same as the formal definition in external method prototype (self).
If not, it could be that the argument is void *, in which case any kind of pointer is accepted (it has been checked that it is a pointer in the code that I snipped), except if it is pointer on a const thing.
There is a first discrepancy: it should check if the formal definition is const void * and accept any pointer on const in this case... But that does not matter much, we rarely have actual argument declared const.
If not, it checks if either not an array (for example, int foo[2]), or an array whose type matches (same base type and dimension).
So, if the formal definition is for example struct {int a; char *b} *foo, and that I pass a double * bar, the type does not match, there is no const qualifier mismatch, and the parameter is not an array, conclusion: we can safely pass it without any further checking!
That's a kind of pointer aliasing. We do not have an optimizing compiler making any speculation about the absence of such aliasing in Smalltalk, so that won't be the source of undefined behaviour. It could be that we deliberately want to force this sort of dirty reinterpret_cast for obscure reasons (since we can explicitly cast a CDatum, I would prefer the explicit way).
BUT, it might be that we completely messed up and passed the wrong object, with wrong type, wrong dimension, and that the address foo->b in my example above will contain either some re-interpreted garbage if pointer is 32bits aligned, or be completely undefined on 64 bits machine (because beyond the sizeof double).
A C compiler would warn me for sure about the aliasing, and prevent production of artifact with -Wall -Werror.
What troubles me here is that I do not even get a warning...
Does it sound correct?
Short answer: it's not OK to correct this behavior, because some low level user interface stuff depends on it (event loop). We can't even introduce a Warning or anything.
Longer story: I tried to rewrite the whole method with double dispatching (ask anObject if compatible with formal CPointerType rather than testing every possible Object class with repeated isKindOf: ).
But when ommitting the disgracious pointer aliasing tolerance, it invariably screw my Macosx 8.3 image with tons of blank windows opening, and blocked uninterruptable UI...
After instrumenting, it appears that the event loop relies on it, and pass aString asNSString (which is transformed into utf16, but stored into a ByteArray and thus declared unsigned char *), to an Objective C method expecting an unsigned short *.
It's a case where the pointer aliasing is benign, as long as we pass the good bytes.
If I try and fix asNSString with a proper cast to unsigned short *, then the UI blocks (I don't know why, but it would require debugging at VM level).
Conclusion: it's true that some distinction such as (unsigned char *) vs (char *) can be germane and should better not be completely prohibited (whether char is signed or not is platform dependent, and not all libraries have cleanly defined APIs). Same goes with platform dependent wide character, we have conversion methods producing the good bytes, but not the good types. We could eventually make an exception for char * like we did for void * (before void * was introduced, char * was the way to do it anyway)... Right now, I have no good solution for this because of the event loop.

Why does the C compiler not warn about malloc size errors?

I made a member struct that I assigned in the ViewDidLoad of my iOS app. I used malloc to allocate space for this struct that was then used throughout my class. Like this:
self.myData = malloc(sizeof(MyData));
Except what I really did was this:
self.myData = malloc(sizeof(MyOtherStruct));
I accidentally set sizeof() in the malloc call to be a different struct (that isn't the same size). I didn't notice this mistake for a very long time because the app only rarely crashed. An update to the OS caused the crash to happen more frequently.
My question is, why can compiler's not warn about this sort of thing? Is it something compiler's don't know about or is it a design choice to allow users's to malloc whatever size they please?
"How can I find this error faster?"
There are a bunch of ways to find the error faster.
Solution #1
The static analyzer cathes this error. Press command-shift-B in Xcode. For example, take the following code:
#include <stdlib.h>
struct x { double x; };
struct y { char y; };
int main(int argc, char **argv) {
struct x *p = malloc(sizeof(struct y));
p->x = 1.0;
return 0;
}
Running the analyzer produces this error for me:
Result of 'malloc' is converted to a pointer of type 'struct x' which is incompatible with sizeof operand type 'struct y'
Solution #2
It is recommended to write the code this way instead:
self.myData = malloc(sizeof(*self.myData));
Just do it this way in the future. This is not only less error-prone, but it is easier to remember.
Solution #3
Use a language like Swift or C++ where the language's type system help you avoid this kind of error. C is less forgiving in many ways. It was invented in the early 1970s, you just kind of have to accept that if you want to use it, and these kinds of errors are a major part of the reason why C++ and Swift even exist in the first place.
Solution #4
Use a run-time memory bounds checker, like the address sanitizer. This will detect the error when memory is accessed, not when it is allocated, but it will still give you stack traces for both access and allocation (and free, if the memory has been freed). Anyone writing C these days should familiarize themselves with the address sanitizer and its friends, tsan, ubsan, etc.
Valgrind also achieves the same effect but the address sanitizer has a better user experience for common use cases.
Question as asked
The compiler only really gives you errors and warnings for type errors. This isn't a type error, it's a runtime error. There are a few "likely" runtime errors that the compiler can detect, but they are very few in number. Things like forgetting to use the return value of malloc()... e.g.,
void f(void) {
malloc(1); // warning
}
The compiler isn't much better than that.
Again, this is the impetus for newer languages like C++ and Swift, which have type systems which allow you to generate errors when you allocate things incorrectly, and this is also the impetus for static analysis (which is a tough problem).
That happens because is not ARC responsability to deal with malloc() and even free()
The ARC just handle with objects allocated like [Object alloc]
In your case, when you do self.myData = malloc(sizeof(MyOtherStruct));, that can be interpreted for example with something like this:
self.myData = malloc(N*sizeof(MyData));
//what can represents self.myData[0]..self.myData[N-1]
For the last, remember when you use sizeof(), it will tell you about the size of the type, that you are passing as a paramter, calculated in compile-time.
You can check this link for more information about object allocation
And also check Apple Documentation about Memory Alloc

Can you create an NSValue from a C struct with bitfields?

I'm trying to do the following, but NSValue's creation method returns nil.
Are C bitfields in structs not supported?
struct MyThingType {
BOOL isActive:1;
uint count:7;
} myThing = {
.isActive = YES,
.count = 3,
};
NSValue *value = [NSValue valueWithBytes:&myThing objCType:#encode(struct MyThingType)];
// value is nil here
First and foremost, claptrap makes a very good point in his comment: why bother using bitfield specifiers (which are mainly used to either do micro-optimization or manually add padding bits where you need them), to then wrap it all up in an instance of NSValue).
It's like buying a castle, but then living in the kitchen to not ware out the carpets...
I don't think it is, a quick canter through the apple dev-docs came up with this... there are indeed several issues to take into account when it comes to bit fields.
I've also just found this, which explains why bit-fields + NSValue don't really play well together.
Especially in cases where the sizeof a struct can lead to NSValue reading the data in an... shall we say erratic manner:
The struct you've created is padded to 8 bits. Now these bits could be read as 2 int, or 1 long or something... From what I've read on the linked page, it's not unlikely that this is what is happening.
So, basically, NSValue is incapable of determining the actual types, when you're using bit fields. In case of ambiguity, an int (width 4 in most cases) is assumed and under/overflow occurs, and you have a mess on your hands.
Since the compiler still has some liberty as to where what member is actually stored, it doesn't quite suffice to pass the stringified typedef sort of thing (objCType: #encode(struct YourStruct), because there is a good chance that you won't be able to make sense of the actual struct itself, owing to compiler optimizations and such...
I'd suggest you simply drop the bit field specifiers, because structs should be supported... at least, last time I tried, a struct with simple primitive types worked just fine.
You can solve this with a union. Simply put the structure into union that has another member with a type supported by NSValue and has a size larger than your structure. In your case this is obvious for long.
union _bitfield_word_union
{
yourstructuretype bitfield;
long plain;
};
You can make it more robust against resizing the structure by using an array whose size is calculated at compile time. (Please remember that sizeof() is a compile time operator, too.)
char plain[(sizeof(yourstructuretype)/sizeof(char)];
Then you can store the structure with the bitfield into the union and read the plain member out.
union converter = { .bitfield = yourstructuretypevalue };
long plain = converter.plain;
Use this value for NSValue instance creation. Reading out you have to do the inverse way.
I'm pretty sure that through a technical correctum of C99 this became standard conforming (called type punning), because you can expect that reading out a member's value (bitfield) through another members value (plain) and storing it back is defined, if the member being read is at least as big as the member being written. (There might be undefined bits 9-31/63 in plain, but you do not have to care about it.) However it is real-world conforming.
Dirty hack? Maybe. One might call it C99. However using bitfields in combination with NSValue sounds like using dirty hacks.

Pointer to specified number of values

How can I specify that a method should take as parameter a pointer to a location in memory that can hold a specified number of values? For example, if I have:
- (void)doSomethingWith:(int *)values;
I'd like to make it clear that the int * passed in should point to an allocated space in memory that's able to hold 10 such values.
To directly answer your question, use an array argument with a bounds, e.g.:
- (void)takeTenInts:(int[10])array
Which specifies that the method takes an array of 10 integers.
Only problem is the C family of languages do not do bounds checking, so the following is valid:
int a[10], b[5];
[self takeTenInts:a]; // ok
[self takeTenInts:b]; // oops, also ok according to the compiler
So while you are specifying the size, as you wish to do, that specification is not being enforced.
If you wish to enforce the size you can use a struct:
typedef struct
{
int items[10];
} TenInts;
- (void)takeTenInts(TenInts)wrappedArray
Now this doesn't actually enforce the size at all[*], but its as close a you can get with the C family (to which the word "enforcement" is anathema).
If you just wish to know the size, either pass it as an additional argument or use NSArray.
[*] It is not uncommon to see structures in C following the pattern:
typedef struct
{
// some fields
int data[0];
} someStruct;
Such structures are dynamically allocated based on their size (sizeof(someStruct)) plus enough additional space to store sufficient integers (e.g. n * sizeof(int)).
In other words, specifying an array as the last field of a structure does not enforce in anyway that there is space for exactly that number of integers; there may be space for more, or fewer...
Why use "(int *)" when you have the power (and "count") of "NSArray" to work with?
But anyways, looking at this potentially related question, couldn't you just do a "sizeof(values)" to get the size of a statically/globally allocated pointer?
If that doesn't work (which would be in the case of a dynamically allocated array), you really would probably need some kind of "count:" parameter in your "doSomethingWith:" method declaration.
There are a several ways. You could just name the method appropriately:
- (void)doSomethingWithTenInts:(int *)tenInts;
Or you could use a struct:
typedef struct {
int values[10];
} TenInts;
- (void)doSomethingWithTenInts:(TenInts *)tenInts;
Or you could make the user tell you how many ints he is giving you:
- (void)doSomethingWithInts:(int *)ints count:(int)count;

Understanding pointers?

As the title suggests, I'm having trouble understanding exactly what a pointer is and why they're used. I've searched around a bit but still don't really understand. I'm working in Objective-C mainly, but from what I've read this is really more of a C topic (so I added both tags).
From what I understand, a variable with an asterisks in front points to an address in memory? I don't quite understand why you'd use a pointer to a value instead of just using the value itself.
For example:
NSString *stringVar = #"This is a test.";
When calling methods on this string, why is it a pointer instead of just using the string directly? Why wouldn't you use pointers to integers and other basic data types?
Somewhat off topic, but did I tag this correctly? As I was writing it I thought that it was more of a programming concept rather than something language specific but it does focus specifically on Objective-C so I tagged it with objective-c and c.
I don't quite understand why you'd use a pointer to a value instead of
just using the value itself.
You use a pointer when you want to refer to a specific instance of a value instead of a copy of that value. Say you want me to double some value. You've got two options:
You can tell me what the value is: "5": "Please double 5 for me." That's called passing by value. I can tell you that the answer is 10, but if you had 5 written down somewhere that 5 will still be there. Anyone else who refers to that paper will still see the 5.
You can tell me where the value is: "Please erase the number I've written down here and write twice that number in its place." That's called passing by reference. When I'm done, the original 5 is gone and there's a 10 in its place. Anyone else who refers to that paper will now see 10.
Pointers are used to refer to some piece of memory rather than copying some piece of memory. When you pass by reference, you pass a pointer to the memory that you're talking about.
When calling methods on this string, why is it a pointer instead of just using the string directly?
In Objective-C, we always use pointers to refer to objects. The technical reason for that is that objects are usually allocated dynamically in the heap, so in order to deal with one you need it's address. A more practical way to think about it is that an object, by definition, is a particular instance of some class. If you pass an object to some method, and that method modifies the object, then you'd expect the object you passed in to be changed afterward, and to do that we need to pass the object by reference rather than by value.
Why wouldn't you use pointers to integers and other basic data types?
Sometimes we do use pointers to integers and other basic data types. In general, though, we pass those types by value because it's faster. If I want to convey some small piece of data to you, it's faster for me to just give you the data directly than it is to tell you where you can find the information. If the data is large, though, the opposite is true: it's much faster for me to tell you that there's a dictionary in the living room than it is for me to recite the contents of the dictionary.
I think maybe you have got a bit confused between the declaration of a pointer variable, and the use of a pointer.
Any data type with an asterisk after it is the address of a value of the data type.
So, in C, you could write:
char c;
and that means value of c is a single character. But
char *p;
is the address of a char.
The '*' after the type name, means the value of the variable is the address of a thing of that type.
Let's put a value into c:
c = 'H';
So
char *p;
means the value of p is the address of a character. p doesn't contain a character, it contains the address of a character.
The C operator & yields the address of a value, so
p = &c;
means put the address of the variable c into p. We say 'p points at c'.
Now here is the slightly odd part. The address of the first character in a string is also the address of the start of the string.
So for
char *p = "Hello World. I hope you are all who safe, sound, and healthy";
p contains the address of the 'H', and implicitly, because the characters are contiguous, p contains the address of the start of the string.
To get at the character at the start of the string, the 'H', use the 'get at the thing pointed to' operator, which is '*'.
So *p is 'H'
p = &c;
if (*p == c) { ... is true ... }
When a function or method is called, to use the string of characters, the only the start address of the string (typically 4 or 8 bytes) need be handed to the function, and not the entire string. This is both efficient, and also means the function can act upon the string, and change it, which may be useful. It also means that the string can be shared.
A pointer is a special variable that holds the memory location of an other variable.
So what is a pointer… look at the definition mentioned above. Lets do this one step at a time in the three step process below:
A pointer is a special variable that holds the memory location of an
other variable.
So a pointer is nothing but a variable… its a special variable. Why is it special, because… read point 2
A pointer is a special variable that holds the memory location of an
other variable.
It holds the memory location of another variable. By memory location I mean that it does not contain value of another variable, but it stores the memory address number (so to speak) of another variable. What is this other variable, read point 3.
A pointer is a special variable that holds the memory location of an
other variable.
Another variable could be anything… it could be a float, int, char, double, etc. As long as its a variable, its memory location on which it is created can be assigned to a pointer variable.
To answer each of your questions:
(1) From what I understand, a variable with an asterisks in front points
to an address in memory?
You can see it that way more or less. The asterisk is a dereference operator, which takes a pointer and returns the value at the address contained in the pointer.
(2) I don't quite understand why you'd use a pointer to a value instead of
just using the value itself.
Because pointers allow different sections of code to share information, better than copying the value here and there, and also allows pointed variables or objects to be modified by called function. Further, pointers enabled complex linked data structures. Read this short tutorial Pointers and Memory.
(3) Why wouldn't you use pointers to integers and other basic data types?
String is a pointer, unlike int or char. A string is a pointer that points to the starting address of data that contains the string, and return all the value from the starting address of the data until an ending byte.
string is a more complex datatype than char or int, for example. In fact, don't think sting as type like int of char. string is a pointer that points to a chunk of memory. Due to its complexity, having a Class like NSString to provide useful functions to work with them becomes very meaningful. See NSString.
When you use NSString, you do not create a string; you create an object that contains a pointer to the starting address of the string, and in addition, a collection of methods that allows you to manipulate the output of the data.
I have heard the analogy that an object is like a ballon, an the string you're holding it with is the pointer. Typically, code is executed like so:
MyClass *someObj = [[MyClass alloc] init];
The alloc call will allocate the memory for the object, and the init will instantiate it with a defined set of default properties depending on the class. You can override init.
Pointers allow references to be passed to a single object in memory to multiple objects. If we worked with values without pointers, you wouldn't be able to reference the same object in memory in two different places.
NSString *stringVar = #"This is a test.";
When calling methods on this string, why is it a pointer instead of just using the string directly?
This is a fairly existential question. I would posit it this way: what is the string if not its location? How would you implement a system if you can't refer to objects somehow? By name, sure... But what is that name? How does the machine understand it?
The CPU has instructions that work with operands. "Add x to y and store the result here." Those operands can be registers (say, for a 32-bit integer, like that i in the proverbial for loop might be stored), but those are limited in number. I probably don't need to convince you that some form of memory is needed. I would then argue, how do you tell the CPU where to find those things in memory if not for pointers?
You: "Add x to y and store it in memory."
CPU: OK. Where?
You: Uh, I dunno, like, where ever ...
At the lowest levels, it doesn't work like this last line. You need to be a bit more specific for the CPU to work. :-)
So really, all the pointer does is say "the string at X", where X is an integer. Because in order to do something you need to know where you're working. In the same way that when you have an array of integers a, and you need to take a[i], i is meaningful to you somehow. How is that meaningful? Well it depends on your program. But you can't argue that this i shouldn't exist.
In reality in those other languages, you're working with pointers as well. You're just not aware of it. Some people would say that they prefer it that way. But ultimately, when you go down through all the layers of abstraction, you're going to need to tell the CPU what part of memory to look at. :-) So I would argue that pointers are necessary no matter what abstractions you end up building.