Why are C variable sizes implementation specific? - cross-platform

Wouldn't it make more sense (for example) for an int to always be 4 bytes?
How do I ensure my C programs are cross platform if variable sizes are implementation specific?

The types' sizes aren't defined by c because c code needs to be able to compile on embedded systems as well as your average x86 processor and future processors.
You can include stdint.h and then use types like:
int32_t (32 bit integer type)
and
uint32_t (32 bit unsigned integer type)

C is often used to write low-level code that's specific to the CPU architecture it runs on. The size of an int, or of a pointer type, is supposed to map to the native types supported by the processor. On a 32-bit CPU, 32-bit ints make sense, but they won't fit on the smaller CPUs which were common in the early 1970s, or on the microcomputers which followed a decade later. Nowadays, if your processor has native support for 64-bit integer types, why would you want to be limited to 32-bit ints?
Of course, this makes writing truly portable programs more challenging, as you've suggested. The best way to ensure that you don't build accidental dependencies on a particular architecture's types into your programs is to port them to a variety of architectures early on, and to build and test them on all those architectures often. Over time, you'll become familiar with the portability issues, and you'll tend to write your code more carefully.
Becoming aware that not all processors have integer types the same width as their pointer types, or that not all computers use twos-complement arithmetic for signed integers, will help you to recognize these assumptions in your own code.

You need to check the size of int in your implementation. Don't assume it is always 4 bytes. Use
sizeof(int)

Related

Skills needed in 8-bit, 16-bit, 32-bit

Are there any specific skillsets required with 8-bit, 16-bit and 32-bit processing for embedded developers?
Yes, there are specific skills expected and differences between 8bit and 32bit processors. (Ignoring 16 bit, since there's so few of them available)
8 bit processors and tools are vastly different than the 32bit variants (even excluding Linux based systems).
Processor architecture
Memory availability
Peripheral complexity
An 8051 is a strange beast and plopping your average CS in front of one and asking them to make a product is asking for something that only mostly works. It's multiple memory spaces, lack of stack, constrained register file, and constrained memory really make "modern" computer science difficult.
Even an AVR, which is less of a strange beast, still has constraints that a 32 bit processor just doesn't have, particularly memory
And all of these are very different than writing code on an embedded linux platform.
In general processors and microcontrollers using 32 bit architecture tend to be more complex and used in more complex applications. As such, someone with only 8 bit device experience may not process the skills or experience necessary for more complex projects.
So it is not specifically the bit-width that is the issue, but it is used simply as a shorthand or proxy for complexity of systems. It is a very crude measure in any event since architectures differ widely even withing the bit-width classification; AVR, PIC and x51 for example are very different, as are 68K, ARM and x86. Even within the ARM family a Cortex-M device is very different from an A-class device.
Beware of any job spec that uses such broad skill classifications - something for you to challenge perhaps in the interview.

Should NSInteger really be used everywhere?

Now there is an iPhone coming with 64-bit architecture. long becomes 64 bits (while int remains 32-bit) , and everywhere NSInteger has been used is now a long and so 64-bits not 32. And twitter has quite a few people saying "I'm glad I've used NSInteger everywhere not int".
If you need to store a value that doesn't exceed 32 bits (for example in a loop which only loops 25 times), why should a long be used, because the 32 (at least) upper bits are going to be empty.
If the program has worked on 32-bit integers, then what benefit does using 64-bits for integers provide, when it uses up more memory?
Also there will be situations where using a 64-bit integer gives a different result to using a 32-bit integer. So if you use NSInteger then something may work on an iPhone 5S but not on an older device, whereas if int or long is explicitly used then the result will be the same on any device.
If you need to store a value that doesn't exceed 32 bits... why should a long be used?
If you can really make that guarantee, then there is absolutely no reason to value a 64 bit type over a 32 bit one. For simple operations like bounded loops, counters, and general arithmetic, 32-bit integers suffice. But for more complex operations, especially those required of high-performance applications - such as those that perform audio or image processing - the increase in the amount of data the processor can handle in 64 bit modes is significant.
If the program has worked on 32-bit integers, then what benefit does using 64-bits for integers provide, when it uses up more memory?
You make using more memory seem like a bad thing. By doubling the size of some data types, they can be addressed to more places in memory, and the more memory that can be addressed, the less time the OS spends loading code. In addition, having twice the amount of lanes for data in a processor bus equates to an order of magnitude more values that can be processed in a single go, and the increase in register size means an order of magnitude more data can be kept around in one register. This equates to, in simplest terms, a nearly automatic doubling of the speed of most applications.
Also there will be situations where using a 64-bit integer gives a different result to using a 32-bit integer? ...
Yes, but not in the way you'd think. 32-bit data types and operations as well as 64-bit operations (most simulated in software, or by special hardware or opcodes in 32-bit hosts) are "relatively stable" in terms of their sizes. You cannot make nearly as many guarantees on a 64-bit architecture because different compilers implement different versions of 64-bit data types (see LP64, SILP64, and LLP64). Practically, this means casting a 64 bit type to a 32-bit type - say a pointer to an int - is guaranteed to lead to information loss, but casting between two data types that are guaranteed to be 64 bits - a pointer and a long on LP64 - is acceptable. ARM is usually compiled using LP64 (all ints are 32-bit, all longs are 64-bit). Again, most developers should not be affected by the switch, but when you start dealing with arbitrarily large numbers that you try to store in integers, then precision becomes an issue.
For that reason, I'd recommend using NSUInteger and NSInteger in public interfaces, and APIs where there is no inherent bounds checking or overflow guards. For example, a TableView requests an NSUInteger amount of data not because it's worried about 32 and 64 bit data structures, but because it can make no guarantees about the architecture upon which it's compiled. Apple's attempt to make architecture-independent data types is actually a bit of a luxury, considering how little work you have to do to get your code to compile and "just work" in both architectures.
The internal storage for NSInteger can be one of many different backing types, which is why you can use it everywhere and do not need to worry about it, which is the whole point of it.
Apple takes care for backward compatibility if your app is running on a 32 or 64 bit engine and will convert your variable behind the scenes to a proper data type using the __LP64__ macro.
#if __LP64__
typedef long NSInteger;
typedef unsigned long NSUInteger;
#else
typedef int NSInteger;
typedef unsigned int NSUInteger;
#endif

Objective C - Why 64bit means different variable types

I have heard that people use types such as NSInteger or CGFloat rather than int or float is because of something to do with 64bit systems. I still don't understand why that is necessary, even though I do it throughout my own code. Basically, why would I need a larger integer for a 64bit system?
People also say that it is not necessary in iOS at the moment, although it may be necessary in the future with 64bit iphones and such.
It is all explained here:
Introduction to 64-Bit Transition Guide
In the section Major 64-Bit Changes:
Data Type Size and Alignment
OS X uses two data models: ILP32 (in
which integers, long integers, and pointers are 32-bit quantities) and
LP64 (in which integers are 32-bit quantities, and long integers and
pointers are 64-bit quantities). Other types are equivalent to their
32-bit counterparts (except for size_t and a few others that are
defined based on the size of long integers or pointers).
While almost all UNIX and Linux implementations use LP64, other
operating systems use various data models. Windows, for example, uses
LLP64, in which long long variables and pointers are 64-bit
quantities, while long integers are 32-bit quantities. Cray, by
contrast, uses ILP64, in which int variables are also 64-bit
quantities.
In OS X, the default alignment used for data structure layout is
natural alignment (with a few exceptions noted below). Natural
alignment means that data elements within a structure are aligned at
intervals corresponding to the width of the underlying data type. For
example, an int variable, which is 4 bytes wide, would be aligned on a
4-byte boundary.
There is a lot more that you can read in this document. It is very well written. I strongly recommend it to you.
Essentially it boils down to this: If you use CGFloat/NSInteger/etc, Apple can make backwards-incompatible changes and you can mostly update your app by just recompiling your code. You really don't want to be going through your app, checking every use of int and double.
What backwards-incompatible changes? Plenty!
M68K to PowerPC
64-bit PowerPC
PowerPC to x86
x86-64 (I'm not sure if this came before or after iOS.)
iOS
CGFloat means "the floating-point type that CoreGraphics" uses: double on OS X and float on iOS. If you use CGFloat, your code will work on both platforms without unnecessarily losing performance (on iOS) or precision (on OS X).
NSInteger and NSUInteger are less clear-cut, but they're used approximately where you might use ssize_t or size_t in standard C. int or unsigned int simply isn't big enough on 64-bit OS X, where you might have a list with more than ~2 billion items. (The unsignedness doesn't increase it to 4 billion due to the way NSNotFound is defined.)

Should I care about bit-endianness when making cross-platform data storing code?

When I save some binary data on disk (or memory), I should care about byte-endianness if the data must be portable across multiple platforms. But how about bit-endianness? Is it fine to ignore?
I think this is fine to ignore, but there can be some pitfalls, so I like to hear other opinions.
Bits are always arranged the same way in a byte, though there are (were) some exotic architectures where a byte was not 8 bit. In modern computing you can safely assume an 8-bit byte.
What varies (as you properly noted) is a byte arrangement - that one you should take care of, indeed.
Big endian vs. little endian mattered more when PowerPC was prevalent (due to its use by Macs). Now that major OS (Windows, OS X, iOS, Android) and hardware platforms (x86, x86-64, ARM) are all little-endian, it is not much of a concern.

iPhone/iPad double precision math

The accepted answer to this Community Wiki question: What are best practices that you use when writing Objective-C and Cocoa? says that iPhones can't do double precision math (or rather, they can, but only emulated in software.) Unfortunately the 2009 link it provides as a reference: Double vs float on the iPhone directly contradicts that statement. Both of these are old, being written when the 3GS was just coming out.
So what's the story with the latest arm7 architecture? Do I need to worry about the 'thumb' compiler flag the second link references? Can I safely use double variables? I am having nasty flashbacks to dark days of 386SXs and DXs and 'math coprocessors.' Tell me it's 2012 and we've moved on.
In all of the iPhones, there is no reason double precision shouldn't be supported. They all use Cortex-A8 architecture with the cp15 coprocessor (which supports IEEE float and double calculations in hardware).
http://www.arm.com/products/processors/technologies/vector-floating-point.php
So yes, you can safely use doubles and it shouldn't be software emulated on the iPhones.
Although this is done in hardware, it may still take more cycles to perform double mathemtic arithmatic vs single (float). In addition to using double, I would check to make sure the precision is appropriate for your application.
As a side note, if the processor supports NEON instruction set, doubles and floats may be calculated faster than using the coprocessor.
http://pandorawiki.org/Floating_Point_Optimization#Compiler_Support
EDIT: Though VFP and Neon are optional extensions to the ARM Core, most of the cortex-A8 have them and all of them ones used in the iPhone and iPad do as well.