Hi I am new in embedded system. I do not know the true reason we classify microprocessor into 8 bit, 16 bit, 32 bit.
In a document I read, it explained it is because of number of the bit we used to number the address of register. But I think it is not true, because if we need 32 bit to number the register address of a processor so we must have more than 232 registers. It seem nonsense, it is too much register. So I think maybe, it is depended on the size of register or maybe the size of bus or the number of the bit, which microprocessor can work with a time.
Please help me to clarify this issued.
It is clear that you have either misunderstood your reference, or it is poorly worded. It should presumably state:
... number of the bits used for an address register
This means that the address range of the processor is then 2n, so perhaps your reference is referring to memory locations rather than registers.
i.e. it refers to the bit-width of a register, not the enumeration of a register.
However I would suggest that data path width is the more common and useful measure of processor architecture by "bit-width". For example 8-bit processors commonly have 16 bit address buses, and 16 bit address registers. And 16-bit 8086 devices use two 16 bit registers (32 bits) to represent a 20-bit address, but it is neither a 20 nor 32 -bit processor. 32 and 64 bit processors tend to have equal address and data register widths, which may be the cause of this erroneous statement.
As described here, the natural size of an integer (i.e. the integer size that single machine instructions take as operands) is the usual method of classification in this context.
It isn't the address of a register but the width of the register.
Related
In 8086 microprocessor, we segment the memory into segments of 64K each because of the 16 bit registers (Since a 20 bit address cannot be stored in the 16 bit register). These segments are categorized as code segment, data segment, stack segment and extra segment. This structure is similar to that of created by a process in operating system. Does that mean each process takes up memory equivalent to 4 segments which will be equivalent to 4*64K in case of 8086 ? And if this is true then by doing some more math we can say that only 4 process will be handled by a 8086 microprocessor at a time (i.e. one of the process will be running state and others would be in block or ready state) since maximum of 16 segments are possible (Total memory size / size of each segment = 1MB/64K = 16).
I have just started studying this and saw this equivalence between process and segments. Does any such connection between the segments of the memory and the memory structure of the process actually exists or it's just my crazy imagination ?
A little history helps. Early UNIX(tm) ran on the Digital pdp minicomputer family. The first circulated versions were V6 & V7, which were exclusive to the pdp-11 family. That family could support a whopping 256K of RAM; but the gp register set (used for address formation) were 16bits wide. There was a limited memory protection scheme in the processor, which permitted the kernel (supervisor) to have a separate address space from user (user); and instructions (addresses generated by pc) to be separate from data (generated by other means). This will probably get edited into the dust by pdp-11 fanbois.
At around this time, intel was rolling what was to become the 8086. Current 8-bit CPUs were already straining at a 64K address space limitation, and were using a concept called bank switching to increase that. In bank switching, some sub-ranges of the 64K address space could be re-pointed into a larger memory bank; so although you could carefully address much more memory. The Hitachi 64180 was one of the CPUs that incorporated this into its silicon; most used external memory controllers.
The 8086 addressing scheme was an amalgamation of these notions. You could produce an Operating System which supported dynamically relocated processes and shared text with up to 64K Instruction + 64K Data. The general idea was you take the segment registers out of the programming model, thus if the OS has to relocate the process, it knows that the process had no saved copy of the old segment value. The commercial OS QNX 1.x, 2.x provided this as a model; the later using the 286 extensions to protect against programs that played with the segment registers.
For programs that didn't care about such subtleties (Lotus 123, ...), you could use the segment registers to effectively create a 2^20 address space on the 8086. It is an ugly programming model in this mode because address formation is A=Seg*16+Base, so Seg=1,Base=0 and Seg=0,Base=16 resolve to the same address.
So, you aren't hallucinating, it was quite intentional, if more than a little half-arsed.
Given that the word size of a CPU allows it to address every single byte in the memory.
And given that via PAE CPUs can even use more bits than its word size for addressing.
What is the reason that a CPU cannot read an unaligned word in one step?
For example, in a 32-bit machine you can read the 4-byte chunk starting at position 0, but you cannot read the one starting at position 1 (you can but it needs several steps).
Why can CPUs not do that?
The problem is not with the ability of the CPU to address any single byte in the memory. But it is the memory that has not the same granularity.
Like Oli said, this is very architecture-specific, but memory chips are often addressed by their data bus wideness. Meaning that a given address represents a full "word" of their data bus.
Let's take the example of a 32 bits CPU, with a 32 bits-wide data bus connected to a memory device. When the CPU wants to access to the word at address 0x00000000, it really wants to access to the bytes 0, 1, 2 and 3. For the memory chip however, this is represented by the single address 0x00000000.
Now when the CPU wants to access to the word at address 0x00000001, it really wants to access to the bytes 1, 2, 3 and 4. For the memory chips however, this is represented by a piece of the word at address 0x00000000 and a piece of the word at address 0x00000001.
Hence the need for two bus cycles.
EDIT: Adding some wiring illustration
To illustrate this, here are both addressing scheme opposed:
Notice the bit shift in the addresses of the RAM chip.
Addresses will look like this:
// From the RAM point of view
#0x00000000: Bytes 0x00000000 to 0x00000003
#0x00000001: Bytes 0x00000004 to 0x00000007
To access to the dword #0x00000001, you can see that no direct addressing is possible. You need to ask the RAM chip for both dwords at addresses 0x00000000 and 0x00000001.
The simple answer is they can't because they are designed not to.
The main reason that they are designed this way is for performance and scalability. We would lose way too many incredibly important features to support this.
A simple analogy, the humble Shipping Container. Before the days of the shipping container, freight of many different shapes and sizes were packed as efficiently as possible into the hulls of ships. Because of the infinitely variable sizes of the freight, ranging from crates, to bags of coffee, to bales of hay and cotton, the capacity of these ships was horribly and inefficiently utilized.
The shipping container changed all of that, now if you want to ship something internationally it must be in a standard-sized shipping container. It isn't that you can't just ship your bag of cat food to your friend in Hong Kong on a container ship, it's that it is just so incredibly inefficient to do that it just isn't done.
You want to get that cat food to your friend quickly, without buying a whole shipping container? Well, you can pay an express shipping company like FedEx to fly it over on a 747, but you sure as hell are going to pay for that ability.
I was wondering what is the reason behind branding a MCU as 32 bit or 64 bit. In the simplistic architecture like Harvard or Neumann architecture it used to be width of data bus. But in the market I have seen MCUs which have 64 bit data lines and yet marketed as 32 bit MCUs. Can somebody explain?
It is not true that the bit width of a processor was defined by the data bus width. Intel 8088 (used in the original IBM PC) was a 16bit device with an 8 bit data bus, and Motorola 68008 (Sinclair QL) was a 32bit device with an 8 bit bus.
It is primarily defined by the nature of the instruction set (width of operands) and the register width (necessarily the same).
When most devices had matching bus and instruction/register widths (i.e. prior to about 1980), there was no need for a distinction and that it was unclear whether it refered to bus or register/insttruction width was of little consequence, when narrow bus width bus versions of wide instruction/register devices were introduced it represented a marketing dilemma. The QL was widely advertised as having a 32 bit processor despite its 8 bit bus, while the 8088 was sometimes referred to as an 8/16 bit part. The 68008 could trivially perform 32bit operations in a single instruction - the fact that it took 4 bus cycles to get the operand was transparent to software, and the total number of instruction and data fetch cycles was still far fewer than it would take an 8 bit processor to perform the same 32 bit operation.
Another interesting architecture in this context is ARM architecture v4 that supports a 16 bit mode known as "Thumb" in addition to the 32bit ARM mode, In Thumb mode both the instruction and register set is 16 bit. This has higher code density than ARM mode. Where an external memory interface is used, most ARM v4 parts support both a 16 or 32 bit external bus - either ARM or Thumb may be used with either, but when a 16 bit bus is implemented, Thumb mode generally runs more efficiently than the 32 bit instruction set due to the single bus cycle per instruction or operand fetch.
Given the increasing variety of architectures instruction/register sets and bus widths, it makes sense now to characterise an architecture by its instruction/register set.
I just discovered that the ARM I'm writing code on (Cortex M0), doesn't support unaligned memory access.
Now in my code I use a lot of packed structures, and I never got any warnings or hardfaults, so how can the Cortex access members of these structures when it doesnt allow for unaligned access?
Compilers such as gcc understand about alignment and will issue the correct instructions to get around alignment issues. If you have a packed structure, you will have told the compiler about it so it know ahead of time how to do alignment.
Let's say you're on a 32 bit architecture but have a struct that is packed like this:
struct foo __attribute__((packed)) {
unsigned char bar;
int baz;
}
When an access to baz is made, it will do the memory loads on a 32 bit boundary, and shift all the bits into position.
In this case it will probably to a 32 bit load of the address of bar and a 32 bit load at the address of bar + 4. Then it will apply a sequence of logical operations such as shift and logical or/and to end up with the correct value of baz in a 32 bit register.
Have a look at the assembly output to see how this works. You'll notice that unaligned accesses will be less efficient than aligned accesses on these architectures.
On many older 8-bit microprocessors, there were instructions to load (and storing) registers which were larger than the width of the memory bus. Such an operation would be performed by loading half of the register from one address, and the other half from the next higher address. Even on systems where the memory bus is wider than 8 bits wide (say, 16 bits) it is often useful to regard memory as being an addressable collection of bytes. Loading a byte from any address will cause the processor to read half of a 16-bit memory location and ignore the other half. Reading a 16-bit value from an even address will cause the processor to read an entire 16-bit memory location and use the whole thing; the value will be the same as if one read two consecutive byte addresses and concatenated the result, but it will be read in one operation rather than two.
On some such systems, if one attempts to read a 16-bit value from an odd address, the processor will read two consecutive addresses, using half of one value and the other half of the other value, as though one had performed two single-byte reads and combined the results. This is called an unaligned memory access. On other systems, such an operation will result in a bus fault, which generally triggers some form of interrupt which may or may not be able to do something useful about it. Hardware to support unaligned accesses is rather complicated, and designing code to avoid unaligned accesses is generally not overly difficult. Thus, such hardware generally only exists either on processors that are already very complicated, or processors which will be running code that was designed for processors that would assembly multi-byte registers from single-byte reads (e.g. on the 8088, every 16-bit read required two 8-bit memory fetches, and a lot of 8088 code was run on later Intel processors).
I have a small confusion.
When we talk about 32-bit architecture and 64-bit architecture what do we actually mean. Do we mean that a 32 bit architecture has 32 bit registers OR 32 bit address-bus OR 32-bit data bus.
What is generally implied?
I would say that usually, this would mean that a 64-bit system has 64-bit address registers. In modern systems, data registers are usually at least as large as the address registers, so the data registers and data bus would likely be equivalently sized.
A 64-bit system, however, usually does not have a 64-bit address bus. There's no point, since there hasn't been enough RAM manufactured in the history of the planet to need a full 64 bit physical address bus. A given system will have a maximum amount of physical RAM that it can address, based on the width of its address bus.
We mean that we have 64 bit of address space for programs.
This usually means that we have 64 bit registers in the CPU (makes sense to have the registers in pointer size) and so on...
a 32 bit architecture means that the ALU (description) is capable of computing 32-bit words. The databus (width) and the registers are included in this definition, as well as adressing.
It means that the registers and stack (!) have a width of 32/64 bits. Address-spaces are often much smaller, see here:
In principle, a 64-bit microprocessor can address 16 exabytes of memory. In practice, it is less than that.
For example, the AMD64 architecture as of 2011 allows 52 bits for physical memory and 48 bits for virtual memory.
wikipedia-link
Well! Thanks a lot for your inputs.
After reading through a lot of articles and online material, I think now I my confusion is no more.
So I would like to briefly summarize.
n-bit CPU:
An n-bit CPU only means that it has n-bit registers which implies an n-bit word size. Don't give a second thought on address/data bus size.
As an example, consider Motorola 68000 processor - which comes in a 32-bit variant ie it is called a 32-bit processor but it has 16-bit data bus and 24-bit address bus. Due to its 24-bit address bus, it can address only 2^24 ie 16 MB of RAM.
Address bus only tells how much RAM can be addressed whereas data bus tells how many units of data can be transferred in one cycle.
68000 processor can thus transact only 2 Bytes of data due to 16 bits in data bus.