Is there a hardware unit called " 2's complement "? - hardware

I understand that in order to do a subtraction you should do a 2's complement transformation to the second number .
Is there a dedicated Hardware for that checks the MSB and if it is found to be 1 it does the transformation ?
Also , Is this system used for subtraction of floating points ?

The Two's Complement operation is implemented in most languages with the unary - operator. It is only used with signed integer types. It can be implemented in an ALU as either a distinct negation (e.g. NEG) instruction or rolled into another operation, for example when you use a subtract (e.g. SUB) instruction instead of an add (e.g. ADD) instruction.
Your first question is unclear because "the last bit" could refer to either the most-significant bit (MSB) or least significant bit (LSB). In a signed integer, the MSB indicates sign; checking for a negative is usually implemented as the N bit in the condition code register, which is updated from the result of the last instruction executed (though several instructions do not change the condition code register). Computing the two's complement only if the original number is negative is the absolute value (e.g. ABS) operation. Checking the LSB just tells you if the integer is even or odd.
Floating point numbers use a separate sign bit, so 0 and -0 are distinct values. Two's compliment does not work with floating point values; a different approach must be used.
EDIT: An example. Consider the following C code:
#include <stdlib.h>
int do_math(int a, int b)
{
return a - b;
}
int main(int argc, char* const argv[])
{
if(argc < 2)
return 0;
return do_math(atoi(argv[1]), atoi(argv[2]));
}
This can be run with:
$ gcc -O0 foo.c -o foo
$ ./foo 20 10; echo $?
10
On x86_64, the function do_math() contains the following code:
_do_math:
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
movl %esi, -8(%rbp)
movl -8(%rbp), %edx
movl -4(%rbp), %eax
subl %edx, %eax
leave
ret
The first two lines are the preamble, setting up the stack for the function. The next four lines fetch the input parameters from the stack (since optimization was disabled, parameters weren't passed in registers).
Then the key instruction: subl, which takes the second parameter (%eax, the x86's Extended AX register, 32 bits in size) and subtracts it from the first parameter (%edx, the x86's Extended DX register, also 32 bits in size), storing the result back into %edx. In the ALU, the subl instruction takes the first parameter as-is and adds the two's complement of the second parameter. It calculates the two's complement by inverting the second parameter's bits (similar to the ~ operator in C) and then using a dedicated adder to add 1. This step could be pipelines, it could be optimized so both it and the final addition complete in one cycle, or they could go a step further and roll the two's complement logic into the ALU's adder chain.
The last two lines clean up the stack and return. (The x86 calling conventions store the result in %edx.
EDIT 2: Use the -S option to gcc to generate an assembly file (same name as input file except .c suffix is replaced with .s). For example: gcc -O0 foo.c -S (Had I not turned off the optimizer with -O0, the entire do_math() function could have been inlined into main(), making it much harder to see.)

Look, you don't have to check the number ever. If number is -ve it is stored in 2's complemented form in the memory. And you are using that number CPU changes the number to your calculations itself. You dnt need to check anything. you have to perform operations

No and no.
The transformation is done by code running on the CPU.

Related

Raku operator for 2's complement arithmetic?

I sometimes use this:
$ perl -e "printf \"%d\", ((~18446744073709551592)+1)"
24
I can't seem to do it with Raku. The best I could get is:
$ raku -e "say +^18446744073709551592"
-18446744073709551593
So: how can I make Raku give me the same answer as Perl ?
Gotta go with (my variant¹ of) Liz's custom op (in her comment below).
sub prefix:<²^>(uint $a) { (+^ $a) + 1 }
say ²^ 18446744073709551592; # 24
My original "semi-educated wild guess"² that turned out to be acceptable to #zentrunix and the basis for Liz's op:
say (+^ my uint $ = 18446744073709551592) + 1; # 24
\o/ It works!³
Footnotes
¹ I flipped the two character op because I wanted to follow the +^ form, have it sub-vocalize as "two's complement", and avoid it looking like ^2.
² One line of thinking was about the particular integer. I saw that 18446744073709551592 is close to 2**64. Another was that integers are limited precision in Perl unless you do something to make them otherwise, whereas in Raku they are arbitrary precision unless you do something to make them otherwise. A third line of thinking came from reading the doc for prefix +^ which says "converts the number to binary using as many bytes as needed" which I interpreted as meaning that the representation is somehow important. Hmm. What if I try an int variable? Overflow. (Of course.) uint? Bingo.
³ I've no idea if this solution is right for the wrong reasons. Or even worse. One thing that's concerning is that uint in Raku is defined to correspond to the largest native unsigned integer size supported by the Raku compiler used to compile the Raku code. (Iirc.) In practice today this means Rakudo and whatever underlying platform is being targeted, and I think that almost certainly means C's uint64_t in almost all cases. I imagine perl has some similar platform dependent definition. So my solution, if it is a reasonable one, is presumably only portable to the degree that the Raku compiler (which in practice today means Rakudo) agrees with the perl binary (which in practice today means P5P's perl) when run on some platform. See also #p6steve's comment below.
'Long-hand' answer:
raku -e 'put ( (18446744073709551592.base(2) - 0b1).comb.map({!$_.Int+0}).join.parse-base(2));'
OR
raku -e 'say 18446744073709551592.base(2).comb.map({!$_.Int+0}).join.parse-base(2) + 1;'
Sample Output: 24
The answers above (should?) implement "Two's-Complement" encoding directly. Neither uses Raku's +^ twos-complement operator. The first one subtracts one from the binary representation, then inverts. The second one inverts first, then adds one. Neither answer feels truly correct, yet the same answer as Perl5 is obtained (24).
Looking at the Raku Docs page, one would conclude that the "twos-complement" of a positive number would be negative, hence it's not clear what the Perl (and now Raku) answers represent. Hopefully the foregoing is somewhat useful.
https://docs.raku.org/routine/+$CIRCUMFLEX_ACCENT

Trying to replicate a CRC made with ielftool in srec_cat

So I'm trying to figure out a way to calculate a CRC with srec_cat before putting the code on a microcontroller. Right now, my post-build script uses the ielftool from IAR to do the calculation and insert it into the correct spot in the hex file.
I'm wondering how I can produce the same CRC with srec_cat, using the same hex file of course.
Here is the ielftool command that produces the CRC32 that I want to replicate:
--checksum APP_SYS_ApplicationCrc:4,crc32:1mi,0xffffffff;0x08060000-0x081fffff
APP_SYS_ApplactionCrc is the symbol where the checksum will be stored with a 4 byte offset added
crc32is the algorithm
1 specifies one’s complement
m reverses the input bytes and the final checksum
i initializes the checksum value with the start value
0xffffffff is the start value
And finally, 0x08060000-0x081fffff is the memory range for which the checksum will be calculated
I've tried a lot of things, but this, I think, is the closest I've gotten to the same command so far with srec_cat:
-crop 0x08060000 0x081ffffc -Bit_Reverse -crc32_b_e 0x081ffffc -CCITT -Bit_Reverse
-crop 0x08060000 0x081ffffc In a way specifies the memory range for which the CRC will be calculated
-Bit_Reverse should do the same thing as m in the ielftool when put in the right spot
-crc32_b_e is the algorithm. (I'm not sure yet if I need big endian _b_e or little endian _l_e)
0x081ffffc is the location in memory to place the CRC
-CCITT The initial seed (start value in ielftool) is all one bits (it's the default, but I figured I'd throw it in there)
Does anyone have ideas of how I can replicate the ielftool's CRC? Or am I just trying in vain?
I'm new to CRCs and don't know much more than the basics. Does it even matter anyway if I have exactly the same algorithm? Won't the CRC still work when I put the code on a board?
Note: I'm currently using ielftool 10.8.3.1326 and srec_cat 1.63
After many days of trying to figure out how to get the CRCs from each tool to match (and to make sure I was giving both tools the same data), I finally found a solution.
Based on Mark Adler's comment above I was trying to figure out how to get the CRC of a small amount of data such as an unsigned int. I finally had a lightbulb moment this morning and I realized that I simply needed to put a uint32_t with the value 123456789 in the code for the project I was already work on. Then I would place the variable at a specific location in memory using:
#pragma location=0x08060188
__root const uint32_t CRC_Data_Test = 123456789; //IAR specific pragma and keyword
This way I knew the variable location and length so could then tell the ielftool and srec_cat to only calculate the CRC over the area of that variable in memory.
I then took the elf file from the compiled project and created an intel hex file, so I could more easily look and make sure the correct variable data was at the correct address.
Next I sent the elf file through ielftool with this command:
ielftool proj.elf --checksum APP_SYS_ApplicationCrc:4,crc32:1mi,0xffffffff;0x08060188-0x0806018b proj.elf
And I sent the hex file through srec_cat with this command:
srec_cat proj.hex -intel -crop 0x08060188 0x0806018c -crc32_b_e 0x081ffffc -o proj_srec.hex -intel
After converting the elf with the CRC to a hex file and comparing two hex files I saw that the CRCs were very similar. The only difference was the endianness. Changing -crc32_b_e to -crc32_l_e got both tools to give me 9E 6C DF 18 as the CRC.
I then changed the memory address ranges for the CRC calculation to what they originally were (see the question) and I once again got the same CRC with both ielftool and srec_cat.

what's best way for awk to check arbitrary integer precision

from GNU gawk's page
https://www.gnu.org/software/gawk/manual/html_node/Checking-for-MPFR.html
they have a formula to check arbitrary precision
function adequate_math_precision(n) { return (1 != (1+(1/(2^(n-1))))) }
My question is : wouldn't it be more efficient by staying within integer math domain with a formula such as
( 2^abs(n) - 1 ) % 2 # note 2^(n-1) vs. 2^|n| - 1
Since any power of 2 must also be even, then subtracting 1 must always be odd, then its modulo (%) over 2 becomes indicator function for is_odd() for n >= 0, while the abs(n) handles the cases where it's negative.
Or does the modulo necessitate a casting to float point, thus nullifying any gains ?
Good question. Let's tackle it.
The proposed snippet aims at checking wether gawk was invoked with the -M option.
I'll attach some digression on that option at the bottom.
The argument n of the function is the floating point precision needed for whatever operation you'll have to perform. So, say your script is in a library somewhere and will get called but you have no control over it. You'll run that function at the beginning of the script to promptly throw exception and bail out, suggesting that the end result will be wrong due to lack of bits to store numbers.
Your code stays in the integer realm: a power of two of an integer is an integer. There is no need to use abs(n) here, because there is no point in specifying how many bits you'll need as a negative number in the first place.
Then you subtract one from an even, integer number. Now, unless n=0, in which case 2^0=1 and then your code reads (1 - 1) % 2 = 0, your snippet shall always return 1, because the quotient (%) of an odd number divided by two is 1.
Problem is: you are trying to calculate a potentially stupidly large number in a function that should check if you are able to do so in the first place.
Since any power of 2 must also be even, then subtracting 1 must always
be odd, then its modulo (%) over 2 becomes indicator function for
is_odd() for n >= 0, while the abs(n) handles the cases where it's
negative.
Except when n=0 as we discussed above, you are right. The snippet will tell that any power of 2 is even, and any power of 2, minus 1, is odd. We were discussing another subject entirely thought.
Let's analyze the other function instead:
return (1 != (1+(1/(2^(n-1)))))
Remember that booleans in awk runs like this: 0=false and non zero equal true. So, if 1+x where x is a very small number, typically a large power of two (2^122 in the example page) is mathematically guaranteed to be !=1, in the digital world that's not the case. At one point, floating computation will reach a precision rock bottom, will be rounded down, and x=0 will be suddenly declared. At that point, the arbitrary precision function will return 0: false: 1 is equal 1.
A larger discussion on types and data representation
The page you link explains precision for gawk invoked with the -M option. This sounds like technoblahblah, let's decipher it.
At one point, your OS architecture has to decide how to store data, how to represent it in memory so that it can be accessed again and displayed. Terms like Integer, Float, Double, Unsigned Integer are examples of data representation. We here are addressing Integer representation: how is an integer stored in memory?
A 32-bit system will use 4 bytes to represent and integer, which in turn determines how larger the integer will be. The 32 bits are read from most significative (MSB) to less significative (LSB) and if signed, one bit will represent the sign (the MSB typically, drastically reducing the max size of the integer).
If asked to compute a large number, a machine will try to fit in in the max number available. If the end result is larger than that, you have overflow and end up with a wrong result or an error. Many online challenges typically ask you to write code for arbitrary long loops or large sums, then test it with inputs that will break the 64bit barrier, to see if you master proper types for indexes.
AWK is not a strongly typed language. Meaning, any variable can store data, regardless of the type. The data type can change and it is determined at runtime by the interpreter, so that the developer doesn't need to care. For instance:
$awk '{a="this is text"; print a; a=2; print a; print a+3.0*2}'
-| this is text
-| 2
-| 8
In the example, a is text, then is an integer and can be summed to a floating point number and printed as integer without any special type handling.
The Arbitrary Precision Page presents the following snippet:
$ gawk -M 'BEGIN {
> s = 2.0
> for (i = 1; i <= 7; i++)
> s = s * (s - 1) + 1
> print s
> }'
-| 113423713055421845118910464
There is some math voodoo behind, we will skip that. Since s is interpreted as a floating point number, the end result is computed as floating point.
Try to input that number on Windows calculator as decimal, and it will fail. Although you can compute it as a binary. You'll need the programmer setting and to add up to 53 bits to be able to fit it as unsigned integer.
53 is a magic number here: with the -M option, gawk uses arbitrary precision for numbers. In other words, it commandeers how many bits are necessary, track them and breaks free of the native OS architecture. The default option says that gawk will allocate 53 bits for any given arbitrary number. Fun fact, the actual result of that snippet is wrong, and it would take up to 100 bits to compute correctly.
To implement arbitrary large numbers handling, gawk relies on an external library called MPFR. Provided with an arbitrary large number, MPFR will handle the memory allocation and bit requisition to store it. However, the interface between gawk and MPFR is not perfect, and gawk can't always control the type that MPFR will use. In case of integers, that's not an issue. For floating point numbers, that will result in rounding errors.
This brings us back to the snippet at the beginning: if gawk was called with the -M option, numbers up to 2^53 can be stored as integers. Floating points will be smaller than that (you'll need to make the comma disappear somehow, or rather represent it spending some of the bits allocated for that number, just like the sign). Following the example of the page, and asking an arbitrary precision larger than 32, the snippet will return TRUE only if the -M option was passed, otherwise 1/2^(n-1) will be rounded down to be 0.

how hex file is converting into binary in microcontroller

I am new to embedded programming. I am using a compiler to convert source code into hex and I will burn into microcontroller. My question is: microntroller (all ICs) will support binary numbers only (0 & 1). Then how it is working with hex file?
the software that loads the program/data into the flash reads whatever format it support which may be intel hex, motorola srecord, elf, coff, or a raw binary or other. and then do the right thing to program the flash with just the relevant ones and zeros.
First of all, the PC you are using right now has a processor inside, which works just like any other microcontroller. You are using it to browse the internet, although it's all "1s and 0s on the inside". And I am presuming your actual firmware doesn't come even close to running what your PC is running at this moment.
microntroller will support binary numbers only (0 & 1)
Your idea that "microntroller only supports binary numbers (0 & 1)" is a misconception. At it's very low level, yes, microcontroller contains a bunch of transistors, and each of them can store only two states of information (a bit).
But the reason for this is simply because this is a practical way to physically store one small chunk of data.
If you check the assembly instruction manual for your uC architecture, you will see a large number of instructions operating on different data widths (bits grouped into 8, 16 or larger chunks). If your controller is, say, 16-bit, then this will the basic word size for most instructions, and the one that will be the most efficient. When programming in C, this will also be the size of the "special" int type which all smaller integral types get expanded to.
In other words, bits are just building blocks of your hardware, and most of the time shouldn't even concern you at the firmware level, let alone higher application levels. Compare it to a human life form: human body is made of cells, but is also capable of doing more than a single-cell organism, isn't it?
i am using compiler to convert source code into hex
Actually, you are using the compiler to create the machine code for your particular microcontroller architecture. "Hex", or more precisely Intel Hex file format, is just one of several file formats used for storing the machine code into a file, and it's by convenience a plain-text ASCII file which you can easily open in Notepad.
To clarify, let's say you wrote a simple line of C code like this:
a = b + c;
Your compiler needs to know which architecture you are targeting, in order to convert this to machine code. For a fictional uC architecture, this will first get compiled to the following fictional assembly language:
// compiler decides that a,b,c will be stored at addresses 0x1000, 1004, 1008
mov ax, (0x1004) // move value from address 0x1004 to accumulator
add ax, (0x1008) // add value from address 0x1008 to accumulator
mov (0x1000), ax // move value from accumulator to address 0x1000
Each of these instructions has its own instruction opcode, which can be found inside the assembly instruction manual. If the instruction operates on one or more parameters, uC will know that the bytes following the instruction are data bytes:
// mov ax, (addr) --> opcode 0x10
// add ax, (addr) --> opcode 0x20
// mov (addr), ax --> opcode 0x30
mov ax, (0x1004) // 0x10 (0x10 0x04)
add ax, (0x1008) // 0x20 (0x10 0x08)
mov (0x1000), ax // 0x30 (0x10 0x00)
Now you've got your machine-code, which, written as hex values, becomes:
10 10 04 20 10 08 30 10 00
And converted to binary becomes:
0001000000010000000010000100000...
To transfer this to your controller, you will use a file format which your flash uploader knows how to read, which is what Intel Hex is most commonly used for.
Once transferred to your microcontroller, it will be stored as a bunch of bits in its flash memory, but the controller is designed to read these bits in chunks of 8 or more bits, and evaluate them as instruction opcodes or data, depending on the context. For the example above, it will read first 8 bits, and seeing that it's an instruction opcode 0x10 (which takes an additional address parameter), it will read the next two bytes to form the address 0x1004. It will then execute the instruction and advance the instruction pointer.
Hex, Decimal, Binary, they are all just ways of representing a number.
AA in hex is the same as 170 in decimal and 10101010 in binary (and 252 or Octal).
The reason the hex representation is used is because it is very convenient when working with microcontrollers as one hex character fits into 1 nibble. Hence F is 1111, FF is 1111 1111 and so fourth.

Converting some assembly to VB.NET - SHR operator working differently?

Well, a simple question here
I am studying some assembly, and converting some assembly routines back to VB.NET
Now, There is a specific line of code I am having trouble with, in assembly, assume the following:
EBX = F0D04080
Then the following line gets executed
SHR EBX, 4
Which gives me the following:
EBX = 0F0D0408
Now, in VB.NET, i do the following
variable = variable >> 4
Which SHOULD give me the same... But it differs a SLIGHT bit, instead of the value 0F0D0408 I get FF0D0408
So what is happening here?
From the documentation of the >> operator:
In an arithmetic right shift, the bits shifted beyond the rightmost bit position are discarded, and the leftmost (sign) bit is propagated into the bit positions vacated at the left. This means that if pattern has a negative value, the vacated positions are set to one; otherwise they are set to zero.
If you are using a signed data type, F0B04080 has a negative sign (bit 1 at the start), which is copied to the vacated positions on the left.
This is not something specific to VB.NET, by the way: variable >> 4 is translated to the IL instruction shr, which is an "arithmetic shift" and preserves the sign, in contrast to the x86 assembly instruction SHR, which is an unsigned shift. To do an arithmetic shift in x86 assembler, SAR can be used.
To use an unsigned shift in VB.NET, you need to use an unsigned variable:
Dim variable As UInteger = &HF0D04080UI
The UI type character at the end of F0D04080 tells VB.NET that the literal is an unsigned integer (otherwise, it would be interpreted as a negative signed integer and the assignment would result in a compile-time error).
VB's >> operator does an arithmetic shift, which shifts in the sign bit rather than 0's.
variable = (variable >> shift_amt) And Not (Integer.MinValue >> (shift_amt - 1))
should give you an equivalent value, even if it is a bit long. Alternatively, you could use an unsigned integer (UInteger or UInt32), as there's no sign bit to shift.