What is decimal value of the sum of the following 5-bit two's complement numbers? - sum

Can someone explain this question?
What is decimal value of the sum of the following 5-bit two's complement numbers? 10010+10101

Two's complement numbers are added together by doing binary arithmetic.
10010 +
10101 =
00111
Like normal numbers you carry the digit to the next place if you hit two ones at the same time while adding.
To interpret two's complement numbers, you have to understand that the first bit represents a value of 2^0, the second 2^1, the third 2^2, the fourth 2^3. This pattern extends for 32 and 64 bit numbers naturally. The final bit in 5bit two's complement represents -2^4.
Multiplying these values with the bits we came up with we have:
-0*2^4 + 0*2^3 + 1*2^2 + 1 *2^1 + 1*2^0
This value is 4 + 2 + 1 = 7. If we looked at the decimal value of 10010 we'd see its equal to -2^4 + 2^1 = -16 + 2 = -14. 10101 comes out to -11.
So the computer is saying that the sum of (-11) + (-14) is 7. This is because of overflow where we ignored the fact that the final bit should have had a 1 carry over into the next column. Giving a finite representation this is the best we can do.
The overflow is characterized by a bunch of neat properties since the representation we have is an Abelian group, a mathematical construct. It's outside the scope of the question but you should certainly understand them. Just google overflow.
Also, most answers are going to be curt since it's a basic topic that google could solve and StackOverflow gets enough questions as is. Make sure to check google and search StackOverflow before asking questions!

10010=2^4+2^1=16+2=18
10101=2^4+2^2+2^0=21
22+18=39
^=power

Related

Is there anything special about the number 308? [duplicate]

This question already has answers here:
Biggest integer that can be stored in a double
(10 answers)
Closed 1 year ago.
So one day I was experimenting (like any other good coder does), and I came up across this:
>>> 1e308
1e+308
>>> 1e309
inf
What is going on? First of all, 308’s factors are 2, 2, 7, and 11.
Farther investigation yields:
>>> 1.7976931348623158075e308 # No, I didn’t copy it incorrectly
1.797693134862315e+308
>>> 1.79769313486231581+308
inf
So what is going on? There doesn’t seem any relationship between an absurdly big number and an equally weird number with over 10 decimal places.
Also, all of this was using the repl python console, so others might be different.
A double-precision floating point number in IEEE-754 has an 11-bit exponent and a 53-bit mantissa. The 11-bit exponent means we get from 2**(-1023) to 2**+1023. 2**1023 happens to be 10**308.
You get about 3.23 bits per decimal digit. A 53-bit mantissa gives you about 17 digits of precision.
The largest number that fits in a double is, as you noticed, 1.7976931348623158075E+308.
I recommend https://en.wikipedia.org/wiki/IEEE_754 .

Mantissa and Exponent - Negative number with decimal (beyond .5)

Here is my question. I am doing some work and am seeing two different answers. I was using a calculator (online) to check my answer and it is clashing with the answer I am supposed to get and I need to see which one is correct.
The problem is: -6.25
I worked this out for 6.25 and then took the twos complement.
6.25 --> 0110.001
Mantissa --> 0.11000100000 Exponent--> 0011
My Answer: Two's Complement 1.00111100000 Exponent--> 0011
The answer I should be getting says: Mantissa --> 1.11000100000 Exponent --> 0011
It doesn't seem to make sense that all you do is add a 1 in front of the positive Mantissa. I know that if the sign bit is a 0 it is a positive number and a 1 is a negative number. Could you please let me know which one is correct or if either of these are correct please? Thanks. Just want to make sure I am doing it right before I continue.
I'm not sure whether the number you want to convert is correct.
In my opinion:
6.25--->110.010(fixed point) or
6.125-->110.001(fixed point)
then you can transform the fixed form to exponent form,the complement of -6.125 is 1_001.111, with the exponent form 1.001111×2^3
So,I think your answer is correct,the other reference answer is just the true form of a negative binary number.

Why is there one more negative int than positive int?

The upper limit for any int data type (excluding tinyint), is always one less than the absolute value of the lower limit.
For example, the upper limit for an int is 2,147,483,647 and ABS(lower limit) = 2,147,483,648.
Is there a reason why there is always one more negative int than positive int?
EDIT: Changed since question isn't directly related to DB's
The types you provided are signed integers. Let's see one byte(8-bit) example. With 1 byte you have 2^8 combinations which gives you 256 possible numbers to store.
Now you want to have the same number of positive and negative numbers (each group should have 128).
The point is 0 doesn't have +0 and -0. There is only one 0.
So you end up with range -128..-1..0..1..127.
The same logic works for 16/32/64-bit.
EDIT:
Why the range is -128 to 127?
It depends on how you represent signed integer:
Signed magnitude representation
Ones' complement
Two's complement
This question isn't really related to databases.
As lad2025 points out, there are an even number of values. So, by including 0, there would be one more positive or negative value. The question you are asking seems to be: "Why is there one more negative value than positive value?"
Basically, the reason is the sign-bit. One possible implementation of negative numbers is to use n - 1 bits for the absolute value and then 0 and 1 for the sign bit. The problem with this approach is that it permits +0 and -0. That is not desirable.
To fix this, computer scientists devised the twos-complement representation for signed integers. (Wikipedia explains this in more detail.) Basically, this representation maintains the concept of a sign bit that can be tested. But it changes the representation. If +1 is represented as 001, then -1 is represented as 111. That is, the negative value is the bit-wise complement of the positive value minus one. In fact the negative is always generated by subtracting 1 and using the bit-wise complement.
The issue is then the value 100 (followed by any number of zeros). The sign bit is set, so it is negative. However, when you subtract 1 and invert, it becomes itself again (011 --> 100). There is an argument for calling this "infinity" or "not a number". Instead it is assigned the smallest possible negative number.
Let's say you have a 4byte (32 bit) integer. The range defined by C++ is -231 to 231-1.
So we end up with a range -231.....0......231.
We can think of this as having 231 non negative integers (note 0 is included) and 231 negative integers.

Objective-C floating point addition error [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Trouble with floats in Objective-C
I have broken this problem down to about as simple as i can get it. Feel free to try the same thing and tell me if you get the same error and what solution you might have. I have already tried it on several computers.
float total = 200000.0f + 154196.8f;
NSLog(#"total: %f", total);
The output is:
total: 354196.812500
If anyone has any sort of logical explanation, feel free to share it.
I'd suggest you brush up on your floats
http://www.altdevblogaday.com/2012/05/20/thats-not-normalthe-performance-of-odd-floats/
If you need higher precision use a double.
Additionally http://randomascii.wordpress.com/2012/03/08/float-precisionfrom-zero-to-100-digits-2/
See What Every Programmer Should Know About Floating-Point Arithmetic for all the deep understanding. The short answer is that all floating point representations have limitations on their precision, and that things that can be expressed in a small number of digits in decimal may not be expressible in a small number of digits in binary (and specifically not in floating point formats).
Note that while double can improve things, it is no panacea. It is quite common to have small rounding errors, even with double. You may easily get 1.99999999 when you expect 2.
Hint:
long double total = 200000.0 + 154196.8;
NSLog(#"total: %Lf", total);
On my machine prints the correct value.
A 32 bit floating point has a 23 bit mantissa, the closest value is 0.5+0.25+0.125.
You should use more bits to get the correct representation.

binary search middle value calculation

The following is the pseudocode I got from a TopCoder tutorial about binary search
binary_search(A, target):
lo = 1, hi = size(A)
while lo <= hi:
mid = lo + (hi-lo)/2
if A[mid] == target:
return mid
else if A[mid] < target:
lo = mid+1
else:
hi = mid-1
// target was not found
Why do we calculate the middle value as mid = lo + (hi - lo) / 2 ? Whats wrong with (hi + lo) / 2
I have a slight idea that it might be to prevent overflows but I'm not sure, perhaps someone can explain it to me and if there are other reasons behind this.
Although this question is 5 years old, but there is a great article in googleblog which explains the problem and the solution in detail which is worth to share.
It's needed to mention that in current implementation of binary search in Java mid = lo + (hi - lo) / 2 calculation is not used, instead the faster and more clear alternative is used with zero fill right shift operator
int mid = (low + high) >>> 1;
Yes, (hi + lo) / 2 may overflow. This was an actual bug in Java binary search implementation.
No, there are no other reasons for this.
From later on in the same tutorial:
"You may also wonder as to why mid is calculated using mid = lo + (hi-lo)/2 instead of the usual mid = (lo+hi)/2. This is to avoid another potential rounding bug: in the first case, we want the division to always round down, towards the lower bound. But division truncates, so when lo+hi would be negative, it would start rounding towards the higher bound. Coding the calculation this way ensures that the number divided is always positive and hence always rounds as we want it to. Although the bug doesn't surface when the search space consists only of positive integers or real numbers, I've decided to code it this way throughout the article for consistency."
It is indeed possible for (hi+lo) to overflow integer. In the improved version, it may seem that subtracting lo from hi and then adding it again is pointless, but there is a reason: performing this operation will not overflow integer and it will result in a number with the same parity as hi+lo, so that the remainder of (hi+lo)/2 will be the same as (hi-lo)/2. lo can then be safely added after the division to reach the same result.
Let us assume that the array we're searching in, is of length INT_MAX.
Hence initially:
high = INT_MAX
low = 0
In the first iteration, we notice that the target element is greater than the middle element and so we shift the start index to mid as
low = mid + 1
In the next iteration, when mid is calculated, it is calculated as (high + low)/2
which essentially translates to
INT_MAX + low(which is half of INT_MAX + 1) / 2
Now, the first part of this operation i.e. (high + low) would lead to an overflow since we're going over the max Int range i.e. INT_MAX
Because Unsigned right shift is not present in Go programming, To avoid integer overflow while calculating middle value in Go Programming language we can write like this.
mid := int(uint(lo+hi) >> 1)
Why question is answered but it is not easy to understand why solution works.
So let's assume 10 is high and 5 is low. Assume 10 is highest value integer can have ( 10+1 will cause overflow ).
So instead of doing (10+5)/2 ≈ 7 ( because 10 + anything will lead overflow).
We do 5+(10-5)/2=> 5 + 2.5 ≈ 7