When we say 4K in hardware it is equal to the value 4096 which is 11 bits. What would be the value for 2G and how many bits represent this value?
Thanks
Often in CS we deal with number that are necessarily power of two (all addressable quantities for example).
In this context is it more useful to have prefixes that instead of being multiple of ten, like the decimal K = 10^3, M = 10^6, G = 10^9, are multiple of two.
Since the power of two closest to 1000, which is decimal K, is 1024 = 2^10, we can make the analogy that in CS K 1024 instead of 1000.
This is rather confusing as some quantities (like disk sizes or transmission channel parameters) are not bound to be power of two and can be given with either the decimal K or the CS K.
To avoid further confusion the CS now use appropriate binary prefixes, for example the CS K now is the Ki.
So as in decimal G is 10^9 = (10^3)^3 which you can think of as K^3 then G in binary (better called Gi) is Ki^3 = (2^10)^3 = 2^30.
To represent 4Ki quantities you need 12 bits as log2(4Ki) = log2(2^2 * 2^10) = 12.
To represent 2Gi quantities you need log2(2Gi) = log2(2 * 2^30) = 31 bits.
Note I used the phrase "To represent 4Ki quantities" rather then "To represent the 4Ki quantity", the latter is different and need one more bit. This is analogous to saying that to represent 1000 quantities we need 3 decimal digits (from 000 to 999) but to represent the number 1000 itself we need 4 digits (1, 0, 0 and 0).
Related
I'm trying to implement Algorithm D from Knuth's "The Art of Computer Programming, Vol 2" in Rust although I'm having trouble understating how to implement the very last step of unnormalizing. My natural numbers are a class where each number is a vector of u64, in base u64::MAX. Addition, subtraction, and multiplication have been implemented.
Knuth's Algorithm D is a euclidean division algorithm which takes two natural numbers x and y and returns (q,r) where q = x / y (integer division) and r = x % y, the remainder. The algorithm depends on an approximation method which only works if the first digit of y is greater than b/2, where b is the base you're representing the numbers in. Since not all numbers are of this form, it uses a "normalizing trick", for example (if we were in base 10) instead of doing 200 / 23, we calculate a normalizer d and do (200 * d) / (23 * d) so that 23 * d has a first digit greater than b/2.
So when we use the approximation method, we end up with the desired q but the remainder is multiplied by a factor of d. So the last step is to divide r by d so that we can get the q and r we want. My problem is, I'm a bit confused at how we're suppose to do this last step as it requires division and the method it's part of is trying to implement division.
(Maybe helpful?):
The way that d is calculated is just by taking the integer floor of b-1 divided by the first digit of y. However, Knuth suggests that it's possible to make d a power of 2, as long as d * the first digit of y is greater than b / 2. I think he makes this suggestion so that instead of dividing, we can just do a binary shift for this last step. Although I don't think I can do that given that my numbers are represented as vectors of u64 values, instead of binary.
Any suggestions?
First of all I see the number of strings as the following:
1 (epsilon 0 length string) + 3 (pick one letter) + 9 (3 options for first letter, 3 options for second)
For a total of 13 strings. Now as far as I know a language can pick any combination of this for example l1 = {ab,a,ac} l2 = {c}
I'm not sure how to calculate the total number of languages there could be here. Any Help?
So you have a set with 13 elements. A particular language could be any subset of this set. How many subsets does this set have?
This is called the power set of that set, and it has 213 elements.
Cardinality of character set, say d = 3.
Total words possible of length (<= k), say w = (d^(k+1) - 1)/(d-1) = 13.
Total languages possible = Power Set {Each word can be included or not} = 2^w = 8192.
There is this question that I'm having a bit a difficulty to answer
Here it is:
An n-bit register can hold 2^n distinct bit patterns. As such,
it can only be used to address a memory whose number of addressable units
(typically, bytes) is less than or equal to 2^n. In this question, register
sizes need not be a power of two. K = 2^10
a) What is the minimum size of an address register for a computer
with 5 TB of memory?
b) What is the minimum size of an address register for a computer
with 7 TBs of memory?
c) What is the minimum size of an address register for a computer
with 2.5 PBs of memory?
From the conversion, I know that:
1KB = $2^{10}$ bytes
1MB = $2^{20}$ bytes
1GB = $2^{30}$ bytes
1TB = $2^{40}$ bytes
If I convert 5TB into bytes we get 5,497,558,138,880 bytes
What would be the next step though? I know that 1 byte = 8 bits
This is how I would proceed:
1 TB = 2^40 bytes
Calculate the number of bytes in 5 TB = 5,497,558,138,880 bytes (assume this number is n);
The logarithmic function log(Base2)(n) = the minimum size of an address register and in this case it would be 42.321928095 bits which I would round up to 43 bits.
Same logic for the other questions.
I suggest you divide by 8.
5,497,558,138,880/8 = 687194767360
Using logarithms, 2^n = 687194767360 therefore log2(687194767360) = n
Therefore n = 39.321928095
The same steps can be used to achieve part b and c
I am programming a fixed-point speech enhancement algorithm on a 16-bit processor. At some point I need to do 32-bit fractional multiplication. I have read other posts about doing 32-bit multiplication byte by byte and I see why this works for Q0.31 formats. But I use different Q formats with varying number of fractional bits.
So I have found out that for fractional bits less than 16, this works:
(low*low >> N) + low*high + high*low + (high*high << N)
where N is the number of fractional bits. I have read that the low*low result should be unsigned as well as the low bytes themselves. In general this gives exactly the result I want in any Q format with less than 16 fractional bits.
Now it gets tricky when the fractional bits are more than 16. I have tried out several numbers of shifts, different shifts for low*low and high*high I have tried to put it on paper, but I can't figure it out.
I know it may be very simple but the whole idea eludes me and I would be grateful for some comments or guidelines!
It's the same formula. For N > 16, the shifts just mean you throw out a whole 16-bit word which would have over- or underflowed. low*low >> N means just shift N-16 bit in the high word of the 32-bit result of the multiply and add to the low word of the result. high * high << N means just use the low word of the multiply result shifted left N-16 and add to the high word of the result.
There are a few ideas at play.
First, multiplication of 2 shorter integers to produce a longer product. Consider unsigned multiplication of 2 32-bit integers via multiplications of their 16-bit "halves", each of which produces a 32-bit product and the total product is 64-bit:
a * b = (a_hi * 216 + a_lo) * (b_hi * 216 + b_lo) =
a_hi * b_hi * 232 + (a_hi * b_lo + a_lo * b_hi) * 216 + a_lo * b_lo.
Now, if you need a signed multiplication, you can construct it from unsigned multiplication (e.g. from the above).
Supposing a < 0 and b >= 0, a *signed b must be equal
264 - ((-a) *unsigned b), where
-a = 232 - a (because this is 2's complement)
IOW,
a *signed b =
264 - ((232 - a) *unsigned b) =
264 + (a *unsigned b) - (b * 232), where 264 can be discarded since we're using 64 bits only.
In exactly the same way you can calculate a *signed b for a >= 0 and b < 0 and must get a symmetric result:
(a *unsigned b) - (a * 232)
You can similarly show that for a < 0 and b < 0 the signed multiplication can be built on top of the unsigned multiplication this way:
(a *unsigned b) - ((a + b) * 232)
So, you multiply a and b as unsigned first, then if a < 0, you subtract b from the top 32 bits of the product and if b < 0, you subtract a from the top 32 bits of the product, done.
Now that we can multiply 32-bit signed integers and get 64-bit signed products, we can finally turn to the fractional stuff.
Suppose now that out of those 32 bits in a and b N bits are used for the fractional part. That means that if you look at a and b as at plain integers, they are going to be 2N times greater than what they really represent, e.g. 1.0 is going to look like 2N (or 1 << N).
So, if you multiply two such integers the product is going to be 2N*2N = 22*N times greater than what it should represent, e.g. 1.0 * 1.0 is going to look like 22*N (or 1 << (2*N)). IOW, plain integer multiplication is going to double the number of fractional bits. If you want the product to
have the same number of fractional bits as in the multiplicands, what do you do? You divide the product by 2N (or shift it arithmetically N positions right). Simple.
A few words of caution, just in case...
In C (and C++) you cannot legally shift a variable left or right by the same or greater number of bits contained in the variable. The code will compile, but not work as you may expect it to. So, if you want to shift a 32-bit variable, you can shift it by 0 through 31 positions left or right (31 is the max, not 32).
If you shift signed integers left, you cannot overflow the result legally. All signed overflows result in undefined behavior. So, you may want to stick to unsigned.
Right shifts of negative signed integers are implementation-specific. They can either do an arithmetic shift or a logical shift. Which one, it depends on the compiler. So, if you need one of the two you need to either ensure that your compiler just supports it directly
or implement it in some other ways.
I am currently trying to figure out how to multiply two numbers in fixed point representation.
Say my number representation is as follows:
[SIGN][2^0].[2^-1][2^-2]..[2^-14]
In my case, the number 10.01000000000000 = -0.25.
How would I for example do 0.25x0.25 or -0.25x0.25 etc?
Hope you can help!
You should use 2's complement representation instead of a seperate sign bit. It's much easier to do maths on that, no special handling is required. The range is also improved because there's no wasted bit pattern for negative 0. To multiply, just do as normal fixed-point multiplication. The normal Q2.14 format will store value x/214 for the bit pattern of x, therefore if we have A and B then
So you just need to multiply A and B directly then divide the product by 214 to get the result back into the form x/214 like this
AxB = ((int32_t)A*B) >> 14;
A rounding step is needed to get the nearest value. You can find the way to do it in Q number format#Math operations. The simplest way to round to nearest is just add back the bit that was last shifted out (i.e. the first fractional bit) like this
AxB = (int32_t)A*B;
AxB = (AxB >> 14) + ((AxB >> 13) & 1);
You might also want to read these
Fixed-point arithmetic.
Emulated Fixed Point Division/Multiplication
Fixed point math in c#?
With 2 bits you can represent the integer range of [-2, 1]. So using Q2.14 format, -0.25 would be stored as 11.11000000000000. Using 1 sign bit you can only represent -1, 0, 1, and it makes calculations more complex because you need to split the sign bit then combine it back at the end.
Multiply into a larger sized variable, and then right shift by the number of bits of fixed point precision.
Here's a simple example in C:
int a = 0.25 * (1 << 16);
int b = -0.25 * (1 << 16);
int c = (a * b) >> 16;
printf("%.2f * %.2f = %.2f\n", a / 65536.0, b / 65536.0 , c / 65536.0);
You basically multiply everything by a constant to bring the fractional parts up into the integer range, then multiply the two factors, then (optionally) divide by one of the constants to return the product to the standard range for use in future calculations. It's like multiplying prices expressed in fractional dollars by 100 and then working in cents (i.e. $1.95 * 100 cents/dollar = 195 cents).
Be careful not to overflow the range of the variable you are multiplying into. Your constant might need to be smaller to avoid overflow, like using 1 << 8 instead of 1 << 16 in the example above.