Size of buffer to hold base58 encoded data - bitcoin

When trying to understand how base58check works, in the referenced implementation by bitcoin, when calculating the size needed to hold a base58 encoded string, it used following formula:
// https://github.com/bitcoin/libbase58/blob/master/base58.c#L155
size = (binsz - zcount) * 138 / 100 + 1;
where binsz is the size of the input buffer to encode, and zcount is the number of leading zeros in the buffer. What is 138 and 100 coming from and why?

tl;dr
It’s a formula to approximate the output size during base58 <-> base256 conversion.
i.e. the encoding/decoding parts where you’re multiplying and mod’ing by 256 and 58
Encoding output is ~138% of the input size (+1/rounded up):
n * log(256) / log(58) + 1
(n * 138 / 100 + 1)
Decoding output is ~73% of the input size (+1/rounded up):
n * log(58) / log(256) + 1
( n * 733 /1000 + 1)

Related

Modulo arithmetic in Bigquery. Compute `x % y`, where `x` is a 128-bit number

Taking the MD5 of a string as a 128-bit representation of an integer x, how do I compute x % y in Google Bigquery, where y will typically be relatively small (approx 1000)?
Bigquery has an MD5 function, returning a result of type BYTES with 16 bytes (i.e. 128 bits).
(Background: this is to compute deterministic pseudo random numbers. However, for legacy and compatibility reasons, I have no flexibility in the algorithm! Even though we know it has a (very slight) bias.)
This needs to be done millions/billions of times every day for different input strings and diffierent moduluses, so hopefully it can be done efficiently. As a fall back, I can compute it externally with another language, and then upload to Bigquery afterwards; but it would be great if I could do this directly in Bigquery.
I have studied a lot of number theory, so maybe we can use some mathematical tricks. However, I'm still stuck on more basic BiqQuery issues
How do I convert a bytes array to some sort of "big integer" type?
Can I access a subrange of the bytes from a BYTES array?
Given one byte (or maybe four bytes?), can I convert it to an integer type on which I can apply arithmetic operations?
With the power of math and a longish SQL function:
CREATE TEMP FUNCTION modulo_md5(str ANY TYPE, m ANY TYPE) AS ((
SELECT MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(MOD(0
* 256 + num[OFFSET(0)], m )
* 256 + num[OFFSET(1)], m )
* 256 + num[OFFSET(2)], m )
* 256 + num[OFFSET(3)], m )
* 256 + num[OFFSET(4)], m )
* 256 + num[OFFSET(5)], m )
* 256 + num[OFFSET(6)], m )
* 256 + num[OFFSET(7)], m )
* 256 + num[OFFSET(8)], m )
* 256 + num[OFFSET(9)], m )
* 256 + num[OFFSET(10)], m )
* 256 + num[OFFSET(11)], m )
* 256 + num[OFFSET(12)], m )
* 256 + num[OFFSET(13)], m )
* 256 + num[OFFSET(14)], m )
* 256 + num[OFFSET(15)], m )
FROM (SELECT TO_CODE_POINTS(MD5(str)) num)
));
SELECT title, modulo_md5(title, 177) result, TO_HEX(MD5(title)) md5
FROM `fh-bigquery.wikipedia_v3.pageviews_2019`
WHERE wiki='en'
LIMIT 100000
And now you can use it as a persistent shared UDF:
SELECT fhoffa.x.modulo_md5("any string", 177) result

Convert 8 bytes into a Double in VB.net

I'm reading an ancient data file that is basically a flattened object store with type flags - for instance, 1=Int16, 2=Int32. To read the Int32's, for instance, I read 4 bytes out of the stream and then did this:
If B.Length >= 2 + Offset Then
Ans = Convert.ToUInt16(B(1 + Offset) * 256 + B(0 + Offset))
End If
Now I am at a bit of a loss how to do the 3=Double. These are 8-byte values, IEEE I assume. There is a Convert.ToDouble(byte), but that's not the same thing, that just returns a Double containing a value from 0 to 255. Likewise, Convert.ToDouble(Int64) basically just casts the value to Double.
So what's the trick here? I found threads for doing it in VB6 and C, but not VB.net.

Precision of div in SQL

select 15000000.0000000000000 / 6060802.6136561442650
gives 2.47491973525125848
How can I get 2.4749197352512584803724193507358?
Thanks a lot
You can't, because of the result rules for determining precision and scale. In fact, your scale is so large that there's no way to shift the result (ie, specifying no scale for the left operand).
First...
The decimal data type supports precision up to 38 digits
... but "precision" here means the total number of digits. Which, yes, your result should fit, but the engine won't shift things for you. The relevant rule is:
Operation Result precision Result scale *
e1 / e2 p1 - s1 + s2 + max(6, s1 + p2 + 1) max(6, s1 + p2 + 1)
* The result precision and scale have an absolute maximum of 38.
When a result precision is greater than 38, the corresponding scale is
reduced to prevent the integral part of a result from being truncated.
.... you're running afoul of the last note there. Here, let's run the numbers.
Your operands have precisions (total digits) of 21 and 20 (p1 and p2, respectively)
Your operands have scales (digits after the decimal) of 13 (s1 and s2)
So:
21 - 13 + 13 + max(6, 13 + 20 + 1) <- The bit in max is the scale, too
21 + max(6, 34)
21 + 34
= 55, with a scale of 34
... except 55 > 38. So the number of digits needs to be reduced. Which, because digits become less significant as the value gets smaller, are dropped from the scale (which also reduces the precision):
55 - 38 = 17 <- difference
55 - 17 = 38 <- final precision
34 - 17 = 17 <- final scale
Now, if we count the number of digits from the answer it gives you, .47491973525125848, you'll get 17 digits.
SQL Server can store decimal numbers with a maximum precision of 38.
SELECT CONVERT(decimal(38,37), 15000000.0000000000000 / 6060802.6136561442650)
AS TestValue brings 2.4749197352512584800000000000000000000.
If there is a pattern in the first parameter, you may save some precision with re-formulation such as
select 1000000 * (15 / 6060802.6136561442650)
I can't test it in sql-server, I have only Oracle available and I get
2,47491973525125848037241935073575410941

Explanation of Processing Image to Byte Array

Can someone explain me how an image converted to byte array?
I need the theory.
I want to use the image for AES encryption {VB .Net), so after I use OpenFile Dialog, my app will load the image and then process it into byte array, but I need the explanation for that process (how pixels turn into byte array)
Thanks for the answer and sorry for the beginner question.
Reference link accepted :)
When you read the bytes from the image file via File.ReadAllBytes(), their meaning depends on the image's file format.
The image file format (e.g. Bitmap, PNG, JPEG2000) defines how pixel values are converted to bytes, and conversely, how you get pixel values back from bytes.
The PNG and JPEG formats are compressed formats, so it would be difficult for you to write code to do that. For Bitmaps, it would be rather easy because it's a simple format. (See Wikipedia.)
But it's much simpler. You can just use .NET's Bitmap class to load any common image file into memory and then use Bitmap.GetPixel() to access pixels via their x,y coordinates.
Bitmap.GetPixel() is slow for larger images, though. To speed this up, you'll want to access the raw representation of the pixels directly in memory. No matter what kind of image you load with the Bitmap class, it always creates a Bitmap representation for it in memory. Its exact layout depends on Bitmap.PixelFormat. You can access it using a pattern like this. The work flow would be:
Copy memory bitmap to byte array using Bitmap.LockBits() and Marshal.Copy().
Extract R, G, B values from byte array using e.g. this formula in case of PixelFormat.RGB24:
// Access pixel at (x,y)
B = bytes[bitmapData.Scan0 + x * 3 + y * bitmapData.Stride + 0]
G = bytes[bitmapData.Scan0 + x * 3 + y * bitmapData.Stride + 1]
R = bytes[bitmapData.Scan0 + x * 3 + y * bitmapData.Stride + 2]
Or for PixelFormat.RGB32:
// Access pixel at (x,y)
B = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 0]
G = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 1]
R = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 2]
A = bytes[bitmapData.Scan0 + x * 4 + y * bitmapData.Stride + 3]
Each Pixel is a byte and the image is made by 3 or 4 bytes, depending of its pattern. Some images has 3 bytes per pixel (related to Red, Greed and Blue), other formats may require 4 bytes (ALpha Channel, R, G and B).
You may use something like:
Dim NewByteArray as Byte() = File.ReadAllbytes("c:\folder\image")
The NewByteArray will be fulfilled with every byte of image and you need to process them using AES, regardless of its position or meaning.

32-bit fractional multiplication with cross-multiplication method (no 64-bit intermediate result)

I am programming a fixed-point speech enhancement algorithm on a 16-bit processor. At some point I need to do 32-bit fractional multiplication. I have read other posts about doing 32-bit multiplication byte by byte and I see why this works for Q0.31 formats. But I use different Q formats with varying number of fractional bits.
So I have found out that for fractional bits less than 16, this works:
(low*low >> N) + low*high + high*low + (high*high << N)
where N is the number of fractional bits. I have read that the low*low result should be unsigned as well as the low bytes themselves. In general this gives exactly the result I want in any Q format with less than 16 fractional bits.
Now it gets tricky when the fractional bits are more than 16. I have tried out several numbers of shifts, different shifts for low*low and high*high I have tried to put it on paper, but I can't figure it out.
I know it may be very simple but the whole idea eludes me and I would be grateful for some comments or guidelines!
It's the same formula. For N > 16, the shifts just mean you throw out a whole 16-bit word which would have over- or underflowed. low*low >> N means just shift N-16 bit in the high word of the 32-bit result of the multiply and add to the low word of the result. high * high << N means just use the low word of the multiply result shifted left N-16 and add to the high word of the result.
There are a few ideas at play.
First, multiplication of 2 shorter integers to produce a longer product. Consider unsigned multiplication of 2 32-bit integers via multiplications of their 16-bit "halves", each of which produces a 32-bit product and the total product is 64-bit:
a * b = (a_hi * 216 + a_lo) * (b_hi * 216 + b_lo) =
a_hi * b_hi * 232 + (a_hi * b_lo + a_lo * b_hi) * 216 + a_lo * b_lo.
Now, if you need a signed multiplication, you can construct it from unsigned multiplication (e.g. from the above).
Supposing a < 0 and b >= 0, a *signed b must be equal
264 - ((-a) *unsigned b), where
-a = 232 - a (because this is 2's complement)
IOW,
a *signed b =
264 - ((232 - a) *unsigned b) =
264 + (a *unsigned b) - (b * 232), where 264 can be discarded since we're using 64 bits only.
In exactly the same way you can calculate a *signed b for a >= 0 and b < 0 and must get a symmetric result:
(a *unsigned b) - (a * 232)
You can similarly show that for a < 0 and b < 0 the signed multiplication can be built on top of the unsigned multiplication this way:
(a *unsigned b) - ((a + b) * 232)
So, you multiply a and b as unsigned first, then if a < 0, you subtract b from the top 32 bits of the product and if b < 0, you subtract a from the top 32 bits of the product, done.
Now that we can multiply 32-bit signed integers and get 64-bit signed products, we can finally turn to the fractional stuff.
Suppose now that out of those 32 bits in a and b N bits are used for the fractional part. That means that if you look at a and b as at plain integers, they are going to be 2N times greater than what they really represent, e.g. 1.0 is going to look like 2N (or 1 << N).
So, if you multiply two such integers the product is going to be 2N*2N = 22*N times greater than what it should represent, e.g. 1.0 * 1.0 is going to look like 22*N (or 1 << (2*N)). IOW, plain integer multiplication is going to double the number of fractional bits. If you want the product to
have the same number of fractional bits as in the multiplicands, what do you do? You divide the product by 2N (or shift it arithmetically N positions right). Simple.
A few words of caution, just in case...
In C (and C++) you cannot legally shift a variable left or right by the same or greater number of bits contained in the variable. The code will compile, but not work as you may expect it to. So, if you want to shift a 32-bit variable, you can shift it by 0 through 31 positions left or right (31 is the max, not 32).
If you shift signed integers left, you cannot overflow the result legally. All signed overflows result in undefined behavior. So, you may want to stick to unsigned.
Right shifts of negative signed integers are implementation-specific. They can either do an arithmetic shift or a logical shift. Which one, it depends on the compiler. So, if you need one of the two you need to either ensure that your compiler just supports it directly
or implement it in some other ways.