Building FLOAT64 out of the FLOAT32 IEEE 754 hex representation in Bigquery - google-bigquery

I would like to build a FLOAT64 out of a FLOAT32 IEEE 754 hex representation in Bigquery.
Here's what I've done so far. Is there a better performing, more integrated, safer alternative?
WITH
T1 AS (
SELECT
0x443dd04f float32_repr -- hex repr of 759.25482177734375
),
T2 AS (
SELECT
IF (float32_repr>> 31=0, 1, -1) my_sign,
(float32_repr& 0x7f800000) >> 23 my_exponent,
float32_repr& 0x007fffff my_mantissa,
FROM
T1 )
SELECT
my_sign*POW(2,my_exponent- 127)* (1+my_mantissa/(1<<23)) my_value
FROM
T2
-- returns 759.25482177734375
I would also like to know how to do it for FLOAT16 and FLOAT64 representations.

Using the javascript implementation on below link, I think you can define BigQuery UDF and parse FLOAT32 IEEE 754 hex representation by using it.
https://gist.github.com/laerciobernardo/498f7ba1c269208799498ea8805d8c30
CREATE TEMP FUNCTION parseFloat(str STRING) RETURNS FLOAT64
LANGUAGE js AS r"""
var float = 0, sign, order, mantiss,exp,
int = 0, multi = 1;
if (/^0x/.exec(str)) {
int = parseInt(str,16);
}else{
for (var i = str.length -1; i >=0; i -= 1) {
if (str.charCodeAt(i)>255) {
console.log('Wrong string parametr');
return false;
}
int += str.charCodeAt(i) * multi;
multi *= 256;
}
}
sign = (int>>>31)?-1:1;
exp = (int >>> 23 & 0xff) - 127;
mantissa = ((int & 0x7fffff) + 0x800000).toString(2);
for (i=0; i<mantissa.length; i+=1){
float += parseInt(mantissa[i])? Math.pow(2,exp):0;
exp--;
}
return float*sign;
""";
SELECT parseFloat('0x443dd04f');
+-----+--------------------+
| Row | f0_ |
+-----+--------------------+
| 1 | 759.25482177734375 |
+-----+--------------------+

Related

Different FFT results from Matlab fft and Objective-c fft

Here is my code in matlab:
x = [1 2 3 4];
result = fft(x);
a = real(result);
b = imag(result);
Result from matlab:
a = [10,-2,-2,-2]
b = [ 0, 2, 0,-2]
And my runnable code in objective-c:
int length = 4;
float* x = (float *)malloc(sizeof(float) * length);
x[0] = 1;
x[1] = 2;
x[2] = 3;
x[3] = 4;
// Setup the length
vDSP_Length log2n = log2f(length);
// Calculate the weights array. This is a one-off operation.
FFTSetup fftSetup = vDSP_create_fftsetup(log2n, FFT_RADIX2);
// For an FFT, numSamples must be a power of 2, i.e. is always even
int nOver2 = length/2;
// Define complex buffer
COMPLEX_SPLIT A;
A.realp = (float *) malloc(nOver2*sizeof(float));
A.imagp = (float *) malloc(nOver2*sizeof(float));
// Generate a split complex vector from the sample data
vDSP_ctoz((COMPLEX*)x, 2, &A, 1, nOver2);
// Perform a forward FFT using fftSetup and A
vDSP_fft_zrip(fftSetup, &A, 1, log2n, FFT_FORWARD);
//Take the fft and scale appropriately
Float32 mFFTNormFactor = 0.5;
vDSP_vsmul(A.realp, 1, &mFFTNormFactor, A.realp, 1, nOver2);
vDSP_vsmul(A.imagp, 1, &mFFTNormFactor, A.imagp, 1, nOver2);
printf("After FFT: \n");
printf("%.2f | %.2f \n",A.realp[0], 0.0);
for (int i = 1; i< nOver2; i++) {
printf("%.2f | %.2f \n",A.realp[i], A.imagp[i]);
}
printf("%.2f | %.2f \n",A.imagp[0], 0.0);
The output from objective c:
After FFT:
10.0 | 0.0
-2.0 | 2.0
The results are so close. I wonder where is the rest ? I know missed something but don't know what is it.
Updated: I found another answer here . I updated the output
After FFT:
10.0 | 0.0
-2.0 | 2.0
-2.0 | 0.0
but even that there's still 1 element missing -2.0 | -2.0
Performing a FFT delivers a right hand spectrum and a left hand spectrum.
If you have N samples the frequencies you will return are:
( -f(N/2), -f(N/2-1), ... -f(1), f(0), f(1), f(2), ..., f(N/2-1) )
If A(f(i)) is the complex amplitude A of the frequency component f(i) the following relation is true:
Real{A(f(i)} = Real{A(-f(i))} and Imag{A(f(i)} = -Imag{A(-f(i))}
This means, the information of the right hand spectrum and the left hand spectrum is the same. However, the sign of the imaginary part is different.
Matlab returns the frequency in a different order.
Matlab order is:
( f(0), f(1), f(2), ..., f(N/2-1) -f(N/2), -f(N/2-1), ... -f(1), )
To get the upper order use the Matlab function fftshift().
In the case of 4 Samples you have got in Matlab:
a = [10,-2,-2,-2]
b = [ 0, 2, 0,-2]
This means:
A(f(0)) = 10 (DC value)
A(f(1)) = -2 + 2i (first frequency component of the right hand spectrum)
A(-f(2) = -2 ( second frequency component of the left hand spectrum)
A(-f(1) = -2 - 2i ( first frequency component of the left hand spectrum)
I do not understand your objective-C code.
However, it seems to me that the program returns the right hand spectrum only.
So anything is perfect.

Randomize float using arc4random?

I have a float and I am trying to get a random number between 1.5 - 2. I have seen tutorials on the web but all of them are doing the randomization for 0 to a number instead of 1.5 in my case. I know it is possible but I have been scratching my head on how to actually accomplish this. Can anyone help me?
Edit1: I found the following method on the web but I do not want all these decimals places. I only want things like 5.2 or 7.4 etc...
How would I adjust this method to do that?
-(float)randomFloatBetween:(float)num1 andLargerFloat:(float)num2
{
int startVal = num1*10000;
int endVal = num2*10000;
int randomValue = startVal + (arc4random() % (endVal - startVal));
float a = randomValue;
return (a / 10000.0);
}
Edit2: Ok so now my method is like this:
-(float)randomFloatBetween:(float)num1 andLargerFloat:(float)num2
{
float range = num2 - num1;
float val = ((float)arc4random() / ARC4RANDOM_MAX) * range + num1;
return val;
}
Will this produce numbers like 1.624566 etc..? Because I only want say 1.5,1.6,1.7,1.8,1.9, and 2.0.
You can just produce a random float from 0 to 0.5 and add 1.5.
EDIT:
You're on the right track. I would use the maximum random value possible as your divisor in order to get the smallest intervals you can between possible values, rather than this arbitrary division by 10,000 thing you have going on. So, define the maximum value of arc4random() as a macro (I just found this online):
#define ARC4RANDOM_MAX 0x100000000
Then to get a value between 1.5 and 2.0:
float range = num2 - num1;
float val = ((float)arc4random() / ARC4RANDOM_MAX) * range + num1;
return val;
This will also give you double precision if you want it (just replace float with double.)
EDIT AGAIN:
Yes, of course this will give you values with more than one decimal place. If you want only one, just produce a random integer from 15 to 20 and divide by 10. Or you could just hack off the extra places afterward:
float range = num2 - num1;
float val = ((float)arc4random() / ARC4RANDOM_MAX) * range + num1;
int val1 = val * 10;
float val2= (float)val1 / 10.0f;
return val2;
arc4random is a 32-bit generator. It generates Uint32's. The maximum value of arc4random() is UINT_MAX. (Do not use ULONG_MAX!)
The simplest way to do this is:
// Generates a random float between 0 and 1
inline float randFloat()
{
return (float)arc4random() / UINT_MAX ;
}
// Generates a random float between imin and imax
inline float randFloat( float imin, float imax )
{
return imin + (imax-imin)*randFloat() ;
}
// between low and (high-1)
inline float randInt( int low, int high )
{
return low + arc4random() % (high-low) ; // Do not talk to me
// about "modulo bias" unless you're writing a casino generator
// or if the "range" between high and low is around 1 million.
}
This should work for you:
float mon_rand() {
const u_int32_t r = arc4random();
const double Min = 1.5;
if (0 != r) {
const double rUInt32Max = 1.0 / UINT32_MAX;
const double dr = (double)r;
/* 0...1 */
const double nr = dr * rUInt32Max;
/* 0...0.5 */
const double h = nr * 0.5;
const double result = Min + h;
return (float)result;
}
else {
return (float)Min;
}
}
That was the simplest I could think of, when I had the same "problem" and it worked for me:
// For values from 0.0 to 1.0
float n;
n = (float)((arc4random() % 11) * 0.1);
And in your case, from 1.5 to 2.0:
float n;
n = (float)((arc4random() % 6) * 0.1);
n += 15 * 0.1;
For anybody who wants more digits:
If you just want float, instead of arc4random(3) it would be easier if you use rand48(3):
// Seed (only once)
srand48(arc4random()); // or time(NULL) as seed
double x = drand48();
The drand48() and erand48() functions return non-negative, double-precision, floating-point values, uniformly distributed over the interval [0.0 , 1.0].
Taken from this answer.

Divide int's and round up in Objective-C

I have 2 int's. How do I divide one by the other and then round up afterwards?
If your ints are A and B and you want to have ceil(A/B) just calculate (A+B-1)/B.
What about:
float A,B; // this variables have to be floats!
int result = floor(A/B); // rounded down
int result = ceil(A/B); // rounded up
-(NSInteger)divideAndRoundUp:(NSInteger)a with:(NSInteger)b
{
if( a % b != 0 )
{
return a / b + 1;
}
return a / b;
}
As in C, you can cast both to float and then round the result using a rounding function that takes a float as input.
int a = 1;
int b = 2;
float result = (float)a / (float)b;
int rounded = (int)(result+0.5f);
i
If you looking for
2.1 roundup> 3
double row = _datas.count / 3;
double rounded = ceil(_datas.count / 3);
if(row > rounded){
row += 1;
}else{
}

Divide integer by 16 without using division or cast

OKAY... let me rephrase this question...
How can I obtain x 16ths of an integer without using division or casting to double....
int res = (ref * frac) >> 4
(but worry a a bit about overflow. How big can ref and frac get? If it could overflow, cast to a longer integer type first)
In any operation of such kind it makes sense to multiply first, then divide. Now, if your operands are integers and you are using a compileable language (eg. C), use shr 4 instead of /16 - this will save some processor cycles.
Assuming everything here are ints, any optimizing compiler worth its salt will notice 16 is a power of two, and shift frac accordingly -- so long as optimizations are turned on. Worry more about major optimizations the compiler can't do for you.
If anything, you should bracket ref * frac and then have the divide, as any value of frac less than 16 will result in 0, whether by shift or divide.
You can use left shift or right shift:
public static final long divisionUsingMultiplication(int a, int b) {
int temp = b;
int counter = 0;
while (temp <= a) {
temp = temp<<1;
counter++;
}
a -= b<<(counter-1);
long result = (long)Math.pow(2, counter-1);
if (b <= a) result += divisionUsingMultiplication(a,b);
return result;
}
public static final long divisionUsingShift(int a, int b) {
int absA = Math.abs(a);
int absB = Math.abs(b);
int x, y, counter;
long result = 0L;
while (absA >= absB) {
x = absA >> 1;
y = absB;
counter = 1;
while (x >= y) {
y <<= 1;
counter <<= 1;
}
absA -= y;
result += counter;
}
return (a>0&&b>0 || a<0&&b<0)?result:-result;
}
I don't understand the constraint, but this pseudo code rounds up (?):
res = 0
ref= 10
frac = 2
denominator = 16
temp = frac * ref
while temp > 0
temp -= denominator
res += 1
repeat
echo res

What is the best way to add two numbers without using the + operator?

A friend and I are going back and forth with brain-teasers and I have no idea how to solve this one. My assumption is that it's possible with some bitwise operators, but not sure.
In C, with bitwise operators:
#include<stdio.h>
int add(int x, int y) {
int a, b;
do {
a = x & y;
b = x ^ y;
x = a << 1;
y = b;
} while (a);
return b;
}
int main( void ){
printf( "2 + 3 = %d", add(2,3));
return 0;
}
XOR (x ^ y) is addition without carry. (x & y) is the carry-out from each bit. (x & y) << 1 is the carry-in to each bit.
The loop keeps adding the carries until the carry is zero for all bits.
int add(int a, int b) {
const char *c=0;
return &(&c[a])[b];
}
No + right?
int add(int a, int b)
{
return -(-a) - (-b);
}
CMS's add() function is beautiful. It should not be sullied by unary negation (a non-bitwise operation, tantamount to using addition: -y==(~y)+1). So here's a subtraction function using the same bitwise-only design:
int sub(int x, int y) {
unsigned a, b;
do {
a = ~x & y;
b = x ^ y;
x = b;
y = a << 1;
} while (a);
return b;
}
Define "best". Here's a python version:
len(range(x)+range(y))
The + performs list concatenation, not addition.
Java solution with bitwise operators:
// Recursive solution
public static int addR(int x, int y) {
if (y == 0) return x;
int sum = x ^ y; //SUM of two integer is X XOR Y
int carry = (x & y) << 1; //CARRY of two integer is X AND Y
return addR(sum, carry);
}
//Iterative solution
public static int addI(int x, int y) {
while (y != 0) {
int carry = (x & y); //CARRY is AND of two bits
x = x ^ y; //SUM of two bits is X XOR Y
y = carry << 1; //shifts carry to 1 bit to calculate sum
}
return x;
}
Cheat. You could negate the number and subtract it from the first :)
Failing that, look up how a binary adder works. :)
EDIT: Ah, saw your comment after I posted.
Details of binary addition are here.
Note, this would be for an adder known as a ripple-carry adder, which works, but does not perform optimally. Most binary adders built into hardware are a form of fast adder such as a carry-look-ahead adder.
My ripple-carry adder works for both unsigned and 2's complement integers if you set carry_in to 0, and 1's complement integers if carry_in is set to 1. I also added flags to show underflow or overflow on the addition.
#define BIT_LEN 32
#define ADD_OK 0
#define ADD_UNDERFLOW 1
#define ADD_OVERFLOW 2
int ripple_add(int a, int b, char carry_in, char* flags) {
int result = 0;
int current_bit_position = 0;
char a_bit = 0, b_bit = 0, result_bit = 0;
while ((a || b) && current_bit_position < BIT_LEN) {
a_bit = a & 1;
b_bit = b & 1;
result_bit = (a_bit ^ b_bit ^ carry_in);
result |= result_bit << current_bit_position++;
carry_in = (a_bit & b_bit) | (a_bit & carry_in) | (b_bit & carry_in);
a >>= 1;
b >>= 1;
}
if (current_bit_position < BIT_LEN) {
*flags = ADD_OK;
}
else if (a_bit & b_bit & ~result_bit) {
*flags = ADD_UNDERFLOW;
}
else if (~a_bit & ~b_bit & result_bit) {
*flags = ADD_OVERFLOW;
}
else {
*flags = ADD_OK;
}
return result;
}
Go based solution
func add(a int, b int) int {
for {
carry := (a & b) << 1
a = a ^ b
b = carry
if b == 0 {
break
}
}
return a
}
same solution can be implemented in Python as follows, but there is some problem about number represent in Python, Python has more than 32 bits for integers. so we will use a mask to obtain the last 32 bits.
Eg: if we don't use mask we won't get the result for numbers (-1,1)
def add(a,b):
mask = 0xffffffff
while b & mask:
carry = a & b
a = a ^ b
b = carry << 1
return (a & mask)
Why not just incremet the first number as often, as the second number?
The reason ADD is implememted in assembler as a single instruction, rather than as some combination of bitwise operations, is that it is hard to do. You have to worry about the carries from a given low order bit to the next higher order bit. This is stuff that the machines do in hardware fast, but that even with C, you can't do in software fast.
Here's a portable one-line ternary and recursive solution.
int add(int x, int y) {
return y == 0 ? x : add(x ^ y, (x & y) << 1);
}
I saw this as problem 18.1 in the coding interview.
My python solution:
def foo(a, b):
"""iterate through a and b, count iteration via a list, check len"""
x = []
for i in range(a):
x.append(a)
for i in range(b):
x.append(b)
print len(x)
This method uses iteration, so the time complexity isn't optimal.
I believe the best way is to work at a lower level with bitwise operations.
In python using bitwise operators:
def sum_no_arithmetic_operators(x,y):
while True:
carry = x & y
x = x ^ y
y = carry << 1
if y == 0:
break
return x
Adding two integers is not that difficult; there are many examples of binary addition online.
A more challenging problem is floating point numbers! There's an example at http://pages.cs.wisc.edu/~smoler/x86text/lect.notes/arith.flpt.html
Was working on this problem myself in C# and couldn't get all test cases to pass. I then ran across this.
Here is an implementation in C# 6:
public int Sum(int a, int b) => b != 0 ? Sum(a ^ b, (a & b) << 1) : a;
Implemented in same way as we might do binary addition on paper.
int add(int x, int y)
{
int t1_set, t2_set;
int carry = 0;
int result = 0;
int mask = 0x1;
while (mask != 0) {
t1_set = x & mask;
t2_set = y & mask;
if (carry) {
if (!t1_set && !t2_set) {
carry = 0;
result |= mask;
} else if (t1_set && t2_set) {
result |= mask;
}
} else {
if ((t1_set && !t2_set) || (!t1_set && t2_set)) {
result |= mask;
} else if (t1_set && t2_set) {
carry = 1;
}
}
mask <<= 1;
}
return (result);
}
Improved for speed would be below::
int add_better (int x, int y)
{
int b1_set, b2_set;
int mask = 0x1;
int result = 0;
int carry = 0;
while (mask != 0) {
b1_set = x & mask ? 1 : 0;
b2_set = y & mask ? 1 : 0;
if ( (b1_set ^ b2_set) ^ carry)
result |= mask;
carry = (b1_set & b2_set) | (b1_set & carry) | (b2_set & carry);
mask <<= 1;
}
return (result);
}
It is my implementation on Python. It works well, when we know the number of bytes(or bits).
def summ(a, b):
#for 4 bytes(or 4*8 bits)
max_num = 0xFFFFFFFF
while a != 0:
a, b = ((a & b) << 1), (a ^ b)
if a > max_num:
b = (b&max_num)
break
return b
You can do it using bit-shifting and the AND operation.
#include <stdio.h>
int main()
{
unsigned int x = 3, y = 1, sum, carry;
sum = x ^ y; // Ex - OR x and y
carry = x & y; // AND x and y
while (carry != 0) {
carry = carry << 1; // left shift the carry
x = sum; // initialize x as sum
y = carry; // initialize y as carry
sum = x ^ y; // sum is calculated
carry = x & y; /* carry is calculated, the loop condition is
evaluated and the process is repeated until
carry is equal to 0.
*/
}
printf("%d\n", sum); // the program will print 4
return 0;
}
The most voted answer will not work if the inputs are of opposite sign. The following however will. I have cheated at one place, but only to keep the code a bit clean. Any suggestions for improvement welcome
def add(x, y):
if (x >= 0 and y >= 0) or (x < 0 and y < 0):
return _add(x, y)
else:
return __add(x, y)
def _add(x, y):
if y == 0:
return x
else:
return _add((x ^ y), ((x & y) << 1))
def __add(x, y):
if x < 0 < y:
x = _add(~x, 1)
if x > y:
diff = -sub(x, y)
else:
diff = sub(y, x)
return diff
elif y < 0 < x:
y = _add(~y, 1)
if y > x:
diff = -sub(y, x)
else:
diff = sub(y, x)
return diff
else:
raise ValueError("Invalid Input")
def sub(x, y):
if y > x:
raise ValueError('y must be less than x')
while y > 0:
b = ~x & y
x ^= y
y = b << 1
return x
Here is the solution in C++, you can find it on my github here: https://github.com/CrispenGari/Add-Without-Integers-without-operators/blob/master/main.cpp
int add(int a, int b){
while(b!=0){
int sum = a^b; // add without carrying
int carry = (a&b)<<1; // carrying without adding
a= sum;
b= carry;
}
return a;
}
// the function can be writen as follows :
int add(int a, int b){
if(b==0){
return a; // any number plus 0 = that number simple!
}
int sum = a ^ b;// adding without carrying;
int carry = (a & b)<<1; // carry, without adding
return add(sum, carry);
}
This can be done using Half Adder.
Half Adder is method to find sum of numbers with single bit.
A B SUM CARRY A & B A ^ B
0 0 0 0 0 0
0 1 1 0 0 1
1 0 1 0 0 1
1 1 0 1 0 0
We can observe here that SUM = A ^ B and CARRY = A & B
We know CARRY is always added at 1 left position from where it was
generated.
so now add ( CARRY << 1 ) in SUM, and repeat this process until we get
Carry 0.
int Addition( int a, int b)
{
if(B==0)
return A;
Addition( A ^ B, (A & B) <<1 )
}
let's add 7 (0111) and 3 (0011) answer will be 10 (1010)
A = 0100 and B = 0110
A = 0010 and B = 1000
A = 1010 and B = 0000
final answer is A.
I implemented this in Swift, I am sure someone will benefit from
var a = 3
var b = 5
var sum = 0
var carry = 0
while (b != 0) {
sum = a ^ b
carry = a & b
a = sum
b = carry << 1
}
print (sum)
You can do it iteratively or recursively. Recursive:-
public int getSum(int a, int b) {
return (b==0) ? a : getSum(a^b, (a&b)<<1);
}
Iterative:-
public int getSum(int a, int b) {
int c=0;
while(b!=0) {
c=a&b;
a=a^b;
b=c<<1;
}
return a;
}
time complexity - O(log b)
space complexity - O(1)
for further clarifications if not clear, refer leetcode or geekForGeeks explanations.
I'll interpret this question as forbidding the +,-,* operators but not ++ or -- since the question specified operator and not character (and also because that's more interesting).
A reasonable solution using the increment operator is as follows:
int add(int a, int b) {
if (b == 0)
return a;
if (b > 0)
return add(++a, --b);
else
return add(--a, ++b);
}
This function recursively nudges b towards 0, while giving a the same amount to keep the sum the same.
As an additional challenge, let's get rid of the second if block to avoid a conditional jump. This time we'll need to use some bitwise operators:
int add(int a, int b) {
if(!b)
return a;
int gt = (b > 0);
int m = -1 << (gt << 4) << (gt << 4);
return (++a & --b & 0)
| add( (~m & a--) | (m & --a),
(~m & b++) | (m & ++b)
);
}
The function trace is identical; a and b are nudged between each add call just like before.
However, some bitwise magic is employed to drop the if statement while continuing to not use +,-,*:
A mask m is set to 0xFFFFFFFF (-1 in signed decimal) if b is positive, or 0x00000000 if b is negative.
The reason for shifting the mask left by 16 twice instead a single shift left by 32 is because shifting by >= the size of the value is undefined behavior.
The final return takes a bit of thought to fully appreciate:
Consider this technique to avoid a branch when deciding between two values. Of the values, one is multiplied by the boolean while the other is multiplied by the inverse, and the results are summed like so:
double naiveFoodPrice(int ownPetBool) {
if(ownPetBool)
return 23.75;
else
return 10.50;
}
double conditionlessFoodPrice(int ownPetBool) {
double result = ownPetBool*23.75 + (!ownPetBool)*10.50;
}
This technique works great in most cases. For us, the addition operator can easily be substituted for the bitwise or | operator without changing the behavior.
The multiplication operator is also not allowed for this problem. This is the reason for our earlier mask value - a bitwise and & with the mask will achieve the same effect as multiplying by the original boolean.
The nature of the unary increment and decrement operators halts our progress.
Normally, we would easily be able to choose between an a which was incremented by 1 and an a which was decremented by 1.
However, because the increment and decrement operators modify their operand, our conditionless code will end up always performing both operations - meaning that the values of a and b will be tainted before we finish using them.
One way around this is to simply create new variables which each contain the original values of a and b, allowing a clean slate for each operation. I consider this boring, so instead we will adjust a and b in a way that does not affect the rest of the code (++a & --b & 0) in order to make full use of the differences between x++ and ++x.
We can now get both possible values for a and b, as the unary operators modifying the operands' values now works in our favor. Our techniques from earlier help us choose the correct versions of each, and we now have a working add function. :)
Python codes:
(1)
add = lambda a,b : -(-a)-(-b)
use lambda function with '-' operator
(2)
add= lambda a,b : len(list(map(lambda x:x,(i for i in range(-a,b)))))