Converting some assembly to VB.NET - SHR operator working differently? - vb.net

Well, a simple question here
I am studying some assembly, and converting some assembly routines back to VB.NET
Now, There is a specific line of code I am having trouble with, in assembly, assume the following:
EBX = F0D04080
Then the following line gets executed
SHR EBX, 4
Which gives me the following:
EBX = 0F0D0408
Now, in VB.NET, i do the following
variable = variable >> 4
Which SHOULD give me the same... But it differs a SLIGHT bit, instead of the value 0F0D0408 I get FF0D0408
So what is happening here?

From the documentation of the >> operator:
In an arithmetic right shift, the bits shifted beyond the rightmost bit position are discarded, and the leftmost (sign) bit is propagated into the bit positions vacated at the left. This means that if pattern has a negative value, the vacated positions are set to one; otherwise they are set to zero.
If you are using a signed data type, F0B04080 has a negative sign (bit 1 at the start), which is copied to the vacated positions on the left.
This is not something specific to VB.NET, by the way: variable >> 4 is translated to the IL instruction shr, which is an "arithmetic shift" and preserves the sign, in contrast to the x86 assembly instruction SHR, which is an unsigned shift. To do an arithmetic shift in x86 assembler, SAR can be used.
To use an unsigned shift in VB.NET, you need to use an unsigned variable:
Dim variable As UInteger = &HF0D04080UI
The UI type character at the end of F0D04080 tells VB.NET that the literal is an unsigned integer (otherwise, it would be interpreted as a negative signed integer and the assignment would result in a compile-time error).

VB's >> operator does an arithmetic shift, which shifts in the sign bit rather than 0's.
variable = (variable >> shift_amt) And Not (Integer.MinValue >> (shift_amt - 1))
should give you an equivalent value, even if it is a bit long. Alternatively, you could use an unsigned integer (UInteger or UInt32), as there's no sign bit to shift.

Related

what's best way for awk to check arbitrary integer precision

from GNU gawk's page
https://www.gnu.org/software/gawk/manual/html_node/Checking-for-MPFR.html
they have a formula to check arbitrary precision
function adequate_math_precision(n) { return (1 != (1+(1/(2^(n-1))))) }
My question is : wouldn't it be more efficient by staying within integer math domain with a formula such as
( 2^abs(n) - 1 ) % 2 # note 2^(n-1) vs. 2^|n| - 1
Since any power of 2 must also be even, then subtracting 1 must always be odd, then its modulo (%) over 2 becomes indicator function for is_odd() for n >= 0, while the abs(n) handles the cases where it's negative.
Or does the modulo necessitate a casting to float point, thus nullifying any gains ?
Good question. Let's tackle it.
The proposed snippet aims at checking wether gawk was invoked with the -M option.
I'll attach some digression on that option at the bottom.
The argument n of the function is the floating point precision needed for whatever operation you'll have to perform. So, say your script is in a library somewhere and will get called but you have no control over it. You'll run that function at the beginning of the script to promptly throw exception and bail out, suggesting that the end result will be wrong due to lack of bits to store numbers.
Your code stays in the integer realm: a power of two of an integer is an integer. There is no need to use abs(n) here, because there is no point in specifying how many bits you'll need as a negative number in the first place.
Then you subtract one from an even, integer number. Now, unless n=0, in which case 2^0=1 and then your code reads (1 - 1) % 2 = 0, your snippet shall always return 1, because the quotient (%) of an odd number divided by two is 1.
Problem is: you are trying to calculate a potentially stupidly large number in a function that should check if you are able to do so in the first place.
Since any power of 2 must also be even, then subtracting 1 must always
be odd, then its modulo (%) over 2 becomes indicator function for
is_odd() for n >= 0, while the abs(n) handles the cases where it's
negative.
Except when n=0 as we discussed above, you are right. The snippet will tell that any power of 2 is even, and any power of 2, minus 1, is odd. We were discussing another subject entirely thought.
Let's analyze the other function instead:
return (1 != (1+(1/(2^(n-1)))))
Remember that booleans in awk runs like this: 0=false and non zero equal true. So, if 1+x where x is a very small number, typically a large power of two (2^122 in the example page) is mathematically guaranteed to be !=1, in the digital world that's not the case. At one point, floating computation will reach a precision rock bottom, will be rounded down, and x=0 will be suddenly declared. At that point, the arbitrary precision function will return 0: false: 1 is equal 1.
A larger discussion on types and data representation
The page you link explains precision for gawk invoked with the -M option. This sounds like technoblahblah, let's decipher it.
At one point, your OS architecture has to decide how to store data, how to represent it in memory so that it can be accessed again and displayed. Terms like Integer, Float, Double, Unsigned Integer are examples of data representation. We here are addressing Integer representation: how is an integer stored in memory?
A 32-bit system will use 4 bytes to represent and integer, which in turn determines how larger the integer will be. The 32 bits are read from most significative (MSB) to less significative (LSB) and if signed, one bit will represent the sign (the MSB typically, drastically reducing the max size of the integer).
If asked to compute a large number, a machine will try to fit in in the max number available. If the end result is larger than that, you have overflow and end up with a wrong result or an error. Many online challenges typically ask you to write code for arbitrary long loops or large sums, then test it with inputs that will break the 64bit barrier, to see if you master proper types for indexes.
AWK is not a strongly typed language. Meaning, any variable can store data, regardless of the type. The data type can change and it is determined at runtime by the interpreter, so that the developer doesn't need to care. For instance:
$awk '{a="this is text"; print a; a=2; print a; print a+3.0*2}'
-| this is text
-| 2
-| 8
In the example, a is text, then is an integer and can be summed to a floating point number and printed as integer without any special type handling.
The Arbitrary Precision Page presents the following snippet:
$ gawk -M 'BEGIN {
> s = 2.0
> for (i = 1; i <= 7; i++)
> s = s * (s - 1) + 1
> print s
> }'
-| 113423713055421845118910464
There is some math voodoo behind, we will skip that. Since s is interpreted as a floating point number, the end result is computed as floating point.
Try to input that number on Windows calculator as decimal, and it will fail. Although you can compute it as a binary. You'll need the programmer setting and to add up to 53 bits to be able to fit it as unsigned integer.
53 is a magic number here: with the -M option, gawk uses arbitrary precision for numbers. In other words, it commandeers how many bits are necessary, track them and breaks free of the native OS architecture. The default option says that gawk will allocate 53 bits for any given arbitrary number. Fun fact, the actual result of that snippet is wrong, and it would take up to 100 bits to compute correctly.
To implement arbitrary large numbers handling, gawk relies on an external library called MPFR. Provided with an arbitrary large number, MPFR will handle the memory allocation and bit requisition to store it. However, the interface between gawk and MPFR is not perfect, and gawk can't always control the type that MPFR will use. In case of integers, that's not an issue. For floating point numbers, that will result in rounding errors.
This brings us back to the snippet at the beginning: if gawk was called with the -M option, numbers up to 2^53 can be stored as integers. Floating points will be smaller than that (you'll need to make the comma disappear somehow, or rather represent it spending some of the bits allocated for that number, just like the sign). Following the example of the page, and asking an arbitrary precision larger than 32, the snippet will return TRUE only if the -M option was passed, otherwise 1/2^(n-1) will be rounded down to be 0.

Bitwise negation up to the first positive bit

I was working in a project and I canĀ“t use the bitwise negation with a U32 bits (Unsigned 32 bits) because when I tried to use the negation operator for example I have 1 and the negation (according to this function) was the biggest number possible with U32 and I expected the zero. My idea was working with a binary number like (110010) and I need to only negate the bits after the first 1-bit(001101). There is a way to do that in LabVIEW?
This computes the value you are looking for.
1110 --> 0001 (aka, 1)
1010 --> 0101 (aka, 101)
111 --> 000 (aka, 0) [indeed, all patterns that are all "1" will become 0]
0 --> 0 [because there are no bits to negate... maybe you want to special case this as "1"?)
Note: This is a VI Snippet. Save the .png file to your disk then drag the image from your OS into LabVIEW and it will generate the block diagram (I wrote it in LV 2016, so it works for 2016 or later). Sometimes dragging directly from browser to diagram works, but most browsers seem to strip out the EXIF data that makes that work.
Here's an alternative solution without a loop. It formats the input into its string representation (without leading zeros) to figure out how many bits to negate - call this n - and then XOR's the input with 2^n - 1.
Note that this version will return an output of 1 for an input of 0.
Using the string functions feels a bit hacky... but it doesn't use a loop!!
Obviously we could instead try to get the 'bit length' of the input using its base-2 log, but I haven't sat down and worked out how to correctly ensure that there are no rounding issues when the input has only its most significant bit set, and therefore the base-2 log should be exactly an integer, but might come out a fraction smaller.
Here's a solution without strings/loops using the conversion to float (a common method of computing floor(log_2(x))). This won't work on unsigned types.

How do I define -0 as an integer and -0 < 0 in Objective-C?

This might sound crazy but I'm working with floor numbers of a building which has -0 as floor and -0A, B, and C so and so.
My user in entering floor data randomly. In the end I'm supposed to sort the array of these floor numbers. What i found is that even if i enter -0 as floor number and try to sort it.. it sorts it as 0 because for computer -0 is still 0.
How do i define -0 and -1<-0<0?
By definition, you are no longer working with integers as -0 != 0 makes no sense in the realm of integers.
So, yes, you're going to have to define your own type and implement your own sorting rules. Simply storing them as strings and then implementing a sorting block to sort an array of them is straightforward, though.
You could go down the path of using floats so you could have floor 0 and floor -0.1, then round for display. But that sort of shenanigans will lead the maintainer after the maintainer after you to call you unpleasant names (which is sometimes OK). :)
All answers so far are suggesting your own type/functions and probably strings.
Strings will work but you can take an idea from floating-point and store them as sign (boolean or even bit flag) and magnitude (unsigned integer type, 8 or 16 bits should be sufficient).
Comparison is simply compare signs and then compare magnitudes if required.
You could use a struct for such a type which would give you the same value semantics as integer and real types and avoid object allocation.
If there is also a letter ("-0 as floor and -0A, B, and C so and so") that can be a third field in the struct, probably a char, and you could still have value semantics.
HTH

Read input in NASM, and store it whole into a variable

what is the method by which I can read the input of the user, say the input is "500"
then store this number in a variable?
The only method I know would be to store them character by character with possibly the need of register offsets.
Is there any other way, preferably storing the number directly?
i.e. something like:
mov var1, inbuffer
Details on environment:
32 bit Assembly w/ DGJPP
Thank you.
Ahhh... DJGPP, that'd be dos I guess. Look into int 21h/0Ah (0Ah in ah). Or you might be better off with the readfile subfunction (3Fh ???) on stdin. Look it up in Ralf Brown's Interrupt list.
In any case, what you're going to get is the characters '5', '0', and '0' - 35h, 30h, 30h. It will take some processing to get the number 500 out of this. If you're reading numbers from left to right, zero up a register to use as "result so far". Read a character from your input buffer. If it's a valid decimal digit, subtract '0' to convert character to number, multiply "result so far" by ten, and add in your new number. Repeat until you run out of characters.

Showing decimals of a variable with sprintf in MATLAB

I don't understand the next thing that happens using the sprintf command.
>> vpa(exp(1),53)
ans =
2.7182818284590455348848081484902650117874145507812500
>> e = 2.7182818284590455348848081484902650117874145507812500
e =
2.7183
>> sprintf('%0.53f', e)
ans =
2.71828182845904550000000000000000000000000000000000000
Why does sprintf show me the number e rounded instead of the number and I kept at the first place?
Variables are double precision by default in MATLAB, so the variable e that you create is limited to the precision of a double, which is about 16 digits. Even though you entered more digits, a double doesn't have the precision to accurately represent all those extra digits and rounds off to the nearest number it can represent.
EDIT: As explained in more detail by Andrew Janke in his answer to this follow-up question I posted, the number you chose for e just happens to be an exact decimal expansion of the binary value. In other words, it's the exactly-representable value that a nearby floating-point number would get rounded to. However, in this case anything more than approximately 16 digits past the decimal point is not considered significant since it can't really be represented accurately by a double-precision type. Therefore, functions like SPRINTF will automatically ignore these small values, printing zeroes instead.