Advantage of Unsigned Multiplier over signed multiplier? - multiplication

I have implemented an FFT, Due to some logic my multiplier will operate only on unsigned numbers. Is there any advantage, if I use unsigned multiplier instead of signed multiplier?

Related

What is the difference between normalized, scaled and integer VkFormats?

Let's take the following 6 VkFormats for example:
VK_FORMAT_R8_UNORM
VK_FORMAT_R8_SNORM
VK_FORMAT_R8_USCALED
VK_FORMAT_R8_SSCALED
VK_FORMAT_R8_UINT
VK_FORMAT_R8_SINT
All of these specify a one-component 8-bit format that has a single 8-bit R component.
The formats differ in whether they are (a) normalized, (b) scaled; or (c) integer. What does that mean? What are the differences between those three things? Where is that specified?
Are all 256 possible values of 8-bits meaningful and valid in all six formats?
(They also differ in whether they are signed or unsigned. I assume this means whether their underlying types are like the C types int8_t or uint8_t ?)
Refer to Identification of Formats and Conversion from Normalized Fixed-Point to Floating-Point in the specification.
UNORM is a float in the range of [0, 1].
SNORM is the same but in the range of [-1, 1]
USCALED is the unsigned integer value converted to float
SSCALED is the integer value converted to float
UINT is an unsigned integer
SINT is a signed integer
I.e. for the VK_FORMAT_R8_*:
for UNORM raw 0 would give 0.0f, raw 255 would give 1.0f
for SNORM raw -127 (resp. 129) would give -1.0f, raw 127 would give 1.0f
USCALED raw 0 would give 0.0f, raw 255 would give 255.0f
SSCALEDraw -128 (resp. 128) would give -128.0f, raw 127 would give 127.0f
-128 (-2n-1) is not meaningful in SNORM, and simply clamps to -1.0f.

gpu optimization when multiplying by powers of 2 in a shader

Do modern GPUs optimize multiplication by powers of 2 by doing a bit shift? For example suppose I do the following in a shader:
float t = 0;
t *= 16;
t *= 17;
Is it possible the first multiplication will run faster than the second?
Floating point multiplication cannot be done by bit shift. Howerver, in theory floating point multiplication by power of 2 constants can be optimized. Floating point value is normally stored in the form of S * M * 2 ^ E, where S is a sign, M is mantissa and E is exponent. Multiplying by a power of 2 constant can be done by adding/substracting to the exponent part of the float, without modifying the other parts. But in practice, I would bet that on GPUs a generic multiply instruction is always used.
I had an interesting observation regarding the power of 2 constants while studying the disassembly output of the PVRShaderEditor (PowerVR GPUs). I have noticed that a certain range of power of 2 constants ([2^(-16), 2^10] in my case), use special notation, e.g. C65, implying that they are predefined. Whereas arbitrary constants, such as 3.0 or 2.3, use shared register notation (e.g. SH12), which implies they are stored as a uniform and probably incur some setup cost. Thus using power of 2 constants may yield some optimizational benefit at least on some hardware.

DataType equivalent in Protobuf

I know that the data-types supported by protobuf-c are restricted to the ones mentioned here , but what can be a good protobuf-c equivalent to the following data types in C
time_t,
int8_t,
int16_t,
uint8_t,
uint16_t,
ushort
For time_t, use uint64_t.
For all the others, use sint32_t (often negative), int32_t (rarely negative), or uint32_t (never negative). Protobuf uses a variable-width encoding for integers that avoids using more space on the wire than is really needed. For instance, numbers less than 128 will be encoded in 1 byte by int32_t.

Units conversion on a PIC 18F2431

I have the following conversion given Pressure per Square Inch (PSI) and Megapascals (MPa):
psi = MPa*1.45038;
I need the lowest value possible after conversion to be 1 PSI. An example of what I am looking for is:
psi = ((long)MPa*145)/100
Is there anyway to optimize this for memory and speed by not using float or long? I will be implementing this conversion on a microcontroller (PIC 18F2431).
You should divide by powers of 2 instead which is far cheaper than division by any other values. And if the type can't be negative then use an unsigned type instead. Depending on the type of MPa and its maximum value you can choose different denominator to suite your needs. No need to cast to a wider type if the multiplication won't overflow
For example if MPa is of type uint16_t you can do psi = MPa*95052/(1UL << 16); (95052/65536 ≈ 1.450378)
If MPa is not larger than 1024 or 210 then you can multiply it with 221 without overflowing, thus you can increase the numerator/denominator for more precision
psi = MPa*3041667/(1UL << 21);
Edit:
On the PIC 18F2431 int is a 16-bit type. That means 95052 will be of type long and MPa will be promoted to long in the expression. If you don't need much precision then change the scaling to fit in an int/int16_t to avoid dealing with long
In case MPa is not larger than 20 you can divide it by 2048 which is the largest power of 2 that is less than or equal to 216/20.
psi = MPa*2970/(1U << 11);
Note that the * and / have equal precedence so it'll be evaluated from left to right, and the above equation will be identical to
psi = (MPa*2970)/2048; // = MPa*1.4501953125)
no need for such excessive parentheses
Edit 2:
Unfortunately if the range of MPa is [0, 2000] then you can only multiply it by 32 without overflowing a 16-bit unsigned int. The closest ratio that you can achieve is 46/32 = 1.4375 so if you need more precision, there's no way other than using long. Anyway integer math with long is still a lot faster than floating-point math on the PIC MCU, and cost significantly less code space
Calculate the largest N such that MPa*1.45038*2^N < 2^32
Calculate the constant value of K = floor(1.45038*2^N) once
Calculate the value of psi = (MPa*K)>>N for every value of MPa
Since 0 <= MPa <= 2000, you must choose N such that 2000*1.45038*2^N < 2^32:
2^N < 2^32/(2000*1.45038)
N < log(2^32/(2000*1.45038))
N < 20.497
N = 20
Therefore, K = floor(1.45038*2^N) = floor(1.45038*2^20) = 1520833.
So for every value of MPa, you can calculate psi = (MPa*1520833)>>20.
You'll need to make sure that MPa is unsigned (or cast it accordingly).
Using this method will allow you to avoid floating-point operations.
For better accuracy, you can use round instead of floor, giving you K = 1520834.
In this specific case it will work out fine, because 2000*1520834 is smaller than 2^32.
But with a different maximum value of MPa or a different conversion scalar, it might not.
In any case, the difference in the outcome of psi for each value of K is neglectable.
BTW, if you don't mind the additional memory, then you can use a look-up table instead:
Add a pre-calculated global variable const unsigned short lut[2001] = {...}
For every value of MPa, calculate psi = lut[MPa] instead of psi = (MPa*K)>>N
So instead of mul + shift operations, your program will perform a load operation
Please note, however, that whether or not this is more efficient in terms of runtime performance depends on several things, such as the accessibility of the memory segment in which you allocate the look-up table, the architecture of the processor at hand, runtime caching heuristics, etc.
So you will need to apply some profiling on your program in order to decide which approach is better.

Multiplication of bits in twos complement form

Please help me with the following two's complement multiplication logic.
Actual cropped
Unsigned 5 [101] 3 [011] 15 [001111] 7 [111]
Two’s comp. −3 [101] 3 [011] −9 [110111] −1 [111]
I cant understand how actual multiplication is different for unsigned and two's complement multiplication when bit for both are same.
Multiplication for signed and unsigned integers is performed by different rules (unlike addition and subtraction, for example).
The same bits can represent different data, actual interpretation depends on type.