I need to generate the following number sequences. The actual usage is to find the bin in a histogram with increasingly bigger bins.
The first column is just a log sequence such that the histogram's bins increase size in powers of two.
The next column (2) is a further subdivision of the first column such that each larger bin is subdivided in two sub-bins.
I have a reference discrete solution but I am looking to further reduce the number of cycles, perhaps using floating point tricks.
Index: (1) (2) (3) (4) (5) (6) (7)
1: 1 1 1 1 1 1 1
2: 2 2 2 2 2 2 2
3: 2 3 3 3 3 3 3
4: 3 4 4 4 4 4 4
5: 3 4 5 5 5 5 5
6: 3 5 6 6 6 6 6
7: 3 5 7 7 7 7 7
8: 4 6 8 8 8 8 8
9: 4 6 8 9 9 9 9
10: 4 6 9 10 10 10 10
11: 4 6 9 11 11 11 11
12: 4 7 10 12 12 12 12
13: 4 7 10 13 13 13 13
14: 4 7 11 14 14 14 14
15: 4 7 11 15 15 15 15
16: 5 8 12 16 16 16 16
17: 5 8 12 16 17 17 17
18: 5 8 12 17 18 18 18
19: 5 8 12 17 19 19 19
20: 5 8 13 18 20 20 20
21: 5 8 13 18 21 21 21
22: 5 8 13 19 22 22 22
23: 5 8 13 19 23 23 23
24: 5 9 14 20 24 24 24
25: 5 9 14 20 25 25 25
26: 5 9 14 21 26 26 26
27: 5 9 14 21 27 27 27
28: 5 9 15 22 28 28 28
29: 5 9 15 22 29 29 29
30: 5 9 15 23 30 30 30
31: 5 9 15 23 31 31 31
32: 6 10 16 24 32 32 32
33: 6 10 16 24 32 33 33
34: 6 10 16 24 33 34 34
35: 6 10 16 24 33 35 35
36: 6 10 16 25 34 36 36
37: 6 10 16 25 34 37 37
38: 6 10 16 25 35 38 38
39: 6 10 16 25 35 39 39
40: 6 10 17 26 36 40 40
41: 6 10 17 26 36 41 41
42: 6 10 17 26 37 42 42
43: 6 10 17 26 37 43 43
44: 6 10 17 27 38 44 44
45: 6 10 17 27 38 45 45
46: 6 10 17 27 39 46 46
47: 6 10 17 27 39 47 47
48: 6 11 18 28 40 48 48
49: 6 11 18 28 40 49 49
50: 6 11 18 28 41 50 50
51: 6 11 18 28 41 51 51
52: 6 11 18 29 42 52 52
53: 6 11 18 29 42 53 53
54: 6 11 18 29 43 54 54
55: 6 11 18 29 43 55 55
56: 6 11 19 30 44 56 56
57: 6 11 19 30 44 57 57
58: 6 11 19 30 45 58 58
59: 6 11 19 30 45 59 59
60: 6 11 19 31 46 60 60
61: 6 11 19 31 46 61 61
62: 6 11 19 31 47 62 62
63: 6 11 19 31 47 63 63
The columns show a semi-logarithmic mapping of values. This means that while the major switchover points proceed in logarithmic progression, intermediate points proceed in linear fashion. This is strongly reminiscent of IEEE-754 floating-point encodings, where the exponent field expresses the binary logarithm of the represented quantity, while the significand field provides a linear progression between powers of two. While use of IEEE-754 floating-point formats is extremely widespread, it is not universal, so this approach can only be considered semi-portable.
One idea for an efficient implementation on most modern processors, CPUs and GPU alike, is therefore to convert the input Index into an IEEE-754 binary32 number represented as float at C or C++ source code level. One then extracts the appropriate bits (consisting of the exponent bits and some leading bits of the significand) from the binary representation of that float number, where the number of included significand bits increases by one with each output column, i.e. the column number is a granularity factor of the mapping.
There are various ways of accomplishing the details of the process outlined above. The most straightforward implementation is to use either ldexpf or multiplication with an appropriate power of two so the smallest input is scaled to the lowest binade of the binary32 format using multiplication with a power of two produced via exp2f().
The biggest caveat with this approach is that the lowest binade (biased exponent of 0) contains subnormal numbers, which are not available on platforms operating in FTZ (- flush to zero) mode. This may be the case for both x86 SIMD as well as for NVIDIA GPUs, for example. Another caveat is that ldexpf() or exp2f() may not implemented efficiently, that is, via (almost) direct support in hardware. Lastly, the accuracy of these functions may be insufficient. For example, for CUDA 11.8 I found that exp2f() with a negative integer argument does not always deliver the correct power-of-two result (specifically, exp2f(-127) is off by one ulp), making the variant using exp2f() fail.
Alternate approaches convert Index into a floating-point number without scaling, i.e. starting the mapping in a binade near unity. This raises the issue that for column j > 0, the first 2j entries incorrectly have logarithmic mapping applied to them. This can be solved by manually enforcing a linear mapping to these entries, so that the result equals Index for the first 2j entries. The IEEE-754 exponent bias for the logarithmic portion of the computed values can be removed prior or after bit field extraction, with preference depending on specifics of the instruction set architecture.
The four design variants described above are enumerated in the exemplary code below by the macro VARIANT which can take values 0 through 3. From some initial experiments it seems that when compiling with modern compilers at high optimization level (I tried gcc, clang, and icx) coding at the assembly level may not be necessary. On platforms without IEEE-754 floating-point arithmetic, a quick simulation of integer to floating-point conversion based on a CLZ (count leading zeros) instruction may be helpful.
#include <cstdio>
#include <cstdlib>
#include <cstdint>
#include <cstring>
#include <cmath>
#define VARIANT (3)
uint32_t float_as_uint32 (float a)
{
uint32_t r;
memcpy (&r, &a, sizeof r);
return r;
}
// provide semi-logarithmic mapping of i to output based on granularity parameter j
int semilogmap (int i, int j)
{
const int FP32_MANT_BITS = 23;
const int FP32_EXPO_BIAS = 127;
#if VARIANT == 0 // this requires subnormal support and will break when using FTZ (flush to zero)!
return float_as_uint32 ((float)i * exp2f (1 - FP32_EXPO_BIAS - j)) >> (FP32_MANT_BITS - j);
#elif VARIANT == 1 // this requires subnormal support and will break when using FTZ (flush to zero)!
return float_as_uint32 (ldexpf ((float)i, 1 - FP32_EXPO_BIAS - j)) >> (FP32_MANT_BITS - j);
#elif VARIANT == 2
return (i < (1 << j)) ? i :
((float_as_uint32 ((float)i) - ((FP32_EXPO_BIAS - 1 + j) << FP32_MANT_BITS)) >> (FP32_MANT_BITS - j));
#elif VARIANT == 3
return (i < (1 << j)) ? i :
((float_as_uint32 ((float)i) >> (FP32_MANT_BITS - j)) - ((FP32_EXPO_BIAS - 1 + j) << j));
#else
#error unsupported VARIANT
#endif // VARIANT
}
int main (void)
{
int col [64][7];
for (int i = 1; i <= 63; i++) {
for (int j = 0; j < 7; j++) {
col[i][j] = semilogmap (i, j);
}
}
for (int i = 1; i <= 63; i++) {
printf ("%2d: ", i);
for (int j = 0; j < 7; j++) {
printf (" %2d", col[i][j]);
}
printf ("\n");
}
return EXIT_SUCCESS;
}
In terms of the number of instructions generated, it might be instructive to look at a CUDA version of variant 0 for execution on NVIDIA GPUs. I had to implement my own version of exp2f() to achieve the necessary accuracy.
/* compute 2**x with a maximum error of 2.055 ulp */
__forceinline__ __device__ float my_exp2f (float x)
{
float r, t;
t = x;
if (x < -126.0f) t = t + 24.0f;
asm ("ex2.approx.ftz.f32 %0,%1;" : "=f"(r) : "f"(t));
if (x < -126.0f) r = r * 5.9604644775390625e-8f; // 0x1.0p-24
return r;
}
/* semi-logarithmic mapping of i to output based on granularity parameter j */
__device__ int semilogmap (int i, int j)
{
const int FP32_MANT_BITS = 23;
const int FP32_EXPO_BIAS = 127;
return __float_as_int ((float)i * my_exp2f (1 - FP32_EXPO_BIAS - j)) >> (FP32_MANT_BITS - j);
}
Compiled with nvcc -c -rdc=true -arch=sm_75 -o semilogmap.obj semilogmap.cu using the toolchain from CUDA 11.8, the following code, (comprising 11 instructions including the function return) is generated for semilogmap():
code for sm_75
Function : _Z10semilogmapii
.headerflags #"EF_CUDA_SM75 EF_CUDA_PTX_SM(EF_CUDA_SM75)"
/*0000*/ IADD3 R0, -R5.reuse, -0x7e, RZ ;
/*0010*/ I2F R4, R4 ;
/*0020*/ ISETP.GT.AND P0, PT, R5.reuse, RZ, PT ;
/*0030*/ IADD3 R6, -R5, 0x17, RZ ;
/*0040*/ I2F R3, R0 ;
/*0050*/ #P0 FADD R3, R3, 24 ;
/*0060*/ MUFU.EX2 R7, R3 ;
/*0070*/ #P0 FMUL R7, R7, 5.9604644775390625e-08 ;
/*0080*/ FMUL R7, R4, R7 ;
/*0090*/ SHF.R.S32.HI R4, RZ, R6, R7 ;
/*00a0*/ RET.ABS.NODEC R20 0x0 ;
This question already has an answer here:
plot first and last columns (variable number) gnuplot
(1 answer)
Closed 2 years ago.
I wish to plot the second, third, and fourth from the rightmost column using Gnuplot. In awk, we can use ($NF-1). But in Gnuplot, not sure how can I designate a column from the rightmost column with 'using'.
Is this possible to use awk in Gnuplot to plot 3rd from the right column vs 4th from the right column? Or is this something that we must need to use shell script?
I have a lot of long text files to plot, so I cannot create new text files to rewrite the file using awk and then use Gnuplot. That would be too time-consuming. I wish to use Gnuplot to make plots from 2nd, 3rd, and 4th from the right.
No need for awk. If you do stats you could limit it to one row with every ::0::0. It should be pretty fast. Try the following complete example:
Code:
### plotting columns from right
reset session
$Data <<EOD
11 21 31 41 51 61 71
12 22 32 42 52 62 72
13 23 33 43 53 63 73
14 24 34 44 54 64 74
15 25 35 45 55 65 75
16 26 36 46 56 66 76
17 27 37 47 57 67 77
EOD
stats $Data u 0 every ::0::0 nooutput
ColMax = STATS_columns
ColFromRight(col) = column(ColMax-col+1)
plot $Data u (ColFromRight(3)):(ColFromRight(4)) w lp pt 7
### end of code
Result:
you can use STATS_columns for the number of columns and use it in your plot
e.g.
nf = int(STATS_columns)
plot data.dat using 1:nf-4
I have a data set of:
32
33
34
35
34
32
29
28
27
25
29
32
34
35
36
28
27
28
28
I would like to be able to find out how many numbers in a row are above 32. For example an output like:
5
4
where 5 is the first instance the values are above 32, and 4 is the second instance the values are over 32. I have been trying to do this in awk but so far all I am getting is the collective number i.e. 9 for all value combined above 32.
Any help would be much appreciated.
awk to the rescue! I think your output is not consistent with the input, or I misunderstood the problem. This is computing the chain length of values >31
$ awk '$1>31{c++; next} c{print c; c=0} END{if(c) print c}' file
6
4
END block is required for the case if the last chain contains the last element.
I want to learn vernam encryption.
First of all, can you confirm me that the algorithm is the same for encoding and decoding?
I have read an exercice which say to decode this message with Pi:
01237 55235 31127 12189 87479 1592
I have tried vernam python pacakge and i tried this:
py_vernam.vernam('01237552353112712189874791592','3.141592653589793238462643383')
or
py_vernam.vernam('01237552353112712189874791592','31415926535897932384626433832')
But it does not give me a readable message...
Thanks
"can you confirm me that the algorithm is the same for encoding and decoding?"
-> first of all, we are talking about encryption not encoding (at the first step) ... the notable difference, to tell those two apart, is that there is a key involved here...
depending on which variant of vernam you are handling, encryption and decryption may be the same or not ... for the binary variant it surely is the same operation, a simple XOR
if you happen to have got your fingers on the "let's do this by hand" or schoolbok variant, it is not, basically because it handles values mod 10 and not mod 2, leading to encryption is + ... decryption is - ...
the notation in blocks of 5 is an indication for the mod 10 variant, since with the mod 2 variant you usually just handle binary data
01237552353112712189874791592 Ciphertext
31415926535897932384626433832 Key
========================================
70822636828325880805258368760 Text (encoded)
so finally we have to read the encoding into characters... (second step)
it is up to the user of the cipher to give these numbers a meaning so when searching for your example message you can find a french page showing a substitution table... so ... let's have a look ...
Substitution table:
Clair A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Chiffré 6 38 32 4 8 30 36 34 39 31 78 72 70 76 9 79 71 58 2 0 52 50 56 54 1 59
Result:
70 8 2 2 6 36 8 2 8 32 58 8 0 8 0 52 58 36 8 76 0
M E S S A G E S E C R E T E T U R G E N T
This is not clear to understand, what exactly i have to send as command? does it mean i have to send in two step? (its a usb hardware connected to pc where i can send the command to get informations)
1) first one has to be
00 A4 08 0c 04 df 01 40 31 {LC?? what does it mean} ?
2) second one has to be
00 b0 xx yy zz {what is exactly xx and yy and zz has to be??}
Reference saying as below:
To select those files, you have to use:
00 A4 08 0c 04 df 01 40 xx Lc
00 : ISO CLAss (~standard ISO)
A4 : ISO file_select INStruction
08 0c : P1 P2 for this operation (otherwise you get an error...)
04 : length of the 'path'
df 01 : directory for Id data
40 xx : the files id
31 : owner's identity (names, birth,...)
33 : owner's address
35 : owner's picture
Lc is the expected answer length (?)
Once the file is successfully selected (SW1:SW2 == 90:00),
one can 'binary_read' the content with
00 b0 xx yy zz
00 : ISO CLAss (~standard ISO APDU)
b0 : ISO read_binary INStruction
xx yy : offset in the file (256*xx)+yy
zz : number of bytes to read
Your questions are clearly answered by the comments in the specification.
Lc is the expected length of the response.
In the 'binary_read', xx and yy are respectively the MSB and LSB of a 16 bit value for file offset, and zz is the length in bytes of the block to read.
If you want to accces the offset n then:
xx = n / 256 ;
yy = n % 256 ;
or alternatively:
xx = (n >> 8) & 0xff ;
yy = n & 0xff ;