How is the matrix created using Isomorphic transform fucntion and Isomorphic inverse transform function? [closed] - cryptography

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
Trying to implementation AES Sbox and InSbox in combination circuit. Here for Sbox two operation is done i.e. Multiplicative Inverse and Affine Transform. For Affine Transform finite field is converted into a composite field using isomorphic transform, of which I have no idea how is that done. Need help in getting matrix delta shown in the image(attached with the question) from the irreducible polynomial mentioned p(x).

The matrices in the question are used for inversion (1/x) step. Affine transformation is a separate step and normally involves a matrix multiply followed by a column xor as specified by AES algorithm. Link to wiki article, note that the wiki article has least significant bits at the top, while the article you reference and other articles have the most significant bit at the top.
https://en.wikipedia.org/wiki/Rijndael_S-box
Getting back to how those matrices are created, I found a few articles, but they not only don't explain how those matrices are created, they were also missing key information, such as the primitive element chosen for GF(2^8) based on
polynomial x^8 + x^4 + x^3 + x + 1 (0x11b) with 1 bit coefficients, which is irreducible, but not primitive, since its primitive element is not x (0x02).
GF(2^8) is mapped to GF(((2^2)^2)^2). From the questions information, GF(2^2) uses x^2 + x + 1 (hex 7) with 1 bit coefficients to produce a 2 bit field with primitive element x = 0x2. GF((2^2)^2) uses x^2 + x + 2 (hex 16) with 2 bit coefficients from GF(2^2) to produce a 4 bit field with primitive element x = 0x4. GF(((2^2)^2)^2) uses x^2 + x + c (hex 11c) with 4 bit coefficients from GF((2^2)^2) to produce an 8 bit field with primitive x = 0x10.
For GF(2^8) there are 128 possible primitive elements: {0x03, 0x05, 0x06, ... , 0xff}. The matrix δ can be used to identify which primitive element was chosen for GF(2^8), in this case x^4 + x^3 + x^2 + x + 1 (hex 1f).
The columns of the matrix δ correspond to the mapping from GF(2^8) to GF(((2^2)^2)^2) by bit: 1st column maps 0x80, 2nd column maps 0x40, ..., 7th column maps 0x02, 8th column maps 0x01. The columns are powers of 0x10 in GF(((2^2)^2)^2). For example the 7th column is 0x5f, which is 0x10^0xa0 in GF(((2^2)^2)^2). Since the 7th column is used to map 0x02 in GF(2^8), this means GF(2^8)log??(0x02) = 0xa0, and that the chosen primitive element is 0x1f, since GF(2^8)log1f(0x02) = 0xa0. The 6th column is 0x7c, which is 0x10^0x41 in GF(((2^2)^2)^2), and GF(2^8)log1f(0x04) = 0x41.
The table below shows the data for all 8 colums.
GF(2^8) log1f(0x80) = 0x64, GF(((2^2)^2)^2) 0x10^0x64 = 0xfc (1st column of matrix)
GF(2^8) log1f(0x40) = 0xc3, GF(((2^2)^2)^2) 0x10^0xc3 = 0x4b (2nd column of matrix)
GF(2^8) log1f(0x20) = 0x23, GF(((2^2)^2)^2) 0x10^0x23 = 0xb0 (3rd column of matrix)
GF(2^8) log1f(0x10) = 0x82, GF(((2^2)^2)^2) 0x10^0x82 = 0x46 (4th column of matrix)
GF(2^8) log1f(0x08) = 0xe1, GF(((2^2)^2)^2) 0x10^0xe1 = 0x74 (5th column of matrix)
GF(2^8) log1f(0x04) = 0x41, GF(((2^2)^2)^2) 0x10^0x41 = 0x7c (6th column of matrix)
GF(2^8) log1f(0x02) = 0xa0, GF(((2^2)^2)^2) 0x10^0xa0 = 0x5f (7th column of matrix)
GF(2^8) log1f(0x01) = 0x00, GF(((2^2)^2)^2) 0x10^0x00 = 0x01 (8th column of matrix)
The inverse mapping matrix can use the same logic:
GF(((2^2)^2)^2) log10(0x80) = 0x67, GF(2^8) 0x1f^0x67 = 0x84 (1st column of matrix)
GF(((2^2)^2)^2) log10(0x40) = 0xbc, GF(2^8) 0x1f^0xbc = 0xf1 (2nd column of matrix)
GF(((2^2)^2)^2) log10(0x20) = 0xab, GF(2^8) 0x1f^0xab = 0xbb (3rd column of matrix)
GF(((2^2)^2)^2) log10(0x10) = 0x01, GF(2^8) 0x1f^0x01 = 0x1f (4th column of matrix)
GF(((2^2)^2)^2) log10(0x08) = 0x66, GF(2^8) 0x1f^0x66 = 0x0c (5th column of matrix)
GF(((2^2)^2)^2) log10(0x04) = 0xbb, GF(2^8) 0x1f^0xbb = 0x5d (6th column of matrix)
GF(((2^2)^2)^2) log10(0x02) = 0xaa, GF(2^8) 0x1f^0xaa = 0xbc (7th column of matrix)
GF(((2^2)^2)^2) log10(0x01) = 0x00, GF(2^8) 0x1f^0x00 = 0x01 (8th column of matrix)
Note that in the questions image, the inverse matrix 1st and 6th columns have the least significant bit flipped. The pdf file linked to below has the proper matrices.
https://github.com/bpdegnan/aes/blob/master/aes-sbox/documentation/aessbox.pdf
I created a small pdf file that explains how the mapping matrices seen on page 4, matrix (8) and page 5, matrix (10) are generated and the logic behind them.
https://github.com/jeffareid/finite-field/blob/master/Composite%20Field%20Mapping%20Example.pdf
In order for sub-field aka composite field math to work, there are two main requirements. Using map() to represent the mapping from GF(2^8) to GF(((2^2)^2)^2), then while operating in GF(((2^2)^2)^2)
map(a + b) = map(a) + map(b) // addition (xor) is isomorphic
map(a · b) = map(a) · map(b) // multiplication is isomorphic
This can also be restated as: using α to represent the primitive element for GF(2^8) and β to represent the primitive element for GF(((2^2)^2)^2).
if α^i + α^j = α^k, then β^i + β^j = β^k // addition (xor) is isomporhic
if α^i · α^j = α^k, then β^i · β^j = β^k // multiplication is isomorphic
Normally β = 100002, and a brute force search is done for 3 constants α, φ, δ, that result in compatible mapping and minimizing gate count, where α is the primitive element for GF(2^8), φ is the constant term for GF((2^2)^2) = x^2 + x + φ, and δ is the constant term for GF(((2^2)^2)^2) = x^2 + x + δ. In this case, α = 111112, φ = 102, δ = 11002.

Related

Eigenvalues do not match between Eigen, Numpy, LAPACKE (Ubuntu), Intel MKL

I have the following matrix M of doubles:
55.774375 61.0225 62.805625
-122.045 -125.61125 -122.045
62.805625 61.0225 55.774375
(the matrix is part of an algorithm to estimate the parameters of an ellipse from a 2D point cloud, see https://autotrace.sourceforge.net/WSCG98.pdf for reference).
And now comes the interesting part. Determining the eigenvalues and (right) eigenvectors of the matrix with different packages leads to different results for the (right) eigenvectors. Eigenvalues are for every package the same:
[-5.09420041e-13 -7.03125000e+00 -7.03125000e+00]
For numpy with python:
M = np.array([[55.774375, 61.0225, 62.805625],
[-122.045, -125.61125, -122.045, ],
[62.805625, 61.0225, 55.774375]])
eval, evec = np.linalg.eig(M)
I get:
[[ 0.41608575 0.37443021 -0.80942954]
[-0.80854518 -0.82367703 0.34119147]
[ 0.41608575 0.42586167 0.47792489]]
With Eigen C++, the code looks as follows
Eigen::Matrix3d M;
M << 55.774375, 61.0225, 62.805625,
-122.045, -125.61125, -122.045,
62.805625, 61.0225, 55.774375;
Eigen::EigenSolver<Eigen::MatrixXd> solver;
solver.compute(M);
I get for the eigenvectors
0.416086 0.376456 -0.462421
-0.808545 -0.823758 0.820878
0.416086 0.423914 -0.335151
With LAPACKE (apt install liblapack-dev lapacke lapacke-dev)
double Marr[]{55.774375, 61.0225, 62.805625,
-122.045, -125.61125, -122.045,
62.805625, 61.0225, 55.774375};
char jobvl = 'N';
char jobvr = 'V';
int n=3;
int lda = n;
int ldvl = n;
int ldvr = n;
int lwork = -1;
int info;
double wr[n], wi[n], vl[ldvl*n], vr[ldvr*n];
LAPACKE_dgeev( LAPACK_ROW_MAJOR, 'V', 'V', n, Marr, lda, wr, wi,
vl, ldvl, vr, ldvr );
if( info > 0 ) {
printf( "The algorithm failed to compute eigenvalues.\n" );
exit( 1 );
}
I get for the eigenvectors
0.416086 0.376456 -0.788993
-0.808545 -0.823758 0.565975
0.416086 0.423914 0.239087
Similar are the results for Intel MKL.
I checked the determinant of M and it is close to zero (-4.0031989907207254e-05).
What I would like to understand is
Why do the eigenvectors for same eigenvalues differ so much between the libraries? Is this because of the different numerical methods used to approx them?
I understand that an Eigenvalue d has many associated eigenvectors, i.e. if v is the eigenvector for d than q * v (q being a scalar) is also an eigenvector of d. Since the second and the third eigenvalue are the same, I would assume that there is some scalar that transforms one into the other, but this doesn't seem to be the case.
My algorithm fails in C++ (in python it is working) due to the different eigenvectors. Is there a way out?

List all X coordinates by given a Y coordinate in a line profiles by DM scripting

For a line profile (curve), I want to reach that list all X coordinates that corresponding a Y coordinate by given this Y coordinate. And I could get the minimum and maximum values of these x coordinates. Here supposed I want to list all the X coordinates corresponding y=8, is this correct or any other better way? Thx
Number minx, maxx
Image front=:getfrontimage()
GetSize( front, xsize, ysize )
for (i=0; i<xsize; i++)
{
x= getpixel(front, i, 8)
minx=min(x)
maxx=max(x)
}
You script becomes wrong when you use the min and max, because you can not get a minimum/maximum of a single value (or rather, it is always that value). What you want to do is likely:
image spec := RealImage("Test",4,100)
spec = trunc(Random()*10)
number v = 8
ClearResults()
number nCh = spec.ImageGetDimensionSize(0)
for( number i=0; i<nCh; i++)
{
if( v == sum(spec[i,0]) )
Result("\n Value "+ v +" # " + i )
}
(The sum() is needed here a a trick to convert an image-expression to a single value.)
However, going pixel-by-pixel in script can be slow. Whenever possible, try to code with image-expressions, because they are much faster (for big images).
I therefore often utilize a trick: I threshold an image for the value I search for, and then iterate over that mask as long as it is not all-zero. The max(img,x,y) command will return the first maximum if there are multiple, so I get an ordered list.
image spec := RealImage("Test",4,100)
spec = trunc(Random()*10)
spec.ShowImage()
number v = 8
image mask = (spec==v)?1:0
ClearResults()
while( 0<sum(mask) )
{
number x,y
max(mask,x,y)
Result("\n Value " + v +" # " + x )
mask[x,0]=0
}
Edit: Answering the question of the comment below.
This is how one gets the ZLP maximum (position and value) from a line-profile in calibrated values.
Precursor: DM contains all data as simple arrays and values (real or integer). These are the real data and unrelated to any calibrations. You see these values if you toggle the "calibration" checkbox off in the Image Status palette:
These are the values all script commands etc. will use, i.e. positions are always indices (starting from 0) and values are the raw numeric values stored.
These images or spectra are calibrated by defining an origin and scale (and unit) for each dimensional axis as well as the intensity (=value). These triplets of values can be found in the image display info of data:
Only when the "Show calibrated values" checkbox is checked, is the data displayed in calibrated values. However, the real values remain unchanged. Just the scale/origin values are used to convert the numbers.
If you want to use a script to use calibrated values, then you have to perform the same conversions in you script yourself.
Here is the example:
image spectrum := GetFrontImage()
number xScale = spectrum.ImageGetDimensionScale(0) // 0 for X dimension
number xOrigin = spectrum.ImageGetDimensionOrigin(0)
string xUnit = spectrum.ImageGetDimensionUnitString(0)
number iScale = spectrum.ImageGetIntensityScale()
number iOrigin = spectrum.ImageGetIntensityOrigin()
string iUnit = spectrum.ImageGetIntensityUnitString()
string info = "\n"
info += "Image ["+spectrum.ImageGetLabel()+"]:"
info += "\n\t Dimension calibration: nCh * " + xScale + " + " + xOrigin + " [" + xUnit + "]"
info += "\n\t Intensity calibration: (value - " + iOrigin + ") * " + iScale +" [" + iUnit + "]"
Result(info)
// Find ZLP maximum (uncalibrated values)
number maxP_ch, dummy, maxV_raw
maxV_raw = max(spectrum,maxP_ch,dummy)
info = "\n"
info += "\n\t The maximum position is at channel index: " + maxP_ch
info += "\n\t The maximum Value at maximum position is: " + maxV_raw
Result(info)
number maxP_cal = xOrigin + xScale * maxP_ch
number maxV_cal = (maxV_raw - iOrigin) * iScale
info = "\n"
info += "\n\t The maximum position is at : " + maxP_cal
info += "\n\t The maximum Value is : " + maxV_cal
Result(info)
Note the different calibration formulas between dimensional calibration and intensity calibration!

Convert Notes to Hertz (iOS)

I have tried to write a function that takes in notes in MIDI form (C2,A4,Bb6) and returns their respective frequencies in hertz. I'm not sure what the best method of doing this should be. I am torn between two approaches. 1) a list based one where I can switch on an input and return hard-coded frequency values given that I may only have to do this for 88 notes (in the grand piano case). 2) a simple mathematical approach however my math skills are a limitation as well as converting the input string into a numerical value. Ultimately I've been working on this for a while and could use some direction.
You can use a function based on this formula:
The basic formula for the frequencies of the notes of the equal
tempered scale is given by
fn = f0 * (a)n
where
f0 = the frequency of one fixed note which must be defined. A common choice is setting the A above middle C (A4) at f0 = 440 Hz.
n = the number of half steps away from the fixed note you are. If you are at a higher note, n is positive. If you are on a lower note, n is negative.
fn = the frequency of the note n half steps away. a = (2)1/12 = the twelth root of 2 = the number which when multiplied by itself 12 times equals 2 = 1.059463094359...
http://www.phy.mtu.edu/~suits/NoteFreqCalcs.html
In Objective-C, this would be:
+ (double)frequencyForNote:(Note)note withModifier:(Modifier)modifier inOctave:(int)octave {
int halfStepsFromA4 = note - A;
halfStepsFromA4 += 12 * (octave - 4);
halfStepsFromA4 += modifier;
double frequencyOfA4 = 440.0;
double a = 1.059463094359;
return frequencyOfA4 * pow(a, halfStepsFromA4);
}
With the following enums defined:
typedef enum : int {
C = 0,
D = 2,
E = 4,
F = 5,
G = 7,
A = 9,
B = 11,
} Note;
typedef enum : int {
None = 0,
Sharp = 1,
Flat = -1,
} Modifier;
https://gist.github.com/NickEntin/32c37e3d31724b229696
Why don't you use a MIDI pitch?
where f is the frequency, and d the MIDI data.

accelerate framework cepstrum peak find

I'm trying to find peak values of cepstrum analysis with accelerate framework. I get peak values always at the end of or at the beginning of frames. I'm analysing it real-time getting audio from microphone. What is wrong with this my code? My code is below :
OSStatus microphoneInputCallback (void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData){
// get reference of test app we need for test app attributes
TestApp *this = (TestApp *)inRefCon;
COMPLEX_SPLIT complexArray = this->fftA;
void *dataBuffer = this->dataBuffer;
float *outputBuffer = this->outputBuffer;
FFTSetup fftSetup = this->fftSetup;
uint32_t log2n = this->fftLog2n;
uint32_t n = this->fftN; // 4096
uint32_t nOver2 = this->fftNOver2;
uint32_t stride = 1;
int bufferCapacity = this->fftBufferCapacity; // 4096
SInt16 index = this->fftIndex;
OSStatus renderErr;
// observation objects
float *observerBufferRef = this->observerBuffer;
int observationCountRef = this->observationCount;
renderErr = AudioUnitRender(rioUnit, ioActionFlags,
inTimeStamp, bus1, inNumberFrames, this->bufferList);
if (renderErr < 0) {
return renderErr;
}
// Fill the buffer with our sampled data. If we fill our buffer, run the
// fft.
int read = bufferCapacity - index;
if (read > inNumberFrames) {
memcpy((SInt16 *)dataBuffer + index, this->bufferList->mBuffers[0].mData, inNumberFrames*sizeof(SInt16));
this->fftIndex += inNumberFrames;
} else {
// If we enter this conditional, our buffer will be filled and we should PERFORM FFT.
memcpy((SInt16 *)dataBuffer + index, this->bufferList->mBuffers[0].mData, read*sizeof(SInt16));
// Reset the index.
this->fftIndex = 0;
/*************** FFT ***************/
//multiply by window
vDSP_vmul((SInt16 *)dataBuffer, 1, this->window, 1, this->outputBuffer, 1, n);
// We want to deal with only floating point values here.
vDSP_vflt16((SInt16 *) dataBuffer, stride, (float *) outputBuffer, stride, bufferCapacity );
/**
Look at the real signal as an interleaved complex vector by casting it.
Then call the transformation function vDSP_ctoz to get a split complex
vector, which for a real signal, divides into an even-odd configuration.
*/
vDSP_ctoz((COMPLEX*)outputBuffer, 2, &complexArray, 1, nOver2);
// Carry out a Forward FFT transform.
vDSP_fft_zrip(fftSetup, &complexArray, stride, log2n, FFT_FORWARD);
vDSP_ztoc(&complexArray, 1, (COMPLEX *)outputBuffer, 2, nOver2);
complexArray.imagp[0] = 0.0f;
vDSP_zvmags(&complexArray, 1, complexArray.realp, 1, nOver2);
bzero(complexArray.imagp, (nOver2) * sizeof(float));
// scale
float scale = 1.0f / (2.0f*(float)n);
vDSP_vsmul(complexArray.realp, 1, &scale, complexArray.realp, 1, nOver2);
// step 2 get log for cepstrum
float *logmag = malloc(sizeof(float)*nOver2);
for (int i=0; i < nOver2; i++)
logmag[i] = logf(sqrtf(complexArray.realp[i]));
// configure float array into acceptable input array format (interleaved)
vDSP_ctoz((COMPLEX*)logmag, 2, &complexArray, 1, nOver2);
// create cepstrum
vDSP_fft_zrip(fftSetup, &complexArray, stride, log2n-1, FFT_INVERSE);
//convert interleaved to real
float *displayData = malloc(sizeof(float)*n);
vDSP_ztoc(&complexArray, 1, (COMPLEX*)displayData, 2, nOver2);
float dominantFrequency = 0;
int currentBin = 0;
float dominantFrequencyAmp = 0;
// find peak of cepstrum
for (int i=0; i < nOver2; i++){
//get current frequency magnitude
if (displayData[i] > dominantFrequencyAmp) {
// DLog("Bufferer filled %f", displayData[i]);
dominantFrequencyAmp = displayData[i];
currentBin = i;
}
}
DLog("currentBin : %i amplitude: %f", currentBin, dominantFrequencyAmp);
}
return noErr;
}
I haven't worked with the Accelerate Framework, but your code appears to be taking the proper steps to calculate the Cepstrum.
The Cepstrum of real acoustic signals tends to have a very large DC component, a large peak at and near zero quefrency [sic]. Just ignore the near-DC portion of the Cepstrum and look for peaks above 20 Hz frequency (above quefrency of Cepstrum_Width/20Hz).
If the input signal contains a series of very closely spaced overtones, the Cepstrum will also have a large peak at the high quefrency end.
For example, the plot below shows the Cepstrum of a Dirichlet Kernel of N=128 and Width=4096, the spectrum of which is a series of very closely spaced overtones.
You may want to use a static synthetic signal to test and debug your code. A good choice for a test signal is any sinusoid with a fundamental F and several overtones at exact integer multiples of F.
Your Cepstra should look something like the following examples.
First a synthetic signal.
The plot below shows the Cepstrum of a synthetic steady-state E2 note, synthesized using a typical near-DC component, a fundamental at 82.4 Hz, and 8 harmonics at integer multiples of 82.4 Hz. The synthetic sinusoid was programmed to generate 4096 samples.
Observe the prominent non-DC peak at 12.36. The Cepstrum width is 1024 (the output of the second FFT), therefore the peak corresponds to 1024/12.36 = 82.8 Hz which is very close to 82.4 Hz the true fundamental frequency.
Now a real acoustical signal.
The plot below shows the Cepstrum of a real acoustic guitar's E2 note. The signal was not windowed prior to the first FFT. Observe the prominent non-DC peak at 542.9. The Cepstrum width is 32768 (the output of the second FFT), therefore the peak corresponds to 32768/542.9 = 60.4 Hz which is fairly far from 82.4 Hz the true fundamental frequency.
The plot below shows the Cepstrum of the same real acoustic guitar's E2 note, but this time the signal was Hann windowed prior to the first FFT. Observe the prominent non-DC peak at 268.46. The Cepstrum width is 32768 (the output of the second FFT), therefore the peak corresponds to 32768/268.46 = 122.1 Hz which is even farther from 82.4 Hz the true fundamental frequency.
The acoustic guitar's E2 note used for this analysis was sampled at 44.1 KHz with a high quality microphone under studio conditions, it contains essentially zero background noise, no other instruments or voices, and no post processing.
References:
Real audio signal data, synthetic signal generation, plots, FFT, and Cepstral analysis were done here: Musical instrument cepstrum

Optimizing Vector elements swaps using CUDA

Since I am new to cuda .. I need your kind help
I have this long vector, for each group of 24 elements, I need to do the following:
for the first 12 elements, the even numbered elements are multiplied by -1,
for the second 12 elements, the odd numbered elements are multiplied by -1 then the following swap takes place:
Graph: because I don't yet have enough points, I couldn't post the image so here it is:
http://www.freeimagehosting.net/image.php?e4b88fb666.png
I have written this piece of code, and wonder if you could help me further optimize it to solve for divergence or bank conflicts ..
//subvector is a multiple of 24, Mds and Nds are shared memory
____shared____ double Mds[subVector];
____shared____ double Nds[subVector];
int tx = threadIdx.x;
int tx_mod = tx ^ 0x0001;
int basex = __umul24(blockDim.x, blockIdx.x);
Mds[tx] = M.elements[basex + tx];
__syncthreads();
// flip the signs
if (tx < (tx/24)*24 + 12)
{
//if < 12 and even
if ((tx & 0x0001)==0)
Mds[tx] = -Mds[tx];
}
else
if (tx < (tx/24)*24 + 24)
{
//if >12 and < 24 and odd
if ((tx & 0x0001)==1)
Mds[tx] = -Mds[tx];
}
__syncthreads();
if (tx < (tx/24)*24 + 6)
{
//for the first 6 elements .. swap with last six in the 24elements group (see graph)
Nds[tx] = Mds[tx_mod + 18];
Mds [tx_mod + 18] = Mds [tx];
Mds[tx] = Nds[tx];
}
else
if (tx < (tx/24)*24 + 12)
{
// for the second 6 elements .. swp with next adjacent group (see graph)
Nds[tx] = Mds[tx_mod + 6];
Mds [tx_mod + 6] = Mds [tx];
Mds[tx] = Nds[tx];
}
__syncthreads();
Thanks in advance ..
paul gave you pretty good starting points you previous questions.
couple things to watch out for: you are doing non-base 2 division which is expensive.
Instead try to utilize multidimensional nature of the thread block. For example, make the x-dimension of size 24, which will eliminate need for division.
in general, try to fit thread block dimensions to reflect your data dimensions.
simplify sign flipping: for example, if you do not want to flip sign, you can still multiplied by identity 1. Figure out how to map even/odd numbers to 1 and -1 using just arithmetic: for example sign = (even*2+1) - 2 where even is either 1 or 0.