Finding Coefficients of LFSR - cryptography

I am studying cryptography from Cristof Paar's book. There is a question about LFSR's I have trouble with. I just can't understand one point here. Question is this:
We want to perform an attack on another LFSR-based stream cipher. In order
to process letters, each of the 26 uppercase letters and the numbers 0, 1, 2, 3, 4, 5
are represented by a 5-bit vector according to the following mapping:
A -> 0 = 00000
.
.
.
Z -> 25 = 11001
0 -> 26 = 11010
.
.
.
5 -> 31= 11111
(binary)
We happen to know the following facts about the system:
-The degree of the LFSR is m = 6.
-Every message starts with the header WPI
We observe now on the channel the following message (the fourth letter is a zero): j5a0edj2b
What are the feedback coefficients of the LFSR? (This one!)
Solution:
I can't understand the matrix in this solution where did these numbers come?

Using WPI, we have plaintext begins with
P=>(10110)(01111)(01000)
Using j5a0edj2b we have the ciphertext
C=>(01001)(11111)(00000)(11010)(00100)(00011)(01001)............
then by addition of P and C in mod 2, the key stream is
S=>(11111)(10000)(01000)....
we find the matrix from key stream
s0=1,s1=1,s2=1,s3=1,s4=1,s5=1,s6=0,s7=0,s8=0,s9=0,s10=0,s11=1 etc
For the matrix
first line.... (s0,s1,s2,s3,s4,s5)
second line....(s1,s2,s3,s4,s5,s6)
third line.....(s2,s3,s4,s5,s6,s7)
4th (s3,s4,s5,s6,s7,s8)
5th (s4,s5,s6,s7,s8,s9)
last line (s5,s6,s7,s8,s9,s10)
this calulations are given in LFSRs in details

Related

Store non-binary values into a unique integer

In 8 bits we can store 8 numbers from 0 to 1 each. We can also say that we can store 8 different piece of data in a range from 0 to 255.
0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 → 8 different piece of data
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
[bit] [bit] [bit] [bit] [bit] [bit] [bit] [bit] → 8 bits total
If instead of storing 0 or 1 we need to store 0, 1 or 2, then we will end up using more bits, of course. In this condition, by the way, we can store 0, 1, 2 or 3. However, this is beyond what I need but technically it will use the same amount of data space.
0/1/2 0/1/2 0/1/2 0/1/2 → 4 different piece of data (expectative)
0/1/2/3 0/1/2/3 0/1/2/3 0/1/2/3 → 4 different piece of data (too much for me)
↓ ↓ ↓ ↓
[2bits] [2bits] [2bits] [2bits] → 8 bits total
We can agree that we are "losing" storage capacity. The solution is to do a radix conversion (I don't know if that's the right term) using the numerical base of 3, this way we will achieve the objective of storing exclusively 0, 1 or 2, and in the same 8 bits we can store more information (and, to be honest, I don't know how the bits organize themselves for this to work).
Examples:
00000₃ → int 0
01212₃ → int 50
11111₃ → int 121
12121₃ → int 151
22222₃ → int 242 (max)
In this conversion, we were able to store 5 pieces of information composed of 0, 1 or 2, instead of just 4. And there's still a "space" left, and that's what I'd like to talk about.
I was wondering if there is any way to store mixed base data instead of fixed base (eg. 3, as exemplified above). To be clearer, supposing that I wanted to store two pieces of data composed of 0, 1 or 2 each, and one piece of data composed of a number between 0 and 8. In binary, and still limited to 256, we can do it as follows:
0/1/2 0/1/2 0/1/2/3/4/5/6/7/8 → 3 piece of different data
↓ ↓ ↓
[2bits] [2bits] [ 4 bits ] → 8 bits total
But, it seems to me that we have empty "spaces" left again, since it doesn't seem to me to be using the full capacity possible in 8 bits, and maybe it's still possible to add more data if we use number base conversion instead.
The problem is: I don't know how to handle the data in this way, in a way to reliably merge and split it again.
In the real world: I need to transfer a lot of data that uses very little information (eg numbers 0 to 2, 0 to 5, 0 to 10, etc.). And I'm currently doing bitwise snapping of this data. And in my case, any optimized byte is a very nice gain (if this is really possible, maybe there is a 20~40% data saving).
I understand that it might consume more processing on rebasing, merge and split conversions, but that won't be an issue, because this processing will be done on the client side (merge when sending and split when receiving), and the server manages the same data already optimized (no additional processing).
--
Possible solutions:
Let's imagine that I have a set of three numbers that can be 0, 1 or 2, and another set of three numbers that can go from 0 to 8 (so it is 9 possibilities each). The goal is to merge all these numbers into a single integer (which should use about 2 bytes at best current method).
Solution 1: each number being stored in a single complete byte:
byte A from 0 to 2
byte B from 0 to 2
byte C from 0 to 2
byte D from 0 to 8
byte E from 0 to 8
byte F from 0 to 8
Problem: it will be necessary to consume 6 bytes to store these 6 numbers.
Solution 2: storage through bits:
2 bits to store A
2 bits to store B
2 bits to store C
4 bits to store F
4 bits to store G
4 bits to store H
6 unused bits to complete the last byte
Problems: in addition to 2 bits being "more than necessary" to store numbers from 0 to 2 (A, B and C), and same for 4 bits from 0 to 8, we will use a total of 3 bytes and still have 6 "dead" (unused) bits at end to complete the last byte.
Solution 3: Separate into two sets and convert their respective bases individually:
A, B and C to base 3 consumes 1 byte
D, E and F to base 9 consumes 2 bytes
Problem: despite being a great solution at the moment (and despite the example above, in more complex situations it can be more optimized than solution 2), and that's what I'm using, I believe there's a lot of "left over" space in this union, and maybe it's still possible to squeeze everything into 2 bytes only.
For example, the conversion of 222₃ consumes up to 5 bits, which means that in this byte we still have 3 unused bits. The 888₉ conversion consumes 10 bits. This makes me see the possibility of using only 15 bits with only 1 unused bit (2 bytes).
Then we come to the next solution.
Solution 4: move the bits to further optimize space:
higher bits stores A, B and C
lower bits stores D, E and F
Example:
[5 bits of numbers of base 3] +
[10 bits of numbers of base 9] +
[1 unused bit]
Problem: currently this solution is even better than the one I am currently using. However, I still see a possibility to improve in situations where more information may be available through number base conversion and union.
You pack your values in the following way:
Example:
possible values for
a = [0,2] -> 3 states
b = [0,2] -> 3 states
c = [0,8] -> 9 states
lets assume you have following values
a := 1
b := 2
c := 8
than you can calculate the final number the following way
int number = 0;
int multi = 1;
number = number.add(multi.multiply(a));
multi = multi.multiply(3));
number = number.add(multi.multiply(b));
multi = multi.multiply(3));
number = number.add(multi.multiply(c));
number holds now your packed numbers
to unpack you just need todo
a = number.mod(3)
number = number.divide(3)
b = number.mod(3)
number = number.divide(3)
c = number.mod(9)

Understanding Pandas Series Data Structure

I am trying to get my head around the Pandas module and started learning about the Series data structure.
I have created the following Series in Spyder :-
songs = pd.Series(data = [145,142,38,13], name = "Count")
I can obtain information about the Series index using the code:-
songs.index
The output of the above code is as follows:-
My question is where it states Start = 0 and Stop = 4, what are these referring to?
I have interpreted start = 0 as the first element in the Series is in row 0.
But i am not sure what Stop value refers to as there are no elements in row 4 of the Series?
Can some one explain?
Thank you.
This concept as already explained adequately in the comments (indexing is at minus one the count of items) is prevalent in many places.
For instance, take the list data structure-
z = songs.to_list()
[145, 142, 38, 13]
len(z)
4 # length is four
# however indexing stops at i-1 position 'i' being the length/count of items in the list.
z[4] # this will raise an IndexError
# you will have to start at index 0 going till only index 3 (i.e. 4 items)
z[0], z[1], z[2], z[-1] # notice how -1 can be used to directly access the last element

Find length of key in vigenere cipher

I am new to cryptography kindly help to solve the following vigenere cipher problem with well defined steps
Assume you are given a 300 character encrypted message, encrypted in Vigenere cryptosystem, in which you know the plaintext word CRYPTOGRAPHY occurs exactly two times, and we know that the ciphertext sequence TICRMQUIRTJR is the encryption of CRYPTOGRAPHY. The first occurrence starts at character position 10 and second at character position 241 (we start counting from 1). What is the length of the key used for encryption
Answer is 7
Solution To estimate the period we use the Kasiski test. The distance between the two occurrences given is
241 − 10 = 231 = 3 · 7 · 11
positions.
Possible periods are thus 3, 7 and 11. If the guess is correct, we can immediately find the
corresponding shifts: at position 10 the shift is
T − c = 19 − 2 = 17 = r
. Similar computations for the other positions gives the shift keys
rrectcorrect
We now see that this is not periodic with periods 3 or 11, while period 7 is possible. The keyword
of length 7 starts at position 15; hence the keyword is
correct.

Regular expression (0+1)*1(0+1)*0 DFA

I'm trying to understand the regular expression: (0+1)*1(0+1)*0 Could you provide examples that matches this pattern?
Let me explain :
1 - (0+1) mean any number of 0, then a 1
2 - (0+1)* means the previous line any number of times (can be 0)
3 - (0+1)*1 mean the previous line and a 1
4 - (0+1)*0 means line 2 and a 0
10 works : 0 times (0+1), then a 1, then 0 times (0+1), then a 0.
00000000000100000000000110 works : eleven 0 and a 1, twice (this is (0+1)*). Then, a 1. Then, no (0+1), and the last 0. A few other examples :
10
00001000010000110000100001000010
01010110
0110
I hope you understood (I'm not english, my english is bad, sorry)
EDIT : There are a lot of websites that can help you with regular expressions, whether it is learning or testing regex.

Discrete Binary Search Main Theory

I have read this: https://www.topcoder.com/community/competitive-programming/tutorials/binary-search.
I can't understand some parts==>
What we can call the main theorem states that binary search can be
used if and only if for all x in S, p(x) implies p(y) for all y > x.
This property is what we use when we discard the second half of the
search space. It is equivalent to saying that ¬p(x) implies ¬p(y) for
all y < x (the symbol ¬ denotes the logical not operator), which is
what we use when we discard the first half of the search space.
But I think this condition does not hold when we want to find an element(checking for equality only) in an array and this condition only holds when we're trying to find Inequality for example when we're searching for an element greater or equal to our target value.
Example: We are finding 5 in this array.
indexes=0 1 2 3 4 5 6 7 8
1 3 4 4 5 6 7 8 9
we define p(x)=>
if(a[x]==5) return true else return false
step one=>middle index = 8+1/2 = 9/2 = 4 ==> a[4]=5
and p(x) is correct for this and from the main theory, the result is that
p(x+1) ........ p(n) is true but its not.
So what is the problem?
We CAN use that theorem when looking for an exact value, because we
only use it when discarding one half. If we are looking for say 5,
and we find say 6 in the middle, the we can discard the upper half,
because we now know (due to the theorem) that all items in there are > 5
Also notice, that if we have a sorted sequence, and want to find any element
that satisfies an inequality, looking at the end elements is enough.