I am trying to implement the SHA-2 hash algorithm on a gated quantum computer (actually, a simulator), and I am having some trouble understanding the theory behind it. The papers I read start with the ring of polynomials given by $F2[x]/(1 + x + x^3 + x^6 + x^8)$. What is the significance of the polynomial $1 + x + x^3 + x^6 + x^8$?
Related
We know thah O(n) + O(n) = O(n) for this, even that O(n) + O(n) + ... + O(n) = O(n^2) for this.
But what happend if O(n) + O(n^2)?
Is O(n) or O(n^2)?
The Big O notation (https://en.wikipedia.org/wiki/Big_O_notation) is used to understand the limit of a specific algorithm, and how fast its complexity grows. Therefore when considering the growth of a linear and a quadratic component of an algorithm, what remains in Big I notation is only the quadratic component.
As you can see from the attached image, the quadratic curve grows much faster (over y-axis) than the linear curve, determining the general tendency of the complexity for that algorithm to be just influenced by the quadratic curve, hence O(n^2).
The case for O(n) + O(n) = O(n) it's due to the fact that any constant in Big O notation can be discarded: in fact the curve y = n and y = 2n would be growing just as fast (although with a different slope).
The case for O(n) + ... + O(n) = O(n^2) it's not generally true! For this case the actual complexity would be polynomial with O(k*n). Only if the parameter k equals to the size of your input n, then you will end up with a specific quadratic case.
I have a few algorithm complexities that I'm not entirely sure of what the Big O notations are for them.
i) ((n-1)(n-1) * ... * 2 * 1)/2
ii) 56 + 2n + n^2 + 3n^3
iii) 2n(lg n) + 1001
iv) n^2 * n^3 + 2^n
I believe ii) and iii) are pretty straightforward with the Big O of ii) being O(n^3) and the Big O of iii) being O(n log n) but let me know if these are wrong.
It's mostly i) and iv) I'm a bit confused on. For i) I assumed it followed the same idea as 1+2+3+4+...+n which has a Big O notation of O(n^2) so that's what I put and for iv) I put O(n^5) but I'm not sure if the 2^n affects the Big O notation in this case, I'm not sure what gets priority here or do I just include them both?
Any help would be much appreciated, I'm not that experienced in Big O notation so any advice would be really helpful as well.
Thanks in advance
Since problem i) is multiplying (not adding) the terms from 1 to n, that should be O(n!).
You're right on ii) n^3 is the dominant term, so it's O(n^3), and on iii) both constants 2 and 1001 can be ignored leaving you with O(n log n).
On iv) you were right to combine the first two terms to get n^5, but even that will eventually be surpassed by the 2^n term, so the answer is O(2^n).
For my Crypto course, I'm given with two polynomials, in compact form and an irreducible polynomial and am asked to perform the 4 basic arithmetic operations in GF(2^8). Accomplished addition and multiplication, I'm now wondering how to approach subtraction and division. For convenience, let's assume the inputs be, in bit sequence, always of 8 bits
1st bit sequence: 11001100
2nd bit sequence: 11110000
irreducible polynomial(fixed): x^8 + x^4 + x^3 + x^1
How do I perform subtraction/division?
The polynomial x^8 + x^4 + x^3 + x^1 is not irreducible: x is obviously a factor!. My bets are on a confusion with x^8 + x^4 + x^3 + x + 1, which is the lexicographically first irreducible polynomial of degree 8.
After we correct the polynomial, GF(28) is a field in which every element is its own opposite. This implies subtraction is the same as addition.
Multiplication * in that field less zero forms a group of 255 elements. Hence for any non-zero B, it holds B255 = 1. Hence the multiplicative inverse of such B is B254.
Hence one can compute A / B as B254 * A when B is not zero. If it is taken that division by zero yields zero, the formula works without special case.
B254 can be computed using 13 multiplications by the standard binary exponentiation method (square and multiply), successively raising B to the 2, 3, 6, 7, 14, 15, 30, 31, 62, 63, 126, 127, 254th powers. C code in this answer on crypto.SE. It is possible to get down to 11 multiplications, and build a whole inverse table with 255 multiplications; Try It Online!.
Other slightly faster to obtain (one) modular inverse include the Extended Euclidean GCD algorithm and log/antilog tables, see this this other answer on crypto.SE. However:
When speed is an issue, we can pre-tabulate the modular inverses.
That's more complex.
Computation of the modular inverse as B254 is constant-time (as desirable in cryptography to prevent side-channel timing attacks) under the main condition that multiplication is, when that's next to impossible to insure with other methods, including table on modern hardware and most computer languages, save perhaps assembly.
I'm having problems to understand how a CNN filter is able to give a higher value to perfect fit patchs when you have grayscale images with big white zones.
For example, imagine that I have the next 3x3 filter:
0-1-0
0-1-0
0-1-0
And this filter is applied to one image with big completely white zones. For example, I could to have a patch of that image, like this:
255-255-255
255-255-255
255-255-255
and for this patch, the kernel would return (0*255 + 0*255 + 0*255) + (1*255 + 1*255 + 1*255) + (0*255 + 0*255 + 0*255) = 765
and if I apply the same filter to this patch image:
0-255-0
0-255-0
0-255-0
I would get the same value: (0*0 + 0*0 + 0*0) + (1*255 + 1*255 + 1*255) + (0*0 + 0*0 + 0*0) = 765
But the last one image patch should have got a much better value for kernel, so I am going crazy to understand how this works really
Thanks in advance!
Well, after a few days thinking about it, I have found the answer to my question, using negative values in kernel. After see so many kernel examples with 1's and 0's, I didn't think that the values could to be negatives too.
I'm new to the gensim package and vector space models in general, and I'm unsure of what exactly I should do with my LSA output.
To give a brief overview of my goal, I'd like to enhance Naive Bayes Classifier using topic modeling to improve classification of reviews (positive or negative). Here's a great paper I've been reading that has shaped my ideas but left me still somewhat confused about implementation..
I've already got working code for Naive Bayes--currently, I'm just using unigram bag of words as my features and labels are either positive or negative.
Here's my gensim code
from pprint import pprint # pretty printer
import gensim as gs
# tutorial sample documents
docs = ["Human machine interface for lab abc computer applications",
"A survey of user opinion of computer system response time",
"The EPS user interface management system",
"System and human system engineering testing of EPS",
"Relation of user perceived response time to error measurement",
"The generation of random binary unordered trees",
"The intersection graph of paths in trees",
"Graph minors IV Widths of trees and well quasi ordering",
"Graph minors A survey"]
# stoplist removal, tokenization
stoplist = set('for a of the and to in'.split())
# for each document: lowercase document, split by whitespace, and add all its words not in stoplist to texts
texts = [[word for word in doc.lower().split() if word not in stoplist] for doc in docs]
# create dict
dict = gs.corpora.Dictionary(texts)
# create corpus
corpus = [dict.doc2bow(text) for text in texts]
# tf-idf
tfidf = gs.models.TfidfModel(corpus)
corpus_tfidf = tfidf[corpus]
# latent semantic indexing with 10 topics
lsi = gs.models.LsiModel(corpus_tfidf, id2word=dict, num_topics =10)
for i in lsi.print_topics():
print i
Here's output
0.400*"system" + 0.318*"survey" + 0.290*"user" + 0.274*"eps" + 0.236*"management" + 0.236*"opinion" + 0.235*"response" + 0.235*"time" + 0.224*"interface" + 0.224*"computer"
0.421*"minors" + 0.420*"graph" + 0.293*"survey" + 0.239*"trees" + 0.226*"paths" + 0.226*"intersection" + -0.204*"system" + -0.196*"eps" + 0.189*"widths" + 0.189*"quasi"
-0.318*"time" + -0.318*"response" + -0.261*"error" + -0.261*"measurement" + -0.261*"perceived" + -0.261*"relation" + 0.248*"eps" + -0.203*"opinion" + 0.195*"human" + 0.190*"testing"
0.416*"random" + 0.416*"binary" + 0.416*"generation" + 0.416*"unordered" + 0.256*"trees" + -0.225*"minors" + -0.177*"survey" + 0.161*"paths" + 0.161*"intersection" + 0.119*"error"
-0.398*"abc" + -0.398*"lab" + -0.398*"machine" + -0.398*"applications" + -0.301*"computer" + 0.242*"system" + 0.237*"eps" + 0.180*"testing" + 0.180*"engineering" + 0.166*"management"
Any suggestions or general comments would be appreciated.
Just started working on the same problem, but with SVM instead, AFAIK after training your model you need to do something like this:
new_text = 'here is some document'
text_bow = dict.doc2bow(new_text)
vector = lsi[text_bow]
Where vector is a topic distribution in your document, with length equal to number of topics you choose for training, 10 in your case.
So you need to represent all your documents as topic distributions and than feed them to classification algorithm.
P.S. I know it's kind of an old question, but I keep seeing it in google results every time I searching )