Generate a merkle tree consistency proof - cryptography

I read the documentation about merkle tree here: Certificate Transparency Version 2.0, but I still do not quite understand the algorithm that generate the consistency proof.
The example in the documentation:
old version
new version
The consistency proof between old version and new version is PROOF(3, D[7]) = [c, d, g, l]. c, g are used to verify hash0, and d, l are additionally used to show hash is consistent with hash0.
Can someone show me how to use the algorithm to generate these proof?

Related

Pointwise function equality proof [duplicate]

I'm just starting playing with idris and theorem proving in general. I can follow most of the examples of proofs of basic facts on the internet, so I wanted to try something arbitrary by my own. So, I want to write a proof term for the following basic property of map:
map : (a -> b) -> List a -> List b
prf : map id = id
Intuitively, I can imagine how the proof should work: Take an arbitrary list l and analyze the possibilities for map id l. When l is empty, it's obvious; when
l is non-empty it's based on the concept that function application preserves equality.
So, I can do something like this:
prf' : (l : List a) -> map id l = id l
It's like a for all statement. How can I turn it into a proof of the equality of the functions involved?
You can't. Idris's type theory (like Coq's and Agda's) does not support general extensionality. Given two functions f and g that "act the same", you will never be able to prove Not (f = g), but you will only be able to prove f = g if f and g are defined the same, up to alpha and eta equivalence or so. Unfortunately, things only get worse when you consider higher-order functions; there's a theorem about such in the Coq standard library, but I can't seem to find or remember it right now.

Generating Artificial Test Clusters

Most authors generated their test clusters using Milligan's algorithm (well cited paper,Milligan, G.: An algorithm for creating artificial test clusters. Psychometric 50 (1985) 123–127). And A public domain implementation of this algorithm is available from Dave Dubin (Dubin, D.: clusgen.c. http://alexia.lis.uiuc.edu/ ̃dubin/ (1996)). However, this link is not available. Could you please tell me if there is any other implementation of this algorithm or other ways to generate artificial test clusters? Thanks in advance!
You may want to consider using MixSim in R or CARP in C which is considered to be some sort of a modern gold standard for assessing clustering. CARP appears to be more flexible. You can get the R package from CRAN, and the C package from MLOSS
Hope this helps!

Mathematica can't solve DSolve[{f[0] ==d,f'[0]==v0,f''[t] == -g*m2/f[t]^2}, f, t]?

In[11]:= $Version
Out[11]= 9.0 for Linux x86 (32-bit) (November 20, 2012)
In[12]:= DSolve[{f[0] == d, f'[0] == v0, f''[t] == g*m2/f[t]^2}, f, t]
DSolve::bvimp: General solution contains implicit solutions. In the boundary
value problem these solutions will be ignored, so some of the solutions will
be lost.
Out[12]= {}
The code above pretty much says it all. I get the same error if I replace g*m2 with 1.
This seems like a really simple DFQ to solve. I'd like to tell DSolve to assume all variables are real and that d, g, and m2 are all greater than 0, but there's unfortunately no way to do that.
Thoughts?
You are trying for a symbolic solution. And unfortunately, symbolic integration is hard (while symbolic differentiation is easy).
The way this integration works is to obtain the energy functional by integrating once
E = 1/2*f'[t]^2 + C/f[t]
and then to isolate f'[t]. The resulting integral is not easy to solve and leads to the mentioned implicit solutions.
Did you really want to get the symbolic solution or only some function table to plot the solutions or compute other related quantities?
Since it was clarified that the requested quantity is the maximum of certain solutions: This can be computed by setting v=0 in the energy equation
C/x = E = 1/2*v0^2 + C/x0
or
x = C*x0/(C + 1/2*v0^2*x0 )
One would have to analyze the time aspect to make sure that this extremum is reached before passing again at the initial point x0.

Detecting duplicate or similar PDFs using elasticsearch

I'm trying to find a good way to identify whether I have duplicate/highly similar PDFs in a system if they're not exactly alike (ie checksums are different because the pages in the pdf have been either been rearranged, deleted, or merged with other pages).
So a simple example would be:
Original PDF contains pages (A, B, C, D)
New PDF entering system contains pages (D, B, C, A, E, F) or (D, G, H, I, B) or any other iteration where some of the content is resides somewhere else in the original PDF.
Any suggestions for robust methods to determine match/similarity thresholds? Such as identifying that a new PDF is 80% similar to the original.
We are using elasticsearch for search purposes in our system but I haven't found a good way to query or use the score to come up with a useful percentage/number to use as a success threshold.
Any thoughts/ideas/suggestions would be most appreciated.

Generator G's requirement to be a primitive root modulo p in the Diffie Hellman algorithm

Having searched, I've found myself confused by the use of P and G in the Diffie Hellman algorithm. There is requirementy that P is prime, and G is a primitive root of P.
I understand the security is based on the difficulty of factoring the result of two very large prime numbers, so I have no issue with that. However, there appears to be little available information on the purpose of G being a primitive root of P. Can anyone answer why this requirement exists (with references if possible)? Does it just increase the security? Given that shared keys can be created with apparently any combination of p and g, even ones that are not prime, I find this intriguing. It can surely only be for security? If so, how does it increase it?
Thanks in advance
Daniel
If g is not a primitive root of p, g will only generate a subgroup of GFp. This has consequences for the security properties of the system: the security of the system will only be proportional to the order of g in GFp instead of proportional to the full order of GFp.
To take a small example: select p=13 and g=3.
The order of 3 in GF_13 is 3 (3^1=3, 3^2=9, 3^3=1).
Following the usual steps of Diffie-Hellman, Alice and Bob should each select integers a, b between 1 and p-1 and calculate resp. A = ga and B = gb. To brute force this, an attacker should expect to try all possible values of a (or b) between 1 and p-1 until he finds a value that yields A (or B). But since g was not a primitive root modulo p, he only need to try the values 1, 2 and 3 in order to find a solution a' so that A = ga'. And the secret is s = gab = (ga)b =(ga')b = ga'b = (gb)a' = Ba', which the attacker can now calculate.
There is no requirement that the generator g used for Diffie-Hellman is a primitive root nor is this even a common choice. Much more popular is to choose g such that it generates a prime order subgroup. I.e. the order of g is a prime q, which is a large prime factor of p-1.
For example the Diffie-Hellman groups proposed for IKE have been chosen such that p is a safe prime and g generates the subgroup of order (p-1)/2.
One motivation for choosing g as a generator of a big prime order subgroup is that this allows to make
the decisional Diffie-Hellman assumption. This assumption does not hold if g is a primitive root, and that makes the analysis of the implemented protocols somewhat harder.
The security of Diffie-Hellman is not based on the difficulty of factoring. It is based on the (assumed) difficulty of calculating general discrete logarithms.
g must be a primitive root of p for the algorithm to be correct and useable. It ensures that for every number 0 <= x < p, there is a distinct value of gx mod p. That is, it ensures that g can "generate" every value in the finite field.