bitcoin block solving, all nonces used but no hit - block

I'm trying to understand how bitcoin block solving attempts works.
I see a nonce is a 32-bit number, so around 4 billion values to try.
Also, I saw a famous mining pool having 500 Ph/s power at hand. And I found there one particular block solved in 40 minutes.
So, that is (40 x 3600) x (500 x 10^15) = 7.2 x 10^22 hashes calculated
on that pool, to solve one block.
That means the nonces has been "cycled" 16763 billion times during those 40 minutes.
So I'm wondering what are those 16763 billion more things done after each nonce cycle? ("1 cycle of nonces" is going from 0 to 4294967295) ?
I see that we can change the timestamp at a certain proportion, and the merkel root hash also.
Aren't merkel hashes and timestamps more strict to calculate and use than nonces?
Those 16763 billion things are changes of the timestamp and merkel only? Can we have as much unique merkel hashes re-generated and timestamps changes as needed?
Can you give me examples? sorry if my view is a bit biased, I'm starting with this.

Apparently, I've found that when the nonces have cycled (overflow), an extraNonce value is incremented, and that requires the Merkel hash to be recalculated based on that extraNonce value.
a link here

Related

Crack sha256 when you know the pass form

Is it possible to write a code that can crack the sha256 hash when you know the form of password? For example the password form is *-********** which is 12-13 characters long and:
The first char is one number from 1 to 25
Second one is hyphen
In each char from the third one to the end, you can put a...z, A...Z and 0...9
After guessing each pass, code converts the pass to sha256 and see whether the result hash is equal to our hash or not and then print the correct pass.
I know all possible numbers is a big number (26+26+10)^10 but I want to know that:
Is it possible to write such code?
If yes, is it possible to run whole code in less than one day (because I think it takes a lot of time to complete the whole code)?
Since I can't ask you to write a code for me, how and where can I ask for this code?
You cannot "crack" a SHA256 hash no matter how much information you know about the plaintext (assuming by crack you mean derive the plaintext from the hash). Even if you knew the password you could not determine any procedure for reversing the hash. In technical terms, there is no known way to perform a preimage attack on a SHA256 hash.
That means you have to resort to guessing or brute forcing the password:
You have a prefix, which can be any value in [1-25]- and 10 additional characters in [a-zA-Z0-9]. That means the total number of possible passwords is: 25 * 62^10 or 20,982,484,146,708,505,600.
If you were able to compute and check a billion passwords per second it would take you 20,982,484,146 seconds to generate every possible hash. If you start now you'll be finished in about 665 years.
If you are able to leverage some more computing power and generate a trillion hashes per second it would only take a bit more than half a year. The good news is that computing hashes can be done in parallel, so it is easy to utilize multiple machines. The bad news is that kind of computing power isn't going to be cheap.
To answer your questions:
Is it possible to write such code? It is possible to write a program that will iterate over the entire range of possible passwords and check it against the hash(es) you want to determine the plaintext for.
If yes, is it possible to run whole code in less than one day. Yes, if you can compute and check around 10^15 hashes per second.
How and where can I ask for this code? This is the least of your problems.
Fortunately, since bitcoin uses sha256, it is pretty easy to find rough numbers on the amount of computing power it takes to generate the number of hashes you need.
If the numbers in this article are correct a Raspberry Pi can generate 2*10^5 hashes per second. I believe the newer Raspberry Pis are more powerful than that so I'm going to double that to 4*10^5. You need to generate about 10^15 hashes per second to be done in less than a day.
You're going to need 250,000,000 Raspberry Pis.

Is there any option to use redis.expire more elastically?

I got a quick simple question,
Assume that if server receives 10 messages from user within 10 minutes, server sends a push email.
At first I thought it very simple using redis,
incr("foo"), expire("foo",60*10)
and in Java, handle the occurrence count like below
if(jedis.get("foo")>=10){sendEmail();jedis.del("foo");}
but imagine if user send one message at first minute and send 8 messages at 10th minute.
and the key expires, and user again send 3 messages in the next minute.
redis key will be created again with value 3 which will not trigger sendEmail() even though user send 11 messages in 2 minutes actually.
we're gonna use Redis and we don't want to put receive time values to redis.
is there any solution ?
So, there's 2 ways of solving this-- one to optimize on space and the other to optimize on speed (though really the speed difference should be marginal).
Optimizing for Space:
Keep up to 9 different counters; foo1 ... foo9. Basically, we'll keep one counter for each of the possible up to 9 different messages before we email the user, and let each one expire as it hits the 10 minute mark. This will work like a circular queue. Now do this (in Python for simplicity, assuming we have a connection to Redis called r):
new_created = False
for i in xrange(1,10):
var_name = 'foo%d' % i
if not (new_created or r.exists(var_name)):
r.set(var_name, 0)
r.expire(var_name, 600)
new_created = True
if not r.exists(var_name): continue
r.incr(var_name, 1)
if r.get(var_name) >= 10:
send_email(user)
r.del(var_name)
If you go with this approach, put the above logic in a Lua script instead of the example Python, and it should be quite fast. Since you'll at most be storing 9 counters per user, it'll also be quite space efficient.
Optimizing for speed:
Keep one Redis Sortet Set per user. Every time a user sends a message, add to his sorted set with a key equal to the timestamp and an arbitrary value. Then just do a ZCOUNT(now, now - 10 minutes) and send an email if that's greater than 10. Then ZREMRANGEBYSCORE(now - 10 minutes, inf). I know you said you didn't want to keep timestamps in Redis, but IMO this is a better solution, and you're going to have to hold some variant on timestamps somewhere no matter what.
Personally I'd go with the latter approach because the space differences are probably not that big, and the code can be done quickly in pure Redis, but up to you.

Developing Rainbow Tables

I am currently working on a parallel computing project where i am trying to crack passwords using rainbow tables.
The first step that i have thought of is to implement a very small version of it that cracks password of lengths 5 or 6 (only numeric passwords to begin with). To begin with, i have some questions with the configuration settings.
1 - What should be the size that i should start with. My first guess is, i will start with a table with 1000 Initial, Final pair. Is this is a good size to start with?
2- Number of chains - I really got no information online with what should be the size of a chain be
3 - Reduction function - If someone can give me any information about how should i go about building one.
Also, if anyone has any information or any example, it will be really helpful.
There is already a wealth of rainbow tables available online. Calculating rainbow tables simply moves the computation burden from when the attack is being run, to the pre-computation.
http://www.freerainbowtables.com/en/tables/
http://www.renderlab.net/projects/WPA-tables/
http://ophcrack.sourceforge.net/tables.php
http://www.codinghorror.com/blog/2007/09/rainbow-hash-cracking.html
It's a time-space tradeoff. The longer the chains are, the less of them you need, so the less space it'll take up, but the longer cracking each password will take.
So, the answer is always to build the biggest table you can in the space that you have available. This will determine your chain length and number of chains.
As for choosing the reduction function, it should be fast and behave pseudo-randomly. For your proposed plaintext set, you could just pick 20 bits from the hash and interpret them as a decimal number (choosing a different set of 20 bits at each step in the chain).

Cost of Preimage attack

I need to know the cost of succeeding with a Preimage attack ("In cryptography, a preimage attack on a cryptographic hash is an attempt to find a message that has a specific hash value.", Wikipedia).
The message I want to hash consists of six digits (the date of birth), then four random digits. This is a social security number.
Is there also a possibility to hash something using a specific password. This would introduce another layer of security as one would have to know the password in order to produce the same hash values for a message.
I am thinking about using SHA-2.
If you want to know how expensive it is to find a preimage for the string you're describing, you need to figure out how many possible strings there are. Since the first 6 digits are a date of birth, their value is even more restricted than the naive assumption of 10^6 - we have an upper bound of 366*100 (every day of the year, plus the two digit year).
The remaining 4 'random' digits permit another 10^4 possibilities, giving a total number of distinct hashes of 366 * 100 * 10^4 = 366,000,000 hashes.
With that few possibilities, it ought to be possible to find a preimage in a fraction of a second on a modern computer - or, for that matter, to build a lookup table for every possible hash.
Using a salt, as Tom suggests, will make a lookup table impractical, but with such a restricted range of valid values, a brute force attack is still eminently practical, so it alone is not sufficient to make the attack impractical.
One way to make things more expensive is to use iterative hashing - that is, hash the hash, and hash that, repeatedly. You have to do a lot less hashing than your attacker does, so increases in cost affect them more than they do you. This is still likely to be only a stopgap given the small search space, however.
As far as "using a password" goes, it sounds like you're looking for an HMAC - a construction that uses a hash, but can only be verified if you have the key. If you can keep the key secret - no easy task if you're assuming the hashes can only be obtained if your system is compromised in the first place - this is a practical system.
Edit: Okay, so 'fractions of a second' may have been a slight exaggeration, at least with my trivial Python test. It's still perfectly tractable to bruteforce on a single computer in a short timeframe, however.
SHA-2, salts, preimage atttacks, brute forcing a restricted, 6-digit number - man it would be awesome if we have a dial we could turn that would let us adjust the security. Something like this:
Time to compute a hash of an input:
SHA-2, salted Better security!
| |
\|/ \|/
|-----------------------------------------------------|
.01 seconds 3 seconds
If we could do this, your application, when verifying that the user entered data matches what you have hashed, would in fact be a few seconds slower.
But imagine being the attacker!
Awesome, he's hashing stuff using a salt, but there's only 366,000,000 possible hashes, I'm gonna blaze through this at 10,000 a second and finish in ~10 hours!
Wait, what's going on! I can only do 1 every 2.5 seconds?! This is going to take me 29 years!!
That would be awesome, wouldn't it?
Sure would.
I present unto you: scrypt and bcrypt. They give you that dial. Want to spend a whole minute hashing a password? They can do that. (Just make sure you remember the salt!)
I'm unsure what your question is exactly, but to make your encrypted value more secure, use salt values.
Edit: I think you are sort of describing salt values in your question.

Storage algorithm question - verify sequential data with little memory

I found this on an "interview questions" site and have been pondering it for a couple of days. I will keep churning, but am interested what you guys think
"10 Gbytes of 32-bit numbers on a magnetic tape, all there from 0 to 10G in random order. You have 64 32 bit words of memory available: design an algorithm to check that each number from 0 to 10G occurs once and only once on the tape, with minimum passes of the tape by a read head connected to your algorithm."
32-bit numbers can take 4G = 2^32 different values. There are 2.5*2^32 numbers on tape total. So after 2^32 count one of numbers will repeat 100%. If there were <= 2^32 numbers on tape then it was possible that there are two different cases – when all numbers are different or when at least one repeats.
It's a trick question, as Michael Anderson and I have figured out. You can't store 10G 32b numbers on a 10G tape. The interviewer (a) is messing with you and (b) is trying to find out how much you think about a problem before you start solving it.
The utterly naive algorithm, which takes as many passes as there are numbers to check, would be to walk through and verify that the lowest number is there. Then do it again checking that the next lowest is there. And so on.
This requires one word of storage to keep track of where you are - you could cut down the number of passes by a factor of 64 by using all 64 words to keep track of where you're up to in several different locations in the search space - checking all of your current ones on each pass. Still O(n) passes, of course.
You could probably cut it down even more by using portions of the words - given that your search space for each segment is smaller, you won't need to keep track of the full 32-bit range.
Perform an in-place mergesort or quicksort, using tape for storage? Then iterate through the numbers in sequence, tracking to see that each number = previous+1.
Requires cleverly implemented sort, and is fairly slow, but achieves the goal I believe.
Edit: oh bugger, it's never specified you can write.
Here's a second approach: scan through trying to build up to 30-ish ranges of contiginous numbers. IE 1,2,3,4,5 would be one range, 8,9,10,11,12 would be another, etc. If ranges overlap with existing, then they are merged. I think you only need to make a limited number of passes to either get the complete range or prove there are gaps... much less than just scanning through in blocks of a couple thousand to see if all digits are present.
It'll take me a bit to prove or disprove the limits for this though.
Do 2 reduces on the numbers, a sum and a bitwise XOR.
The sum should be (10G + 1) * 10G / 2
The XOR should be ... something
It looks like there is a catch in the question that no one has talked about so far; the interviewer has only asked the interviewee to write a program that CHECKS
(i) if each number that makes up the 10G is present once and only once--- what should the interviewee do if the numbers in the given list are present multple times? should he assume that he should stop execting the programme and throw exception or should he assume that he should correct the mistake by removing the repeating number and replace it with another (this may actually be a costly excercise as this involves complete reshuffle of the number set)? correcting this is required to perform the second step in the question, i.e. to verify that the data is stored in the best possible way that it requires least possible passes.
(ii) When the interviewee was asked to only check if the 10G weight data set of numbers are stored in such a way that they require least paases to access any of those numbers;
what should the interviewee do? should he stop and throw exception the moment he finds an issue in the algorithm they were stored in, or correct the mistake and continue till all the elements are sorted in the order of least possible passes?
If the intension of the interviewer is to ask the interviewee to write an algorithm that finds the best combinaton of numbers that can be stored in 10GB, given 64 32 Bit registers; and also to write an algorithm to save these chosen set of numbers in the best possible way that require least number of passes to access each; he should have asked this directly, woudn't he?
I suppose the intension of the interviewer may be to only see how the interviewee is approaching the problem rather than to actually extract a working solution from the interviewee; wold any buy this notion?
Regards,
Samba