How many combinations does SHA-256 have? - bitcoin

By using an online tool and wikipedia I found out that every sha-256 encrypted string is 64 chars longs containing numbers and characters. Hence I assumed that there are 34^36 combinations ( 2^216 simplified by an algebra calculator ).
After doing some research I found out that most people said there are 2^256 combinations. Could someone explain ? To make the context clear, I write a paper about cryptocurrencies and try to explain how many different combinations there are to encrypt and how long this could take ( therefore how many guesses it could take) and compare this to the amount of total atoms in the universe (roughly 10^85).

SHA-256 produces 256 bits which is 32 bytes, not characters, each byte has 256 possible values.
There are 256 bits and each bit has 2 values (0 or 1), thus 2^256.
There are 32 bytes and each byte has 256 values, thus 256^32.
Note: 2^256 == 256^32 ~= 10^77.
The 32 bytes can be encoded many ways, in hexadecimal it would be 64 characters, in Base64 it would be 44 characters.

Total combinations of SHA-256 is
115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936

A sha-256 hash has 64 characters, 32 hex combinations, because a hex has 2 characters.
3a 7b d3 e2 36 0a 3d 29 ee a4 36 fc fb 7e 44 c7 35 d1 17 c4 2d 1c 18 35 42 0b 6b 99 42 dd 4f 1b
Above is a hash where the hex combinations are separated so you can count 32.
There are 16 characters available to hex 0-9&a-f and 16^2 or 256 combinations in hex.
With 32 slots for a hex in a sha-256 you use 256^32 to get:
115792089237316195423570985008687907853269984665640564039457584007913129639936
Available sha-256 hashes.

Related

How hexadecimal representation matches word size?

What does the statement "hexadecimal matches cleanly with modulo 8 word sizes, such as 8, 16, 32, and 64 bits" mean?
Since a single hex digit can represent exactly 4 bits of binary data, any word size that's a multiple of 4 can be exactly represented with a fixed number of hex digits.
And every word size that's a multiple of 8 (i.e. the common ones) can be represented with a number of digits that's a multiple of 2:
8 bits can store values from 00 to FF
16 bits can store values from 0000 to FFFF
32 bits can store values from 00000000 to FFFFFFFF
...
All 2-digit hex numbers can be represented in 8 bits and all 8 bit values can be represented in 2 hex digits. If a hex editor displays some value as CA FE BA BE you can easily grasp that it's 4 bytes and thus 32 bits. Getting that information from the decimal 3405707966 is not quite as trivial (no matter how you group the digits: there's no nice "byte boundaries" in that representation).
If you compare this with decimal, the same isn't true. For example, 8 bits can represent values from 0 to 255 (decimal). So you need up to 3 digits in decimal to represent 8-bit values. But there are 3-digit decimal values that you can't represent in 8 bits: 256 (or anything higher than that) doesn't map onto 8 bits. So the mapping isn't perfect for decimal numbers.

How do I run md5() on a bigint in Presto?

select md5(15)
returns
Query failed (#20160818_193909_00287_8zejd): line 1:8:
Unexpected parameters (bigint) for function md5. Expected: md5(varbinary)
How do I hash 15 and get back a string? I'd like to select 1 in 16 items at random, e.g. where md5(id) like '%3'.
FYI I might be on version 0.147, don't know how to tell.
FYI I found this PR. md5 would be cross-platform, which is nice, but I'd take a Presto-dependent hash function that spread ids relatively uniformly. I suppose I could implement my own linear formula. Seems awkward.
Best thing I could come up with was to cast the integer as a varchar, then turn it into varbinary via utf8, then apply md5 on the varbinary:
presto> select md5(to_utf8(cast(15 as varchar)));
_col0
-------------------------------------------------
9b f3 1c 7f f0 62 93 6a 96 d3 c8 bd 1f 8f 2f f3
(1 row)
If this is not the result you get, you can always turn it into a hex string manually:
presto> select to_hex(md5(to_utf8(cast(15 as varchar))));
_col0
----------------------------------
9BF31C7FF062936A96D3C8BD1F8F2FF3
(1 row)

wrong output when decoding base64 string

i seem to always get incorrect output when decoding this base64 string in vb.net ( i think its base64? it really looks like it )
im using the frombase64string function
and i did it like this
Dim b64str = "0DDQQL3uAikQBgAAc4cqK4WnSQBg4SAgExEAAF3BAmAILYojRgkBhUrBAgEDRw=="
Dim i As String = System.Text.Encoding.Unicode.GetString(Convert.FromBase64String(b64str))
MsgBox(i)
but i always get this output
バ䃐⤂ؐ
that doesn't seem right
0DDQQL3uAikQBgAAc4cqK4WnSQBg4SAgExEAAF3BAmAILYojRgkBhUrBAgEDRw==
It looks like Base64, the length is a correct size, the characters belong to the Base64 character set and the trailing "==" is reasonable. Of course it might not be a Base64 encoding.
Base64 decoding results in:
D0 30 D0 40 BD EE 02 29 10 06 00 00 73 87 2A 2B 85 A7 49 00 60 E1 20 20 13 11 00 00 5D C1 02 60 08 2D 8A 23 46 09 01 85 4A C1 02 01 03 47
Now the problem, this is not a character string, it is an array of 8-bit bytes. Thus it can not be displayed as characters. The 0x00 bytes will signal the end of a string to the print method and the no-representable characters may be ignored, displayed with special characters or multiple bytes may display as must-byte unicode characters. The only guaranteed and usual display is in hexadecimal as above.
That String can be virtually anything. It might be the result of an encryption algorithm, like sha*. Your mistake is that you assume that it must be base64 because it might be.
It is a valid observation that it might be base64, so it was a perfectly valid thing to run that function, but it is you who has to determine whether based on the results it is base64 or something else, based on particular logic, which was not described in the question.

What is wrong with this LDAP filter packet?

I am trying to port a program which queries an LDAP server from Perl to Go, and with the Go version I am receiving a response that the filter is malformed:
00000057: LdapErr: DSID-0C0C0968, comment: The server was unable to decode a search request filter, data 0, v1db1\x00
I have used tcpdump to capture the data transmitted to the server with both the Perl and Go versions of my program, and have found that they are sending slightly different filter packets. This question is not about any possible bugs in the Go program, but simply about understanding the contents of the LDAP filter packets.
The encoded filter is:
(objectClass=*)
And the Perl-generated packet (which the server likes) looks like this:
ASCII . . o b j e c t C l a s s
Hex 87 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12
The Go-generated packet (which the server doesn't like) looks like this:
ASCII . . . . o b j e c t C l a s s
Hex a7 0d 04 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
This is my own breakdown of the packets:
##Byte 0: Tag
When I dissect Byte 0 from both packets, I see they are identical, except for the Primitive/Constructed bit, which is set to Primitive in the Perl version, and Constructed in the Go version. See DER encoding for details.
Bit# 87 6 54321
Perl 10 0 00111
Go 10 1 00111
Bits 87: In both packets, 10 = Context Specific
Bit 6: In the Perl version 0 = Primitive, in the Go version 1 = Constructed
Bits 54321: 00111 = 7 = Object descriptor
##Byte 1: Length
11 bytes for the Perl version, 13 for the Go version
##Bytes 2-3 for the Go version
Byte 2: Tag 04: Substring Filter (See section 4.5.1 of RFC 4511)
Byte 3: Length of 11 bytes
##Remainder: Payload
For both packets this is simply the ASCII text objectClass
My reading of RFC 4511 section 4.5.1 suggests that the Go version is "more" correct, yet the Perl version is the one that works with the server. What gives?
Wireshark is able to parse both packets, and interprets them both equally.
The Perl version is correct, and the Go version is incorrect.
As you point out, RFC 4511 section 4.5.1 specifies encoding for the filter elements, like:
Filter ::= CHOICE {
and [0] SET SIZE (1..MAX) OF filter Filter,
or [1] SET SIZE (1..MAX) OF filter Filter,
not [2] Filter,
equalityMatch [3] AttributeValueAssertion,
substrings [4] SubstringFilter,
greaterOrEqual [5] AttributeValueAssertion,
lessOrEqual [6] AttributeValueAssertion,
present [7] AttributeDescription,
approxMatch [8] AttributeValueAssertion,
extensibleMatch [9] MatchingRuleAssertion,
... }
And in this case, the relevant portion is:
present [7] AttributeDescription,
The AttributeDescription element is defined in section 4.1.4 of the same specification:
AttributeDescription ::= LDAPString
-- Constrained to <attributedescription>
-- [RFC4512]
And from section 4.1.2:
LDAPString ::= OCTET STRING -- UTF-8 encoded,
-- [ISO10646] characters
So this means that the present filter component is an octet string, which is a primitive element. Go is incorrectly converting it to a constructed element, and the directory server is correctly rejecting that malformed request.

BER Encoding of a "Choice"

I am trying to parse an LDAP bind request using the Apache Harmony ASN.1/BER classes (could use another library, I just chose that as it has an Apache License).
My question is on the encoding specifically of a "CHOICE" in ASN.1. The RFC that defines the LDAP ASN.1 schema (http://www.rfc-editor.org/rfc/rfc2251.txt) gives the following as part a bind request:
BindRequest ::= [APPLICATION 0] SEQUENCE {
version INTEGER (1 .. 127),
name LDAPDN,
authentication AuthenticationChoice }
AuthenticationChoice ::= CHOICE {
simple [0] OCTET STRING,
-- 1 and 2 reserved
sasl [3] SaslCredentials }
SaslCredentials ::= SEQUENCE {
mechanism LDAPString,
credentials OCTET STRING OPTIONAL }
How is that CHOICE there actually encoded?
I generated a sample bind request using JXplorer and captured the raw data that was sent. It looks like this:
00000000 30 31 02 01 01 60 2c 02 01 03 04 1b 75 69 64 3d |01...`,.....uid=|
00000010 74 65 73 74 75 73 65 72 2c 64 63 3d 74 65 73 74 |testuser,dc=test|
00000020 2c 64 63 3d 63 6f 6d 80 0a 74 65 73 74 69 6e 67 |,dc=com..testing|
00000030 31 32 33 |123|
The 80 there (at offset 0x27) seems to represent that choice. Fair enough - and I get that (per http://en.wikipedia.org/wiki/Basic_Encoding_Rules#BER_encoding) the last bit is set in order to indicate that it's "context specific" (i.e. defined by this application/protocol) But how would I know if this is a "simple" or "sasl" auth? What indicates which option of the choice is being used? In this case it looks like the next byte (0x0a) is the length of the string - so this could be an OctetString or something of the sort - but I don't see anything here that indicates what the actual is other than 0x80...
I'm also not sure what the [0] and [3] mean in the CHOICE section above. Is that saying there are four options but only options numbered 0 and 3 are in use?
Below you can see output of openssl asn1parse command. The CHOICE members are encoded using so called context specific tags - which means normal tag value is replaced with the one specified in ASN.1 definition for respective item in the CHOICE. The tag has value 0 which implicates the first item in CHOICE is selected. The first choice item is of type OCTET STRING. The value 0 of context specific tag gives you the information about the value type. If there was no context tag, normal OCTET STRING tag would be used.
0:d=0 hl=2 l= 49 cons: SEQUENCE
2:d=1 hl=2 l= 1 prim: INTEGER :01
5:d=1 hl=2 l= 44 cons: appl [ 0 ]
7:d=2 hl=2 l= 1 prim: INTEGER :03
10:d=2 hl=2 l= 27 prim: OCTET STRING :uid=testuser,dc=test,dc=com
39:d=2 hl=2 l= 10 prim: cont [ 0 ]
The '80'H in the encoded message above is called the "identifier octets" (in general it may be more than one octet). This value of the identifier octet(s) indicates that the selected alternative of the CHOICE is "simple", because the five low-order bits of '80'H are '00000'B, which matches the tag number of the tag of "simple" ([0]).
If the sender had selected the "sasl" alternative, the identifier octet would be 'A3'H instead of '80'H. The '3'H in 'A3'H (the five low-order bits) is the tag number of the tag of "sasl" ([3]). The two highest-order bits of the identifier octet are set to '10'B for both alternatives because both [0] and [3] are "context-specific" tags (this just means that these tags don't contain the APPLICATION keyword or the PRIVATE keyword). The next bit of the identifier octet (the "constructed" bit) is set to '0' for "simple" but is set to '1' for "sasl", because the encoding of "sasl" contains nested tags whereas the encoding of "simple" does not contain any nested tags.