Convert "emailAdress=<email-address>" found in Subject field of x.509 SSL certificate to hexadecimal - ssl

I have a 'Subject' of SSL x.509 certificate given as
Subject: C=XX, ST=XX, L=XX, O=XX, OU=XX, emailAddress=admin#adobe.pw, CN=trustasia.asia
and I want to covert this to binary stream as found in SSL certificate when it is sent on wire, I know definition Subject field is given in RFC-5280 in ASN.1 notation and DER encoding rules given in x.609 are to covert this field to binary representation, these two documents and with little help from code(which gave hexadecimal representations of OID such as id-at-countryName:2.5.4.6:{0x55, 0x04, 0x06}) i was able to covert all the RDNs(RelativeDistinguishedNames) to their binary representation, but I am stuck with emailAdress filds.
I found its OID:1.2.840.113549.1.9.1 but don't know what it is hexadecimal representation.
Can you please guide me how can I covert this to binary representation.

I suspect that you are talking about OID encoding using ASN.1 Distinguished Encoding Rules (DER). I would suggest to check this article to get detailed information about OBJECT_IDENTIFIER encoding rules: OBJECT IDENTIFIER
OID string value conversion to ASN.1 DER will result in:
06 09 2A 86 48 86 F7 0D 01 09 01
where, 0x06 -- is OBJECT_IDENTIFIER tag identifer, 0x09 -- encoded OID value length in bytes, the rest bytes (2A 86 48 86 F7 0D 01 09 01) represent OID binary form

emailAddress is of type IA5String so it would appear in the certificate in the same form as shown in subject line: 'admin#adobe.pw'.

Related

Why do DKIM public keys always end with 'IDAQAB'?

I have noticed that all the DKIM public keys generated always end with the string 'IDAQAB'.
Any reason to it or are there cases where DKIM public keys will not end with the same string all the time ?.
DKIM public keys are encoded in the binary DER format and shared as Base64 in the DNS. RSA public keys consist of a modulus and an exponent. The exponent is typically 65537, which is 01 00 01 in hexadecimal. DER prefixes this value with 02 for the integer type and 03 for the length of the exponent in bytes. The Base64 encoding of 02 03 01 00 01 is IDAQAB (at the right offset).
Before the modulus, which is unique for each RSA public key, there are nested length prefixes and an object identifier. This information is identical for RSA keys of the same length, which is why you find many DKIM public keys which also share the same prefix, such as MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA.

How to track down Invalid utf8 character string

Running a search in PHPMyAdmin for an ip address to unblock from a WordPress plug in, I get this on one of the tables:
Warning: #1300 Invalid utf8 character string: '\x8B\x08\x00\x00\x00\x00\x00\x00\x03\x14\xD6y8\x15\xEF\x17\x0...'
Warning: #1300 Invalid utf8 character string: '\x8B\x08\x00\x00\x00\x00\x00\x00\x03\x00\x1E\x80\xE1\x7Fa:2:{...'
I tried to search for part of the strings, but cannot find where they are in the db.
These look suspicious to me, I've had some SQL injection compromises in the past and I'm fearing that's what it may indicate.
How do I track down where these strings actually are in the db if I cannot find by the PHPMyAdmin search?
Thank you.
Those look like gzip headers which are missing their leading \x1f. I expect it's there but not part of the warning because \x1f is a valid UTF-8 character but \x8b is not.
1F 2-byte magic number of a gzip file
8B |
08 compression method (08 = deflate)
00 1 byte header flags (00 = it's probably compressed text)
00 4 byte timestamp
00 |
00 |
00 |
00 Extra flags
03 Operating System (03 = Unix)
After that, data begins.
Something is trying to read gzipped text as UTF-8.

Converting control characters to XML?

The JPEG-2000 standard says that the file type box contains a series of compatibility list fields. It says this about each field:
This field is encoded as a four-byte string of ISO/IEC 646 characters.
Here is a compatibility list field from one JPEG-2000 file (hex bytes):
00 00 00 B5
How is that four-byte encoding of a string represented in XML, given that hex 00 (nul) is not a valid character in XML?

No padding for AES cipher in Java Card

In JavaCard 2.2.2 API, I can see that some symmetric ciphers are implemented with a padding mode, for example:
Cipher algorithm ALG_DES_CBC_ISO9797_M1 provides a cipher using DES in
CBC mode or triple DES in outer CBC mode, and pads input data
according to the ISO 9797 method 1 scheme.
But for the AES cipher, there is no padding mode that is available (ALG_AES_BLOCK_128_ECB_NOPAD and ALG_AES_BLOCK_128_CBC_NOPAD).
So how explain that it's not supported for this algorithm?
Are these padding methods vulnerable to known attacks using AES?
If other padding modes are available depends on the Java Card API you are using as well as the implementation details for the specific Java Card.
Later API's have:
a new getInstance method which can be used with PAD_PKCS5;
additional constants such as ALG_AES_CBC_PKCS5.
The special getInstance method was added because of the explosion of modes and padding methods.
Older API implementations may indeed not have these methods, but please again check availability.
AES itself is a block cipher. The different modes such as CBC use a cipher and a padding - so CBC_AES_PKCS7PADDING would be more logical in some sense. As a block cipher, AES is therefore not vulnerable to padding oracle attacks.
CBC on the other hand is vulnerable against padding oracle - and other plaintext oracle - attacks. So you should protect your IV and ciphertext with e.g. a AES-CMAC authentication tag if you need protection against these attacks.
That's however not a reason why the padding modes were not included. The different padding modes are certainly present now.
Not necessarily - it only means, that this algorithm does not automatically pad input data. You have to do it by yourself (probably pad it to multiples of 16 bytes, because this is what AES needs).
So how explain that it's not supported for this algorithm?
I don't know for sure, but note that there are several ways of doing this and maybe author decided that you have choose most suitable padding style for you.
If case you want to know more about padding, consider this example:
You have to encrypt a word "overflow" with AES.
First, you have to convert it to byte form, because this is what AES operates on.
ASCII encoded string "overflow" is
"6F 76 65 72 66 6C 6F 77 00"
(last byte is string terminator, AKA \0 or null byte)
Unfortunately, this is also insufficient for pure AES algorithm, because it can OLNY operate on whole blocks of data - like 16 byte block of data.
This means, you need 16-9=7 more bytes of data. So you pad your encoded string to full 16 bytes of data with null byte for example. Result is
"6F 76 65 72 66 6C 6F 77 00 00 00 00 00 00 00 00"
Now you choose your encryption key and encrypt data.
After you decrypt your data, you receive again
"6F 76 65 72 66 6C 6F 77 00 00 00 00 00 00 00 00"
And now the crux of the matter: how do you know, which bytes where originally in your string, and which are padding bytes?
In case of en/decrypting strings this is very simple because string (almost) always ends with null byte, and never have multiple consecutive null bytes at the end. So it's easy to determine where to cut your data.
More information about styles of "crypto-padding" you can find here: https://en.wikipedia.org/wiki/Padding_%28cryptography%29#Byte_padding

Base64url encoded representation puzzle

I'm writing a cookie authentication library that replicates that of an existing system. I'm able to create authentication tokens that work. However testing with a token with known value, created by the existing system, I encountered the following puzzle.
The original encoded string purports to be base64url encoded. And, in fact, using any of several base64url code modules and online tools, the decoded value is the expected result.
However base64url encoding the decoded value (again using any of several tools) doesn't reproduce the original string. Both encoded strings decode to the expected results, so apparently both representations are valid.
How? What's the difference?
How can I replicate the original encoded results?
original encoded string: YWRtaW46NTVGRDZDRUE6vtRbQoEXD9O6R4MYd8ro2o6Rzrc
my base64url decode: admin:55FD6CEA:[encrypted hash]
Encoding doesn't match original but the decoded strings match.
my base64url encode: YWRtaW46NTVGRDZDRUE677-977-9W0Lvv70XD9O6R--_vRh377-977-92o7vv73Otw
my base64url decode: admin:55FD6CEA:[encrypted hash]
(Sorry, SSE won't let me show the unicode representation of the hash. I assure you, they do match.)
This string:
YWRtaW46NTVGRDZDRUE6vtRbQoEXD9O6R4MYd8ro2o6Rzrc
is not exactly valid Base64. Valid Base64 consists in a sequence of characters among uppercase letters, lowercase letters, digits, '/' and '+'; it must also have a length which is a multiple of 4; 1 or 2 final '=' signs may appear as padding so that the length is indeed a multiple of 4. This string contains only Base64-valid characters, but only 47 of them, and 47 is not a multiple of 4. With an extra '=' sign at the end, this becomes valid Base64.
That string:
YWRtaW46NTVGRDZDRUE677-977-9W0Lvv70XD9O6R--_vRh377-977-92o7vv73Otw
is not valid Base64. It contains several '-' and one '_' sign, neither of which should appear in a Base64 string. If some tool is decoding that string into the "same" result as the previous string, then the tool is not implementing Base64 at all, but something else (and weird).
I suppose that your strings got garbled at some point through some copy&paste mishap, maybe related to a bad interpretation of bytes as characters. This is the important point: bytes are NOT characters.
It so happens that, traditionally, in older times, computers got on the habit of using so-called "code pages" which were direct mappings of characters onto bytes, with each character being encoded as exactly one byte. Thus came into existence some tools (such as Windows' notepad.exe) that purport to do the inverse, i.e. show the contents of a file (nominally, some bytes) as they character counterparts. This, however, fails when the bytes are not "printable characters" (while a code page such as "Windows-1252" maps each character to a byte value, there can be byte values that are not the mapping of a printable character). This also began to fail even more when people finally realized that there were only 256 possible byte values, and a lot more possible characters, especially when considering Chinese.
Unicode is an evolving standard that maps characters to code units (i.e. numbers), with a bit more than 100000 currently defined. Then some encoding rules (there are several of them, the most frequent being UTF-8) encode the characters into bytes. Crucially, one character can be encoded over several bytes.
In any case, a hash value (or whatever you call an "encrypted hash", which is probably a confusion, because hashing and encrypting are two distinct things) is a sequence of bytes, not characters, and thus is never guaranteed to be the encoding of a sequence of characters in any code page.
Armed with this knowledge, you may try to put some order into your strings and your question.
Edit: thanks to #marfarma for pointing out the URL-safe Base64 encoding where the '+' and '/' characters are replaced by '-' and '_'. This makes the situation clearer. When adding the needed '=' signs, the first string then decodes to:
00000000 61 64 6d 69 6e 3a 35 35 46 44 36 43 45 41 3a be |admin:55FD6CEA:.|
00000010 d4 5b 42 81 17 0f d3 ba 47 83 18 77 ca e8 da 8e |.[B.....G..w....|
00000020 91 ce b7 |...|
while the second becomes:
00000000 61 64 6d 69 6e 3a 35 35 46 44 36 43 45 41 3a ef |admin:55FD6CEA:.|
00000010 bf bd ef bf bd 5b 42 ef bf bd 17 0f d3 ba 47 ef |.....[B.......G.|
00000020 bf bd 18 77 ef bf bd ef bf bd da 8e ef bf bd ce |...w............|
00000030 b7 |.|
We now see what happened: the first string was decoded to bytes but someone fed these bytes to some display system or editors that really expected UTF-8. Some of these bytes were not valid UTF-8 encoding of anything, so they were replaced with the Unicode code point U+FEFF ZERO WIDTH NO-BREAK SPACE, i.e. a space character with no width (thus, nothingness on the screen). The characters where then reencoded as UTF-8, each U+FEFF yielding the EF BF BD sequence of three bytes.
Therefore, the hash value was badly mangled, but the bytes that were altered show up as nothing when interpreted (wrongly) as characters, and what was put in their place also shows up as nothing. Hence no visible difference on the screen.