BER Encoding of a "Choice" - ldap

I am trying to parse an LDAP bind request using the Apache Harmony ASN.1/BER classes (could use another library, I just chose that as it has an Apache License).
My question is on the encoding specifically of a "CHOICE" in ASN.1. The RFC that defines the LDAP ASN.1 schema (http://www.rfc-editor.org/rfc/rfc2251.txt) gives the following as part a bind request:
BindRequest ::= [APPLICATION 0] SEQUENCE {
version INTEGER (1 .. 127),
name LDAPDN,
authentication AuthenticationChoice }
AuthenticationChoice ::= CHOICE {
simple [0] OCTET STRING,
-- 1 and 2 reserved
sasl [3] SaslCredentials }
SaslCredentials ::= SEQUENCE {
mechanism LDAPString,
credentials OCTET STRING OPTIONAL }
How is that CHOICE there actually encoded?
I generated a sample bind request using JXplorer and captured the raw data that was sent. It looks like this:
00000000 30 31 02 01 01 60 2c 02 01 03 04 1b 75 69 64 3d |01...`,.....uid=|
00000010 74 65 73 74 75 73 65 72 2c 64 63 3d 74 65 73 74 |testuser,dc=test|
00000020 2c 64 63 3d 63 6f 6d 80 0a 74 65 73 74 69 6e 67 |,dc=com..testing|
00000030 31 32 33 |123|
The 80 there (at offset 0x27) seems to represent that choice. Fair enough - and I get that (per http://en.wikipedia.org/wiki/Basic_Encoding_Rules#BER_encoding) the last bit is set in order to indicate that it's "context specific" (i.e. defined by this application/protocol) But how would I know if this is a "simple" or "sasl" auth? What indicates which option of the choice is being used? In this case it looks like the next byte (0x0a) is the length of the string - so this could be an OctetString or something of the sort - but I don't see anything here that indicates what the actual is other than 0x80...
I'm also not sure what the [0] and [3] mean in the CHOICE section above. Is that saying there are four options but only options numbered 0 and 3 are in use?

Below you can see output of openssl asn1parse command. The CHOICE members are encoded using so called context specific tags - which means normal tag value is replaced with the one specified in ASN.1 definition for respective item in the CHOICE. The tag has value 0 which implicates the first item in CHOICE is selected. The first choice item is of type OCTET STRING. The value 0 of context specific tag gives you the information about the value type. If there was no context tag, normal OCTET STRING tag would be used.
0:d=0 hl=2 l= 49 cons: SEQUENCE
2:d=1 hl=2 l= 1 prim: INTEGER :01
5:d=1 hl=2 l= 44 cons: appl [ 0 ]
7:d=2 hl=2 l= 1 prim: INTEGER :03
10:d=2 hl=2 l= 27 prim: OCTET STRING :uid=testuser,dc=test,dc=com
39:d=2 hl=2 l= 10 prim: cont [ 0 ]

The '80'H in the encoded message above is called the "identifier octets" (in general it may be more than one octet). This value of the identifier octet(s) indicates that the selected alternative of the CHOICE is "simple", because the five low-order bits of '80'H are '00000'B, which matches the tag number of the tag of "simple" ([0]).
If the sender had selected the "sasl" alternative, the identifier octet would be 'A3'H instead of '80'H. The '3'H in 'A3'H (the five low-order bits) is the tag number of the tag of "sasl" ([3]). The two highest-order bits of the identifier octet are set to '10'B for both alternatives because both [0] and [3] are "context-specific" tags (this just means that these tags don't contain the APPLICATION keyword or the PRIVATE keyword). The next bit of the identifier octet (the "constructed" bit) is set to '0' for "simple" but is set to '1' for "sasl", because the encoding of "sasl" contains nested tags whereas the encoding of "simple" does not contain any nested tags.

Related

How many combinations does SHA-256 have?

By using an online tool and wikipedia I found out that every sha-256 encrypted string is 64 chars longs containing numbers and characters. Hence I assumed that there are 34^36 combinations ( 2^216 simplified by an algebra calculator ).
After doing some research I found out that most people said there are 2^256 combinations. Could someone explain ? To make the context clear, I write a paper about cryptocurrencies and try to explain how many different combinations there are to encrypt and how long this could take ( therefore how many guesses it could take) and compare this to the amount of total atoms in the universe (roughly 10^85).
SHA-256 produces 256 bits which is 32 bytes, not characters, each byte has 256 possible values.
There are 256 bits and each bit has 2 values (0 or 1), thus 2^256.
There are 32 bytes and each byte has 256 values, thus 256^32.
Note: 2^256 == 256^32 ~= 10^77.
The 32 bytes can be encoded many ways, in hexadecimal it would be 64 characters, in Base64 it would be 44 characters.
Total combinations of SHA-256 is
115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
A sha-256 hash has 64 characters, 32 hex combinations, because a hex has 2 characters.
3a 7b d3 e2 36 0a 3d 29 ee a4 36 fc fb 7e 44 c7 35 d1 17 c4 2d 1c 18 35 42 0b 6b 99 42 dd 4f 1b
Above is a hash where the hex combinations are separated so you can count 32.
There are 16 characters available to hex 0-9&a-f and 16^2 or 256 combinations in hex.
With 32 slots for a hex in a sha-256 you use 256^32 to get:
115792089237316195423570985008687907853269984665640564039457584007913129639936
Available sha-256 hashes.

u-sql: filtering out empty// Null strings (microsoft academic graph)

I am new to u-sql of azure datalake analytics.
I want to do what I think is a very simple operations but ran into trouble.
Basically: I want to create a query which ignore empty string.
using it in select works, but not in WHERE statement.
Below the statement I am making and the cryptic error I get
JOB
#xsel_res_1 =
EXTRACT
x_paper_id long,
x_Rank uint,
x_doi string,
x_doc_type string,
x_paper_title string,
x_original_title string,
x_book_title string,
x_paper_year int,
x_paper_date DateTime?,
x_publisher string,
x_journal_id long?,
x_conference_series_id long?,
x_conference_instance_id long?,
x_volume string,
x_issue string,
x_first_page string,
x_last_page string,
x_reference_count long,
x_citation_count long?,
x_estimated_citation int?
FROM #"adl://xmag.azuredatalakestore.net/graph/2018-02-02/Papers.txt"
USING Extractors.Tsv()
;
#xsel_res_2 =
SELECT
x_paper_id AS x_paper_id,
x_doi.ToLower() AS x_doi,
x_doi.Length AS x_doi_length
FROM #xsel_res_1
WHERE NOT string.IsNullOrEmpty(x_doi)
;
#xsel_res_3 =
SELECT
*
FROM #xsel_res_2
SAMPLE ANY (5)
;
OUTPUT #xsel_res_3
TO #"/graph/2018-02-02/x_output/x_papers_x6.tsv"
USING Outputters.Tsv();
THE ERROR
Vertex failed
Vertex failure triggered quick job abort. Vertex failed: SV1_Extract[0][1] with error: Vertex user code error.
VertexFailedFast: Vertex failed with a fail-fast error
E_RUNTIME_USER_EXTRACT_ROW_ERROR: Error occurred while extracting row after processing 10 record(s) in the vertex' input split. Column index: 5, column name: 'x_original_title'.
E_RUNTIME_USER_EXTRACT_EXTRACT_INVALID_CHARACTER_AFTER_QUOTED_FIELD: Invalid character following the ending quote character in a quoted field.
Row selected
Component
RUNTIME
Message
Invalid character following the ending quote character in a quoted field.
Resolution
Column should be fully surrounded with double-quotes and double-quotes within the field escaped as two double-quotes.
Description
Invalid character is detected following the ending quote character in a quoted field. A column delimiter, row delimiter or EOF is expected. This error can occur if double-quotes within the field are not correctly escaped as two double-quotes.
Details
Row Delimiter: 0x0
Column Delimiter: 0x9
HEX: 61 76 6E 69 20 74 65 72 6D 69 6E 20 75 20 70 6F 76 61 6C 6A 73 6B 6F 6A 20 6C 69 73 74 69 6E 69 20 69 20 6E 61 74 70 69 73 75 20 67 20 31 31 38 35 09 22 50 6F 20 6B 6F 6E 63 75 22 ### 20 28 73 74 61 72 69 20 68 72
UPDATE
BY the way, the operations work on other datasets, so the problem is not the syntax as far as I can tell
//Define schema of file, must map all columns
#searchlog =
EXTRACT UserId int,
Start DateTime,
Region string,
Query string,
Duration int,
Urls string,
ClickedUrls string
FROM #"/Samples/Data/SearchLog.tsv"
USING Extractors.Tsv();
#searchlog_1 =
SELECT * FROM #searchlog
WHERE NOT string.IsNullOrEmpty(ClickedUrls );
OUTPUT #searchlog_1
TO #"/Samples/Output/SearchLog_output_x1.tsv"
USING Outputters.Tsv();
This is an unfortunate error display for this case.
Assuming text is utf-8, you can use a site like www.hexutf8.com to convert the hex to:
avni termin u povaljskoj listini natpisu g 1185 "Po koncu" (Stari hr
It looks like the input row contains at least one " character that is not properly escaped. It should look like this:
avni termin u povaljskoj listini natpisu g 1185 ""Po koncu"" (Stari hr
#Saveenr's answer assumes that the values in your file are all quoted. Alternatively, if they are not quoted (and do not contain your column separator as values), then setting Extractors.Tsv(quoting:false) could help as well.

wrong output when decoding base64 string

i seem to always get incorrect output when decoding this base64 string in vb.net ( i think its base64? it really looks like it )
im using the frombase64string function
and i did it like this
Dim b64str = "0DDQQL3uAikQBgAAc4cqK4WnSQBg4SAgExEAAF3BAmAILYojRgkBhUrBAgEDRw=="
Dim i As String = System.Text.Encoding.Unicode.GetString(Convert.FromBase64String(b64str))
MsgBox(i)
but i always get this output
バ䃐⤂ؐ
that doesn't seem right
0DDQQL3uAikQBgAAc4cqK4WnSQBg4SAgExEAAF3BAmAILYojRgkBhUrBAgEDRw==
It looks like Base64, the length is a correct size, the characters belong to the Base64 character set and the trailing "==" is reasonable. Of course it might not be a Base64 encoding.
Base64 decoding results in:
D0 30 D0 40 BD EE 02 29 10 06 00 00 73 87 2A 2B 85 A7 49 00 60 E1 20 20 13 11 00 00 5D C1 02 60 08 2D 8A 23 46 09 01 85 4A C1 02 01 03 47
Now the problem, this is not a character string, it is an array of 8-bit bytes. Thus it can not be displayed as characters. The 0x00 bytes will signal the end of a string to the print method and the no-representable characters may be ignored, displayed with special characters or multiple bytes may display as must-byte unicode characters. The only guaranteed and usual display is in hexadecimal as above.
That String can be virtually anything. It might be the result of an encryption algorithm, like sha*. Your mistake is that you assume that it must be base64 because it might be.
It is a valid observation that it might be base64, so it was a perfectly valid thing to run that function, but it is you who has to determine whether based on the results it is base64 or something else, based on particular logic, which was not described in the question.

What is wrong with this LDAP filter packet?

I am trying to port a program which queries an LDAP server from Perl to Go, and with the Go version I am receiving a response that the filter is malformed:
00000057: LdapErr: DSID-0C0C0968, comment: The server was unable to decode a search request filter, data 0, v1db1\x00
I have used tcpdump to capture the data transmitted to the server with both the Perl and Go versions of my program, and have found that they are sending slightly different filter packets. This question is not about any possible bugs in the Go program, but simply about understanding the contents of the LDAP filter packets.
The encoded filter is:
(objectClass=*)
And the Perl-generated packet (which the server likes) looks like this:
ASCII . . o b j e c t C l a s s
Hex 87 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12
The Go-generated packet (which the server doesn't like) looks like this:
ASCII . . . . o b j e c t C l a s s
Hex a7 0d 04 0b 6f 62 6a 65 63 74 43 6c 61 73 73
Byte# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
This is my own breakdown of the packets:
##Byte 0: Tag
When I dissect Byte 0 from both packets, I see they are identical, except for the Primitive/Constructed bit, which is set to Primitive in the Perl version, and Constructed in the Go version. See DER encoding for details.
Bit# 87 6 54321
Perl 10 0 00111
Go 10 1 00111
Bits 87: In both packets, 10 = Context Specific
Bit 6: In the Perl version 0 = Primitive, in the Go version 1 = Constructed
Bits 54321: 00111 = 7 = Object descriptor
##Byte 1: Length
11 bytes for the Perl version, 13 for the Go version
##Bytes 2-3 for the Go version
Byte 2: Tag 04: Substring Filter (See section 4.5.1 of RFC 4511)
Byte 3: Length of 11 bytes
##Remainder: Payload
For both packets this is simply the ASCII text objectClass
My reading of RFC 4511 section 4.5.1 suggests that the Go version is "more" correct, yet the Perl version is the one that works with the server. What gives?
Wireshark is able to parse both packets, and interprets them both equally.
The Perl version is correct, and the Go version is incorrect.
As you point out, RFC 4511 section 4.5.1 specifies encoding for the filter elements, like:
Filter ::= CHOICE {
and [0] SET SIZE (1..MAX) OF filter Filter,
or [1] SET SIZE (1..MAX) OF filter Filter,
not [2] Filter,
equalityMatch [3] AttributeValueAssertion,
substrings [4] SubstringFilter,
greaterOrEqual [5] AttributeValueAssertion,
lessOrEqual [6] AttributeValueAssertion,
present [7] AttributeDescription,
approxMatch [8] AttributeValueAssertion,
extensibleMatch [9] MatchingRuleAssertion,
... }
And in this case, the relevant portion is:
present [7] AttributeDescription,
The AttributeDescription element is defined in section 4.1.4 of the same specification:
AttributeDescription ::= LDAPString
-- Constrained to <attributedescription>
-- [RFC4512]
And from section 4.1.2:
LDAPString ::= OCTET STRING -- UTF-8 encoded,
-- [ISO10646] characters
So this means that the present filter component is an octet string, which is a primitive element. Go is incorrectly converting it to a constructed element, and the directory server is correctly rejecting that malformed request.

Understanding ISO 8583 messaging log

I read about ISO 8583 messaging at WIKI and Code Project; I understood ISO 8583 messages can basically be divided in 3 parts:
MTI (Message Type Indicator)
1.1. Version
1.2. Message Class
1.3. Message Function
1.4. Message Origin
Bitmap
Indicate which data elements are present.
DataElement
The essence of the whole ISO message, contain information about the transaction such as:
transaction type,
amount,
customerid, etc.
So, after reading these two web references, I want to make divide my ISO messaging log as MTI, bitmap, and Data Element.
For example:
(0800 2020000000800000 000000 000001 3239313130303031)
MTI: 0800 (1987 version, Network Management Message, Request, Acquirer)
Bitmap: 20 20 00 00 00 80 00 00 (eg. 20 = 0010 0000 ,so position 3 is on)
DataElement:(by seeing Bitmap , we can defined data element as follow)
field 03:000000 (Processing Code)
field 11:000001 (Systems trace audit number)
field 41:3239313130303031 (Card acceptor terminal idenfication)
But my challenge is that I already have ISO 8583 messaging log from my ATM Machine.
This actual output messaging log is not very clear like the one in the snippet above.
So I cannot divide this message to MTI, Bitmap and Data element like upper example.
00000:00 5B 30 31 31 30 30 30 30 30 30 30 30 38 32 30 80 38 00 00 [.[01100000000820.8..]
00020:00 81 00 00 04 00 00 00 00 00 00 00 33 36 32 39 31 30 31 30 [............36291010]
00040:32 39 35 37 31 30 33 31 31 30 30 30 30 30 30 35 30 33 31 53 [2957103110000005031S]
00060:55 32 30 31 31 31 30 33 31 31 30 32 39 35 37 32 30 31 31 31 [U2011103110295720111]
00080:30 33 31 31 30 32 39 35 37 33 30 30 31 [0311029573001 ]
I have no previous experience in ISO 8583 message and welcome suggestions.
Got it,
This message is divided as follows:
First 2 bytes are the message length 00 5B = 91
Followed by 14 bytes of header = 01100000000820
Followed somehow by BMP as follows:
Primary BMP = 80 38 00 00 00 81 00 00 = Fields {1, 11,12, 13, 41, 48} exist
Field 1 means secondary BMP exist
Secondary BMP = 04 00 00 00 00 00 00 00 = Field 70 exist
I am not sure where is MTI, will it be the 0820 at the trailer of the header? since it is in ASCII and usually it comes in numeric value of 08 20 but this might be part of the specs. 0820 means network management advice
anyways, the fields from the decoded BMP as follows:
DE 11 = 362910
System Trace Audit Number
DE 12 = 102957
Local transaction time hh(24)mmss
DE 13 = 1031
Local transaction date MMDD
DE 41 = 10000005
terminal ID
DE 48 = (031) SU20111031102957201110311029573
notice the 3 digits length field preceding the remaining data in this field. which is a generic (future/private use) field
DE 70 = 001
network management information code <001 = sign on>
From DE 70 value 001 this is a sign on message, which must be a 0800 MTI.
To get more information about the location of the MTI and the meaning of DE 48, you should read the manual (technical specs) of this device to get more information.
while sending ISO 8583 message we are converting it in BCD/HEX form ,use Wireshark tool to
track communication between IP and Ports.
A good online bitmap analysis tool is https://neapay.com/online-tools/bitmap-fields-decoder.html.
Sometime helps the https://codebeautify.org/hex-string-converter tool.
For a better understanding of the ISO 8583 message format, it is useful to analyse each field manually. However, each field can have length and value subfields in different formats (BCD, EBCDIC, ASCII ...). And some fields may have inner fields, for example BMP 48 or 60 often used as containers for nested field trees. These inner fields may have tag, length and value. And nested fields of these fields may have different formats again :). For example the https://github.com/credibledoc/credible-doc/blob/master/iso-8583-packer/doc/ebcdic/ebcdic-decimal-tag-packer.md page describes a field with EBCDIC tag and BCD value.
You can use the https://github.com/credibledoc/credible-doc/tree/master/iso-8583-packer Java library (I am the author) for building ISO 8583 messages. The example of ISO message above can be unpacked and packed with the iso-8583-packer library.
Message data:
<f name="Root" lenHex="005B">
<f name="Header" val="0110000000" valHex="30313130303030303030"/>
<f name="MTI" val="0820" valHex="30383230"/>
<f name="Bitmap" bitmapHex="80380000008100000400000000000000" bitSet="{1, 11, 12, 13, 41, 48, 70}">
<f name="SystemTraceAuditNumber" fieldNum="11" val="362910" valHex="333632393130"/>
<f name="LocalTransactionTimeHHMMSS" fieldNum="12" val="102957" valHex="313032393537"/>
<f name="LocalTransactionDateMMDD" fieldNum="13" val="1031" valHex="31303331"/>
<f name="TerminalId" fieldNum="41" val="10000005" valHex="3130303030303035"/>
<f name="PrivateData_48" fieldNum="48" val="SU20111031102957201110311029573" lenHex="303331" valHex="53553230...39353733"/>
<f name="NetworkManagementInformationCode" fieldNum="70" val="001" valHex="303031"/>
</f>
</f>
Message Structure:
<f type="LEN_VAL" name="Root" lengthPacker="BinaryLengthPacker" bodyPacker="AsciiBodyPacker">
<f type="VAL" name="Header" bodyPacker="AsciiBodyPacker" len="10"/>
<f type="VAL" name="MTI" bodyPacker="AsciiBodyPacker" len="4"/>
<f type="BIT_SET" name="Bitmap" bitMapPacker="IfbBitmapPacker">
<f type="VAL" fieldNum="11" name="SystemTraceAuditNumber" bodyPacker="AsciiBodyPacker" len="6"/>
<f type="VAL" fieldNum="12" name="LocalTransactionTimeHHMMSS" bodyPacker="AsciiBodyPacker" len="6"/>
<f type="VAL" fieldNum="13" name="LocalTransactionDateMMDD" bodyPacker="AsciiBodyPacker" len="4"/>
<f type="VAL" fieldNum="41" name="TerminalId" bodyPacker="AsciiBodyPacker" len="8"/>
<f type="LEN_VAL" fieldNum="48" name="PrivateData_48" lengthPacker="AsciiLengthPacker" bodyPacker="AsciiBodyPacker"/>
<f type="VAL" fieldNum="70" name="NetworkManagementInformationCode" bodyPacker="AsciiBodyPacker" len="3"/>
</f>
</f>
The example above can be found on GitHub https://github.com/credibledoc/credible-doc/blob/master/iso-8583-packer/src/test/java/com/credibledoc/iso8583packer/examples/UnderstandingIso8583MessageLogTest.java
BMP 55 often contains TLV EMV data. The https://paymentcardtools.com/emv-tlv-parser tool is useful in the case.