Finding binary representation of a number at any bit position - optimization

in the past week I struggled with this problem, and it seems I can't handle it finally.
Given an arbitrary 64bit unsigned int number, if it contains the binary pattern of 31 (0b11111) at any bit position, at any bit settings, the number is valid, otherwise not.
E.g.:
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 1111 valid
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0011 1110 valid
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0111 1100 valid
0000 0000 0000 0000 0000 0000 0000 0000 0000 1111 1000 0000 0000 0000 0000 0000 valid
0000 0000 0011 1110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0011 1110 valid etc...
also:
0000 0000 0000 1100 0000 0100 0000 0000 1100 1100 0000 0000 0100 0000 0001 1111 valid
1110 0000 0000 0100 0000 0000 0011 0000 0000 0000 0000 0000 0000 0000 0011 1110 valid
0000 0000 1000 0010 0000 0010 0000 0000 0000 0000 0010 0000 0000 0000 0111 1100 valid
0000 0010 0000 0110 0000 0000 0000 0000 0000 1111 1000 0000 0000 0100 0110 0000 valid
0000 0000 0011 1110 0000 0000 0011 0000 0000 1000 0000 0000 0000 0000 0011 1110 valid etc...
but:
0000 0000 0000 1100 0000 0100 0000 0000 1100 1100 0000 0000 0100 0000 0000 1111 invalid
1110 0000 0000 0100 0000 0000 0011 0000 0000 0000 0000 0000 0000 0000 0011 1100 invalid
0000 0000 1000 0010 0000 0010 0000 0000 0000 0000 0010 0000 0000 0000 0101 1100 invalid
0000 0010 0000 0110 0000 0000 0000 0000 0000 1111 0000 0000 0000 0100 0110 0000 invalid
0000 0000 0011 1010 0000 0000 0011 0000 0000 1000 0000 0000 0000 0000 0001 1110 invalid etc...
You've got the point...
But that is just the first half of the problem. The second one is, it needs to be implemented without loops or branches (which was done already) for speed increasing, using only one check by arithmetic and/or logical, bit manipulation kind of code.
The closest I can get, is a modified version of Bit Twiddling Hacks "Determine if a word has a zero byte" ( https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord ) to check five bit blocks of zeros (negated 11111). But it still has the limit of the ability to check only fixed blocks of bits (bit 0 to 4, bit 5 to 9, etc...) not at any bit position (as in the examples above).
Any help would be greatly appreciated, since I'm totally exhausted.
Sz

Implementation
Let me restate your goal in a slightly different formulation:
I want to check whether an integer contains 5 consecutive high bits.
From this formulation the following solution explains itself. It is written in C++.
bool contains11111(uint64_t i) {
return i & (i << 1) & (i << 2) & (i << 3) & (i << 4);
}
This approach also works for any other pattern. For instance, if you wanted to check for 010 you would use ~i & (i<<1) & ~(i<<2). In your example the pattern's length is a prime number, but for composite numbers and especially powers of two you can optimize this even further. For instance, when searching for 1111 1111 you could use i&=i<<1; i&=i<<2; i&=i<<4; return i.
Testing
To test this on your examples I used the following program. The literals inside testcases[] were generated by running your examples through the bash command ...
{ echo 'ibase=2'; tr -dc '01\n' < fileWithYourExamples; } |
bc | sed 's/.*/UINT64_C(&),/'
#include <cinttypes>
#include <cstdio>
bool contains11111(uint64_t i) {
return i & (i << 1) & (i << 2) & (i << 3) & (i << 4);
}
int main() {
uint64_t testcases[] = {
// valid
UINT64_C(31),
UINT64_C(62),
UINT64_C(124),
UINT64_C(260046848),
UINT64_C(17451448556060734),
UINT64_C(3382101189607455),
UINT64_C(16142027170561130558),
UINT64_C(36593945997738108),
UINT64_C(145804038196167776),
UINT64_C(17451654848708670),
// invalid
UINT64_C(3382101189607439),
UINT64_C(16142027170561130556),
UINT64_C(36593945997738076),
UINT64_C(145804038187779168),
UINT64_C(16325754941866014),
};
for (uint64_t i : testcases) {
std::printf("%d <- %016" PRIx64 "\n", contains11111(i), i);
}
}
This prints
1 <- 000000000000001f
1 <- 000000000000003e
1 <- 000000000000007c
1 <- 000000000f800000
1 <- 003e00000000003e
1 <- 000c0400cc00401f
1 <- e00400300000003e
1 <- 008202000020007c
1 <- 020600000f800460
1 <- 003e00300800003e
0 <- 000c0400cc00400f
0 <- e00400300000003c
0 <- 008202000020005c
0 <- 020600000f000460
0 <- 003a00300800001e

Related

Understand pdf structure with flatedecode

Good day!
I read documentation on pdf, but I have some global problems.
https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
I need xref table from pdf file with Cross-Reference Streams.
This is pdf file
https://ufile.io/q77el
Part of pdf file:
startxref
22827515
%%EOF
This is this part:
6628 0 obj
<<
/W [1 4 1]
/Info 1 0 R
/Root 2 0 R
/Size 6629
/Type /XRef
/Filter /FlateDecode
/Length 3996
/DecodeParms <<
/Columns 6
/Predictor 12
>>
>>
stream
xÚí]{|ŽåŸç=ïÝf6­LNIŒ³ŒeHŽ;ÙæÜÁ!D¥ƒèWé...
endstream
I found this text, use function gzucompress and have this
$a = gzuncompress(substr($match[2][0],1,-1));
0200 0000 0000 ff02 0200 0000 0301 02ff
0000 000c 0002 0000 000f 7e00 0201 0000
f176 0102 ff00 0000 c2ff 0201 0000 003e
0202 0000 0000 0001 0200 0000 0000 0102
0000 0000 0001 0200 0000 0000 0102 0000
0000 0001 0200 0000 0000 0102 ff00 000d
3bf8 0201 0000 f3c5 0902 0000 0000 0001
0200 0000 0000 0102 0000 0000 0001 0200
0000 0000 0102 0000 0000 0001 0200 0000
0000 0102 0000 0000 0001 0200 0000 0000
txt file
But what this mean?
I see /W [1 4 1] means that i need to split the string into 3 parts : 1 byte 4 bytes 1 byte
02 00000000 00
ff 02020000 00
03 0102ff00 00
00 0c000200 00
But this does not work.
please, tell me what my next step. Thank you!
Answer - predictor information.
/Columns 6 - mean that splin on n+1
/Predictor 12 - mean that this is png algoritm

Impala CONV function not consistently converting BASE-16 to BASE-2

I have hex strings that I need to convert to Base-2 binary strings, but I cannot get Impala to perform consistently.
E.g.
I would expect this statement:
select conv('0020008000',16,2) union
select conv('000006040A',16,2);
To return:
0000 0000 0010 0000 0000 0000 1000 0000 0000 0000
0000 0000 0000 0000 0000 0110 0000 0100 0000 1010
However, instead it's returning:
0000 0000 0010 0000 0000 0000 1000 0000 0000 0000
1100 0000 1000 0001 010
The 1st HEX value is converted correctly, but the 2nd is missing the first 21 digits (all zeros).
Can anyone explain why this is happening and how I can fix this behaviour?
Impala/Hive treats multiple leading zeros as redundant and trims them. I'm not sure if this behavior can be toggled on/off. I worked around it using lpad function.

First time Magenta generated midi doesn’t play

Trying to get first time Magenta installation to generate a playable midi.
After upgrading bazel on OSX to ‘Build label: 0.2.3’ Magenta
works with this ‘example.mid’ input midi placed in a subdirectory.
tmp3/example.mid
4d54 6864 0000 0006 0001 0002 00dc 4d54 726b 0000 0019 00ff 5804
0402 1808 00ff 5103 03d0 9000 ff59 0200 0001 ff2f 004d 5472
6b00 0000 4000 c000 0090 3c64 8151 3c00 0b3e 6481 513e 000b 4064
8151 4000 0b41 6481 5141 000b 4364 8151 4300 0b45 6481 5145
000b 4764 8151 4700 0b48 6481 5148 0001 ff2f 00
after running
bazel build magenta:convert_midi_dir_to_note_sequences
then
mkdir out3
touch out3/newexample.mid
./bazel-bin/magenta/convert_midi_dir_to_note_sequences \
--midi_dir=/Users/user/Downloads/magenta-master/tmp3 \
--output_file=/Users/user/Downloads/magenta-master/out3/newexample.mid \
--recursive
you get
out3/newexample.mid
2101 0000 0000 0000 072b 7cb0 0a36 2f69 642f 6d69 6469 2f74 6d70
332f 3364 3864 3537 3835 6634 3838 6666 6438 3837 3566 3130
6131 3238 3538 6336 6636 6332 3135 3230 3638 120b 6578 616d 706c
652e 6d69 641a 0474 6d70 3320 dc01 2a04 1004 1804 3200 3a09
1100 0000 0000 006e 4042 0d08 3c10 6421 6666 6666 6666 ce3f 4216
083e 1064 1900 0000 0000 00d0 3f21 3333 3333 3333 df3f 4216
0840 1064 1900 0000 0000 00e0 3f21 9999 9999 9999 e73f 4216
0841 1064 1900 0000 0000 00e8 3f21 9999 9999 9999 ef3f 4216
0843 1064 1900 0000 0000 00f0 3f21 cdcc cccc cccc f33f 4216
0845 1064 1900 0000 0000 00f4 3f21 cccc cccc cccc f73f 4216
0847 1064 1900 0000 0000 00f8 3f21 cccc cccc cccc fb3f 4216
0848 1064 1900 0000 0000 00fc 3f21 cccc cccc cccc ff3f 49cc
cccc cccc ccff 3f13 5fbf 34
but the music file doesn’t play. Even if you add ‘ff2f 00’ (common suggestion) to the end.
How can you make this resulting file play in a player such as Quicktime 7? Any ideas?
We've recently added a model that you can train to generate new sequences. Have a look at https://github.com/tensorflow/magenta/blob/master/magenta/models/basic_rnn/README.md.
Thanks!

Scapy - srp1 does not see the response frame on L2, but tcpdump see it just fine

sending the scapy packet as below over eth3
ans, unans = srp1(REQUEST, iface=self.iface)
print ans, unans
the call never returns, I tried srp too. (send/sendp/sniff too). I see response as None or call just hangs.
However, I could see the request and response on tcpdump just fine
listening on eth3, link-type EN10MB (Ethernet), capture size 65535 bytes
16:52:52.565683 00:26:55:27:1c:a2 (oui Unknown) > Broadcast, ethertype Unknown (0x88f8), length 34:
0x0000: ffff ffff ffff 0026 5527 1ca2 88f8 0001
0x0010: 000b 1500 0000 0000 0000 0000 0000 ffff
0x0020: eaf4
16:52:52.576476 00:04:25:1c:a0:02 (oui Unknown) > Broadcast, ethertype Unknown (0x88f8), length 76:
0x0000: ffff ffff ffff 0004 251c a002 88f8 0001
0x0010: 000b 9500 0028 0000 0000 0000 0000 0000
0x0020: 0000 f1f0 f100 0000 0000 0000 0000 0000
0x0030: 0000 0000 0000 0803 0087 1634 XXXX XXXX
0x0040: XXXX 0000 XXXX XXXX XXXX ffff
I figured it out and sharing it -
If you define new protocol with custom payloads, Answer layer should implement "answers" method:
e.g:
class MyAnswer(Packet):
name = "MyAnswer"
fields_desc = [ByteEnumField("isOk", 0. BooleanFields)]
def answers(self, other):
return isinstance(other, MyRequest)

How can I extract the hex values from the text file using SQL Server?

I am in need of extracting the hexadecimal values from a textfile using a SQL Server stored procedure.
I wrote the procedure to give the inputs (hex value - 068F 015A 0000 01A7 69 019A 6B 00F1 6A) manually, but I need to pass in a filename as input. From that file I have to extract the content.
My file content looks something like this :
V( 068F 015A 0000 01A7 69 019A 6B 00F1 6A )
V( 0665 0158 0000 01A8 68 0186 6B 00EE 6A )
V( 0687 017A 0000 01C3 67 018A 69 00F9 69 )
V( 067F 0171 0000 01AF 66 01A4 68 00F6 67 )
V( 06C2 0162 0000 01AA 64 0191 66 0150 65 )
V( 07E7 0163 0000 01B3 62 0195 64 0213 64 )
V( 0876 0166 0000 01CA 60 0214 62 01EF 62 )
V( 0BA1 015F 0000 021B 5E 039C 60 024B 60 )
V( 0DC9 014D 0007 01A2 5B 0426 5C 0407 5D )
V( 0E30 0140 000A 013E 5B 04A2 5B 043A 5C )
V( 0E6B 0130 000B 013C 5A 04B4 5A 046D 5B )
V( 0DDC 0150 0011 015A 58 052A 59 0399 5A )
V( 0C1C 0164 0001 013E 55 0456 56 03A4 56 )
V( 0CF0 02EA 0000 01B0 55 0534 56 0338 57 )
V( 0B86 03A1 0000 01D3 57 0461 58 02D6 59 )
V( 0950 0236 0000 013F 59 03A1 5A 0226 5A )
V( 0847 0279 0000 00CC 59 03A7 59 01E1 5B )
V( 0734 0203 0000 0078 5B 037D 5B 0156 5D )
V( 075D 038B 0000 00DD 5E 0306 5E 01E3 60 )
V( 073E 02C6 0000 0117 61 028A 62 0191 63 )
V( 0606 0183 0000 0095 62 01C8 63 01A7 64 )
V( 0793 0310 0000 00D4 5F 02CF 5F 020A 61 )
V( 07C2 03C0 0000 011D 5D 0301 5D 0211 5E )
V( 080A 043B 0000 0170 5E 031C 5E 01E6 60 )
V( 06FD 041C 0000 0129 60 02B1 60 01D5 61 )
V( 05D2 03A3 0000 0139 62 014E 62 0238 63 )
V( 06CC 046E 0000 0153 62 0205 62 0240 63 )
Please help me..
I tried like this for the single Hex value
set #sting='068F 015A 0000 01A7 69 019A 6B 00F1 6A'
set #string=replace(#string,' ',',')
SET #Delimiter = ','
SET #string = #string + #Delimiter
SET #Pos = charindex(#Delimiter,#string)
while(#Pos <> 0)
begin
select #mid ='0x'+substring(#string,1,#Pos - 1)
set #query='select convert(int,'+#mid+')'
insert #word execute (#query)
select #value=value from #word
SET #string = substring(#string,#pos+1,len(#string))
SET #pos = charindex(#Delimiter,#string)
end
Thank you,
I found a way to read the content from the textfile.
alter PROCEDURE ReadFromTextFile
#FileName VARCHAR (1024)
AS
DECLARE #OLEResult INT
DECLARE #FS INT
DECLARE #FileID INT
DECLARE #Message VARCHAR (8000)
drop table hex_temp
create table hex_temp (id int identity(1,1),value varchar(max))
-- Create an instance of the file system object
EXECUTE #OLEResult = sp_OACreate 'Scripting.FileSystemObject', #FS OUT
IF #OLEResult <> 0
BEGIN
PRINT 'Scripting.FileSystemObject'
PRINT 'Error code: ' + CONVERT (VARCHAR, #OLEResult)
END
-- Open the text file for reading
EXEC #OLEResult = sp_OAMethod #FS, 'OpenTextFile', #FileID OUT, #FileName, 1, 1
IF #OLEResult <> 0
BEGIN
PRINT 'OpenTextFile'
PRINT 'Error code: ' + CONVERT (VARCHAR, #OLEResult)
END
-- Read the first line into the #Message variable
EXECUTE #OLEResult = sp_OAMethod #FileID, 'ReadLine', #Message OUT
-- Keep looping through until the #OLEResult variable is < 0; this indicates that the end of the file has been reached.
WHILE #OLEResult >= 0
BEGIN
insert into hex_temp(value) values (#Message)
EXECUTE #OLEResult = sp_OAMethod #FileID, 'ReadLine', #Message OUT
END
EXECUTE #OLEResult = sp_OADestroy #FileID
EXECUTE #OLEResult = sp_OADestroy #FS