Efficient bit field extraction from bytes array being interpreted as a bitstream, possibly using Intel's BMI Set 1 - vb.net

In order to extract bit fields from a bytes array being interpreted as a bitstream, I am aiming to device an efficient bit field extraction function, optimized for Intel's modern CPUs (ideally making use of the new BEXTR instruction) and MS Visual Studio 2017 (I'm developing in VB.NET).
Inputs: Bitstream left to right (MSB first)
Pos=0...7, Len=1...8
Output: bitfield at Pos in length Len (LSB right-aligned)
As an example for Pos=0...7 and Len=3 (masking omitted):
Byte0 Byte1 Shifts
01234567 01234567
xxx >> 5
xxx >> 4
xxx >> 3
xxx >> 2
xxx >> 1
xxx n/a
xx x << 1 | >> 7
x xx << 2 | >> 6
From the example, a possibly naive implementation in pseudo-code would be:
Extract(Pos, Len, ByteAddr):=
if 8-Pos-Len > 0
Res:=[ByteAddr+0] >> 8-Pos-Len
Res:=Res & (2^Len-1)
elseif 8-Pos-Len < 0
Res:=[ByteAddr+0] << Pos+Len-8
Res:=Res & (2^Len-1)
Res:=Res | ([ByteAddr+1] >> 16-Pos-Len))
else
Res:=[ByteAddr+0] & (2^Len-1)
fi
Testing this on "paper" (well, notepad.exe these days) for Len=4 and Pos=0...7 shows, that the algorithm is likely to work:
Byte0 Byte1 B0>> B0<< &(2^Len-1) B1>> B0|B1
8-Pos-Len Pos+Len-8 16-Pos-Len
01234567 01234567 01234567 01234567 01234567 01234567 01234567
xxxx.... | 0000xxxx | 0000xxxx | |
.xxxx... | 000.xxxx | 0000xxxx | |
..xxxx.. | 00..xxxx | 0000xxxx | |
...xxxx. | 0...xxxx | 0000xxxx | |
....xxxx | | 0000xxxx | |
.....xxx y....... ....xxx0 0000xxx0 0000000y 0000xxxy
......xx yy...... ....xx00 0000xx00 000000yy 0000xxyy
.......x yyy..... ....x000 0000x000 00000yyy 0000xyyy
Questions:
(1) For efficiency reasons, should I use a lookup table instead of 2^Len-1, or can I rely on the compiler to reliably optimize powers of 2? (Of course, I could also use (1<<Len)-1. Does the compiler do this?)
(2) In VB.NET, how would I proceed to instruct the compiler to pretty please make use of the new BEXTR instruction?
(3) Should I do this completely different, i.e. package everything in lookup tables? (After all, it's just 8 x 8 possibilities. However, it would not truly be expandable.)

Related

How to select half precision (BFLOAT16 vs FLOAT16) for your trained model?

how will you decide what precision works best for your inference model? Both BF16 and F16 takes two bytes but they use different number of bits for fraction and exponent.
Range will be different but I am trying to understand why one chose one over other.
Thank you
|--------+------+----------+----------|
| Format | Bits | Exponent | Fraction |
|--------+------+----------+----------|
| FP32 | 32 | 8 | 23 |
| FP16 | 16 | 5 | 10 |
| BF16 | 16 | 8 | 7 |
|--------+------+----------+----------|
Range
bfloat16: ~1.18e-38 … ~3.40e38 with 3 significant decimal digits.
float16: ~5.96e−8 (6.10e−5) … 65504 with 4 significant decimal digits precision.
bfloat16 is generally easier to use, because it works as a drop-in replacement for float32. If your code doesn't create nan/inf numbers or turn a non-0 into a 0 with float32, then it shouldn't do it with bfloat16 either, roughly speaking. So, if your hardware supports it, I'd pick that.
Check out AMP if you choose float16.

Implementation of an AND chip in HDL

I'm working through this book http://nand2tetris.org/book.php that teaches fundamental concepts of CS and I got stuck where I'm asked to code an AND chip and test it in provided testing software.
This is what I've got so far:
/**
* And gate:
* out = 1 if (a == 1 and b == 1)
* 0 otherwise
*/
CHIP And {
IN a, b;
OUT out;
PARTS:
// Put your code here:
Not(in=a, out=nota);
Not(in=b, out=notb);
And(a=a, b=b, out=out);
Or(a=nota, b=b, out=nota);
Or(a=a, b=notb, out=notb);
}
Problem is I'm getting this error:
...
at Hack.Gates.CompositeGateClass.readParts(Unknown Source)
at Hack.Gates.CompositeGateClass.<init>(Unknown Source)
at Hack.Gates.GateClass.readHDL(Unknown Source)
at Hack.Gates.GateClass.getGateClass(Unknown Source)
at Hack.Gates.CompositeGateClass.readParts(Unknown Source)
at Hack.Gates.CompositeGateClass.<init>(Unknown Source)
at Hack.Gates.GateClass.readHDL(Unknown Source)
...
And I don't know if I'm getting this error because the testing program is malfunctioning or because my code is wrong and the software can't load it up.
It may be helpful to examine the truth tables for Nand and And:
Nand
a | b | out
0 | 0 | 1
0 | 1 | 1
1 | 0 | 1
1 | 1 | 0
And
a | b | out
0 | 0 | 0
0 | 1 | 0
1 | 0 | 0
1 | 1 | 1
And is the inverse of Nand, meaning that for every combination of inputs, And will give the opposite output of Nand. Another way to think of the "opposite" of a binary value is "not" that value.
If you send 2 inputs through a Nand gate and then send its output through a Not gate, you will have Not(Nand(a, b)), which is equivalent to And(a, b).
Your problem is that you are trying to use parts (Not, And and Or) that haven't been defined yet (and you are trying to use an And gate in your definition of an And gate).
At each point in the course, you can only use parts that you have previously built. If memory serves, at this point the only part you have available is a Nand gate.
You should be able to construct an And gate using only Nand gates.
You're overthinking the problem significantly
If one is given NAND, or (not) AND, then AND can be constructed as (not)NAND since (not)(not) AND = AND

Multiple Font Style Combinations in vb.net

If I want to create a font with multiple style combinations, like bold AND underline, I have to place the 'or' statement between it, like in the example below:
lblArt.Font = New Font("Tahoma", 18, FontStyle.Bold Or FontStyle.Underline)
If you place bold 'and' underline, it won't work, and you only get 1 of the 2 (like how the or statement should be working), while that would be the logically way to do it. What is the reason behind this?
Boolean logic works a bit differently than the way we use the terms in English. What's happening here is that the enumerated FontStyle values are actually bit flags, and in order to manipulate bit flags, you use bitwise operations.
To combine two bit flags, you OR them together. An OR operation combines the two values. So imagine that FontStyle.Bold was 2 and FontStyle.Underline was 4. When you OR them together, you get 6—you've combined them together. In Boolean logic, you can think of an OR operation as returning "true" (i.e., setting that bit in the result) if either of the bits in the two operands are set, and "false" if neither of the bits in the two operands are set.
You can write a truth table for such an operation as follows:
| A | B | A OR B |
|---|---|--------|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |
Notice that the results more closely mirror what we, in informal English, would call "and". If either one has it set, then the result has it set, too.
In contrast to OR, a bitwise AND operation only returns "true" (i.e., sets that bit in the result) if both of the bits in the two operands are set. Otherwise, the result is "false". Again, a truth table can be written:
| A | B | A AND B |
|---|---|---------|
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
Assuming again that FontStyle.Bold has the value 2 and FontStyle.Underline has the value 4, if you AND them together, you get 0. This is because the values effectively cancel each other out. The net result is that you don't get any font styles—precisely why it doesn't work when you write FontStyle.Bold And FontStyle.Underline.
In VB, a bitwise OR operation is performed using the Or operator. The And operator performs a bitwise AND operation. So in order to do a bitwise inclusion of values, which is how you combine bit flags, you use the Or operator.
try this:
lblArt.Font = New Drawing.Font("Tahoma", _
18, _
FontStyle.Bold or FontStyle.Italic)
use "New Drawing.Font" instead of Font alone
Source

Luke reveals unknown term values for numeric fields in index

We use Lucene.net for indexing. One of the fields that we index, is a numeric field with the values 1 to 6 and 9999 for not set.
When using Luke to explore the index, we see terms that we do not recognize. The index contains a total of 38673 documents, and Luke shows the following top ranked terms for this field:
Term | Rank | Field | Text | Text (decoded as numeric-int)
1 | 38673 | Axis | x | 0
2 | 38673 | Axis | p | 0
3 | 38673 | Axis | t | 0
4 | 38673 | Axis | | | 0
5 | 19421 | Axis | l | 0
6 | 19421 | Axis | h | 0
7 | 19421 | Axis | d# | 0
8 | 19252 | Axis | ` N | 9999
9 | 19252 | Axis | l | 8192
10 | 19252 | Axis | h ' | 9984
11 | 19252 | Axis | d# p | 9984
12 | 18209 | Axis | ` | 4
13 | 950 | Axis | ` | 1
14 | 116 | Axis | ` | 5
15 | 102 | Axis | ` | 6
16 | 26 | Axis | ` | 3
17 | 18 | Axis | ` | 2
We find the same pattern for other numeric fields.
Where does the unknown values come from?
NumericFields are indexed using a trie structure. The terms you see are part of it, but will not return results if you query for them.
Try indexing your NumericField with a precision step of Int32.MaxValue and the values will go away.
NumericField documentation
... Within Lucene, each numeric value is indexed as a trie structure, where each term is logically assigned to larger and larger pre-defined brackets (which are simply lower-precision representations of the value). The step size between each successive bracket is called the precisionStep, measured in bits. Smaller precisionStep values result in larger number of brackets, which consumes more disk space in the index but may result in faster range search performance. The default value, 4, was selected for a reasonable tradeoff of disk space consumption versus performance. You can use the expert constructor NumericField(String,int,Field.Store,boolean) if you'd like to change the value. Note that you must also specify a congruent value when creating NumericRangeQuery or NumericRangeFilter. For low cardinality fields larger precision steps are good. If the cardinality is < 100, it is fair to use Integer.MAX_VALUE, which produces one term per value. ...
More details on the precision step available in the NumericRangeQuery documentation:
Good values for precisionStep are depending on usage and data type:
• The default for all data types is 4, which is used, when no
precisionStep is given.
• Ideal value in most cases for 64 bit data
types (long, double) is 6 or 8.
• Ideal value in most cases for 32 bit
data types (int, float) is 4.
• For low cardinality fields larger
precision steps are good. If the cardinality is < 100, it is fair to use •Integer.MAX_VALUE (see below).
• Steps ≥64 for long/double and
≥32 for int/float produces one token per value in the index and
querying is as slow as a conventional TermRangeQuery. But it can be
used to produce fields, that are solely used for sorting (in this case
simply use Integer.MAX_VALUE as precisionStep). Using NumericFields
for sorting is ideal, because building the field cache is much faster
than with text-only numbers. These fields have one term per value and
therefore also work with term enumeration for building distinct lists
(e.g. facets / preselected values to search for). Sorting is also
possible with range query optimized fields using one of the above
precisionSteps.
EDIT
little sample, the index produced by this will show terms with value 8192, 9984, 1792, etc in luke, but using a range that would include them in the query doesnt produce results:
NumericField number = new NumericField("number", Field.Store.YES, true);
Field regular = new Field("normal", "", Field.Store.YES, Field.Index.ANALYZED);
IndexWriter iw = new IndexWriter(FSDirectory.GetDirectory("C:\\temp\\testnum"), new StandardAnalyzer(), true);
Document doc = new Document();
doc.Add(number);
doc.Add(regular);
number.SetIntValue(1);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(2);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(13);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(2000);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(9999);
regular.SetValue("one");
iw.AddDocument(doc);
iw.Commit();
IndexSearcher searcher = new IndexSearcher(iw.GetReader());
NumericRangeQuery rangeQ = NumericRangeQuery.NewIntRange("number", 1, 2, true, true);
var docs = searcher.Search(rangeQ);
Console.WriteLine(docs.Length().ToString()); // prints 2
rangeQ = NumericRangeQuery.NewIntRange("number", 13, 13, true, true);
docs = searcher.Search(rangeQ);
Console.WriteLine(docs.Length().ToString()); // prints 1
rangeQ = NumericRangeQuery.NewIntRange("number", 9000, 9998, true, true);
docs = searcher.Search(rangeQ);
Console.WriteLine(docs.Length().ToString()); // prints 0
Console.ReadLine();

How to represent and insert into an ordered list in SQL?

I want to represent the list "hi", "hello", "goodbye", "good day", "howdy" (with that order), in a SQL table:
pk | i | val
------------
1 | 0 | hi
0 | 2 | hello
2 | 3 | goodbye
3 | 4 | good day
5 | 6 | howdy
'pk' is the primary key column. Disregard its values.
'i' is the "index" that defines that order of the values in the 'val' column. It is only used to establish the order and the values are otherwise unimportant.
The problem I'm having is with inserting values into the list while maintaining the order. For example, if I want to insert "hey" and I want it to appear between "hello" and "goodbye", then I have to shift the 'i' values of "goodbye" and "good day" (but preferably not "howdy") to make room for the new entry.
So, is there a standard SQL pattern to do the shift operation, but only shift the elements that are necessary? (Note that a simple "UPDATE table SET i=i+1 WHERE i>=3" doesn't work, because it violates the uniqueness constraint on 'i', and also it updates the "howdy" row unnecessarily.)
Or, is there a better way to represent the ordered list? I suppose you could make 'i' a floating point value and choose values between, but then you have to have a separate rebalancing operation when no such value exists.
Or, is there some standard algorithm for generating string values between arbitrary other strings, if I were to make 'i' a varchar?
Or should I just represent it as a linked list? I was avoiding that because I'd like to also be able to do a SELECT .. ORDER BY to get all the elements in order.
As i read your post, I kept thinking 'linked list'
and at the end, I still think that's the way to go.
If you are using Oracle, and the linked list is a separate table (or even the same table with a self referencing id - which i would avoid) then you can use a CONNECT BY query and the pseudo-column LEVEL to determine sort order.
You can easily achieve this by using a cascading trigger that updates any 'index' entry equal to the new one on the insert/update operation to the index value +1. This will cascade through all rows until the first gap stops the cascade - see the second example in this blog entry for a PostgreSQL implementation.
This approach should work independent of the RDBMS used, provided it offers support for triggers to fire before an update/insert. It basically does what you'd do if you implemented your desired behavior in code (increase all following index values until you encounter a gap), but in a simpler and more effective way.
Alternatively, if you can live with a restriction to SQL Server, check the hierarchyid type. While mainly geared at defining nested hierarchies, you can use it for flat ordering as well. It somewhat resembles your approach using floats, as it allows insertion between two positions by assigning fractional values, thus avoiding the need to update other entries.
If you don't use numbers, but Strings, you may have a table:
pk | i | val
------------
1 | a0 | hi
0 | a2 | hello
2 | a3 | goodbye
3 | b | good day
5 | b1 | howdy
You may insert a4 between a3 and b, a21 between a2 and a3, a1 between a0 and a2 and so on. You would need a clever function, to generate an i for new value v between p and n, and the index can become longer and longer, or you need a big rebalancing from time to time.
Another approach could be, to implement a (double-)linked-list in the table, where you don't save indexes, but links to previous and next, which would mean, that you normally have to update 1-2 elements:
pk | prev | val
------------
1 | 0 | hi
0 | 1 | hello
2 | 0 | goodbye
3 | 2 | good day
5 | 3 | howdy
hey between hello & goodbye:
hey get's pk 6,
pk | prev | val
------------
1 | 0 | hi
0 | 1 | hello
6 | 0 | hi <- ins
2 | 6 | goodbye <- upd
3 | 2 | good day
5 | 3 | howdy
the previous element would be hello with pk=0, and goodbye, which linked to hello by now has to link to hey in future.
But I don't know, if it is possible to find a 'order by' mechanism for many db-implementations.
Since I had a similar problem, here is a very simple solution:
Make your i column floats, but insert integer values for the initial data:
pk | i | val
------------
1 | 0.0 | hi
0 | 2.0 | hello
2 | 3.0 | goodbye
3 | 4.0 | good day
5 | 6.0 | howdy
Then, if you want to insert something in between, just compute a float value in the middle between the two surrounding values:
pk | i | val
------------
1 | 0.0 | hi
0 | 2.0 | hello
2 | 3.0 | goodbye
3 | 4.0 | good day
5 | 6.0 | howdy
6 | 2.5 | hey
This way the number of inserts between the same two values is limited to the resolution of float values but for almost all cases that should be more than sufficient.