How to select half precision (BFLOAT16 vs FLOAT16) for your trained model? - tensorflow

how will you decide what precision works best for your inference model? Both BF16 and F16 takes two bytes but they use different number of bits for fraction and exponent.
Range will be different but I am trying to understand why one chose one over other.
Thank you
|--------+------+----------+----------|
| Format | Bits | Exponent | Fraction |
|--------+------+----------+----------|
| FP32 | 32 | 8 | 23 |
| FP16 | 16 | 5 | 10 |
| BF16 | 16 | 8 | 7 |
|--------+------+----------+----------|
Range
bfloat16: ~1.18e-38 … ~3.40e38 with 3 significant decimal digits.
float16: ~5.96e−8 (6.10e−5) … 65504 with 4 significant decimal digits precision.

bfloat16 is generally easier to use, because it works as a drop-in replacement for float32. If your code doesn't create nan/inf numbers or turn a non-0 into a 0 with float32, then it shouldn't do it with bfloat16 either, roughly speaking. So, if your hardware supports it, I'd pick that.
Check out AMP if you choose float16.

Related

Deflate: code lengths of > 7 bits for top-level HCLEN?

RFC 1951 specifies that the first level of encoding in a block contains HCLEN 3-bit values, which encode the lengths of the next level of Huffman codes. Since these are 3-bit values, it follows that no code for the next level can be longer than 7 bits (111 in binary).
However, there seem to be corner cases which (at least with the "classical" algorithm to build Huffman codes, using a priority queue) apparently generate codes of 8 bits, which can of course not be encoded.
An example I came up with is the following (this represents the 19 possible symbols resulting from the RLE encoding, 0-15 plus 16, 17 and 18):
symbol | frequency
-------+----------
0 | 15
1 | 14
2 | 6
3 | 2
4 | 18
5 | 5
6 | 12
7 | 26
8 | 3
9 | 20
10 | 79
11 | 94
12 | 17
13 | 7
14 | 8
15 | 4
16 | 16
17 | 1
18 | 13
According to various online calculators (eg https://people.ok.ubc.ca/ylucet/DS/Huffman.html), and also building the tree by hand, some symbols in the above table (namely 3 and 17) produce 8-bit long Huffman codes. The resulting tree looks ok to me, with 19 leaf nodes and 18 internal nodes.
So, is there a special way to calculate Huffman codes for use in DEFLATE?
Yes. deflate uses length-limited Huffman codes. You need either a modified Huffman algorithm that limits the length, or an algorithm that shortens a Huffman code that has exceeded the length. (zlib does the latter.)
In addition to the code lengths code being limited to seven bits, the literal/length and distance codes are limited to 15 bits. It is not at all uncommon to exceed those limits when applying Huffman's algorithm to sets of frequencies encountered during compression.
Though your example is not a valid or possible set of frequencies for that code. Here is a valid example that results in a 9-bit Huffman code, which would then need to be squashed down to seven bits:
3 0 0 5 5 1 9 31 58 73 59 28 9 1 2 0 6 0 0

Why does two's-complement multiplication need to do sign extension?

In the book Computer Systems A Programmer's Perspective (2.3.5), the method to calculate two's-complement multiplication is described as follows:
Signed multiplication in C generally is performed by truncating the 2w-bit product to w bits.
Truncating a two’s-complement number to w bits is equivalent to first computing its value modulo 2w and then converting from unsigned to two’s-complement.
Thus, for similar bit-level operands, why is unsigned multiplication different from two’s-complement multiplication? Why does two's-complement multiplication need to do sign extension?
To calculate same bit-level representation of unsigned and two’s-complement addition, we can convert the arguments of two’s-complement, then perform unsigned addition, and finally convert back to two’s-complement.
Since multiplication consists of multiple additions, why are the full representations of unsigned and two’s-complement multiplication different?
Figure 2.27 demonstrates the example below:
+------------------+----------+---------+-------------+-----------------+
| Mode | x | y | x · y | Truncated x · y |
+------------------+----------+---------+-------------+-----------------+
| Unsigned | 5 [101] | 3 [011] | 15 [001111] | 7 [111] |
| Two's complement | –3 [101] | 3 [011] | –9 [110111] | –1 [111] |
+------------------+----------+---------+-------------+-----------------+
If you multiply 101 by 011, you will get 1111 (which is equal to 001111). How did they get 110111 for two's complement case then?
The catch here is that to get a correct 6-bit two's complement product you need to multiply 6-bit two's complement numbers. Thus, you need first to convert -3 and 3 to 6-bit two's complement representation: -3 = 111101, 3 = 000011 and only then multiply them 111101 * 000011 = 10110111. You also need to truncate the result to 6 bits to eventually get 110111 from the table above.

Efficient bit field extraction from bytes array being interpreted as a bitstream, possibly using Intel's BMI Set 1

In order to extract bit fields from a bytes array being interpreted as a bitstream, I am aiming to device an efficient bit field extraction function, optimized for Intel's modern CPUs (ideally making use of the new BEXTR instruction) and MS Visual Studio 2017 (I'm developing in VB.NET).
Inputs: Bitstream left to right (MSB first)
Pos=0...7, Len=1...8
Output: bitfield at Pos in length Len (LSB right-aligned)
As an example for Pos=0...7 and Len=3 (masking omitted):
Byte0 Byte1 Shifts
01234567 01234567
xxx >> 5
xxx >> 4
xxx >> 3
xxx >> 2
xxx >> 1
xxx n/a
xx x << 1 | >> 7
x xx << 2 | >> 6
From the example, a possibly naive implementation in pseudo-code would be:
Extract(Pos, Len, ByteAddr):=
if 8-Pos-Len > 0
Res:=[ByteAddr+0] >> 8-Pos-Len
Res:=Res & (2^Len-1)
elseif 8-Pos-Len < 0
Res:=[ByteAddr+0] << Pos+Len-8
Res:=Res & (2^Len-1)
Res:=Res | ([ByteAddr+1] >> 16-Pos-Len))
else
Res:=[ByteAddr+0] & (2^Len-1)
fi
Testing this on "paper" (well, notepad.exe these days) for Len=4 and Pos=0...7 shows, that the algorithm is likely to work:
Byte0 Byte1 B0>> B0<< &(2^Len-1) B1>> B0|B1
8-Pos-Len Pos+Len-8 16-Pos-Len
01234567 01234567 01234567 01234567 01234567 01234567 01234567
xxxx.... | 0000xxxx | 0000xxxx | |
.xxxx... | 000.xxxx | 0000xxxx | |
..xxxx.. | 00..xxxx | 0000xxxx | |
...xxxx. | 0...xxxx | 0000xxxx | |
....xxxx | | 0000xxxx | |
.....xxx y....... ....xxx0 0000xxx0 0000000y 0000xxxy
......xx yy...... ....xx00 0000xx00 000000yy 0000xxyy
.......x yyy..... ....x000 0000x000 00000yyy 0000xyyy
Questions:
(1) For efficiency reasons, should I use a lookup table instead of 2^Len-1, or can I rely on the compiler to reliably optimize powers of 2? (Of course, I could also use (1<<Len)-1. Does the compiler do this?)
(2) In VB.NET, how would I proceed to instruct the compiler to pretty please make use of the new BEXTR instruction?
(3) Should I do this completely different, i.e. package everything in lookup tables? (After all, it's just 8 x 8 possibilities. However, it would not truly be expandable.)

CodeChef C_HOLIC2 Solution Find the smallest N whose factorial produces P Trailing Zeroes

For CodeChef problem C_HOLIC2, I tried iterating over elements: 5, 10, 15, 20, 25,... and for each number checking the number of trailing zeros using the efficient technique as specified over here, but got TLE.
What is the fastest way to solve this using formula method?
Here is the Problem Link
As we know for counting the number of trailing zeros in factorial of a number, the trick used is:
The number of multiples of 5 that are less than or equal to 500 is 500÷5=100
Then, the number of multiples of 25 is 500÷25=20
Then, the number of multiples of 125 is 500÷125=4
The next power of 5 is 625, which is > than 500.
Therefore, the number of trailing zeros of is 100+20+4=124
For detailed explanation check this page
Thus, this count can be represented as:
Using this trick, given a number N you can determine the no. of trailing zeros count in its factorial. Codechef Problem Link
Now, suppose we are given the no. of trailing zeros, count and we are asked the smallest no. N whose factorial has count trailing zeros Codechef Problem Link
Here the question is how can we split count into this representation?
This is a problem because in the following examples, as we can see it becomes difficult.
The count jumps even though the no is increasing by the same amount.
As you can see from the following table, count jumps at values whose factorials have integral powers of 5 as factors e.g. 25, 50, ..., 125, ...
+-------+-----+
| count | N |
+-------+-----+
| 1 | 5 |
+-------+-----+
| 2 | 10 |
+-------+-----+
| 3 | 15 |
+-------+-----+
| 4 | 20 |
+-------+-----+
| 6 | 25 |
+-------+-----+
| 7 | 30 |
+-------+-----+
| 8 | 35 |
+-------+-----+
| 9 | 40 |
+-------+-----+
| 10 | 45 |
+-------+-----+
| 12 | 50 |
+-------+-----+
| ... | ... |
+-------+-----+
| 28 | 120 |
+-------+-----+
| 31 | 125 |
+-------+-----+
| 32 | 130 |
+-------+-----+
| ... | ... |
+-------+-----+
You can see this from any brute force program for this task, that these jumps occur frequently i.e. at 6, 12, 18, 24 in case of numbers whose factorials have 25.(Interval = 6=1×5+1)
After N=31 factorials will also have a factor of 125. Thus, these jumps corresponding to 25 will still occur with the same frequency i.e. at 31, 37, 43, ...
Now the next jump corresponding to 125 will be at 31+31 which is at 62. Thus jumps corresponding to 125 will occur at 31, 62, 93, 124.(Interval =31=6×5+1)
Now the jump corresponding to 625 will occur at 31×5+1=155+1=156
Thus you can see there exists a pattern. We need to find the formula for this pattern to proceed.
The series formed is 1, 6, 31, 156, ...
which is 1 , 1+5 , 1+5+52 , 1+5+52+53 , ...
Thus, nth term is sum of n terms of G.P. with a = 1, r = 5
Thus, the count can be something like 31+31+6+1+1, etc.
We need to find this tn which is less than count but closest to it. i.e.
Say the number is count=35, then using this we identify that tn=31 is closest. For count=63 we again see that using this formula, we get tn=31 to be the closest but note that here, 31 can be subtracted twice from count=63. Now we go on finding this n and keep on subtracting tn from count till count becomes 0.
The algorithm used is:
count=read_no()
N=0
while count!=0:
n=floor(log(4*count+1,5))
baseSum=((5**n)-1)/4
baseOffset=(5**n)*(count/baseSum) // This is integer division
count=count%baseSum
N+=baseOffset
print(N)
Here, 5**n is 5n
Let's try working this out for an example:
Say count = 70,
Iteration 1:
Iteration 2:
Iteration 3:
Take another example. Say count=124 which is the one discussed at the beginning of this page:
Iteration 1:
PS: All the images are completely owned by me. I had to use images because StackOverflow doesn't allow MathJax to be embedded. #StackOverflowShouldAllowMathJax

Luke reveals unknown term values for numeric fields in index

We use Lucene.net for indexing. One of the fields that we index, is a numeric field with the values 1 to 6 and 9999 for not set.
When using Luke to explore the index, we see terms that we do not recognize. The index contains a total of 38673 documents, and Luke shows the following top ranked terms for this field:
Term | Rank | Field | Text | Text (decoded as numeric-int)
1 | 38673 | Axis | x | 0
2 | 38673 | Axis | p | 0
3 | 38673 | Axis | t | 0
4 | 38673 | Axis | | | 0
5 | 19421 | Axis | l | 0
6 | 19421 | Axis | h | 0
7 | 19421 | Axis | d# | 0
8 | 19252 | Axis | ` N | 9999
9 | 19252 | Axis | l | 8192
10 | 19252 | Axis | h ' | 9984
11 | 19252 | Axis | d# p | 9984
12 | 18209 | Axis | ` | 4
13 | 950 | Axis | ` | 1
14 | 116 | Axis | ` | 5
15 | 102 | Axis | ` | 6
16 | 26 | Axis | ` | 3
17 | 18 | Axis | ` | 2
We find the same pattern for other numeric fields.
Where does the unknown values come from?
NumericFields are indexed using a trie structure. The terms you see are part of it, but will not return results if you query for them.
Try indexing your NumericField with a precision step of Int32.MaxValue and the values will go away.
NumericField documentation
... Within Lucene, each numeric value is indexed as a trie structure, where each term is logically assigned to larger and larger pre-defined brackets (which are simply lower-precision representations of the value). The step size between each successive bracket is called the precisionStep, measured in bits. Smaller precisionStep values result in larger number of brackets, which consumes more disk space in the index but may result in faster range search performance. The default value, 4, was selected for a reasonable tradeoff of disk space consumption versus performance. You can use the expert constructor NumericField(String,int,Field.Store,boolean) if you'd like to change the value. Note that you must also specify a congruent value when creating NumericRangeQuery or NumericRangeFilter. For low cardinality fields larger precision steps are good. If the cardinality is < 100, it is fair to use Integer.MAX_VALUE, which produces one term per value. ...
More details on the precision step available in the NumericRangeQuery documentation:
Good values for precisionStep are depending on usage and data type:
• The default for all data types is 4, which is used, when no
precisionStep is given.
• Ideal value in most cases for 64 bit data
types (long, double) is 6 or 8.
• Ideal value in most cases for 32 bit
data types (int, float) is 4.
• For low cardinality fields larger
precision steps are good. If the cardinality is < 100, it is fair to use •Integer.MAX_VALUE (see below).
• Steps ≥64 for long/double and
≥32 for int/float produces one token per value in the index and
querying is as slow as a conventional TermRangeQuery. But it can be
used to produce fields, that are solely used for sorting (in this case
simply use Integer.MAX_VALUE as precisionStep). Using NumericFields
for sorting is ideal, because building the field cache is much faster
than with text-only numbers. These fields have one term per value and
therefore also work with term enumeration for building distinct lists
(e.g. facets / preselected values to search for). Sorting is also
possible with range query optimized fields using one of the above
precisionSteps.
EDIT
little sample, the index produced by this will show terms with value 8192, 9984, 1792, etc in luke, but using a range that would include them in the query doesnt produce results:
NumericField number = new NumericField("number", Field.Store.YES, true);
Field regular = new Field("normal", "", Field.Store.YES, Field.Index.ANALYZED);
IndexWriter iw = new IndexWriter(FSDirectory.GetDirectory("C:\\temp\\testnum"), new StandardAnalyzer(), true);
Document doc = new Document();
doc.Add(number);
doc.Add(regular);
number.SetIntValue(1);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(2);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(13);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(2000);
regular.SetValue("one");
iw.AddDocument(doc);
number.SetIntValue(9999);
regular.SetValue("one");
iw.AddDocument(doc);
iw.Commit();
IndexSearcher searcher = new IndexSearcher(iw.GetReader());
NumericRangeQuery rangeQ = NumericRangeQuery.NewIntRange("number", 1, 2, true, true);
var docs = searcher.Search(rangeQ);
Console.WriteLine(docs.Length().ToString()); // prints 2
rangeQ = NumericRangeQuery.NewIntRange("number", 13, 13, true, true);
docs = searcher.Search(rangeQ);
Console.WriteLine(docs.Length().ToString()); // prints 1
rangeQ = NumericRangeQuery.NewIntRange("number", 9000, 9998, true, true);
docs = searcher.Search(rangeQ);
Console.WriteLine(docs.Length().ToString()); // prints 0
Console.ReadLine();