Parsing HEVC Stream for non IDR frames - hevc

I parsed the HEVC stream with this code (after converting bytes to hex format)
string[] NALunit_string = Regex.Split(fsStringASHex, #"000001|00000001");
but after looking at the NAL units types, I can find some with reserved types. Is it normal?

Related

How to extract QP per macroblock or slice in HEVC?

How can I get QP value per macroblock or slice from a encoded frame (encode by hevc hardware encoder)? I have tried some hevc bitstream parser like hevcexbrowser https://github.com/virinext/hevcesbrowser but it doesn't have access to CTU or parse slice body.
You can decode the bitstream using an open-source decoder, then modify that decoder to dump the information you need during its parsing. I'd recommend Hevc test Model (HM)

Why is 32768 used as a constant to normalize the wav data in VGGish?

I'm trying to follow along with what the code is doing for VGGish and I came across a piece that I don't really understand. In vggish_input.py there is this:
def wavfile_to_examples(wav_file):
"""Convenience wrapper around waveform_to_examples() for a common WAV format.
Args:
wav_file: String path to a file, or a file-like object. The file
is assumed to contain WAV audio data with signed 16-bit PCM samples.
Returns:
See waveform_to_examples.
"""
wav_data, sr = wav_read(wav_file)
assert wav_data.dtype == np.int16, 'Bad sample type: %r' % wav_data.dtype
samples = wav_data / 32768.0 # Convert to [-1.0, +1.0]
return waveform_to_examples(samples, sr)
Where does the constant of 32768 come from and how does dividing that convert the data to samples?
I found this for converting to -1 and +1 and not sure how to bridge that with 32768.
https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
32768 is 2^15. int16 has a range of -32768 to +32767. If you have int16 as input and divide it by 2^15, you get a number between -1 and +1.

TensorFlow Lite: Is tensor string buffer format ASCII or UTF-8?

Are the strings stored in a .tflite tensor buffer in ASCII or UTF-8 format?
The few TensorFlow Lite ops that deal with string can handle UTF-8 and adhere to the format described in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/string_util.h#L17
That is:
4 bytes specifying the number of strings in the tensor
a section describing the length and location (offset) of each string inside the buffer, and the length of the buffer itself.
a section containing the actual strings.

PDF stream encoded using FlateDecode with predictor and not enough data

Is it allowed that a stream that is encoded using FlateDecode with a PNG predictor has a last predictor row that doesn't have the same column width as all the other rows? I.e. it misses some data?
Imagine, for example, a stream that has already been decoded using the Flate algorithm, resulting in 105 bytes. And a predictor with the parameters <</Predictor 15 /Columns 10>>.
Since the stream has 105 bytes, the predictor can decode 10 full rows containing 10 columns each, and one row with only 5 columns, i.e. data for 5 columns is missing. Should the last row be decoded as a row with only 5 columns, or should the last 5 bytes be discarded, or is the stream as a whole just invalid?
I didn't find anything in the PDF specification but I came across two PDF files in the wild that have such streams.
It is up to you to decide how to deal with invalid streams, PDF specification does not handle invalid data.
For example we take all the data that can be decoded and the rest is padded with 0.

Read in 4-byte words from binary file in Julia

I have a simple binary file that contains 32-bit floats adjacent to each other.
Using Julia, I would like to read each number (i.e. each 32-bit word) and put them each sequentially into a array of Float32 format.
I've tried a few different things through looking at the documentation, but all have yielded impossible values (I am using a binary file with known values as dummy input). It appears that:
Julia is reading the binary file one-byte at a time.
Julia is putting each byte into a Uint8 array.
For example, readbytes(f, 4) gives a 4-element array of unsigned 8-bit integers. read(f, Float32, DIM) also gives strange values.
Anyone have any idea how I should proceed?
I'm not sure of the best way of reading it in as Float32 directly, but given an array of 4*n Uint8s, I'd turn it into an array of n Float32s using reinterpret (doc link):
raw = rand(Uint8, 4*10) # i.e. a vector of Uint8 aka bytes
floats = reinterpret(Float32, raw) # now a vector of 10 Float32s
With output:
julia> raw = rand(Uint8, 4*2)
8-element Array{Uint8,1}:
0xc8
0xa3
0xac
0x12
0xcd
0xa2
0xd3
0x51
julia> floats = reinterpret(Float32, raw)
2-element Array{Float32,1}:
1.08951e-27
1.13621e11
(EDIT 2020: Outdated, see newest answer.) I found the issue. The correct way of importing binary data in single precision floating point format is read(f, Float32, NUM_VALS), where f is the file stream, Float32 is the data type, and NUM_VALS is the number of words (values or data points) in the binary data file.
It turns out that every time you call read(f, [...]) the data pointer iterates to the next item in the binary file.
This allows people to be able to read in data line-by-line simply:
f = open("my_file.bin")
first_item = read(f, Float32)
second_item = read(f, Float32)
# etc ...
However, I wanted to load in all the data in one line of code. As I was debugging, I had used read() on the same file pointer several times without re-declaring the file pointer. As a result, when I experimented with the correct operation, namely read(f, Float32, NUM_VALS), I got an unexpected value.
Julia Language has changed a lot since 5 years ago. read() no longer has API to specify Type and length simultaneously. reinterpret() creates a view of a binary array instead of array with desired type. It seems that now the best way to do this is to pre-allocate the desired array and fill it with read!:
data = Array{Float32, 1}(undef, 128)
read!(io, data)
This fills data with desired float numbers.