extracting motion-compensated frames during HEVC encoding

extracting motion-compensated frames during HEVC encoding - hevc

I am trying to analyze H.265 coding performance. Is there a way to export the predicted frames for H.265/HEVC encoding? Specifically, how should I obtain reconstructed frames after compensating with the motion vectors, but before applying the residual? Is there a way to do this with ffmpeg, or any other codec analysis tool?

Yes you can do it with HM decoder.
What you need to do is to find the exact line of the code in the TDecCu.cpp file, where two pointers piResi and piPred are accessed to be added and reconstruct the block. There, you may print piPred alone.

Related

Converting an executable file into an analog waveform signal

I have been trying to convert a digital binary file (.exe) into waveform to listen the resulted audio. I have been looking for any possible software/open source code to help me in achieving this, but no use.
My ultimate goal is to represent the .exe file as a spectogram to analyse the behaviour of the frequencies in the executable file. My understanding that I have to identify the range of frequencies first, which could be done by plotting the waveform first.
Any reference would be appreciated.
Edit:
I have a collection of binary files and I need to classify them according to their sound statistical features (frequency behaviour). My plan was to get the waveform of the actual binary file (by dividing the file into 1 signed byte each) and then convert the waveform into spectrogram picture and apply deep learning analysis for voice recognition
So, the depth of each sample will be 8-bits, and the sampling rate will be either 8Khz or 16 Khz. But I am confused of how to determine the frequencies related from the executable file

ImageDeserializer mean file with variable sized images?

When you have variable sized input images and use the ImageDeserializer to resize the images, how are you supposed to deal with a mean file? Computing the mean file is easy when the input images are all the same size. Wouldn't it be better if the ImageDeserializer would be capable of compute the means?

The order in which image pre-processing steps are executed by default are:
Take a crop that has the desired image size
Subtract the mean
Hence, your mean file hence only has to contain a mean for the desired input size.
For computing the mean yourself, you will have to repeat these steps, at least to some degree. If you're on .NET, then you may want to have a look at this post where .NET image pre-processing is discussed.
I agree that it would be helpful if there was some tool to compute the mean file. I can understand though why the image deserializer does not do it automatically: You need to transform your training and test data via the same mean file. If you subtract mean from training data automatically, you'll have no way of repeating the same operation on the test data. Plus there is randomization that could make it messy in some cases, etc.

Parsing HEVC for Motion Information

I parsed the HEVC stream by simply identifying sart code (000001 or 00000001), and now I am looking for the motion information in the NAL payload. My goal is to calculate the percentage of the motion information in the stream. Any ideas?

Your best bet is to start with the HM reference software (get it here: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/) and add some debug info as the different kinds of data is read from the bitstream. This is likely much easier than writing bitstream decoder from scratch.
Check out the debug that is built into the software already, for example RExt__DECODER_DEBUG_BIT_STATISTICS or DEBUG_CABAC_BINS. This may do what you want already, if not it will be pretty close. I think information about bit usage can be best collected in source/Lib/TLibDecoder/TDecBinCoderCABAC.cpp during decode.
If you need to speed this up, you can of course skip the actual decode steps :)

At the decoder side, You can find the motion vector information as MVD, so you should using pixel decoding process to get the motion information. it need you to understand the process of the inter prediction at HEVC.
than you!

create a geotiff file from undocumented tif image

I have an undocumented tiff image which I need to use with a software that can read only geotif files. my simplest idea was to pretend the image is at 0N, 0W with a pixel size of 0.00000899928° (1m) in both directions.
I have rea the thread here but I was unable to reproduce the answer.
Thanks for helping. I am a dummy in geodesics, GIS and the like.

You are attempting to georeference a raster, which is often a difficult task, with multiple techniques. It's not possible to provide an answer for your question given the information you have supplied. Also, never assume that lengths in degrees can be converted to lengths in metres (the Earth isn't flat).
Search around GIS.SE for ideas , e.g. using the [georeferencing] tag. There are tools available with QGIS to help manually georeference rasters to other geospatial data.

Good library for Digital watermarking

Can somebody help me, to find a library, or a detailed description of algorithm, that could embed a Digital watermark(invisible watermark, just a kind of steganography) to a jpeg/png file. But the quality of algorithm, should be great. It should be possible to extract this mark after rotation and expansion(if possible) of image.
Mark is just a key 32bytes.
I found a good site, but the algorithm are made for the NetPBM format, that is dead...
I know that there is a LSB method, but it is not stable to the expansion. Are there something better?
Changing metadata, is not suitable, because it is visible changes.

This maybe won't really be an answer, as I don't think it would be easy to give a magical, precise answer on this question.Watermarking is complex, and the best way to do it is by yourself : this will make things more hard for an attacker trying to reverse engineer your code. One could even read your question here, guess what library you used, and attack your system more easily.
Making Steganography resist to expansion in JPEG images is also very hard, because the JPEG compression is reapplied after the expansion. There are in fact a bunch of JPEG steganography algorithms. Which one you should use, depends on what exactly do you require :
Data confidentiality ?
Message presence confidentiality ?
Message coherence after JPEG changes ?
Resistance to "Known Cover" attacks (when attackers try to find the message, based on the steganographic system) ?
Resistance to "Known Message" attacks (when attackers try to find the steganographic system used, based on the message) ?
From what I know, usually, algorithm that resist to JPEG changes (picture recompression) are often really easier to attack, whereas algorithms that run the "encode" stage during the JPEG compression (after the DCT (lossy) transform, and before the Huffmann (non-lossy) transform) are more prone to resist.
Also, one key factor about steganography is scale : if you have only 32bytes of data to encode in a, say, 256*256px image, don't use an algo that can encode 512bytes of data in the same size. Either use a scalable algorithm, either use an algorithm at its efficient scale.
Also, the best way to do good steganography is to know its limitations,and to know how steganalyzers work. Try these tools, so you can understand what attackers will do to your picture.^
Now, I cannot tell you what steganographic system will be the best for you, but I can give you some indications :
jSteg - Quite old, I don't think it will resist to JPEG changes
OutGuess - Quite old too, but one of the best algorithms
F5 (and F3/F4) - More recent, good algorithm, scientifical research behind.
stegHide
I think all of these are LSB based : the encoding is done during the JPEG compression, after the DCT and Quantization. The only non LSB-based steganography system I heard of was mentionned in this research paper, however, I did not read it to the end yet, so I cannot tell if this will meet your needs.
However, I'm not sure there exists a real steganography algorithm resisting to JPEG compression, to JPEG resize and rotation, resisting to visual and statisticals attacks. Or I'm not aware of it.
Sorry for the lack of precise answer, I tried to give you what I know on the subject, as it's always better to be more informed. Sorry also for the lack of proper English, I'm French, nobody's perfect :)

Pistache is right in what he told you regarding the watermarking implementation algorithms. I will try to help you by showing one algorithm for the given requirements.
Before explaining you the algorithms first I guess that the distinction between the JPG and PNG formats should be done.
JPEG is a lossy format, i.e, the images are susceptible to compression that could remove the watermark. When you open an image for manipulation purposes and you save it, upon the writing procedure, a compression is made by using DCT filtering that removes some important components of the image.
On the other hand, PNG format is lossless, and that means that images are not susceptible to such kind of compression when stored after manipulation.
As a matter of fact, JPEG is used as a watermarking scheme attack due to its compressing characteristic that could remove the watermark if an attacker performed the compression.
Now that you know the difference between both formats, I can tell you a suitable algorithm resistant to the attacks that you mentioned.
Regarding methods to embed a watermark message for PNG files you can use the histogram embedding method. The histogram embedding method changes values on the histogram by changing the values of the neighbor bins. For example imagine that you have a PNG image in grayscale.
Therefore, you'll have only one channel for embedding and that means that you have one histogram with 256 bins. By selecting the neighbor bins x and x+1, you change the values of x and x+1 by moving the pixels with the bright x to x+1 or the other way around, so that (x/(x+1))>T for embedding a '1' or ((x+1)/x)>T for embedding a '0'.
You can repeat the same procedure for the whole histogram length and therefore you can embed in the best case up to 128bits. However this payload is less than what you asked. Therefore I suggest you to split the image into parts, for example blocks, and if you split one image into 4 components you'd be able to embed in the best case up to
512 bits which means 64 bytes.
This method although is very, susceptible to filtering and compression if applied straight in the space domain. Therefore, I suggest you to compute before the DWT of the image and use its low-frequency sub-band. This will provide you better transparency and robustness increased for the warping, resizing etc attacks and compression or filtering as well.
There are other approaches such as LPM (Log Polar Maps) but they are very complex to implement and I think for your case this approach would be fine.
I can suggest you two papers, the first is:
Watermarking digital image and video data. A state-of-the-art overview
This paper will give you some basic notions of watermarking and explain more in detail the LSB algorithm. And the second paper is:
Real-Time Compressed- Domain Video Watermarking Resistance to Geometric Distortions
This paper will explain the algorithm that I just explained now.
Cheers,

I do not know if you are considering approaches different to steganography. Instead of storing data hidden in the pixel data you could create a new data block in the JPEG file and store encripted data.
Take a look at the JPEG file structure on Wikipedia
You can create an application specific data block, using the marker 0xFF 0xEn. Doing so, any change in the image pixels do not change the information stored in the image. Moreover, many image editing software respect custom data blocks and will keep them even after image manipulation.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas