WebRTC's Statistics API - estimatedPlayoutTimestamp giving incorrect values - webrtc

I'm currently investigating the WebRTC Statistics API, specifically the identifiers lastPacketReceivedTimestamp and estimatedPlayoutTimestamp. My aim for this is to evaluate when exactly the WebRTC API receives an RTP packet of video data and when exactly that packet is utilised to render a frame of video.
I can convert the values for lastPacketReceivedTimestamp from High Resolution Time to human-readable format, but I am struggling to do so with estimatedPlayoutTimestamp values.
Example outputs for lastPacketReceivedTimestamp are 1648396983645,1648396984656,1648396985657,1648396986656 - these convert well on https://www.epochconverter.com/.
Example outputs for estimatedPlayoutTimestamp are 3857385783571,3857385784570,3857385785580,3857385786570 - these do not convert well, instead reading as many years in the future.
Am I misunderstanding what the values of estimatedPlayoutTimestamp are? I thought they would just be the timestamp of when each packet is used in a render, but this does not appear to be the case. How should I go about finding when exactly each packet is used to render a frame of WebRTC video?
Thanks in advance!

estimatedPlayoutDelayTimestamp is defined as a NTP timestamp in the specification. These start on January 1st 1900, causing it to be 70 years in the future if you consider it to be based on the epoch.
Note that chrome://webrtc-internals currently does the calculation wrong as well...

Related

React Native, IBECONS, RSSI value to distance conversion

How to stabilize the RSSI (Received Signal Strength Indicator) of low energy Bluetooth beacons (BLE) for more accurate distance calculation?
We are trying to develop an indoor navigation system and came across this problem where the RSSI is fluctuating so much that, the distance estimation is nowhere near the correct value. We tried using an advance average calculator but to no use,
The device is constantly getting RSSI values, how to filter them, how to get the mean value, I am completely lost, please help.
Can anyone suggest any npm library or point in the right direction, I have been searching for many days but have not gotten anywhere.
FRONT END: ReactNative BACKEND: NODEJS
In addition to the answer of #davidgyoung, we would like to point out that any filtering method is a compromise between quality of noise level reduction and the time-lag introduced by this filtration (depending on the characteristic filtering time you use in your method). As was pointed by #davidgyoung, if you take characteristic filtering period T you will get an average time-lag of about T/2.
Thus, I think the best approach to solve your problem is not to try to find the best filtering method but to make changes on the transmitter’s end itself.
First you can increase the number of signals, transmitter per second (most of the modern beacon allow to do so by using manufacturer applications and API).
Secondly, you can increase beacon's power (which is also usually one of the beacon’s settings), which usually reduces signal-to-noise ratio.
Finally, you can compare beacons from different vendors. At Navigine company we experimented and tested lots of different beacons from multiple manufacturers, and it appears that signal-to-noise ratio can significantly vary among existing manufacturers. From our side, we recommend taking a look at kontakt.io beacons (https://kontakt.io/) as an one of the recognized leaders with 5+ years experience in the area.
It is unlikely that you will find a pre-built package that will do what you want as your needs are pretty specific. You will most likely have to wtite your own filtering code.
A key challenge is to decide the parameters of your filtering, as an indoor nav use case often is impacted by time lag. If you average RSSI over 30 seconds, for example, the output of your filter will effectively give you the RSSI of where a moving object was on average 15 seconds ago. This may be inappropriate for your use case if dealing with moving objects. Reducing the averaging interval to 5 seconds might help, but still introduces time lag while reducing smoothing of noise. A filter called an Auto-Regressive Moving Average Filter might be a good choice, but I only have an implementation in Java so you would need to translate to JavaScript.
Finally, do not expect a filter to solve all your problems. Even if you smooth out the noise on the RSSI you may find that the distance estimates are not accurate enough for your use case. Make sure you understand the limits of what is possible with this technology. I wrote a deep dive on this topic here.

QPSK works in simulation but not with SDR

I'm going to start off by saying that I'm very new to SDR and GNU Radio. This may be a dumb question, but I have been googling and testing things for about two months now trying to get this to work without success. Any help or pointers would be appreciated!
I'm attempting to use GNU Radio 3.8 to transfer a file using differential QPSK. I've tried to follow the tutorials on the wiki as well as several similar academic papers I found on the internet (which also seem to be based off the wiki tutorial). None of them worked on their own but combining what actually works from each one, I managed to create a flowgraph sans hardware that does indeed send and receive the data from a file. Here's the Flowgraph and here is a screenshot of the results. The results show the four constellation points, and the data from the file source matches up perfectly with the data having gone though the entire transmit+receive chain. In the simulation I have a throttle block and a channel model block where the LimeSDR Source and LimeSDR Sink block would be. So far so good (at least as far as I can tell).
When I actually start transmitting this signal with the SDR, the received data no longer matches up with what is transmitted. Here's the flowgraph I've been using for the transmission. I added a protocol formatter and some FEC blocks that I could have removed for this illustration, but the point is that simply looking at what bits are going into the modulator vs what's being recovered, the two do not match up. The constellation looks good (as far as I can tell) but the bits are all wrong. Here's a screenshot showing the bits being transmitted. You'll notice in the screenshot of the transmitted signal that the signal has a repeating series of three flat top "1's" surrounded on both sides by a period of "0's" (at time 1.5ms and 3.5ms). This is a screenshot of the received bits. At time 1ms and 3ms you can see how it is has significantly more transitions between 1 and 0 than it should.
So at this point I'm stumped. The simulation worked but the real world test does not. I've messed around with the RRC filter properties a significant amount. I have no clue if the values I have chosen are correct as I have not found a tutorial or explanation on how to do so. I just looked at some of the example flowgraphs and made some guesses as to how those values were derived and applied those guesses to my use case. It worked well in the simulation so I thought it would be fine in the real world test. I've tried a variety of samples per symbol but my goal is for a 4800 bit per second transfer speed, and using different samples per symbol didn't help anyway. What should I change in order to get this to work?
Bonus question: The constellation object has QPSK and DQPSK, and the constellation modulator has a differential checkbox. What is the best practice combination of selections to get a differential QPSK modulation?

How do I simulate GPS satellite data?

I am trying to understand the algorithms used in GIS/GPS, and writing a program for that. However, the issue is with test data, which needs to simulate the reception from satellites on any random day.
The test data should be
- a continuous stream from several sources that
- changes in the number of sources (more satellites is more accuracy in calculation)
- should be broadcast with a unique timestamp but needs not be received in order of the timestamp
How should I go about it? Is there a software pattern for this? Any toolkits that do this sort of live "broadcasting"?
[PS: I would be using either gcc or go, so something specific to those will be more useful]

How to detect silence and cut mp3 file without re-encoding using NAudio and .NET

I've been looking for an answer everywhere and I was only able to find some bits and pieces. What I want to do is to load multiple mp3 files (kind of temporarily merge them) and then cut them into pieces using silence detection.
My understanding is that I can use Mp3FileReader for this but the questions are:
1. How do I read say 20 seconds of audio from an mp3 file? Do I need to read 20 times reader.WaveFormat.AverageBytesPerSecond? Or maybe keep on reading frames until the sum of Mp3Frame.SampleCount / Mp3Frame.SampleRate exceeds 20 seconds?
2. How do I actually detect the silence? I would look at an appropriate number of the consecutive samples to check if they are all below some threshold. But how do I access the samples regardless of them being 8 or 16bit, mono or stereo etc.? Can I directly decode an MP3 frame?
3. After I have detected silence at say sample 10465, how do I map it back to the mp3 frame index to perform the cutting without re-encoding?
Here's the approach I'd recommend (which does involve re-encoding)
Use AudioFileReader to get your MP3 as floating point samples directly in the Read method
Find an open source noise gate algorithm, port it to C#, and use that to detect silence (i.e. when noise gate is closed, you have silence. You'll want to tweak threshold and attack/release times)
Create a derived ISampleProvider that uses the noise gate, and in its Read method, does not return samples that are in silence
Either: Pass the output into WaveFileWriter to create a WAV File and and encode the WAV file to MP3
Or: use NAudio.Lame to encode directly without a WAV step. You'll probably need to go from SampleProvider back down to 16 bit WAV provider first
BEFORE READING BELOW: Mark's answer is far easier to implement, and you'll almost certainly be happy with the results. This answer is for those who are willing to spend an inordinate amount of time on it.
So with that said, cutting an MP3 file based on silence without re-encoding or full decoding is actually possible... Basically, you can look at each frame's side info and each granule's gain & huffman data to "estimate" the silence.
Find the silence
Copy all the frames from before the silence to a new file
now it gets tricky...
Pull the audio data from the frames after the silence, keeping track of which frame header goes with what audio data.
Start writing the second new file, but as you write out the frames, update the main_data_begin field so the bit reservoir is in sync with where the audio data really is.
MP3 is a compressed audio format. You can't just cut bits out and expect the remainder to still be a valid MP3 file. In fact, since it's a DCT-based transform, the bits are in the frequency domain instead of the time domain. There simply are no bits for sample 10465. There's a frame which contains sample 10465, and there's a set of bits describing all frequencies in that frame.
Plain cutting the audio at sample 10465 and continuing with some random other sample probably causes a discontinuity, which means the number of frequencies present in the resulting frame skyrockets. So that definitely means a full recode. The better way is to smooth the transition, but that's not a trivial operation. And the result is of course slightly different than the input, so it still means a recode.
I don't understand why you'd want to read 20 seconds of audio anyway. Where's that number coming from? You usually want to read everything.
Sound is a wave; it's entirely expected that it crosses zero. So being close to zero isn't special. For a 20 Hz wave (threshold of hearing), zero crossings happen 40 times per second, but each time you'll have multiple samples near zero. So you basically need multiple samples that are all close to zero, but on both sides. 5 6 7 isn't much for 16 bits sounds, but it might very well be part of a wave that will have a maximum at 10000. You really should check for at least 0.05 seconds to catch those 20 Hz sounds.
Since you detected silence in a 50 millisecond interval, you have a "position" that's approximately several hundred samples wide. With any bit of luck, there's a frame boundary in there. Cut there. Else it's time for reencoding.

MPEG-2 seeking, where to start?

I'd like to be able to seek to an arbitrary frame in a MPEG-2 file (from DVD, I guess it's called MPEG-2 Program Stream). So far I had been using OpenCV 2.1 for accessing those frames but that would only work on a frame after frame basis (only forward seeking). Later when I installed OpenCV 2.3.1 that possibility was lost though, i.e. limited to AVI. Anyways, I'd like to do that without OpenCV. I've managed to seek to keyframes (I suppose) or every so and so frame (e.g. every 12th frame). Now, looking at VirtualDub frame accurate seeking is possible. It says: ''parsing interleaved MPEG-2 file''. What exactly does that mean and where would I have to start to do the same? Is it even legal, I remember reading something about that somewhere, can't really remember though. I'm programming in C++ using directshow. As far as I know directshow won't do it. Then I was looking into CBaseFilter, streamtime method etc but before I dive into that complex topic I'd like to know if that's the right way to go. Looking forward to your answers, thanks!
# Geraint: code snippet of filter graph:
CoCreateInstance(CLSID_FilterGraph,NULL,CLSCTX_INPROC,IID_IGraphBuilder,(LPVOID *)&pGraphBuilder);
CoCreateInstance(CLSID_MPEG2Demultiplexer,NULL,CLSCTX_INPROC,IID_IBaseFilter,(LPVOID *)&pib);
CoCreateInstance(CLSID_CMPEG2VidDecoderDS,NULL,CLSCTX_INPROC,IID_IBaseFilter,(LPVOID *)&pib2);
pGraphBuilder->AddFilter(pib,L"Sample Splitter");
pGraphBuilder->AddFilter(pib2,L"Sample Decoder");
ZeroMemory(&am_media_type, sizeof(am_media_type));
am_media_type.majortype = MEDIATYPE_Video;
am_media_type.subtype = MEDIASUBTYPE_MPEG2_VIDEO;
am_media_type.formattype = FORMAT_MPEG2Video;
pGraphBuilder->QueryInterface(IID_IMediaControl,(LPVOID *)&pMediaControl);
pGraphBuilder->QueryInterface(IID_IMediaSeeking, (void**)(&pMediaSeeking));
pGraphBuilder->QueryInterface(__uuidof(IVideoFrameStep), (PVOID *)&fst);
pGraphBuilder->QueryInterface(IID_IMediaEvent, (void **)&imev);
pGraphBuilder->QueryInterface(IID_IBasicVideo,(LPVOID *)&ibv);
pGraphBuilder->RenderFile(FILENAME,0);
and then I use IMediaSeeking for seeking the vid. I've also tried frame stepping (hence the references above).
DirectShow is capable of delivering frame-accurate seeking. However, without an index, this is based on a time offset from file start, not a frame count.
Use IMediaSeeking to set the start time. The demux will begin delivery of compressed frames some way before that. The decoder will start decoding at the previous key frame but will discard any frames that are before your chosen start point.
G