WebRtc Blank or Black RenderView in Android - webrtc

Latest theres some problems suddenly break out in the video call sence...May this be the Janus Server problem?
Here is the scene.
Client A make a video call to ClientB. at the start parse, Both ClientA and ClientB render the video data from the font-camera capturer, and make a peerconnection to JanusServer. and then just fine.
Then the Client B received the video calling request and join the video, then try to build the peerconnection to Janus Server, and also build the peerconnection to the ClientA.
But Client B work fine, and is correctly render the local video data from font-camera, and also the remote video data from ClientA.
But sometimes, Client A render fail while the remote data have received… then I capture the log, EglRenderer report that the frame drop like below. But i can still hear the voice from ClientB
2022-05-24 16:12:07.118 20211-20285/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4006 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0. Average render time: NA. Average swapBuffer time: NA.
2022-05-24 16:12:07.352 20211-20329/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4008 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0. Average render time: NA. Average swapBuffer time: NA.
2022-05-24 16:12:07.353 20211-20329/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4008 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0. Average render time: NA. Average swapBuffer time: NA.
2022-05-24 16:12:09.972 20211-20327/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4007 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0. Average render time: NA. Average swapBuffer time: NA.
2022-05-24 16:12:10.999 20211-20324/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4006 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0. Average render time: NA. Average swapBuffer time: NA.
2022-05-24 16:12:11.123 20211-20322/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4006 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0. Average render time: NA. Average swapBuffer time: NA.
2022-05-24 16:12:11.124 20211-20322/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4006 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0. Average render time: NA. Average swapBuffer time: NA.
2022-05-24 16:12:11.362 20211-20290/? I/-org.webrtc.Logging-: 0-EglRenderer: videoRenderDuration: 4007 ms. Frames received: 0. Dropped: 0. Rendered: 0. Render fps: .0.
And then i realized that is probably there’s some problems happen in the network data transmission…then i get the RTCStatsType from differnt(ClientA to janus and ClientA to ClientB) peerconnection. and i found a weird log in ClientA.
As I known as PeerConnection, its suppose to be theres only one inboundVideoStream and inboundAudioStream, and only one outboundVideoStream and outboundVideoStream beside the multiStream sence(But in this situation, I aint use multistream)..And the second inboundVideoStream byteReceived is alway zero..
Its theres any possible that the EglRender get the second video stream as sink and pass it to native?? But it doesn't make sense, cause peerConnection also have two different inboundAudioStream, but work fine.
Also i have check the ClientB log, Its seem normal, one peerconnection just have one inboundVideoStream and one inboudAudioStream.
So why? Why one peerConnection have two different InboundVideoStream? May this can be the problem to lead to this issue?
-----inbound.
2022-05-24 16:36:42.911 20211-20326/? D/-ConnectionStatusHelper-: 0-
userName:69b8cd9238004cf9bb33d36c4f2f
OriginData:ConnectionStatus(key='RTCInboundRTPAudioStream_3474935701', id='RTCInboundRTPAudioStream_3474935701', type='inbound-rtp', timeStamp=1653381402904495, info=AudioInboundConnectionStatus(packetsReceived=4897, bytesReceived=320554, packetsLost=0, lastPacketReceivedTimestamp=242214.814, jitter=0.002, fractionLost=0.0, packetsReceivedDiff=0, bytesReceivedDiff=0, packetsLostDiff=0), BaseConnectionStatus(ssrc=3474935701, isRemote=false, mediaType='audio', kind='audio', transportId='RTCTransport_audio_1', trackId='', codecId='RTCCodec_audio_Inbound_111', updateTime=1653381402910), isValid=true)
2022-05-24 16:36:42.912 20211-20291/? D/-ConnectionStatusHelper-: 0-
userName:69b8cd9238004cf9bb33d36c4f2f
OriginData:ConnectionStatus(key='RTCInboundRTPAudioStream_4186376904', id='RTCInboundRTPAudioStream_4186376904', type='inbound-rtp', timeStamp=1653381402904495, info=AudioInboundConnectionStatus(packetsReceived=0, bytesReceived=0, packetsLost=0, lastPacketReceivedTimestamp=0.0, jitter=0.0, fractionLost=0.0, packetsReceivedDiff=0, bytesReceivedDiff=0, packetsLostDiff=0), BaseConnectionStatus(ssrc=4186376904, isRemote=false, mediaType='audio', kind='audio', transportId='RTCTransport_audio_1', trackId='RTCMediaStreamTrack_receiver_17', codecId='', updateTime=1653381402910), isValid=true)
2022-05-24 16:36:42.913 20211-20322/? D/-ConnectionStatusHelper-: 0-
userName:69b8cd9238004cf9bb33d36c4f2f
OriginData:ConnectionStatus(key='RTCInboundRTPVideoStream_1827282291', id='RTCInboundRTPVideoStream_1827282291', type='inbound-rtp', timeStamp=1653381402904495, info=VideoInboundConnectionStatus(firCount=0, pliCount=3, nackCount=24, sliCount=0, packetsReceived=19182, bytesReceived=21350593, jitter=0.0, packetsLost=4, fractionLost=0.0, framesDecoded=2763, packetsReceivedDiff=0, bytesReceivedDiff=0, packetsLostDiff=0), BaseConnectionStatus(ssrc=1827282291, isRemote=false, mediaType='video', kind='video', transportId='RTCTransport_audio_1', trackId='', codecId='RTCCodec_video_Inbound_96', updateTime=1653381402910), isValid=true)
2022-05-24 16:36:42.915 20211-20326/? D/-ConnectionStatusHelper-: 0-
userName:69b8cd9238004cf9bb33d36c4f2f
OriginData:ConnectionStatus(key='RTCInboundRTPVideoStream_391733382', id='RTCInboundRTPVideoStream_391733382', type='inbound-rtp', timeStamp=1653381402904495, info=VideoInboundConnectionStatus(firCount=0, pliCount=0, nackCount=0, sliCount=0, packetsReceived=0, bytesReceived=0, jitter=0.0, packetsLost=0, fractionLost=0.0, framesDecoded=0, packetsReceivedDiff=0, bytesReceivedDiff=0, packetsLostDiff=0), BaseConnectionStatus(ssrc=391733382, isRemote=false, mediaType='video', kind='video', transportId='RTCTransport_audio_1', trackId='RTCMediaStreamTrack_receiver_18', codecId='', updateTime=1653381402910), isValid=true)
-----outbound
2022-05-24 16:36:42.915 20211-20283/? D/-ConnectionStatusHelper-: 0-
userName:69b8cd9238004cf9bb33d36c4f2f
OriginData:ConnectionStatus(key='RTCOutboundRTPAudioStream_3823193211', id='RTCOutboundRTPAudioStream_3823193211', type='outbound-rtp', timeStamp=1653381402904495, info=AudioOutboundConnectionStatus(packetsSent=0, retransmittedPacketsSent=0, bytesSent=0, retransmittedBytesSent=0, packetsSentDiff=0, bytesSentDiffer=0), BaseConnectionStatus(ssrc=3823193211, isRemote=false, mediaType='audio', kind='audio', transportId='RTCTransport_audio_1', trackId='RTCMediaStreamTrack_sender_35', codecId='RTCCodec_audio_Outbound_111', updateTime=1653381402910), isValid=true)
2022-05-24 16:36:42.916 20211-20326/? D/-ConnectionStatusHelper-: 0-
userName:69b8cd9238004cf9bb33d36c4f2f
OriginData:ConnectionStatus(key='RTCOutboundRTPVideoStream_584071059', id='RTCOutboundRTPVideoStream_584071059', type='outbound-rtp', timeStamp=1653381402904495, info=VideoOutboundConnectionStatus(firCount=0, pliCount=0, nackCount=0, sliCount=0, qpSum=0, packetsSent=0, retransmittedPacketsSent=0, bytesSent=0, retransmittedBytesSent=0, framesEncoded=0, totalEncodeTime=0.0, packetsSentDiff=0, bytesSentDiffer=0), BaseConnectionStatus(ssrc=584071059, isRemote=false, mediaType='video', kind='video', transportId='RTCTransport_audio_1', trackId='RTCMediaStreamTrack_sender_36', codecId='RTCCodec_video_Outbound_96', updateTime=1653381402910), isValid=true)

Related

Training with Roboflow-Train-YOLOv5 stops with a '^C'

Running Roboflow's notebook, 'Roboflow-Train-YOLOv5', stops after completion the epochs loop.
Instead a reporting the results, I get the following lines, with a ^C at the end of the 3rd line
from the end.
I would like to know the reason for the failure, and there is a way to fix it.
10 epochs completed in 0.191 hours.
Optimizer stripped from runs/train/yolov5s_results2/weights/last.pt, 14.9MB
Optimizer stripped from runs/train/yolov5s_results2/weights/best.pt, 14.9MB
Validating runs/train/yolov5s_results2/weights/best.pt...
Fusing layers...
my_YOLOv5s summary: 213 layers, 7015519 parameters, 0 gradients, 15.8 GFLOPs
Class Images Labels P R mAP#.5 mAP#.5:.95: 20% 1/5 [00:01<00:04, 1.03s/it]^C
CPU times: user 7.01 s, sys: 830 ms, total: 7.84 s
Wall time: 12min 31s
My Colab plan is Colab pro, so I guess it is not a problem of resources.

How to Increase Flask RestAPI Concurrent Request Performance

I'm using "gunicorn" and "gevent" to serve flask APIs with 4 workers. When I do AB (apache-benchmark) test to a single API response time per request is 60ms. However, when I call the second API from the first one response time per request is going up to 358ms. As I share the code sample of APIs, they only return a "Hi there! " response. What could be the reason for this response time increase?
First API
import requests
from flask import Flask, request
api_url = f'http://127.0.0.1:4000/'
app = Flask(__name__)
#app.route('/', methods=['GET'])
def index():
return 'Hi there! '
#app.route('/test', methods=['GET'])
def test():
resp = requests.get(f'{api_url}')
response = resp.text
return response
Second API
from flask import Flask, request
app = Flask(__name__)
#app.route('/', methods=['GET'])
def index():
return 'Hi there!'
ab test on "/" path with 1000 concurrent 10000 request
Benchmarking 0.0.0.0 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests
Server Software: gunicorn
Server Hostname: 0.0.0.0
Server Port: 3000
Document Path: /
Document Length: 10 bytes
Concurrency Level: 1000
Time taken for tests: 0.602 seconds
Complete requests: 10000
Failed requests: 0
Total transferred: 1630000 bytes
HTML transferred: 100000 bytes
Requests per second: 16604.12 [#/sec] (mean)
Time per request: 60.226 [ms] (mean)
Time per request: 0.060 [ms] (mean, across all concurrent requests)
Transfer rate: 2643.04 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 3.2 0 18
Processing: 2 55 11.7 59 60
Waiting: 1 55 11.7 59 60
Total: 8 56 9.1 59 67
Percentage of the requests served within a certain time (ms)
50% 59
66% 59
75% 59
80% 59
90% 60
95% 60
98% 60
99% 60
100% 67 (longest request)
ab test on "/test" path by calling the second API with 1000 concurrent 10000 request response result per request -> 358.406 ms
Benchmarking 0.0.0.0 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests
Server Software: gunicorn
Server Hostname: 0.0.0.0
Server Port: 3000
Document Path: /test
Document Length: 9 bytes
Concurrency Level: 1000
Time taken for tests: 3.584 seconds
Complete requests: 10000
Failed requests: 0
Total transferred: 1610000 bytes
HTML transferred: 90000 bytes
Requests per second: 2790.13 [#/sec] (mean)
Time per request: 358.406 [ms] (mean)
Time per request: 0.358 [ms] (mean, across all concurrent requests)
Transfer rate: 438.68 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 3.1 0 13
Processing: 16 339 54.2 358 446
Waiting: 3 339 54.2 358 446
Total: 16 340 53.4 359 458
Percentage of the requests served within a certain time (ms)
50% 359
66% 364
75% 367
80% 369
90% 372
95% 376
98% 379
99% 401
100% 458 (longest request)

tf.subtract cost too long time for large array

The Tensorflow tf.subtract cost too long time for the large array.
My workstation configuration:
CPU: Xeon E5 2699 v3
Mem: 384 GB
GPU: NVIDIA K80
CUDA: 8.5
CUDNN: 5.1
Tensorflow: 1.1.0, GPU version
The following is the test code and result.
import tensorflow as tf
import numpy as np
import time
W=3000
H=4000
in_a = tf.placeholder(tf.float32,(W,H))
in_b = tf.placeholder(tf.float32,(W,H))
def test_sub(number):
sess=tf.Session()
out = tf.subtract(in_a,in_b)
for i in range(number):
a=np.random.rand(W,H)
b=np.random.rand(W,H)
feed_dict = {in_a:a,
in_b:b}
t0=time.time()
out_ = sess.run(out,feed_dict=feed_dict)
t_=(time.time()-t0) * 1000
print "index:",str(i), " total time:",str(t_)," ms"
test_sub(20)
Results:
index: 0 total time: 338.145017624 ms
index: 1 total time: 137.024879456 ms
index: 2 total time: 132.538080215 ms
index: 3 total time: 133.152961731 ms
index: 4 total time: 132.885932922 ms
index: 5 total time: 135.06102562 ms
index: 6 total time: 136.723041534 ms
index: 7 total time: 137.926101685 ms
index: 8 total time: 133.605003357 ms
index: 9 total time: 133.143901825 ms
index: 10 total time: 136.317968369 ms
index: 11 total time: 137.830018997 ms
index: 12 total time: 135.458946228 ms
index: 13 total time: 132.793903351 ms
index: 14 total time: 144.603967667 ms
index: 15 total time: 134.593963623 ms
index: 16 total time: 135.535001755 ms
index: 17 total time: 133.697032928 ms
index: 18 total time: 136.134147644 ms
index: 19 total time: 133.810043335 ms
The test result shows it(i.e., tf.subtract) cost more than 130 ms to dispose a 3000x4000 subtraction, which obviously is too long, especially on the NVIDIA k80 GPU platform.
Can anyone provide some methods to optimize the tf.subtract?
Thanks in advance.
You're measuring not only the execution time of tf.subtract but also the time required from transferring the input data from the CPU memory to the GPU memory: this is your bottleneck.
To avoid it, don't use placeholders to feed the data but generate it with tensorflow (if you have to randomly generate it) or if you have to read them, use the tensorflow input pipeline. (that creates threads that reads the input for you before starting and then feed the graph without exiting from the tensorflow graph)
It's important to do more possible operations within the tensorflow graph in order to remove the data transfer bottleneck.
It sounds reasonable that the time I measured contained the data transferring time from CPU memory to GPU memory.
Since I have to read the input data (e.g., the input data are images generated by mobile phone and they are sent to the tensorflow one by one), does it mean that the tensorflow placeholders must be used?
To the situation mentioned above (the input data are images generated by mobile phone and they are sent to the tensorflow one by one), if two images are not generated at the same time (i.e., the second image comes long after the first one), how can the input pipeline threads read the input data before starting (i.e., the second image are not generated when the tensorflow is disposing the first image)? So, could you give me a simple example to explain the tensorflow input pipeline?

Safari progress bar incomplete on HLS stream

I'm transcoding videos to HLS and am able to validate the stream via:
$mediastreamvalidator <url-to-master-playlist.m3u8>
Mediastreamvalidator gives no errors and all segments seem ok.
However on many videos the progress bar contains a small white/unloaded segment which never changes despite the video segments fully loading. Does this have to do with my encoding properties?
Here is the mediastreamvalidator output for one of the errant segments:
Processed 3 out of 3 segments
Average segment duration: 3.686918
Total segment bitrates (all discontinuities): average: 2293.51 kb/s, max: 2562.78 kb/s
Playlist max bitrate: 2374.000000 kb/s
Audio Group ID: AUDIO
Discontinuity: sequence: 0, parsed segment count: 3 of 3, duration: 11.061 sec, average: 2293.51 kb/s, max: 2562.78 kb/s
Track ID: 1
Audio Codec: AAC-LC
Audio sample rate: 44100 Hz
Audio channel layout: Stereo (L R)
Track ID: 2
Video Codec: avc1
Video profile: Main
Video level: 3.1
Video resolution: 480x360
Video average IDR interval: 3.409137, Standard deviation: 0.000005
Video frame rate: 26.400

G1 garbage having one slow worker

I'm having an issue with GC pause (~400ms) which I'm trying to reduce. I noticed that I always have one worker a lot slower than others :
2013-06-03T17:24:51.606+0200: 605364.503: [GC pause (mixed)
Desired survivor size 109051904 bytes, new threshold 1 (max 1)
- age 1: 47105856 bytes, 47105856 total
, 0.47251300 secs]
[Parallel Time: 458.8 ms]
[GC Worker Start (ms): 605364503.9 605364503.9 605364503.9 605364503.9 605364503.9 605364504.0
Avg: 605364503.9, Min: 605364503.9, Max: 605364504.0, Diff: 0.1]
--> [**Ext Root Scanning (ms)**: **356.4** 3.1 3.7 3.6 3.2 3.0
Avg: 62.2, **Min: 3.0, Max: 356.4, Diff: 353.4**] <---
[Update RS (ms): 0.0 22.4 33.6 21.8 22.3 22.3
Avg: 20.4, Min: 0.0,
As you can see one worker took 356 ms when others took only 3 ms !!!
If someone has an idea or think it's normal ..
[I'd rather post this as a comment, but I still lack the necessary points to do so]
No idea as to whether it is normal, but I've come across the same problem:
2014-01-16T13:52:56.433+0100: 59577.871: [GC pause (young), 2.55099911 secs]
[Parallel Time: 2486.5 ms]
[GC Worker Start (ms): 59577871.3 59577871.4 59577871.4 59577871.4 59577871.4 59577871.5 59577871.5 59577871.5
Avg: 59577871.4, Min: 59577871.3, Max: 59577871.5, Diff: 0.2]
[Ext Root Scanning (ms): 152.0 164.5 159.0 183.7 1807.0 117.4 113.8 138.2
Avg: 354.5, Min: 113.8, Max: 1807.0, Diff: 1693.2]
I've been unable to find much about the subject but here http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2013-February/001484.html
Basically as you surmise one the GC worker threads is being held up
when processing a single root. I've seen s similar issue that's caused
by filling up the code cache (where JIT compiled methods are held).
The code cache is treated as a single root and so is claimed in its
entirety by a single GC worker thread. As a the code cache fills up,
the thread that claims the code cache to scan starts getting held up.
A full GC clears the issue because that's where G1 currently does
class unloading: the full GC unloads a whole bunch of classes allowing
any the compiled code of any of the unloaded classes' methods to be
freed by the nmethod sweeper. So after a a full GC the number of
compiled methods in the code cache is less.
It could also be the just the sheer number of loaded classes as the
system dictionary is also treated as a single claimable root.
I think I'll try enabling the code cache flushing and let you know. If you finally managed to solve this problem please let me know, I'm striving to get done with it as well.
Kind regards