gstreamer custom plugin for nvidia gpu - gpu

I want to develop a gstreamer plugin that can use the acceleration provided by the GPU of the graphics card (NVIDIA RTX2xxx). The objective is to have a fast gstreamer pipeline that process a video stream including on it a custom filter.
After two days googling, I can not find any example or hint.
One of the best alternatives found is use "nvivafilter", passing a cuda module as argument. However, no where explains how to install this plugin or provides an example. Worst, it seems that it could be specific for Nvidia Jetson hardware.
Another alternative seems use gstreamer inside an opencv python script. But that means a mix that I do not known how impacts performance.
This gstreamer tutorial talks about several libraries. But seems outdated and not provides details.
RidgeRun seems to have something similar to "nvivafilter", but not FOSS.
Has someone any example or suggestion about how to proceed.

I suggest you start with installing DS 5.0 and explore the examples and the apps provided. It's built on Gstreamer. Deepstream Instalation guide
The installation is straight forward. You will find custom parsers built.
You will need to install the following: Ubuntu 18.04, GStreamer 1.14.1, NVIDIA driver 440or later,CUDA 10.2,TensorRT 7.0 or later.
Here is an example of running an app with 4 streams. deepstream-app -c /opt/nvidia/deepstream/deepstream-5.0/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt
The advantage of DS is that all the video pipeline is optimized on GPU including decoding and preprocessing. You can always run Gstreamer along opencv only, in my experience it's not an efficient implementation.
Building custom parsers:
The parsers are required to convert the raw Tensor data from the inference to (x,y) location of bounding boxes around the detected object. This post-processing algorithm will vary based on the detection architecture.
If using Deepstream 4.0, Transfer Learning Toolkit 1.0 and TensorRT 6.0: follow the instructions in the repository https://github.com/NVIDIA-AI-IOT/deepstream_4.x_apps
If using Deepstream 5.0, Transfer Learning Toolkit 2.0 and TensorRT 7.0: keep following the instructions from https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps
Resources:
Starting page: https://developer.nvidia.com/deepstream-sdk
Deepstream download and resources: https://developer.nvidia.com/deepstream-getting-started
Quick start manual: https://docs.nvidia.com/metropolis/deepstream/dev-guide/index.html
Integrate TLT model with Deepstream SDK: https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps
Deepstream Devblog: https://devblogs.nvidia.com/building-iva-apps-using-deepstream-5.0/
Plugin manual: https://docs.nvidia.com/metropolis/deepstream/plugin-manual/index.html
Deepstream 5.0 release notes: https://docs.nvidia.com/metropolis/deepstream/DeepStream_5.0_Release_Notes.pdf
Transfer Learning Toolkit v2.0 Release Notes: https://docs.nvidia.com/metropolis/TLT/tlt-release-notes/index.html
Transfer Learning Toolkit v2.0 Getting Started Guide: https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html
Metropolis documentation: https://docs.nvidia.com/metropolis/
TensorRT: https://developer.nvidia.com/tensorrt
TensorRT documentation: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html
TensorRT Devblog: https://devblogs.nvidia.com/speeding-up-deep-learning-inference-using-tensorrt/
TensorRT Open Source Software: https://github.com/NVIDIA/TensorRT
https://gstreamer.freedesktop.org/documentation/base/gstbasetransform.html?gi-language=cGood luck.

Related

How can I run Mozilla TTS/Coqui TTS training with CUDA on a Windows system in 2023

there is a post How can I run Mozilla TTS/Coqui TTS training with CUDA on a Windows system? answered, by GuyPaddock, but I have RTX a5000 graphic card, running Windows 10. I'm not a programmer, but I think it needs CUDA version 11.x for this card. Will there be someone good who would write step by step what I should install to be able to run it and train models? (kidna RETARD guide) It's best not to mess with the webUI from AUTOMATIC1111, which requires python 3.10.6. Thanks in advance.
Trying to install it from the link above and also from youtube. I am trying to install this on python 3.10.8 because stable diffusion needs python 3.10.6, And version 3.10.8 is from October like CUDA 11.8. If possible, I'd like a step by step explanation of what I need to do to make it work?

TensorFlow Lite Arduino IDE Library

I am working on using TF Lite to get a trained TF model onto my board via the Arduino IDE. I am using the Circuit Playground Bluefruit board (listed as supported on the TF website). When I try to run the hello-world from the cloned library, I get an "Error in compiling for baord" message.
Adafruit mentions I need only the NON pre-compiled library, but it seems the library was removed from the native library manager a few years ago, making it difficult to find the pre-compiled library. I tried to install using:
git clone https://github.com/tensorflow/tflite-micro-arduino-examples Arduino_TensorFlowLite
which, of course, gets a pre-compiled version. I think this is what is behind the aforementioned error message. Any guidance would be so appreciated!!

Enable XNNPack in TFLite v2.3 for ARM64

The TFLite team recently announced XNNPack support in TF v2.3 (https://blog.tensorflow.org/2020/07/accelerating-tensorflow-lite-xnnpack-integration.html). This should provide some pretty impressive speedups on float operations on ARM v8 cores.
Does anyone know how to enable XNNPack for ARM64 builds of TFLite? The benchmarking application in particular would be a good place to test out this new functionality on target hardware. iOS and Android support is enabled by passing a flag to Bazel when compiling. Unfortunately, no guidance is given for building for ARM64 boards. The build instructions (see below) don't provide any updated guidance, and inspecting download_dependencies.sh doesn't show XNNPack being downloaded from anywhere.
https://www.tensorflow.org/lite/guide/build_arm64
XNNPACK is not yet supported via Makefile-based builds. We have recently added experimental support for cross-compilation to ARM64 (via --config=elinux_aarch64 in the bazel build command), which should allow build-time opt-in to XNNPACK by also adding --define tflite_with_xnnpack=true in your build command. Expect some improvements in documentation for cross-compilation to ARM64 in the next TF 2.4 release, where we'll also be looking into enabling XNNPACK by default for as many platforms as possible.

How to integrate tensorflow into QNX operating system

I want to use the tensorflow in a QNX operating system? The very first step is to integrate the tensorflow into QNX. Any suggestions?
There is an issue on that on GitHub, unfortunately w/o a result but it's a starting point: https://github.com/tensorflow/tensorflow/issues/14753
Depending on your objective, NVIDIA's TensorRT can load TensorFlow models and provides binaries for QNX, see for example https://docs.nvidia.com/deeplearning/sdk/pdf/TensorRT-Release-Notes.pdf

What is SYCL 1.2?

I am trying to install tensorflow
Please specify the location where ComputeCpp for SYCL 1.2 is installed. [Default is /usr/local/computecpp]:
Invalid SYCL 1.2 library path. /usr/local/computecpp/lib/libComputeCpp.so cannot be found
What should I do?What is SYCL 1.2?
SYCL is a C++ abstraction layer for OpenCL. TensorFlow's experimental support for OpenCL uses SYCL, in conjunction with a SYCL-aware C++ compiler.
As Yaroslav pointed out in his comment, SYCL is only required if you are building TensorFlow with OpenCL support. The following question during the execution of ./configure asks about OpenCL support:
Do you wish to build TensorFlow with OpenCL support? [y/N]
If you answer N, you will not have to supply a SYCL path.
It is optional step so you can skip if you want.
OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators.
so if you want to intall you have to Setting Up TensorFlow With OpenCL Using SYCL bellow link provide step by step information about it
https://developer.codeplay.com/computecppce/latest/getting-started-with-tensorflow