SkiaSharp and GPU acceleration - rendering

I am evaluating SkiaSharp library (from nuget, version 1.59.3). Everything is rendered well. But it seems Skia is not using GPU for accelerated rendering. Windows 10 Task Manager doesn't detect any uses of GPU for my testing application. I am using next commands to create SKCanvas:
using (SKBitmap bitmap = new SKBitmap(gWidth, gHeight, SKColorType.Bgra8888, alphaType))
using (SKCanvas canvas = new SKCanvas(bitmap))
{ ... }
Does GPU acceleration requires some special way to initialize SkiaSharp ?
I have tried next command:
GRContext context = GRContext.Create(GRBackend.OpenGL);
but it returns null.

Ah, GPU.
You need to be in an existing OpenGL/ANGLE context.
I am doing this:
https://github.com/mono/SkiaSharp/blob/master/source/SkiaSharp.Views/SkiaSharp.Views.UWP/SKSwapChainPanel.cs
But, before I actually initialize SkiaSharp, I first have to manually create the ANGLE context:
https://github.com/mono/SkiaSharp/blob/master/source/SkiaSharp.Views/SkiaSharp.Views.UWP/AngleSwapChainPanel.cs
This is the same for all platforms, first set up the OpenGL/ANGLE context, and then when it is the current context, start the SkiaSharp GRContext.
This is something that I do for my unit tests as well:
https://github.com/mono/SkiaSharp/blob/master/tests/Tests/GlContexts/Wgl/WglContext.cs
This is not always the nicest code to write if you aren't a fan of all this setup-y code, but you could use some other library to do all the boilerplate code. As long as there is a valid OpenGL/ANGLE context, you are good.

Related

Metal assertion `A command encoder is already encoding to this command buffer`

I am using Metal in my project and I have encapsulated some of the kernels as functions kind of the same way as MetalPerformanceShaders suggests.
So each my Metal kernel has Objective-C class with the method:
- (void)encodeToCommandBuffer:(id<MTLCommandBuffer>)cmdBuffer
inputTexture:(id<MTLTexture>)inputTexture
outputTexture:(id<MTLTexture>)outputTexture
inputSize:(TextureSize)inputSize
outputSize:(TextureSize)outputSize
{
id<MTLComputeCommandEncoder> enc = [cmdBuffer computeCommandEncoder];
[enc setComputePipelineState:_state];
//set arguments to the state
[enc dispatchThreadgroups:_threadgroupsPerGrid threadsPerThreadgroup:_threadsPerThreadgroup];
[enc endEncoding];
}
The problem is that my code crashes with the assertion:
failed assertion A command encoder is already encoding to this command buffer
Issue is random, happens at different functions. The error description is self explanatory, but what I am curious is - crashes happen in my encodeToCommandBuffer methods. In the pipeline I also use Image Processing functions from MetalPerformanceShaders and these are also get called with encodeToCommandBuffer method and these do not crash.
So it is clear that my understanding how encodeToCommandBuffer method should be written is wrong. How do I need to modify the code? Do I need to check for cmdBuffer state somehow? That it is ready to produce new Encoder. And what if it's not? Do I need to have some sort of while loop that will wait until buffer is ready?
Ok, sorted out. My pipeline is processing multiple instances in parallel, and I made a mistake in the code - pipeline tried to process all instances through the same command buffer, when it was not intended.

cupy.RawModule using name_expressions and nvcc and/or path

I am using CuPy for testing cuda kernels from a library. More specifically I use the cupy.RawModule to exploit the kernels in python. However, the kernels are templated and enclosed in a namespace. Before the name_expressions parameter to RawModule in CuPy 8.0.0, I had to copy the c++-mangled names into the get_function() method manually of the RawModule. Using name_expressions I thought that this should be possible, nevertheless, this requires the code to be compiled from source using the code parameter in combination with backend='nvrtc'.
Should it be possible to enable (any of the below)?:
name_expressions in conjunction with path
name_expressions in conjunction with backend='nvcc'
Should it be possible to enable (any of the below)?:
'name_expressions' in conjunction with 'path'
'name_expressions' in conjunction with 'backend'='nvcc'
The answer is no for both questions.
The name_expressions feature requires the source code for just-in-time (JIT) compilation of your C++ template kernels using NVRTC, whereas the path argument is for loading external cubin, fatbin, or ptx code. If you want to compile an external source code, you can do so by loading it in Python first, and then pass it as the code argument:
with open('my_cuda_cpp_code.cu') as f:
code = f.read()
mod = cp.RawModule(code=code, name_expressions=(...), ...)
Unfortunately unlike NVRTC, NVCC does not provide an API to return mangled names, so using NVCC is not possible. If you pass backend='nvcc' to RawModule, it'd raise an error.

Problem with type FreeRV while adding new distribution

I'm trying to add a new discrete distribution to PyMC3 (a Wallenius non-central hypergeometric) by wrapping Agner Fogs c++ version of it (https://www.agner.org/random/).
I have successfully put the relevant functions in a c++ extension and added broadcasting so that it behaves as scipy's distributions. (For now broadcasting is done in Python. .. will later try the xtensor-python bindings for more performant vectorization in c++.)
I'm running into the following problem: when I instantiate an RV of the new distribution in a model context, I'm getting a "TypeError: an integer is required (got type FreeRV)" from where "value" is passed to the logp() function of the new distribution.
I understand that PyMC3 might need to connect RVs to the functions, but I find no way to cast them into something my new functions can work with.
Any hints on how to resolve this or general info on adding new distributions to PyMC3 or the internal workings of distributions would be extremely helpful.
Thanks in advance!
Jan
EDIT: I noticed that FreeRV inherits from theanos TensorVariable, so I tried calling .eval(). This leads to another error along the lines that no input is connected. (I don't have the exact error message right now).
One thing which puzzles me is why logp is called at instantiation of the variable when setting up the model ...

Best way to modify a built-in TensorFlow kernel

I would like to learn the best way to modify TensorFlow built-in operator kernels.
For example, I want to modify the value of static const double A in tensorflow/core/kernels/resize_bicubic_op.cc. I have come up with two possible ways:
Modify it directly and recompile the whole TensorFlow library. The problems of this solution are: A. This influences all the functions which use bicubic interpolation. B. This requires me to recompile the entire library and does not work when installing from a binary.
Define it as a custom op. The problem is that in the source code, there is no REGISTER_OP() inside. I don't know how to write the REGISTER_OP() for this bicubic function and whether other modification needs to be made.
Are there other better ways?
Thanks.
The best way to approach this problem is to build a custom op. See this tutorial for more details about how to add custom ops in general. The REGISTER_OP call for the tf.image.resize_bicubic() op is in tensorflow/core/ops/image_ops.cc.
Another alternative is to re-use the same op registration, and register a new kernel with the alternative implementation. This would enable you to use the (experimental) Graph.kernel_label_map() API to select an alternative implementation for the "ResizeBicubic" op. For example, you could do the following in your Python program:
_ = tf.load_op_library(...) # Load the .so containing your implementation.
with tf.get_default_graph().kernel_label_map({"ResizeBicubic": "my_impl"}):
images = tf.image.resize_bicubic(...) # Will use your implementation.
...and add a kernel registration that specifies the label "my_impl" with your C++ code:
template <typename Device, typename T>
class MyResizeBicubicOp<Device, T> : public OpKernel {
// Custom implementation goes here...
}
#define REGISTER_KERNEL(T) \
REGISTER_KERNEL_BUILDER(Name("ResizeBicubic") \
.Device(DEVICE_CPU) \
.Label("my_impl") \
.TypeConstraint<T>("T") \
.HostMemory("size"), \
MyResizeBicubicOp<CPUDevice, T>);
TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL);

How do I scan/enumerate vst plugin dlls?

I'm trying to build a small program that hosts vst effects and I would like to scan a folder for plugin dlls.
I know how to find all the dlls but now I have the following questions:
What is the best way to determine if a given dll is a vst plugin?
I tried to just see if the ddl exports the proper function and this works fine for plugins made with the more recent versions of the vst sdk since it exports a method called "VstPluginMain" but older versions export a rather generic "main" function.
How do I determine if the plugin is an effect or an instrument?
How do I scan vst shell plugins?
Shell plugins are basically dlls that somehow contain multiple effects. An example of this are the plugins made by Waves Audio http://www.waves.com/
ps: If there is a library that can do all of this for me please let me know.
How to determine a VST plugin?
Once you've found main/VSTPluginMain... call it!
If what's returned is NULL, it's not a VST.
If what's returned is a pointer to the bytes "VstP" (see VstInt32 magic; ///< must be #kEffectMagic ('VstP') in aeffect.h), then you have a VST.
The VSTPluginMain returns a pointer to an AEffect structure. You will need to look at this structure.
Effect or instrument? AEffect::flags | (effFlagsIsSynth = 1 << 8)
Shell VSTs are more complex:
Category will be kPlugCategShell
Support the "shellCategory" canDo.
Use effShellGetNextPlugin to enumerate.
To instance, respond to audioMasterCurrentId in your callback with the ID you want.
#Dave Gamble nailed it, but I wanted to add a few things on VST shell plugins, since they are a bit tricky to work with.
To determine if a VST is a shell plugin, send the effGetPlugCategory opcode to the plugin dispatcher. If it returns kPlugCategShell, then it's a shell plugin. To get the list of sub-plugins in the shell, you basically call effShellGetNextPlugin until it returns 0. Example code snippit (adapted from a working VST host):
// All this stuff should probably be set up far earlier in your code...
// This assumes that you have already opened the plugin and called VSTPluginMain()
typedef VstIntPtr (*Vst2xPluginDispatcherFunc)(AEffect *effect, VstInt32 opCode, VstInt32 index, VstIntPtr value, void *ptr, float opt);
Vst2xPluginDispatcherFunc dispatcher;
AEffect* plugin;
char nameBuffer[40];
while(true) {
memset(nameBuffer, 0, 40);
VstInt32 shellPluginId = dispatcher(pluginHandle, effShellGetNextPlugin, 0, 0, nameBuffer, 0.0f);
if(shellPluginId == 0 || nameBuffer[0] == '\0') {
break;
}
else {
// Do something with the name and ID
}
}
If you actually want to load a plugin in a VST shell, it's a bit trickier. First, your host needs to handle the audioMasterCurrentId opcode in the host callback. When you call the VST's VSTPluginMain() method to instantiate the plugin, it will call the host callback with this opcode and ask for the unique ID which should be loaded.
Because this callback is made before the main function returns (and hence, before it delivers an AEffect* to your host), that means that you probably will need to store the shell plugin ID to load in a global variable, since you will not be able to save a pointer to any meaningful data in void* user field of the AEffect struct in time for it to be passed back to you in the host callback.
If you want to develop your VST Host application in .NET take a look at VST.NET