Vulkan ShaderViewportIndexLayerEXT support - vulkan

I'm in the process of porting some rendering code to Vulkan. I've been using the SPIR-V cross-compiler to avoid the requirement of re-writing all my shaders, which has been working well in most cases, but now I've hit an issue I just can't get past.
I have a vertex shader being compiled to SPIR-V that uses SV_RenderTargetArrayIndex. This maps to ShaderViewportIndexLayerEXT in SPIRV. The device I'm running on (an NVidia 3090 with latest drivers) supports the VK_EXT_shader_viewport_index_layer extension (which is core in 1.2), and I'm explicitly enabling both shaderOutputViewportIndex and shaderOutputLayer in the VkPhysicalDeviceVulkan12Features struct (chained off the vkPhysicalDeviceFeatures2 which is in turn chained on pNext for the vkDeviceCreateInfo struct.
I've also added the line:
[[vk::ext_extension("SPV_EXT_shader_viewport_index_layer")]]
To the .hlsl source file, and verified the SPIRV being output contains the extension reference.
The validation layer is whining, with:
Validation Error: [ VUID-VkShaderModuleCreateInfo-pCode-01091 ] Object 0: handle = 0x1e66c477eb0, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0xa7bb8db6 | vkCreateShaderModule(): The SPIR-V Capability (ShaderViewportIndexLayerEXT) was declared, but none of the requirements were met to use it. The Vulkan spec states: If pCode declares any of the capabilities listed in the SPIR-V Environment appendix, one of the corresponding requirements must be satisfied
Is there something else I need to do to enable this feature on the device? Any help/insight would be greatly appreciated.
(I can always drop the requirement to use this feature - realistically in my use case on DX12 it's not gaining me much if anything, but I'd rather figure this out!)

Related

Can neon registers be indexed?

Consider a neon register such as:
uint16x8_t foo;
To access an individual lane, one is supposed to use vgetq_lane_u16(foo, 3). However, one might be tempted to write foo[3] instead given the intuition that foo is an array of shorts. When doing so, the gcc (10) compiles without warnings, but it is not clear that it does what it was intended to do.
The gcc documentation does not specifically mention the indexed access, but it says that operations behave like C++ valarrays. Those do support indexed access in the intuitive way.
Regardless of what foo[3] evaluates to, doing so seems to be faster than vgetq_lane_u16(foo, 3), so probably they're different or we wouldn't need both.
So what exactly does foo[3] mean? Is its behavior defined at all? If not, why does gcc happily compile it?
The foo[3] form is the GCC Vector extension form, as you have found and linked documentation to; it behaves as so:
Vectors can be subscripted as if the vector were an array with the same number of elements and base type. Out of bound accesses invoke undefined behavior at run time. Warnings for out of bound accesses for vector subscription can be enabled with -Warray-bounds.
This can have surprising results when used on big-endian systems, so Arm’s Arm C Language Extensions recommend to use vget_lane if you are using other Neon intrinsics like vld1 in the same code path.

Why is Aws::String not extending std::string

We previously built aws-sdk-cpp locally and designed a library around it. Now I'm upgrading to use Conan provided aws-sdk-cpp instead of a locally built one, but I'm getting errors from our library.
I call AWS functions such as the following,
std::string src_bucket;
DeleteObject
// ...
obj.WithBucket(src_bucket).WithDelete(std::move(del));
but I get errors like this.
error: no matching function for call to 'Aws::S3::Model::DeleteObjectsRequest::WithBucket(const string&)`.
Has this function call changed to now allow std::string? Is there a way to make AWS methods accept std::string?
Is there a way to make AWS methods accept std::string?
Yes. These functions accept std::string if custom memory management is disabled:
If the compile-time constant is enabled (on), the types resolve to STL types with a custom allocator connected to the AWS memory system.
If the compile-time constant is disabled (off), all Aws::* types resolve to the corresponding default std::* type.
It looks like the former is what you get, and the latter is what you expect - perhaps you've switched from linking the SDK statically (default off) to linking the SDK dynamically (default on)?
In any case, you'll either have to somehow build the SDK with custom memory management disabled, use types like Aws::String yourself, or convert between Aws::String and std::string as needed.

Does the ABI persist any more error information than an HRESULT?

While porting a regular C++ class to a Windows Runtime class, I hit a fairly significant road block. My C++ class reports certain error conditions by throwing custom error objects. This allows clients to conveniently filter on exceptions, documented in the public interface.
I cannot seem to find a reliable way to pass enough information across the ABI to replicate the same fidelity1 using the Windows Runtime. Under the assumption, that an HRESULT is the only generalized error reporting information, I have evaluated the following options:
The 'obvious' choice: Map the exception condition to any of the predefined HRESULT values. While this technically works (presumably), there is no way at the call site to distinguish between errors originating from the implementation, and errors originating from callees of the implementation.
Invent custom HRESULTs. If this layout still applies to the Windows Runtime, I could easily set the Customer bit and go crazy with my 27 bits worth of error code representation. This works, until someone else does the same. I'm not aware of any way to attribute an HRESULT to an interface, which would solve this ambiguity.
Even if either of the above could be made to work as intended, throwing hresult_errors as prescribed, the call site would still be at the mercy of the language projection. While C# seemingly allows to pass any System.Exception(-derived) error object across the ABI, and have them re-thrown at the call site, C++/WinRT only supports some 14 distinct exception types (see throw_hresult).
With neither of these options allowing for sufficiently complete error information to cross the ABI, it seems that an HRESULT simply may not be enough. Does the Windows Runtime have any provisioning to allow for additional (arbitrary) error information to cross the ABI?
1 I'm not strictly interested in passing actual C++ exceptions across. Instead, I'm looking for a way to allow clients to uniquely identify documented error conditions, in a natural way. Passing custom Windows Runtime error types would be fine.
There are a few options here. Our general API guidance for Windows Runtime APIs that have well-defined, expected failure modes is that failure information should be part of the normal parameters and return value. We would normally create a TryDoSomething API in this situation and provide extended error information via either a return or out parameter. This works best for us due to the fact that there's no consistent way to map exceptions across all languages. This is a topic we hope to revisit more in xlang in the future.
HRESULTs are usable with a caveat. HRESULT values can be a nuisance in anything but C++, where you need to redefine them locally because you can't just use the header. They will generate exceptions in most languages, so if this is common, you'll be creating debugger noise for your components' clients.
The last option allows you to transit a language-specific exception stored in a COM object across the ABI boundary (and up the COM logical stack, including across marshalled calls). In practice it will only be usable by C++ code compiled with the same compiler, settings, and type definitions as the component itself. E.g. passing it from a component compiled with VC to a component compiled with Clang could potentially lead to memory corruption.
Assuming I haven't scared you off, you'll want to look at RoOriginateLanguageException. It allows you to wrap the exception in a COM object and store it with other winrt error data in the TLS. We use this in projections to enable exceptions thrown within a callback to propagate to the outer code using the same projection in a controlled way that unwinds safely through other code potentially written using other languages or tools. This is how the support in C# and other languages is implemented.
Thanks,
Ben

GL commands after eglTerminate

The question is regarding OpenGL ES 2.0 and EGL 1.4.
I'm trying to understand if there is a spec requirement of the behavior of GL commands after eglTerminate was called. I mean if there is any GL error or it can be an exception.
Is there any definition of an expected behavior in this case, or should GL commands not be influenced by EGL commands at all?
Thanks
Calling eglTerminate flags all EGL resources associated with the EGLDisplay you are terminating for deletion. This includes any surfaces and contexts, which would certainly affect the behaviour of an OpenGL ES context in your case.
Regarding expected behaviour, the spec wording you're after is the following (from http://www.khronos.org/registry/egl/specs/eglspec.1.5.pdf - eglTerminate, page 17):
Use of bound contexts and surfaces (that is, continuing to issue com-
mands to a bound client API context) will not result in interruption
or termination of applications, but rendering results are undefined,
and client APIs may generate errors.
i.e. if your context is still current when you terminate the display, any subsequent OpenGL ES calls made on that context are undefined - they may raise OpenGL ES errors, or result in incorrect rendering, but should not cause an exception.

Why does Math.sin() delegate to StrictMath.sin()?

I was wondering, why does Math.sin(double) delegate to StrictMath.sin(double) when I've found the problem in a Reddit thread. The mentioned code fragment looks like this (JDK 7u25):
Math.java :
public static double sin(double a) {
return StrictMath.sin(a); // default impl. delegates to StrictMath
}
StrictMath.java :
public static native double sin(double a);
The second declaration is native which is reasonable for me. The doc of Math states that:
Code generators are encouraged to use platform-specific native libraries or microprocessor instructions, where available (...)
And the question is: isn't the native library that implements StrictMath platform-specific enough? What more can a JIT know about the platform than an installed JRE (please only concentrate on this very case)? In ther words, why isn't Math.sin() native already?
I'll try to wrap up the entire discussion in a single post..
Generally, Math delegates to StrictMath. Obviously, the call can be inlined so this is not a performance issue.
StrictMath is a final class with native methods backed by native libraries. One might think, that native means optimal, but this doesn't necessarily has to be the case. Looking through StrictMath javadoc one can read the following:
(...) the definitions of some of the numeric functions in this package require that they produce the same results as certain published algorithms. These algorithms are available from the well-known network library netlib as the package "Freely Distributable Math Library," fdlibm. These algorithms, which are written in the C programming language, are then to be understood as executed with all floating-point operations following the rules of Java floating-point arithmetic.
How I understand this doc is that the native library implementing StrictMath is implemented in terms of fdlibm library, which is multi-platform and known to produce predictable results. Because it's multi-platform, it can't be expected to be an optimal implementation on every platform and I believe that this is the place where a smart JIT can fine-tune the actual performance e.g. by statistical analysis of input ranges and adjusting the algorithms/implementation accordingly.
Digging deeper into the implementation it quickly turns out, that the native library backing up StrictMath actually uses fdlibm:
StrictMath.c source in OpenJDK 7 looks like this:
#include "fdlibm.h"
...
JNIEXPORT jdouble JNICALL
Java_java_lang_StrictMath_sin(JNIEnv *env, jclass unused, jdouble d)
{
return (jdouble) jsin((double)d);
}
and the sine function is defined in fdlibm/src/s_sin.c refering in a few places to __kernel_sin function that comes directly from the header fdlibm.h.
While I'm temporarily accepting my own answer, I'd be glad to accept a more competent one when it comes up.
Why does Math.sin() delegate to StrictMath.sin()?
The JIT compiler should be able to inline the StrictMath.sin(a) call. So there's little point creating an extra native method for the Math.sin() case ... and adding extra JIT compiler smarts to optimize the calling sequence, etcetera.
In the light of that, your objection really boils down to an "elegance" issue. But the "pragmatic" viewpoint is more persuasive:
Fewer native calls makes the JVM core and JIT easier to maintain, less fragile, etcetera.
If it ain't broken, don't fix it.
At least, that's how I imagine how the Java team would view this.
The question assumes that the JVM actually runs the delegation code. On many JVMs, it won't. Calls to Math.sin(), etc.. will potentially be replaced by the JIT with some intrinsic function code (if suitable) transparently. This will typically be done in an unobservable way to the end user. This is a common trick for JVM implementers where interesting specializations can happen (even if the method is not tagged as native).
Note however that most platforms can't simply drop in the single processor instruction for sin due to suitable input ranges (eg see: Intel discussion).
Math API permits a non-strict but better-performing implementations of its methods but does not require it and by default Math simply uses StrictMath impl.