C++/CLI method calls native method to modify int - need pin_ptr? - c++-cli

I have a C++/CLI method, ManagedMethod, with one output argument that will be modified by a native method as such:
// file: test.cpp
#pragma unmanaged
void NativeMethod(int& n)
{
n = 123;
}
#pragma managed
void ManagedMethod([System::Runtime::InteropServices::Out] int% n)
{
pin_ptr<int> pinned = &n;
NativeMethod(*pinned);
}
void main()
{
int n = 0;
ManagedMethod(n);
// n is now modified
}
Once ManagedMethod returns, the value of n has been modified as I would expect. So far, the only way I've been able to get this to compile is to use a pin_ptr inside ManagedMethod, so is pinning in fact the correct/only way to do this? Or is there a more elegant way of passing n to NativeMethod?

Yes, this is the correct way to do it. Very highly optimized inside the CLR, the variable gets the [pinned] attribute so the CLR knows that it stores an interior pointer to an object that should not be moved. Distinct from GCHandle::Alloc(), pin_ptr<> can do it without creating another handle. It is reported in the table that the jitter generates when it compiles the method, the GC uses that table to know where to look for object roots.
Which only ever matters when a garbage collection occurs at the exact same time that NativeMethod() is running. Doesn't happen very often in practice, you'd have to use threads in the program. YMMV.
There is another way to do it, doesn't require pinning but requires a wee bit more machine code:
void ManagedMethod(int% n)
{
int copy = n;
NativeMethod(copy);
n = copy;
}
Which works because local variables have stack storage and thus won't be moved by the garbage collector. Does not win any elegance points for style but what I normally use myself, estimating the side-effects of pinning is not that easy. But, really, don't fear pin_ptr<>.

Related

Check whether function called through function-pointer has a return statement

We have a plugin system that calls functions in dlls (user-generated plugins) by dlopening/LoadLibrarying the dll/so/dylib and then dlsyming/GetProcAddressing the function, and then storing that result in a function pointer.
Unfortunately, due to some bad example code being copy-pasted, some of these dlls in the wild do not have the correct function signature, and do not contain a return statement.
A dll might contain this:
extern "C" void Foo() { stuffWithNoReturn(); } // copy-paste from bad code
or it might contain this:
extern "C" int Foo() { doStuff(); return 1; } // good code
The application that loads the dll relies on the return value, but there are a nontrivial number of dlls out there that don't have the return statement. I am trying to detect this situation, and warn the user about the problem with his plugin.
This naive code should explain what I'm trying to do:
typedef int (*Foo_f)(void);
Foo_f func = (Foo_f)getFromDll(); // does dlsym or GetProcAddress depending on platform
int canary = 0x42424242;
canary = (*func)();
if (canary == 0x42424242)
printf("You idiot, this is the wrong signature!!!\n");
else
real_return_value = canary;
This unfortunately does not work, canary contains a random value after calling a dll that has the known defect. I naively assumed calling a function with no return statement would leave the canary intact, but it doesn't.
My next idea was to write a little bit of inline assembler to call the function, and check the eax register upon return, but Visual Studio 2015 doesn't allow __asm() in x64 code anymore.
I know there is no standards-conform solution to this, as casting the function pointer to the wrong type is of course undefined behavior. But if someone has a solution that works at least on 64bit Windows with Visual C++, or a solution that works with clang on MacOS, I would be most delighted.
#Lorinczy Zsigmond is right in that the contents of the register are undefined if the function does something but returns nothing.
We found however that in practice, the plugins that return nothing also have almost always empty functions that compile to a retn 0x0 and leaves the return register untouched. We can detect this case by spraying the rax register with a known value (0xdeadbeef) and checking for that.

Confusion regarding reentrant functions

My understanding of "reentrant function" is that it's a function that can be interrupted (e.g by an ISR or a recursive call) and later resumed such that the overall output of the function isn't affected in any way by the interruption.
Following is an example of a reentrant function from Wikipedia https://en.wikipedia.org/wiki/Reentrancy_(computing)
int t;
void swap(int *x, int *y)
{
int s;
s = t; // save global variable
t = *x;
*x = *y;
// hardware interrupt might invoke isr() here!
*y = t;
t = s; // restore global variable
}
void isr()
{
int x = 1, y = 2;
swap(&x, &y);
}
I was thinking, what if we modify the ISR like this:
void isr()
{
t=0;
}
And let's say, then, that the main function calls the swap function, but then suddenly an interrupt occurs, then the output would surely get distorted as the swap wouldn't be proper, which in my mind makes this function non-reentrant.
Is my thinking right or wrong? Is there some mistake in my understanding of reentrancy?
The answer to your question:
that the main function calls the swap function, but then suddenly an interrupt occurs, then the output would surely get distorted as the swap wouldn't be proper, which in my mind makes this function non-reentrant.
Is no, it does not, because re-entrancy is (by definition) defined with respect to self. If isr calls swap, the other swap would be safe. However, swap is thread-unsafe, though.
The correct way of thinking depends on the precise definition of re-entrancy and thread-safety (See, say Threadsafe vs re-entrant)
Wikipedia, the source of the code in question, selected the definition of reentrant function to be "if it can be interrupted in the middle of its execution and then safely called again ("re-entered") before its previous invocations complete execution".
I have never heard the term re-entrancy used in the context of interrupt service routines. It is generally the responsibility of the ISR (and/or the operating system) to maintain consistency - application code should not need to know anything about what an interrupt might do.
That a function is re-entrant usually means that it can be called from multiple threads simultaneously - or by itself recursively (either directly or through a more elaborate call chain) - and still maintain internal consistency.
For functions to be re-entrant they must generally avoid using static variables and of course avoid calls to other functions that are not themselves re-entrant.

Compile-time information in CUDA

I'm optimizing a very time-critical CUDA kernel. My application accepts a wide range of switches that affect the behavior (for instance, whether to use 3rd or 5th order derivative). Consider as an approximation a set of 50 switches, where every switch is an integer variable (a bool sometimes, or a float, but this case is not so relevant for this question).
All these switches are constant during the execution of the application. Most of these switches are run-time and I store them in constant memory, so to exploit the caching mechanism. Some other switches can be compile-time and the customer is fine with having to re-compile the application if he wants to change the value in the switch. A very simple example could be:
__global__ void mykernel(const float* in, float *out)
{
for ( /* many many times */ )
if (compile_time_switch)
do_this(in, out);
else
do_that(in, out);
}
Assume that do_this and do_that are compute-bound and very cheap, that I optimize the for loop so that its overhead is negligible, that I have to place the if inside the iteration. If the compiler recognizes that compile_time_switch is static information it can optimize out the call to the "wrong" function and create code that is just as optimized as if the if weren't there. Now the real question:
In which ways can I provide the compiler with the static value of this switch? I see two such ways, listed below, but none of them work for me. What other possibilities remain?
Template parameters
Providing a template parameter enables this static optimization.
template<int compile_time_switch>
__global__ void mykernel(const float* in, float *out)
{
for ( /* many many times */ )
if (compile_time_switch)
do_this(in, out);
else
do_that(in, out);
}
This simple solution does not work for me, since I don't have direct access to the code that calls the kernel.
Static members
Consider the following struct:
struct GlobalParameters
{
static const bool compile_time_switch = true;
};
Now GlobalParameters::compile_time_switch contains the static information as I want it, and that compiler would be able to optimize the kernel. Unfortunately, CUDA does not support such static members.
EDIT: the last statement is apparently wrong. the definition of the struct is of course legit and you are able to use the static member GlobalParameters::compile_time_switch in device code. The compiler inlines the variable, so that the final code will directly contain the value, not a run-time variable access, which is the behavior you would expect from an optimizer compiler. So, the second options is actually suitable.
I consider my problem solved both thanks to this fact and to kronos' answer. However, I'm still looking for other alternative methods to provide compile-time information to the compiler.
Yor third options are preprocessor definitions:
#define compile_time_switch 1
__global__ void mykernel(const float* in, float *out)
{
for ( /* many many times */ )
if (compile_time_switch)
do_this(in, out);
else
do_that(in, out);
}
The preprocessor will discard the else case compleatly and the compiler has nothing to optimize in his dead code elemination pass, because there is no dead code.
Furthermore, you can specify the definition with the -D comand line switch and (I think) any by nvidia supported compiler will accept -D (msvc may use a different switch).

Will code written in this style be optimized out by RVO in C++11?

I grew up in the days when passing around structures was bad mojo because they are often large, so pointers were always the way to go. Now that C++11 has quite good RVO (right value optimization), I'm wondering if code like the following will be efficient.
As you can see, my class has a bunch of vector structures (not pointers to them). The constructor accepts value structures and stores them away.
My -hope- is that the compiler will use move semantics so that there really is no copying of data going on; the constructor will (when possible) just assume ownership of the values passed in.
Does anyone know if this is true, and happens automagically, or do I need a move constructor with the && syntax and so on?
// ParticleVertex
//
// Class that represents the particle vertices
class ParticleVertex : public Vertex
{
public:
D3DXVECTOR4 _vertexPosition;
D3DXVECTOR2 _vertexTextureCoordinate;
D3DXVECTOR3 _vertexDirection;
D3DXVECTOR3 _vertexColorMultipler;
ParticleVertex(D3DXVECTOR4 vertexPosition,
D3DXVECTOR2 vertexTextureCoordinate,
D3DXVECTOR3 vertexDirection,
D3DXVECTOR3 vertexColorMultipler)
{
_vertexPosition = vertexPosition;
_vertexTextureCoordinate = vertexTextureCoordinate;
_vertexDirection = vertexDirection;
_vertexColorMultipler = vertexColorMultipler;
}
virtual const D3DVERTEXELEMENT9 * GetVertexDeclaration() const
{
return particleVertexDeclarations;
}
};
Yes, indeed you should trust the compiler to optimally "move" the structures:
Want Speed? Pass By Value
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying
In this case, you'd move the arguments into the constructor call:
ParticleVertex myPV(std::move(pos),
std::move(textureCoordinate),
std::move(direction),
std::move(colorMultipler));
In many contexts, the std::move will be implicit, e.g.
D3DXVECTOR4 getFooPosition() {
D3DXVECTOR4 result;
// bla
return result; // NRVO, std::move only required with MSVC
}
ParticleVertex myPV(getFooPosition(), // implicit rvalue-reference moved
RVO means Return Value Optimization not Right value optimization.
RVO is a optimization performed by the compiler when the return of a function is by value, and its clear that the code returns a temporary object created in the body, so the copy can be avoided. The function returns the created object directly.
What C++11 introduces is Move Semantics. Move semantics allows us to "move" the resource from a certain temporary to a target object.
But, move implies that the object wich the resource comes from, is in a unusable state after the move. This is not the case (I think) you want in your class, because the vertex data is used by the class, even if the user calls to this function or not.
So, use the common return by const reference to avoid copies.
On the other hand,, DirectX provides handles to the resources (Pointers), not the real resource. Pointers are basic types,its copying is cheap, so don't worry about performance. In your case, you are using 2d/3d vectors. Its copying is cheap too.
Personally, I think that returning a pointer to an internal resource is a very bad idea, always. I think that in this case the best aproach is to return by const reference.

std::unique_ptr and pointer-to-pointer

I've been using std::unique_ptr to store some COM resources, and provided a custom deleter function. However, many of the COM functions want pointer-to-pointer. Right now, I'm using the implementation detail of _Myptr, in my compiler. Is it going to break unique_ptr to be accessing this data member directly, or should I store a gajillion temporary pointers to construct unique_ptr rvalues from?
COM objects are reference-countable by their nature, so you shouldn't use anything except reference-counting smart pointers like ATL::CComPtr or _com_ptr_t even if it seems inappropriate for your usecase (I fully understand your concerns, I just think you assign too much weight to them). Both classes are designed to be used in all valid scenarios that arise when COM objects are used, including obtaining the pointer-to-pointer. Yes, that's a bit too much functionality, but if you don't expect any specific negative consequences you can't tolerate you should just use those classes - they are designed exactly for this purpose.
I've had to tackle the same problem not too long ago, and I came up with two different solutions:
The first was a simple wrapper that encapsulated a 'writeable' pointer and could be std::moved into my smart pointer. This is just a little more convenient that using the temp pointers you are mentioning, since you cannot define the type directly at the call-site.
Therefore, I didn't stick with that. So what I did was a Retrieve helper-function that would get the COM function and return my smart-pointer (and do all the temporary pointer stuff internally). Now this trivially works with free-functions that only have a single T** parameter. If you want to use this on something more complex, you can just pass in the call via std::bind and only leave the pointer-to-be-returned free.
I know that this is not directly what you're asking, but I think it's a neat solution to the problem you're having.
As a side note, I'd prefer boost's intrusive_ptr instead of std::unique_ptr, but that's a matter of taste, as always.
Edit: Here's some sample code that's transferred from my version using boost::intrusive_ptr (so it might not work out-of-the box with unique_ptr)
template <class T, class PtrType, class PtrDel>
HRESULT retrieve(T func, std::unique_ptr<PtrType, PtrDel>& ptr)
{
ElementType* raw_ptr=nullptr;
HRESULT result = func(&raw_ptr);
ptr.reset(raw_ptr);
return result;
}
For example, it can be used like this:
std::unique_ptr<IFileDialog, ComDeleter> FileDialog;
/*...*/
using std::bind;
using namespace std::placeholders;
std::unique_ptr<IShellItem, ComDeleter> ShellItem;
HRESULT status = retrieve(bind(&IFileDialog::GetResult, FileDialog, _1), ShellItem);
For bonus points, you can even let retrieve return the unique_ptr instead of taking it by reference. The functor that bind generates should have signature typedefs to derive the pointer type. You can then throw an exception if you get a bad HRESULT.
C++0x smart pointers have a portable way to get at the raw pointer container .get() or release it entirely with .release(). You could also always use &(*ptr) but that is less idiomatic.
If you want to use smart pointers to manage the lifetime of an object, but still need raw pointers to use a library which doesn't support smart pointers (including standard c library) you can use those functions to most conveniently get at the raw pointers.
Remember, you still need to keep the smart pointer around for the duration you want the object to live (so be aware of its lifetime).
Something like:
call_com_function( &my_uniq_ptr.get() ); // will work fine
return &my_localscope_uniq_ptr.get(); // will not
return &my_member_uniq_ptr.get(); // might, if *this will be around for the duration, etc..
Note: this is just a general answer to your question. How to best use COM is a separate issue and sharptooth may very well be correct.
Use a helper function like this.
template< class T >
T*& getPointerRef ( std::unique_ptr<T> & ptr )
{
struct Twin : public std::unique_ptr<T>::_Mybase {};
Twin * twin = (Twin*)( &ptr );
return twin->_Myptr;
}
check the implementation
int wmain ( int argc, wchar_t argv[] )
{
std::unique_ptr<char> charPtr ( new char[25] );
delete getPointerRef(charPtr);
getPointerRef(charPtr) = 0;
return charPtr.get() != 0;
}