Marshalling simple and complex datatypes to/from Object^% / void* - c++-cli

I guess this will be simple for C++/CLI gurus.
I am creating a wrapper which will expose high-performance C++ native classes to C# WinForms application.
Everything went fine with simple known objects and I could wrap also a callback function to delegate. But now I am a bit confused.
The native C++ class has a following method:
int GetProperty(int propId, void* propInOut)
At first I thought I could use void* as IntPtr, but then I found out that I need to access it from C#. So I thought about a wrapper method:
int GetProperty(int propId, Object^ propInOut)
but as I looked through the C++ source, I found out that the method needs to modify the objects. So obviously I need:
int GetProperty(int propId, Object^% propInOut)
Now I cannot pass Objects to native methods so I need to know how to treat them in the wrapper. As the caller should always know what kind of data he/she is passing/receiving, I declared a wrapper:
int GetProperty(int propId, int dataType, Object^% propInOut)
I guess, I can use it to pass reference and value types, for example, an int like this:
Object count = 100; // yeah, I know boxing is bad but this will not be real-time call anyway
myWrapper.GetProperty(Registry.PROP_SMTH, DATA_TYPE_INT, ref count);
I just added a bunch of dataType constants for all the data types I need:
DATA_TYPE_INT, DATA_TYPE_FLOAT, DATA_TYPE_STRING, DATA_TYPE_DESCRIPTOR, DATA_TYPE_BYTE_ARRAY
(DATA_TYPE_DESCRIPTOR is a simple struct with two fields: int Id and wstring Description - this type will be wrapped too, so I guess marshaling will be simple copying data back and forth; all the native strings are Unicode).
Now, the question is - how to implement the wrapper method for all these 5 types?
When I can just cast Object^% to something (is int, float safe to do that?) and pass to native method, when do I need to use pin_ptr and when I need some more complex marshaling to native and back?
int GetProperty(int propId, int dataType, Object^% propInOut)
{
if(dataType == DATA_TYPE_INT)
{
int* marshaledPropInOut = ???
int result = nativeObject->GetProperty(propId, (void*)marshaledPropInOut);
// need to do anything more?
return result;
}
else
if(dataType == DATA_TYPE_FLOAT)
{
float* marshaledPropInOut = ???
int result = nativeObject->GetProperty(propId, (void*)marshaledPropInOut);
// need to do anything more ?
return result;
}
else
if(dataType == DATA_TYPE_STRING)
{
// will pin_ptr be needed or it is enough with the tracking reference in the declaration?
// the pointers won't get stored anywhere in C++ later so I don't need AllocHGlobal
int result = nativeObject->GetProperty(propId, (void*)marshaledPropInOut);
// need to do anything more?
return result;
}
else
if(dataType == DATA_TYPE_BYTE_ARRAY)
{
// need to convert form managed byte[] to native char[] and back;
// user has already allocated byte[] so I can get the size of array somehow
return result;
}
else
if(dataType == DATA_TYPE_DESCRIPTOR)
{
// I guess I'll have to do a dumb copying between native and managed struct,
// the only problem is pinning of the string again before passing to the native
return result;
}
return -1;
}
P.S. Maybe there is a more elegant solution for wrapping this void* method with many possible datatypes?

It doesn't necessarily make sense to equate a C# object to a void*. There isn't any way to marshal arbitrary data. Even with an object, C# still knows what type it is underneath, and for marshaling to take place -- meaning a conversion from the C++ world to C# or vice-versa -- the type of data needs to be known. A void* is just a pointer to memory of a completely unknown type, so how would you convert it to an object, where the type has to be known?
If you have a limited number of types as you describe that could be passed in from the C# world, it is best to make several overloads in your C++/CLI code, each of which took one of those types, and then you can pin the type passed in (if necessary), convert it to a void*, pass that to your C++ function that takes a void*, and then marshal back as appropriate for the type.
You could implement a case statement as you listed, but then what do you do if you can't handle the type that was passed in? The person calling the function from C# has no way to know what types are acceptable and the compiler can't help you figure out that you did something wrong.

Related

I am about to use dlopen() to open shared object. Do I need to include corresponding headers if shared object?

I have to use dlopen() and access functions from shared object in my code. Do I need to include headers of corresponding functions of shared object ?
Because of the way dlopen() and dlsym() operate, I don't see how that would accomplish anything. Very roughly speaking, dlopen() copies the library binary into your program space and adds the addresses of its exported symbols (i.e. global functions & variables) to your program's symbol table.
Because the library was not linked to your program at compile-time, there's no way your code could possibly know the instruction addresses of these new functions tacked on at run-time. The only way to access a run-time dynamically linked symbol is via a pointer obtained from dlsym().
You have to create a function pointer for each and every library definition that you want to use. If you want to call them like regular functions, in C-language you can manually typedef type definitions for the function pointers, specifying their parameters and return values, then you can call the pointers just like regular functions. But note that you have to define all of these manually. Including the library header doesn't help.
In C++ I think there are issues with storing dlsym() output in a typedef'd pointer due to stricter standards, but this should work in C:
addlib.c (libaddlib.dylib):
int add(int x, int y) {
return x+y;
}
myprogram.c:
#include <stdio.h>
#include <dlfcn.h>
typedef int (*add_t)(int, int);
int main() {
void *lib_handle;
add_t add; // call this anything you want...it's a pointer, it doesn't care
lib_handle = dlopen("libaddlib.dylib", RTLD_NOW);
if (lib_handle == NULL) {
// error handling
}
add = (add_t)dlsym(lib_handle, "add");
if (add == NULL) {
// error handling
}
printf("Sum is %d\n", add(17, 23));
dlclose(lib_handle); // remove library from address space
return 0;
}
(Update: I compiled the dylib and myprogram...it works as expected.)

How can I use a 'native' pointer in a reference class in C++/CLI?

I am trying to write a small library which will use DirectShow. This library is to be utilised by a .NET application so I thought it would be best to write it in C++/CLI.
I am having trouble with this line however:
HRESULT hr = CoCreateInstance( CLSID_FilterGraph,
NULL,
CLSCTX_INPROC_SERVER,
IID_IGraphBuilder,
(void**)(&graphBuilder) ); //error C2440:
Where graphBuilder is declared:
public ref class VideoPlayer
{
public:
VideoPlayer();
void Load(String^ filename);
IGraphBuilder* graphBuilder;
};
If I am understanding this page correctly, I can use */& as usual to denote 'native' pointers to unmanaged memory in my C++/CLI library; ^ is used to denote a pointer to a managed object. However, this code produces:
error C2440: 'type cast' : cannot convert from 'cli::interior_ptr' to 'void **'
The error suggests that graphBuilder is considered to be a 'cli::interior_ptr<Type>'. That is a pointer/handle to managed memory, isn't it? But it is a pure native pointer. I am not trying to pass the pointer to a method expecting a handle or vice versa - I simply want to store it in my managed class) If so, how do I say graphBuilder is to be a 'traditional' pointer?
(This question is similar but the answer, to use a pin_ptr, I do not see helping me, as it cannot be a member of my class)
The error message is a bit cryptic, but the compiler is trying to remind you that you cannot pass a pointer to a member of a managed class to unmanaged code. That cannot work by design, disaster strikes when the garbage collector kicks in while the function is executing and moves the managed object. Invalidating the pointer to the member in the process and causing the native code to spray bytes into the gc heap at the wrong address.
The workaround is simple, just declare a local variable and pass a pointer to it instead. Variables on the stack can't be moved. Like this:
void init() {
IGraphBuilder* builder; // Local variable, okay to pass its address
HRESULT hr = CoCreateInstance(CLSID_FilterGraph,
NULL,
CLSCTX_INPROC_SERVER,
IID_IGraphBuilder,
(void**)(&builder) );
if (SUCCEEDED(hr)) {
graphBuilder = builder;
// etc...
}
}

What kind of pointer returned if I use "&" to get address of a value type in C++\CLI?

Suppose I write the following code:
public ref class Data
{
public:
Data()
{
}
Int32 Age;
Int32 year;
};
public void Test()
{
int age = 30;
Int32 year = 2010;
int* pAge = &age;
int* pYear = &year;
Data^ data = gcnew Data();
int* pDataYear = &data->Year; // pData is interior pointer and the compiler will throw error
}
If you compile the program, the compiler will throw error:
error C2440: 'initializing' : cannot convert from 'cli::interior_ptr' to 'int *'
So I learned the "&data->Year" is a type of interior pointer.
UPDATES: I tried to use "&(data->Year)", same error.
But how about pAge and pYear?
Are they native pointers, interior pointers or pinned pointers??
If I want to use them in the following native function:
void ChangeNumber(int* pNum);
Will it be safe to pass either pAge or pYear?
They (pAge and pYear) are native pointers, and passing them to a native function is safe. Stack variables (locals with automatic storage lifetime) are not subject to being rearranged by the garbage collector, so pinning is not necessary.
Copying managed data to the stack, then passing it to native functions, solves the gc-moving-managed-data-around problem in many cases (of course, don't use it in conjunction with callbacks that expect the original variable to be updated before your wrapper has a chance to copy the value back).
To get a native pointer to managed data, you have to use a pinning pointer. This can be slower than the method of copying the value to the stack, so use it for large values or when you really need the function to operate directly on the same variable (e.g. the variable is used in callbacks or multi-threading).
Something like:
pin_ptr<int> p = &mgd_obj.field;
See also the MSDN documentation

Creating a global "null" struct for re-use in C program?

Not sure what I'm doing wrong here. I have a struct that is used heavily through my program.
typedef struct _MyStruct {
// ... handful of non-trivial fields ...
} MyStruct;
I expect (read, intend) for lots of parts of the program to return one of these structs, but many of them should be able to return a "null" struct, which is a singleton/global. The exact use case is for the implementing function to say "I can't find what you asked me to return".
I assumed this would be a simple case of defining a variable in a header file, and initializing it in the .c file.
// MyStruct.h
// ... Snip ...
MyStruct NotFoundStruct;
-
// MyStruct.c
NotFoundStruct.x = 0;
NotFoundStruct.y = 0;
// etc etc
But the compiler complains that the initialization is not constant.
Since I don't care about what this global actually references in memory, I only care that everything uses the same global, I tried just removing the initialization and simply leaving the definition in the header.
But when I do this:
MyStruct thing = give_me_a_struct(some_input);
if (thing == NotFoundStruct) {
// ... do something special
}
Th compiler complains that the operands to the binary operator "==" (or "!=") are invalid.
How does one define such as globally re-usable (always the same memory address) struct?
This doesn't directly answer your question, but it won't fit in a comment...
If you have a function that may need to return something or return nothing, there are several options that are better than returning a "null struct" or "sentinel struct," especially since structs are not equality comparable in C.
One option is to return a pointer, so that you can actually return NULL to indicate that you are really returning nothing; this has the disadvantage of having significant memory management implications, namely who owns the pointer? and do you have to create an object on the heap that doesn't already exist on the heap to do this?
A better option is to take a pointer to a struct as an "out" parameter, use that pointer to store the actual result, then return an int status code indicating success or failure (or a bool if you have a C99 compiler). This would look something like:
int give_me_a_struct(MyStruct*);
MyStruct result;
if (give_me_a_struct(&result)) {
// yay! we got a result!
}
else {
// boo! we didn't get a result!
}
If give_me_a_struct returns zero, it indicates that it did not find the result and the result object was not populated. If it returns nonzero, it indicates that it did find the result and the result object was populated.
C doesn't allow global non-const assignments. So you must do this in a function:
void init() {
NotFoundStruct.x = 0;
NotFoundStruct.y = 0;
}
As for the comparison, C doesn't know how to apply a == operator to a struct. You can overload (redefine) the operator in C++, but not in C.
So to see if a return value is empty, your options are to
Have each function return a boolean value to indicate found or not, and return the struct's values via pointers through the argument list. (eg. bool found = give_me_a_struct(some_input, &thing);)
Return a pointer to a struct, which can be NULL if nothing exists. (eg. MyStruct* thing = give_me_a_struct(some_input);)
Add an additional field to the struct that indicates whether the object is valid.
The third option is the most generic for other cases, but requires more data to be stored. The best bet for your specific question is the first option.
// MyStruct.h
typedef struct _MyStruct {
// fields
} MyStruct;
extern MyStruct NotFoundStruct;
// MyStruct.c
#include "my_struct.h"
MyStruct NotFoundStruct = {0};
But since you can't use the == operator, you will have to find another way to distinguish it. One (not ideal) way is to have a bool flag reserved to indicate validity. That way, only that must be checked to determine if it's a valid instance.
But I think you should consider James's proposed solution instead
In the header:
// Structure definition then
extern MyStruct myStruct;
In the .c that contains global data
struct MyStruct myStruct
{
initialize field 1,
initialize field 2,
// etc...
};

Tracking reference in C++/CLI

Can someone please explain me the following code snippet?
value struct ValueStruct {
int x;
};
void SetValueOne(ValueStruct% ref) {
ref.x = 1;
}
void SetValueTwo(ValueStruct ref) {
ref.x = 2;
}
void SetValueThree(ValueStruct^ ref) {
ref->x = 3;
}
ValueStruct^ first = gcnew ValueStruct;
first->x = 0;
SetValueOne(*first);
ValueStruct second;
second.x = 0;
SetValueTwo(second); // am I creating a copy or what? is this copy Disposable even though value types don't have destructors?
ValueStruct^ third = gcnew ValueStruct;
third->x = 0;
SetValueThree(third); // same as the first ?
And my second question is: is there any reason to have something like that?:
ref struct RefStruct {
int x;
};
RefStruct% ref = *gcnew RefStruct;
// rather than:
// RefStruct^ ref = gcnew RefStruct;
// can I retrieve my handle from ref?
// RefStruct^ myref = ???
What is more: I see no difference between value type and ref type, since both can be pointed by handler ;(
Remember that the primary use of C++/CLI is for developing class libraries for consumption by GUIs / web services built in other .NET languages. So C++/CLI has to support both reference and value types because other .NET languages do.
Furthermore, C# can have ref parameters that are value typed as well, this isn't unique to C++/CLI and it doesn't in any way make value types equivalent to reference types.
To answer the questions in your code comments:
am I creating a copy or what?
Yes, SetValueTwo takes its parameter by value, so a copy is made.
is this copy Disposable even though value types don't have destructors?
Incorrect. Value types can have destructors. Value types cannot have finalizers. Since this particular value type has a trivial destructor, the C++/CLI compiler will not cause it to implement IDisposable. In any case, if a parameter is an IDisposable value type, the C++/CLI compiler will ensure that Dispose is called when the variable goes out of scope, just like stack semantics for local variables. This includes abnormal termination (thrown exception), and allows managed types to be used with RAII.
Both
ValueStruct% ref = *gcnew ValueStruct;
and
ValueStruct^ ref = gcnew ValueStruct;
are allowed, and put a boxed value type instance on the managed heap (which isn't a heap at all, but a FIFO queue, however Microsoft chooses to call it a heap like the native memory area for dynamic allocation).
Unlike C#, C++/CLI can keep typed handles to boxed objects.
If a tracking reference is to a value type instance on the stack or embedded in another object, then the value type content has to be boxed in the process of formed the reference.
Tracking references can also be used with reference types, and the syntax to obtain a handle is the same:
RefClass^ newinst = gcnew RefClass();
RefClass% reftoinst = *newinst;
RefClass^% reftohandle = newinst;
RefClass stacksem;
RefClass^ ssh = %stacksem;
One thing that I can never seem to remember completely is that the syntax isn't 100% consistent compared to native C++.
Declare a reference:
int& ri = i; // native
DateTime% dtr = dt; // managed tracking reference
Declare a pointer:
int* pi; // native
Stream^ sh; // tracking handle
Form a pointer:
int* pi = &ri; // address-of native object
DateTime^ dth = %dtr; // address-of managed object
Note that the unary address-of operator is the same as the reference notation in both standard C++ and C++/CLI. This seems to contradict a tracking reference cannot be used as a unary take-address operator (MSDN) which I'll get back to in a second.
First though, the inconsistency:
Form a reference from a pointer:
int& iref = *pi;
DateTime% dtref = *dth;
Note that the unary dereference operator is always *. It is the same as the pointer notation only in the native world, which is completely opposite of address-of which, as mentioned above, are always the same symbol as the reference notation.
Compilable example:
DateTime^ dth = gcnew DateTime();
DateTime% dtr = *dth;
DateTime dt = DateTime::Now;
DateTime^ dtbox = %dt;
FileInfo fi("temp.txt");
// FileInfo^ fih = &fi; causes error C3072
FileInfo^ fih = %fi;
Now, about unary address-of:
First, the MSDN article is wrong when it says:
The following sample shows that a tracking reference cannot be used as a unary take-address operator.
The correct statement is:
% is the address-of operator for creation of a tracking handle. However its use is limited as follows:
A tracking handle must point to an object on the managed heap. Reference types always exist on the managed heap so there is no problem. However, value types and native types may be on the stack (for local variables) or embedded within another object (member variables of value type). Attempts to form a tracking handle will form a handle to a boxed copy of the variable: the handle is not linked to the original variable. As a consequence of the boxing process, which requires metadata which does not exist for native types, it is never possible to have a tracking handle to an instance of a native type.
Example code:
int i = 5;
// int^ ih = %i; causes error C3071
System::Int32 si = 5;
// System::Int32^ sih = %si; causes error C3071
// error C3071: operator '%' can only be applied to an instance
// of a ref class or a value-type
If System::Int32 isn't a value type then I don't know what is. Let's try System::DateTime which is a non-primitive value type:
DateTime dt = DateTime::Now;
DateTime^ dtbox = %dt;
This works!
As a further unfortunate restriction, primitive types which have dual identity (e.g. native int and managed value type System::Int32) are not handled correctly, the % (form tracking reference) operator cannot perform boxing even when the .NET name for the type is given.