Copy unmanaged data into managed array - c++-cli

I need to copy native (i.e. unmanaged) data (byte*) to managed byte array with C++/CLI (array).
I tried Marshal::Copy (data is pointed to by const void* data and is dataSize bytes)
array<byte>^ _Data=gcnew array<byte>(dataSize);
System::Runtime::InteropServices::Marshal::Copy((byte*)data, _Data, 0, dataSize);
This gives error C2665: none of the 16 overloads can convert all parameters. Then I tried
System::Runtime::InteropServices::Marshal::Copy(new IntPtr(data), _Data, 0, dataSize);
which produces error C2664: parameter 1 cannot be converted from "const void*" to "__w64 int".
So how can it be done and is Marshal::Copy indeed the "best" (simplest/fastest) way to do so?

As you've noted, Marshal::Copy (and .NET in general), is not const-safe.
However, the usual C and C++ functions are. You can write either:
array<byte>^ data_array =gcnew array<byte>(dataSize);
pin_ptr<byte> data_array_start = &data_array[0];
memcpy(data_array_start, data, dataSize);
or to avoid pinning:
array<byte>^ data_array =gcnew array<byte>(dataSize);
for( int i = 0; i < data_array->Length; ++i )
data_array[i] = data[i];

"IntPtr" is just a wrapper around a "void *". You shouldn't need the new syntax, just use of the explicit conversion operator.
System::Runtime::InteropServices::Marshal::Copy( IntPtr( ( void * ) data ), _Data, 0, dataSize );
Should work.

All these answers dance around the real misunderstanding in the original question.. The essential mistake made is that this code:
System::Runtime::InteropServices::Marshal::Copy(new IntPtr(data),
_Data,
0,
dataSize)
is incorrect.. you don't new (or gcnew) an IntPtr. Its a value type. One of the answers shows this, but it doesn't point out the original misunderstanding. The correct code can be expressed this way:
System::Runtime::InteropServices::Marshal::Copy(IntPtr((void *)data),
_Data,
0,
dataSize)
This confused me when I first started using these constructs also..
IntPtr is a C# struct.. a value type.

The C++/CLI compiler is a bit obtuse about this. The formal definition of IntPtr is "native integer", it is not a pointer type. The C++ language however only allows conversion of void* to a pointer type. The CLI supports pointer types but there are very few framework methods that accept them. Marshal::Copy() doesn't. One of the three IntPtr constructors does.
You have to whack the compiler over the head with a cast or by using the IntPtr constructor. It is anybody's guess if this will still work on a 128-bit operating system, I'm not going to worry about it for a while.

System::Runtime::InteropServices::Marshal::Copy(new
IntPtr((void*)data), _Data, 0, dataSize);
Pay attention to (void*) which type-casts from (const void*) so new IntPtr constructor can take it as argument.

Related

How to declare native array of fixed size in Perl 6?

I'm am trying to declare the following C struct in Perl 6:
struct myStruct
{
int A[2]; //<---NEED to declare this
int B;
int C;
};
My problem is that I don't know how to declare the int A[2]; part using the built in NativeCall api.
So for what I have is:
class myStruct is repr('CStruct') {
has CArray[int32] $.A;
has int32 $.B;
has int32 $.C;
};
However, I know that the has CArray[int32] $.A; part is wrong as it does not declare a part in my struct that takes up ONLY 2 int32 sizes.
Update 2: It turned out that this didn't work at the time I first posted this answer, hence the comments. I still haven't tested it but it must surely work per Tobias's answer to Passing an inlined CArray in a CStruct to a shared library using NativeCall. \o/
I haven't tested this, but this should work when using Rakudo compiler release 2018.05:
use NativeCall;
class myStruct is repr('CStruct') {
HAS int32 #.A[2] is CArray;
has int32 $.B;
has int32 $.C;
}
HAS instead of has causes the attribute to be inline rather than a pointer;
int32 instead of int is because the Perl 6 int type isn't the same as C's int type but is instead platform specific (and usually 64 bit);
# instead of $ marks the attribute as being Positional ("supports looking up values by index") instead of scalar (which gets treated as a single thing);
[2] "shapes" the Positional data to have 2 elements;
is CArray binds a CArray as the Positional data's container logic;
This commit from April this year wired up the is repr('CStruct') to use the declared attribute information to appropriately allocate memory.
Fwiw I found out about this feature from a search of the #perl6 logs for CArray and found out it had landed in master and 2018.05 from a search of Rakudo commits for the commit message title.
See Declaring an array inside a Perl 6 NativeCall CStruct
There are other ways, but the easiest is instead of an array, just declare each individual item.
class myStruct is repr('CStruct') {
has int32 $.A0;
has int32 $.A1;
... as many items as you need for your array ...
has int32 $.B;
has int32 $.C;
};
So I've done some experimentation on this and taken a look at the docs and it looks like the CArray type doesn't handle shaping the same way as Perl6 Arrays.
The closest thing you've got it the allocate constructor that preallocates space in the array but it doesn't enforce the size so you can add more things.
Your class definition is fine but you'd want to allocate the array in the BUILD submethod.
https://docs.raku.org/language/nativecall#Arrays
(Further thought)
You could have two objects. One internal and one for the struct.
The Struct has a CArray[int32] array. The internal data object has a shaped int32 cast array my int3 #a[2]. Then you just need to copy between the two.
The getters and setter live on the main object and you use the struct object just when you want to talk to the lib?
This does not really declare an array of fixed size, but puts a constraint on the size of its value: You can try and use where to constraint the size of the array. CArray is not a positional (and thus cannot be declared with the # sigil) but it does have the elems method.
use NativeCall;
my CArray[int32] $A where .elems < 2
That is, at least, syntactically correct. Whether that breaks the program in some other place remains to be seen. Can you try that?

Can [in] parameters be used for returning data?

I'm not quite sure how [in] and [out] interact with the pass-by-value and pass-by-reference concepts. The MSDN documentation clearly states that [in] means data flows from caller to callee, and [out] is required for data to flow from callee to caller.
However someone suggested to me that I use [in] parameters for objects where the caller can retrieve the results.
Example method definition in IDL:
HRESULT _stdcall a_method( [in] long *arg1, [in] BSTR arg2, [in] IAnObject *arg3 );
In my server's implementation of this method (using C++), I can write:
*arg1 = 20;
arg2[0] = L'X'; // after checking length of string is not 0
arg3->set_value(50);
In the client code, using C++:
long val1 = 10;
BSTR val2 = SysAllocString(L"hello");
IAnObject *val3 = AnObject_Factory::Create();
ptr->a_method(&val1, val2, val3);
When I tried this out (using my object via in-process server), all three changes from the server were propagated to the client, i.e. val1 == 20, val2 was "Xello", and val3->get_value() got 50.
My question is: Is this guaranteed behaviour, i.e. if I am using out-of-process server, or DCOM to another machine, will it see the same changes in val1, val2, and val3 ?
I previously thought that [in] indicated to the underlying RPC that the argument only had to be marshaled in one direction; it didn't have to try and send changes back to the caller. But now I am not so sure.
I am intending that my object is Automation-compatible (i.e. usable from VB6, Java etc. - no custom marshaling required), and that it ought to be able to be used via DCOM instead of in-process, without any changes required in the client code.
You shouldn't change the contents of [in] arguments, so the following code is wrong:
*arg1 = 20;
arg2[0] = L'X'; // after checking length of string is not 0
You're seeing the changes being reflected because you're making calls in the same apartment, where marshaling isn't happening. The proper way to return values is with [out] or [in, out] arguments.
However, you may access its contents and call its methods (for interface pointers), so the following code is right:
arg3->set_value(50);
EDIT: Further answering your questions.
Marshaling can occur both ways, and the [in] and [out] attributes tell the way(s).
For automation, I recommend you don't return more than the typical [out, retval] argument, to support scripting languages. If you must return multiple values, return an IDispatch with properties. Take a look at this blog post as a good starting point if you're taking scriptable automation seriously.
To expand upon #Paulo-madeira's answer, I can guarantee that if a proxy is involved, that
*arg1 = 20;
arg2[0] = L'X'; // after checking length of string is not 0
will at best be ignored, and at worst will corrupt the heap.

Which Marshal::Copy overload method used?

Please consider the following C++/CLI code:
typedef unsigned __int8 uint8_t;
...
uint8_t unmanaged_buf[MAVLINK_MAX_PACKET_LEN];
array<uint8_t>^ Buffer;
...
Marshal::Copy((IntPtr)unmanaged_buf, Buffer, 0, len);
Is the following the Marshal::Copy() method that is used?
Marshal::Copy Method (IntPtr, array<Byte>, Int32, Int32)
PS: The MSDN URL for the above method is at: http://msdn.microsoft.com/en-us/library/ms146631.aspx
If it is, is it because Byte is the type that is closest to unsigned __int8? Specifically, how does the Visual C++ compiler determine which method overload to use?
From MSDN documentation about __int8:
The types __int8, __int16, and __int32 are synonyms for the ANSI types that have the same size, and are useful for writing portable code that behaves identically across multiple platforms. The __int8 data type is synonymous with type char, …
This doesn't say anything about the unsigned versions of the types, but I think it makes sense to assume that unsigned __int8 is synonymous with unsigned char.
And from .NET Framework Equivalents to C++ Native Types:
The following table shows the keywords for built-in Visual C++ types, which are aliases of predefined types in the System namespace.
unsigned char: System.Byte
Putting this together, unsigned __int8 is synonymous to an alias of System.Byte, which means it is the same as System.Byte in C++/CLI code.

ComBSTR assignment

I'm confused about COM string assignments. Which of the following string assignment is correct. Why?
CComBSTR str;
.
.
Obj->str = L"" //Option1
OR should it be
Obj->str = CComBSTR(L"") //Option2
What is the reason
A real BSTR is:
temporarily allocated from the COM heap (via SysAllocString() and family)
a data structure in which the string data is preceded by its length, stored in a 32-bit value.
passed as a pointer to the fifth byte of that data structure, where the string data resides.
See the documentation:
MSDN: BSTR
Most functions which accept a BSTR will not crash when passed a BSTR created the simple assignment. This leads to confusion as people observe what seems to be working code from which they infer that a BSTR can be initialized just like any WCHAR *. That inference is incorrect.
Only real BSTRs can be passed to OLE Automation interfaces.
By using the CComBSTR() constructor, which calls SysAllocString(), your code will create a real BSTR. The CComBSTR() destructor will take care of returning the allocated storage to the system via SysFreeString().
If you pass the CComBSTR() to an API which takes ownership, be sure to call the .Detach() method to ensure the BSTR is not freed. BSTRs are not reference counted (unlike COM objects, which are), and therefore an attempt to free a BSTR more than once will crash.
If you use str = CComBSTR(L"") you use the constructor:
CComBSTR( LPCSTR pSrc );
If you use str = L"" you use the assignment operator:
CComBSTR& operator =(LPCSTR pSrc);
They both would initialize the CComBSTR object correctly.
Personally, I'd prefer option 1, because that doesn't require constructing a new CComBSTR object. (Whether their code does so behind the scenes is a different story, of course.)
Option 1 is preferred because it only does one allocation for the string where as option 2 does 2 (not withstanding the creation of a new temporary object for no particular reason). Unlike the bstr_t type in VC++ the ATL one does not do referenced counted strings so it will copy the entire string across.

P/Invoke with [Out] StringBuilder / LPTSTR and multibyte chars: Garbled text?

I'm trying to use P/Invoke to fetch a string (among other things) from an unmanaged DLL, but the string comes out garbled, no matter what I try.
I'm not a native Windows coder, so I'm unsure about the character encoding bits. The DLL is set to use "Multi-Byte Character Set", which I can't change (because that would break other projects). I'm trying to add a wrapper function to extract some data from some existing classes. The string in question currently exists as a CString, and I'm trying to copy it to an LPTSTR, hoping to get it into a managed StringBuilder.
This is what I have done that I believe is the closest to being correct (I have removed the irrelevant bits, obviously):
// unmanaged function
DLLEXPORT void Test(LPTSTR result)
{
// eval->result is a CString
_tcscpy(result, (LPCTSTR)eval->result);
}
// in managed code
[DllImport("Test.dll", CharSet = CharSet.Auto)]
static extern void Test([Out] StringBuilder result);
// using it in managed code
StringBuilder result = new StringBuilder();
Test(result);
// contents in result garbled at this point
// just for comparison, this unmanaged consumer of the same function works
LPTSTR result = new TCHAR[100];
Test(result);
Really appreciate any tips! Thanks!!!
One problem is using CharSet.Auto.
On an NT-based system this will assume that the result parameter in the native DLL will be using Unicode. Change that to CharSet.Ansi and see if you get better results.
You also need to size the buffer of the StringBuilder that you're passing in:
StringBuilder result = new StringBuilder(100); // problem if more than 100 characters are returned
Also - the native C code is using 'TCHAR' types and macros - this means that it could be built for Unicode. If this might happen it complicates the CharSet situation in the DllImportAtribute somewhat - especially if you don't use the TestA()/TestW() naming convention for the native export.
Dont use out paramaeter as you are not allocating in c function
[DllImport("Test.dll", CharSet = CharSet.Auto)]
static extern void Test(StringBuilder result);
StringBuilder result = new StringBuilder(100);
Test(result);
This should work for you
You didn't describe what your garbled string looks like. I suspect you are mixing up some MBCS strings and UCS-2 strings (using 2-byte wchar_ts). If every other byte is 0, then you are looking a UCS-2 string (and possibly misusing it as an MBCS string). If every other byte is not 0, then you are probably looking at an MBCS string (and possibly misusing it as a Unicode string).
In general, I would recommend not using TCHARs (or LPTSRs). They use macro magic to switch between char (1 byte) and wchar_t (2 bytes), depending on whether _UNICODE is #defined. I prefer to explicit use chat and wchar_t to make the codes intent very clear. However, you will need to call the -A or -W forms of any Win32 APIs that use TCHAR parameters: e.g. MessageBoxA() or MessageBoxW() instead of MessageBox() (which is a macro that checks whether _UNICODE is #defined.
Then you should change CharSet = CharSet.Auto to something CharSet = CharSet.Ansi (if both caller and callee are using MBCS) or CharSet = CharSet.Unicode (if both caller and callee are using UCS-2 Unicode). But it sounds like your DLL is using MBCS, not Unicode.
pinvoke.net is a great wiki reference with many examples of P/Invoke function signatures for Win32 APIs: