Convert System::String to std::string in UTF8, which later converted to char* as c_str - c++-cli

I have a System::String^ variable in C++ code. This variable should be converted to std::string which is later converted to const char* via c_str.
// original string
System::String^ path = ...;
// convert to std::string
msclr::interop::marshal_context context;
std::string filename(context.marshal_as<std::string>(path));
// call API function that internally connects to sqlite3 using sqlite3_open as
// sqlite3_open(filename.c_str())
// https://www.sqlite.org/c3ref/open.html -
// const char *filename, /* Database filename (UTF-8) */
doCalculation(filename)
It works well with the ASCII paths, but fails if the path contains non-latin characters.
So Somehow I need to convert marshalled std::string from current implementation (ASCII?) to UTF8.
I've tried
std::wstring dbPath(context.marshal_as<std::wstring>(path));
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t> convert;
std::string dbPathU8 = convert.to_bytes(dbPath);
but it does not work.

What you want to do is use the .Net methods to convert directly to UTF-8.
The available methods in the Encoding class aren't exactly what you're looking for (direct from managed String to unmanaged string or byte array), so we'll need an intermediary and some manual copying.
String^ path = ...;
// First, convert to a managed array of the bytes you want.
array<Byte>^ bytes = Encoding::UTF8->GetBytes(path);
// Then, copy those bytes from the managed byte array to an unmanaged string.
std::string str;
str.resize(bytes->Length);
Marshal::Copy(bytes, 0, IntPtr(str.data()), bytes->Length);
// OR, copy directly to the char* you want eventually.
char* chars = new char[bytes->Length + 1]; // or malloc(), or whatever.
Marshal::Copy(bytes, 0, IntPtr(chars), bytes->Length);
chars[bytes->Length] = '\0'; // null terminate.
// don't forget to free the buffer when you're done with it!
There are several GetBytes variants available, but the parameters to them seem to either be both managed, or both unmanaged. (String^ and array^, or char* and byte*, but not String^ and byte*.) Therefore, we have the Encoding class create a managed byte array, then we use the Marshal::Copy method to copy those bytes either to the unmanaged string object, or directly to a char*.

Related

C++/CLI Conversion of byte* to Managed Byte[]

I'm in a C++/CLI project, and I have a byte* variable that I want to fully convert into a managed array<byte> as efficiently as possible.
Currently, the only way that I've seen is to manually create the managed array<byte> object, and then copy individual bytes from the byte* variable, as shown below:
void Foo(byte* source, int bytesCount)
{
auto buffer = gcnew array<byte>(bytesCount);
for (int i = 0; i < bytesCount; ++i)
{
buffer[i] = source[i];
}
}
Is there any other way to do this more efficiently? Ideally, to not have to copy the memory at all.
If not, is there any way to do this more cleanly?
You can't create a managed array from an unmanaged buffer without copying.
However, you don't need to copy the individual bytes in a loop, though. You can pin the managed array (see pin_ptr) and then copy the bytes directly from the unmanaged buffer into the memory of the managed array, such as with memcpy() or equivalent.
void Foo(byte* source, int bytesCount)
{
auto buffer = gcnew array<byte>(bytesCount);
{
pin_ptr<byte> p = &buffer[0];
byte *cp = p;
memcpy(cp, source, bytesCount);
}
// use buffer as needed...
}

Correct way to convert array<unsigned char>^ to std::string

I wanted to know what is the correct way to convert a managed array<unsigned char>^ to an unmanaged std::string. What I do right now is this:
array<unsigned char>^ const content = GetArray();
auto enc = System::Text::Encoding::ASCII;
auto const source = enc->GetString(content);
std::string s = msclr::interop::marshal_as<std::string>(source);
Is there a way to marshal the content in one step to a std::string without converting to a String^?
I tried:
array<unsigned char>^ const content = GetArray();
std::string s = msclr::interop::marshal_as<std::string>(content);
but this gave me following errors:
Error C4996 'msclr::interop::error_reporting_helper<_To_Type,cli::array<unsigned char,1> ^,false>::marshal_as':
This conversion is not supported by the library or the header file needed for this conversion is not included.
Please refer to the documentation on 'How to: Extend the Marshaling Library' for adding your own marshaling method.
Error C2065 '_This_conversion_is_not_supported': undeclared identifier
If the array is of plain bytes, then it's already ASCII encoded (or whatever narrow characters you're using). Converting to a managed UTF-16 String^ is an unnecessary detour.
Just construct a narrow-characters string from the byte array. Pass a pointer to the first byte and the length.
array<unsigned char>^ const content = GetArray();
pin_ptr<unsigned char> contentPtr = &content[0];
std::string s(contentPtr, content->Length);
I'm not at a compiler right now, there may be trivial syntax errors.

StringToCoTaskMemUni or StringToCoTaskMemAnsi methods can cause hang?

I have the below code in c++/CLI and observing hang while converting the .net string to char * using StringToCoTaskMemAnsi
const char* CDICashInStringStore::CDIGetStringVal( void )
{
unsigned int identifier = (unsigned int)_id;
debug(" cashincdistores--routing call to .Net for CDI String %d", identifier);
NCR::APTRA::INDCDataAccess::IStringValue^ stringValueProvider = (NCR::APTRA::INDCDataAccess::IStringValue^)GetStringProvider()->GetProvider();
String^ strValue = stringValueProvider->GetStringValue(identifier);
debug(" cashincdistores-- going to call StringToCoTaskMemAnsi);
IntPtr iPtr = Marshal::StringToCoTaskMemAnsi(strValue);
debug(" cashincdistores-- StringToCoTaskMemAnsi called);
// use a local (retVal is not needed)
const char * ansiStr = strdup((const char *) iPtr.ToPointer());
Marshal::FreeCoTaskMem(iPtr);
debug(" cashincdistores--got results %d %s",identifier,ansiStr);
// The returned memory will be free() 'ed by the user
return ansiStr;
}
In our logging I can see "cashincdistores-- going to call StringToCoTaskMemAnsi" and suspecting there is a hang after calling the 'StringToCoTaskMemAnsi' method.
Does there is a chance of hang in 'StringToCoTaskMemAnsi' marshalling method. what could cause the hang ?
Why are you using COM in the first place? You don't need any COM in that code.
Disclaimer: You should probably not be returning a const char * someone else will have to free from your function. That's a very easy way to produce memory leaks or multiple free errors.
Ignoring the disclaimer above, you have a couple possibilities:
First way:
#include <msclr/marshal.h>
msclr::interop::marshal_context context;
const char* strValueAsCString = context.marshal_as<const char*>(strValue);
// Probably bad
const char* ansiStr = strdup(strValueAsCString);
The strValueAsCString pointer will remain valid as long as context is in scope.
Another way:
#include <string>
#include <msclr/marshal_cppstd.h>
std::string strValueAsStdString = msclr::interop::marshal_as<std::string>(strValue);
// Probably bad
const char* ansiStr = strdup(strValueAsStdString.c_str());
Here, the std::string manages the lifetime of the string.
See Overview of Marshaling for reference.

C/C++ DLL: Converting a const uint8_t to a String

I haven't seen C++ code in more than 10 years and now I'm in the need of developing a very small DLL to use the Ping class (System::Net::NetworkInformation) to make a ping to some remoteAddress.
The argument where I'm receiving the remoteAddress is a FREObject which then needs to be transformed into a const uint8_t *. The previous is mandatory and I can't change anything from it. The remoteAddress has to be received as a FREObject and later be transformed in a const uint8_t *.
The problem I'm having is that I have to pass a String^ to the Ping class and not a const uint8_t * and I have no clue of how to convert my const uint8_t * to a String^. Do you have any ideas?
Next is part of my code:
// argv[ARG_IP_ADDRESS_ARGUMENT holds the remoteAddress value.
uint32_t nativeCharArrayLength = 0;
const uint8_t * nativeCharArray = NULL;
FREResult status = FREGetObjectAsUTF8(argv[ARG_IP_ADDRESS_ARGUMENT], &nativeCharArrayLength, &nativeCharArray);
Basically the FREGetObjectAsUTF8 function fills the nativeCharArray array with the value of argv[ARG_IP_ADDRESS_ARGUMENT] and returns the array's length in nativeCharArrayLength. Also, the string uses UTF-8 encoding terminates with the null character.
My next problem would be to convert a String^ back to a const uint8_t *. If you can help with this as well I would really appreciate it.
As I said before, non of this is changeable and I have no idea of how to change nativeCharArray to a String^. Any advice will help.
PS: Also, the purpose of this DLL is to use it as an ANE (Air Native Extension) for my Adobe Air app.
You'll need UTF8Encoding to convert the bytes to characters. It has methods that take pointers, you'll want to take advantage of that. You first need to count the number of characters in the converted string, then allocate an array to store the converted characters, then you can turn it into System::String. Like this:
auto converter = gcnew System::Text::UTF8Encoding;
auto chars = converter->GetCharCount((Byte*)nativeCharArray, nativeCharArrayLength-1);
auto buffer = gcnew array<Char>(chars);
pin_ptr<Char> pbuffer = &buffer[0];
converter->GetChars((Byte*)nativeCharArray, nativeCharArrayLength-1, pbuffer, chars);
String^ result = gcnew String(buffer);
Note that the -1 on nativeCharArrayLength compensates for the zero terminator being included in the value.

C++/CLI, I have a Byte[] and I need a char* to the first and last elements

I've been handed a Byte[] that contains a file. I need to pass this to another method that is expecting two parameters, a char* to the beginning of the file and a char* to the end of the file.
I'm assuming I need to pin the array first so it doesn't get collected. I don't imagine I can then just cast the first and last elements, right?
Old question, but I just found out that you can create a pin_ptr<unsigned char> from such an array and then reinterpret_cast the result.
pin_ptr<unsigned char> pinned = &buffer[0];
unsigned char* unsignedBufferPtr = pinned;
char* bufferPtr = reinterpret_cast<char*>(unsignedBufferPtr);
You can then use a reinterpret_cast on the result