Converting c++ project to clr safe project - c++-cli

I need to work on converting a very huge c++ project to clr safe. The current c++ project has a lot of stuff from c++ like templates, generics, pointers, storage/stream, ole apis, zlib compression apis, inlines etc. Where can I find the datiled document for this type of conversion? Can you suggest some good book to refer to? If anyone of you have done such conversion, can I get some analysis from you?

I'll just cough up the MSDN Library article titled "How to: Migrate to /clr:safe
Visual C++ can generate verifiable components with using /clr:safe, which causes the compiler to generate errors for each non-verifiable code construct.
The following issues generate verifiability errors:
Native types. Even if it isn't used, the declaration of native classes, structures, pointers, or arrays will prevent compilation.
Global variables
Function calls into any unmanaged library, including common language runtime function calls
A verifiable function cannot contain a static_cast Operator for down-casting. The static_cast operator can be used for casting between primitive types, but for down-casting, safe_cast or a C-Style cast (which is implemented as a safe_cast) must be used.
A verifiable function cannot contain a reinterpret_cast operator (or any C-style cast equivalent).
A verifiable function cannot perform arithmetic on an interior_ptr. It may only assign to it and dereference it.
A verifiable function can only throw or catch pointers to reference types, so value types must be boxed before throwing.
A verifiable function can only call verifiable functions (such that calls to the common language runtime are not allowed, include AtEntry/AtExit, and so global constructors are disallowed).
A verifiable class cannot use Explicit.
If building an EXE, a main function cannot declare any parameters, so GetCommandLineArgs must be used to retrieve command-line arguments.
Making a non-virtual call to a virtual function.
Also, the following keywords cannot be used in verifiable code:
unmanaged and pack pragmas
naked and align __declspec modifiers
__asm
__based
__try and __except
I reckon that will keep you busy for a while. There is no magic wand to wave to turn native C++ into verifiable code. Are you sure this is worth the investment?

The vast majority of native C++ is entirely valid C++/CLI, including templates, inlines, etc, except the CLR STL is rather slow compared to the BCL. Also, native C++ doesn't have generics, only templates.
The reality of compiling as C++/CLI is to check the switch and push compile, and wait for it to throw errors.

Rewriting native C++ into safe C++/CLI will result in a code that is syntactically different, but semantically same as C#. If that is the case, why not rewrite directly in C#?
If you want to avoid what is essentially a complete rewrite, consider the following alternatives:
P/Invoke. Unfortunately, I'm unfamiliar whether this would isolate safe from unsafe code. Even if it can perform the isolation, you'll need to wrap your existing C++ code into procedural, C-like API, so it can be consumed by P/Invoke. On a plus side, unless your API is excessively chatty, you get to keep (most of) your native performance.
Wrapping your C++ into out-of-process COM server and using COM Interop to consume it from the manged code. This way, your managed code is completely protected from any corruption that might happen at C++ end and can remain "safe". The downside is a performance hit that you'll get for out-of-process marshaling and the implementation effort you'll need to expend to correctly implement the COM.

Related

C++/CLI: Console::WriteLine() or cout?

I am going back to school where we have to take a C++ class.
I am familiar with the language but there's a few things I have never heard of...
Generally, my teacher said that plain C++ is "unsafe". It generates "unsafe code" (whatever that means). That's why we have to use C++/CLI which is supposed to make "safe" code.
Now... isn't CLI just a Microsoft .NET extension?
He is also telling us to use Console::WriteLine() instead of cout. Since Console::WriteLine() is "safe" and cout is "unsafe".
All this seems weird to me... Can anyone clarify this?
Thanks!
To put it very blunt and simple.
Safe
By "safe code" you teacher probably means managed code. That is code where you don't have to "care" about pointers and memory, you have a garbagecollector that takes care of this for your. You are dealing with refrences. Examples of languages built like this is java and c#. Code is compiled to a "fictional" opcodes(intermediate language, IL for C#), and compiles and run realtime(JIT, just in time compilation). The IL generated code, will have to be converted to real native platform based opcodes, in java this is one of things the jvm does. You may easily disassemble code from languages like these. And they may run on several platforms without a recompilation.
Unsafe
By "unsafe code" the teacher means ordinary native c++ unmanaged code, where all memory and resource management is handled by you. This makes room for human error, and memory leaks, resource leaks and other memory errors, you don't usually deal with in managed languages. It also compiles to pure bytecode (native assembly opcodes), which means that you have to compile your code for each platform you intend to target. You will encounter that you will have to make a lot of code specific for each platform, depending on what you are going to code. It's nice to see that simple things such as threading, which where platform dependent, now is a part of the c++ standard.
Then you have c++/CLI, which basicly is a mix. You may use managed code from the .net framework in c++, and it may be used as a bridge, and be used to make wrappers.
Console::WriteLine() is managed .net code, safe.
cout is standard iso c++ from <iostream>, unsafe
You find a related post here, with a broader answer here and here :)
Edit
As pointed out by Deduplicator below this is also of interest for you
Hope it helps.
Cheers
In the world of .NET, "safe" is synonymous with "verifiable" type safety. In Visual C++, it's enabled by /clr:safe.
/clr:safe will prevent you from using std::cout or any other function or type implemented in native code, because the metadata needed by .NET's verifier does not exist for native functions. MSIL which Stigandr mentioned can be used for just-in-time compilation, but even when compilation to native code is performed ahead of time, the MSIL is provided alongside the compiled native code and serves as a proof of its type safety which the verifier inspects.
Standard (native / unmanaged) C++ does check type safety during compilation. But that can be disabled by casts, and without runtime type checks, which C++ does not provide as part of the language, pointer arithmetic (e.g. array index out of bounds) can also violate type safety, as can using pointers to freed objects. C++ isn't just a language though, it is also a standard library, where you find smart pointers and smart collections that do the necessary runtime checks, so it can be just as type-safe as any managed framework.

Is C++/CLI an extension of Standard ISO C++?

Is Microsofts C++/CLI built on top of the C++ Standard (C++98 or C++11) or is it only "similar" and has deviations?
Or, specifically, is every ISO standard conforming C++ program (either C++98 or C++11), also a conforming C++/CLI program?
Note: I interpret the Wikipedia article above only comparing C++/CLI to MC++, not to ISO Standard C++.
Sure, it is an extension to C++03 and can compile any compliant C++03 program that doesn't conflict with the added keywords. The only thing it doesn't support are some of the Microsoft extensions to C++, the kind that are fundamentally incompatible with managed code execution like __fastcall and __try. MC++ was their first attempt at it, kept compatible by prefixing all added keywords with underscores. The syntax was rather forced and not well received by their customers, C++/CLI dropped the practice and has a much more intuitive syntax. Stanley Lippman of C++ Primer fame was heavily involved btw.
The compiler can be switched between managed and native code generation on-the-fly with #pragma managed, the product is a .NET mixed-mode assembly that contains both MSIL and native machine code. The MSIL produced from native C++ source is not exactly equivalent to the kind produced by, say, the C# or VB.NET compilers. It doesn't magically become verifiable and doesn't get the garbage collector love, you can corrupt the heap or blow the stack just as easily. And no optimizer love either, the MSIL gets translated to machine code at runtime and is optimized just like normal managed code with the time restrictions inherent in a jitter. Getting too much native C++ code translated to MSIL is a very common mistake, the compiler hides it too well.
C++/CLI is notable for introducing syntax that got later adopted into C++11. Like nullptr, override, final and enum class. Bit of a problem, actually, it begat __nullptr to be able to distinguish between a managed and a native null pointer. They never found a great solution for enum class, you have to declare it public to get a managed enum type. Some C++11 extensions work, few beyond the ones it already had, auto is fine but no lambda expressions, quite a loss in .NET programming. The language has been frozen since 2005.
The C++/CX language extension is notable as well, one that makes writing C++ code for Store and Phone apps palatable. The syntax resembles C++/CLI a great deal, including the ref class and hats in the syntax. But with objects allocated with ref new instead of gcnew, the latter would have been too misleading. Otherwise very different from C++/CLI at runtime, you get pure native code out of C++/CX. The language extension hides the COM interop code that's underneath, automatically reference-counting objects, translating error codes into exceptions and mapping generics. The resemblance to C++/CLI syntax is no accident, they basically perform the same role. Mapping C++-like syntax to a foreign type system.
CLI is a set of extensions for standard C++. CLI has full support of standard C++ and adds something more. So every C++ program will compile with enabled CLI, except you are using a CLI reserved word and this is the weakness of the extension, because it does not respect the double underscore rule for extensions (such reserved words has to begin with __).
You can deactivate those extensions in the GUI by:
Configuration Properties -> General -> Common Language Runtime Support
Even Bjarne Stroustrup calls CLI an extension:
On the difficult and controversial question of what the CLI binding/extensions to C++ is to be called, I prefer C++/CLI as a shorthand for "The CLI extensions to ISO C++". Keeping C++ as part of the name reminds people what is the base language and will help keep C++ a proper subset of C++ with the C++/CLI extension
Language extensions could always be called deviations from the standard, because it will not compile with a compiler without CLI support (e.g. the ^ pointer).

C++/CLI: are local managed variables allowed in unmanaged class methods?

I know gcroot is used for holding a refrence to a managed object in a native class, but what about using managed objects as local variables inside an unmanaged class function ?
The compiler does not seem to generate an error on this, but is it "proper" ? Does it hurt performence?
There is no such thing as a local managed object. All managed objects are stored on the heap, required to let the garbage collector do its job. You could only have a reference as a local variable. A pointer at runtime.
Using an managed object reference in an unmanaged function is possible if you compile that code with /clr or #pragma managed in effect. Such code will be translated to IL and gets just-in-time compiled at runtime, just like normal managed code. It won't otherwise have the characteristics of managed code, there is no verification and you'll suffer all the normal pointer bugs. And yes, it can hurt performance because such code doesn't get the normal optimizer love. The optimizer built in the jitter isn't as effective because it works under time constraints.
Compile native code without the /clr option or use #pragma unmanaged inside your code to switch the compiler on-the-fly.
Managed objects, both values of managed value types (value struct, value class, enum class) and handles to managed reference types (ref struct, ref class) can be used inside code which is compiled to MSIL.
And code that is compiled to MSIL can be part of unmanaged objects (for example, a virtual member function of a standard C++ type can be compiled to MSIL, and the Visual C++ compiler "It Just Works" technology will make sure that the v-table is set up correctly). This is extremely useful when forwarding events and callbacks produced by standard C++ code into the managed GUI world. But it is also applicable if you have an algorithm implemented in managed code (perhaps C#) that you want to call from C++.
As Hans mentions, there are performance implications of switching between MSIL and machine code generation for a particular function. But if you are sitting on a native-managed boundary, then compilation to MSIL and use of the "It Just Works" a/k/a "C++ interop" is by far the highest performance alternative.

Tutorial needed on invoking unmanaged DLL from C#.NET

I have a DLL from a vendor that I Need to invoke from C#. I know that C# data classes are not directly compatible with C++ data types.
So, given that I have a function that receives data and returns a "string".
(like this)
string answer = CreateCode2(int, string1, uint32, string2, uint16);
What must I do to make the input parameters compatible, and then make the result string compatible?
Please - I have never done this: Don't give answers like "Use P/Invoke" or "Use Marshal" I need a tutorial with examples.
All the P/Invoke examples I have seen are from .NET Framework 1.1, and Marshall (without a tutorial) is totally confusing me.
Also, I have seen some examples that tell me when I create my extern function to replace all the datatypes with void*. This is makes my IDE demand that I use "unsafe".
This isn't quite a tutorial but it's got a lot of good information on using P/Invoke
Calling Win32 DLLs in C# with P/Invoke
It'll give you an idea of the terminology, the basic concepts, how to use DllImport and should be enough to get you going.
There's a tutorial on MSDN: Platform Invoke Tutorial.
But it's pretty short and to be honest the one I've mentioned above is a much better source of information, but there's a lot of it on there.
Also useful is the PInvoke Signature Toolkit, described here .
And downloadable here.
It lets you paste in an unmanaged method signature, or struct definition and it'll give you the .NET P/Invoke equivalent. It's not 100% perfect but it gets you going much quicker than trying to figure everything out yourself.
With regards to Marshalling specifically, I would say start simple.
If you've got something that's some sort of pointer, rather than trying to convert it directly to some .NET type in the method signature using Marshal it can sometimes be easier to just treat it as an IntPtr and then use Marshal.Copy, .PtrToString, .PtrToStructure and the other similar methods to get the data into a .NET type.
Then when you've gotten to grips with the whole thing you can move on to direct conversions using the Marshal attribute.
There's a good 3 part set of articles on marshalling here, here and here.

C++ Interop: embedding an array in a UDT

I have an application that involves a lot of communication between managed (C#) and unmanaged (C++) code. We are using Visual Studio 2005 (!), and we use the interop assembly generated automatically by tlbimp.
We have fairly good luck passing simple structs back and forth as function arguments. And because our objects are fairly simple, we can pack them into SAFEARRAYs using the IRecordInfo interface. Passing these arrays as arguments to COM methods seems to work properly.
We would like to be able to embed variable-length arrays in our UDTs, but this fails badly. I don't think I have been able to find a single piece of documentation showing how someone has accomplished this. Nor have I found documentation that says it can't be done.
1) Naive approach: Simply declare a safearray in the managed code:
struct MyUdt {
int member1;
BSTR member2;
SAFEARRAY *m3;
};
The C++ compiler is happy with this, but the generated IDL confounds tblimp.exe. It reports that it is unable to convert the signature for member m3, and the signature for member tagSAFEARRAY.rgsabound. These are only warnings, but they are meaningful, the resulting assembly is not usable.
Using LPSAFEARRAY, oddly enough, fails in different ways, but for the same reason, tblimp just can't deal with it.
2) Trickier: Pack it into a variant:
struct MyUdt {
int member1;
BSTR member2;
VARIANT m3;
};
We have code that builds safearrays of UDTs, and it never gives us any trouble. It's basically copied from MSDN. Using that code to create a safearray, then:
pVal->m3.vt = VT_SAFEARRAY | VT_RECORD;
pval->parray = p;
Fails in odd ways. It always breaks, some variations produce an OutOFMemoryException... odd, others fail in different ways. (I'm not sure if a pRecInfo pointer is required here or not, but it fails the same way, present or not.)
The Google search space for this is badly polluted with answers to questions that I am not asking:
How do you pass UDT/structs from unmanaged code.
How do you pass a SAFEARRAY of structs? (We're doing this fine.)
How do you use p/invoke or customer marshalling to pass UDTs.
And many answers describing how to define things from the managed side, not the unmanaged side.
And then there are a couple of Microsoft KBs describing problems with VT_RECORD in early versions of .NET. I don't think these are germane - VT_RECORD types work with VARIANT and with SAFEARRAY. (But maybe not with the UDT marshallling...)
If this won't ever work, it would be nice to at least know why.
Mark