I am going back to school where we have to take a C++ class.
I am familiar with the language but there's a few things I have never heard of...
Generally, my teacher said that plain C++ is "unsafe". It generates "unsafe code" (whatever that means). That's why we have to use C++/CLI which is supposed to make "safe" code.
Now... isn't CLI just a Microsoft .NET extension?
He is also telling us to use Console::WriteLine() instead of cout. Since Console::WriteLine() is "safe" and cout is "unsafe".
All this seems weird to me... Can anyone clarify this?
Thanks!
To put it very blunt and simple.
Safe
By "safe code" you teacher probably means managed code. That is code where you don't have to "care" about pointers and memory, you have a garbagecollector that takes care of this for your. You are dealing with refrences. Examples of languages built like this is java and c#. Code is compiled to a "fictional" opcodes(intermediate language, IL for C#), and compiles and run realtime(JIT, just in time compilation). The IL generated code, will have to be converted to real native platform based opcodes, in java this is one of things the jvm does. You may easily disassemble code from languages like these. And they may run on several platforms without a recompilation.
Unsafe
By "unsafe code" the teacher means ordinary native c++ unmanaged code, where all memory and resource management is handled by you. This makes room for human error, and memory leaks, resource leaks and other memory errors, you don't usually deal with in managed languages. It also compiles to pure bytecode (native assembly opcodes), which means that you have to compile your code for each platform you intend to target. You will encounter that you will have to make a lot of code specific for each platform, depending on what you are going to code. It's nice to see that simple things such as threading, which where platform dependent, now is a part of the c++ standard.
Then you have c++/CLI, which basicly is a mix. You may use managed code from the .net framework in c++, and it may be used as a bridge, and be used to make wrappers.
Console::WriteLine() is managed .net code, safe.
cout is standard iso c++ from <iostream>, unsafe
You find a related post here, with a broader answer here and here :)
Edit
As pointed out by Deduplicator below this is also of interest for you
Hope it helps.
Cheers
In the world of .NET, "safe" is synonymous with "verifiable" type safety. In Visual C++, it's enabled by /clr:safe.
/clr:safe will prevent you from using std::cout or any other function or type implemented in native code, because the metadata needed by .NET's verifier does not exist for native functions. MSIL which Stigandr mentioned can be used for just-in-time compilation, but even when compilation to native code is performed ahead of time, the MSIL is provided alongside the compiled native code and serves as a proof of its type safety which the verifier inspects.
Standard (native / unmanaged) C++ does check type safety during compilation. But that can be disabled by casts, and without runtime type checks, which C++ does not provide as part of the language, pointer arithmetic (e.g. array index out of bounds) can also violate type safety, as can using pointers to freed objects. C++ isn't just a language though, it is also a standard library, where you find smart pointers and smart collections that do the necessary runtime checks, so it can be just as type-safe as any managed framework.
Related
Is Microsofts C++/CLI built on top of the C++ Standard (C++98 or C++11) or is it only "similar" and has deviations?
Or, specifically, is every ISO standard conforming C++ program (either C++98 or C++11), also a conforming C++/CLI program?
Note: I interpret the Wikipedia article above only comparing C++/CLI to MC++, not to ISO Standard C++.
Sure, it is an extension to C++03 and can compile any compliant C++03 program that doesn't conflict with the added keywords. The only thing it doesn't support are some of the Microsoft extensions to C++, the kind that are fundamentally incompatible with managed code execution like __fastcall and __try. MC++ was their first attempt at it, kept compatible by prefixing all added keywords with underscores. The syntax was rather forced and not well received by their customers, C++/CLI dropped the practice and has a much more intuitive syntax. Stanley Lippman of C++ Primer fame was heavily involved btw.
The compiler can be switched between managed and native code generation on-the-fly with #pragma managed, the product is a .NET mixed-mode assembly that contains both MSIL and native machine code. The MSIL produced from native C++ source is not exactly equivalent to the kind produced by, say, the C# or VB.NET compilers. It doesn't magically become verifiable and doesn't get the garbage collector love, you can corrupt the heap or blow the stack just as easily. And no optimizer love either, the MSIL gets translated to machine code at runtime and is optimized just like normal managed code with the time restrictions inherent in a jitter. Getting too much native C++ code translated to MSIL is a very common mistake, the compiler hides it too well.
C++/CLI is notable for introducing syntax that got later adopted into C++11. Like nullptr, override, final and enum class. Bit of a problem, actually, it begat __nullptr to be able to distinguish between a managed and a native null pointer. They never found a great solution for enum class, you have to declare it public to get a managed enum type. Some C++11 extensions work, few beyond the ones it already had, auto is fine but no lambda expressions, quite a loss in .NET programming. The language has been frozen since 2005.
The C++/CX language extension is notable as well, one that makes writing C++ code for Store and Phone apps palatable. The syntax resembles C++/CLI a great deal, including the ref class and hats in the syntax. But with objects allocated with ref new instead of gcnew, the latter would have been too misleading. Otherwise very different from C++/CLI at runtime, you get pure native code out of C++/CX. The language extension hides the COM interop code that's underneath, automatically reference-counting objects, translating error codes into exceptions and mapping generics. The resemblance to C++/CLI syntax is no accident, they basically perform the same role. Mapping C++-like syntax to a foreign type system.
CLI is a set of extensions for standard C++. CLI has full support of standard C++ and adds something more. So every C++ program will compile with enabled CLI, except you are using a CLI reserved word and this is the weakness of the extension, because it does not respect the double underscore rule for extensions (such reserved words has to begin with __).
You can deactivate those extensions in the GUI by:
Configuration Properties -> General -> Common Language Runtime Support
Even Bjarne Stroustrup calls CLI an extension:
On the difficult and controversial question of what the CLI binding/extensions to C++ is to be called, I prefer C++/CLI as a shorthand for "The CLI extensions to ISO C++". Keeping C++ as part of the name reminds people what is the base language and will help keep C++ a proper subset of C++ with the C++/CLI extension
Language extensions could always be called deviations from the standard, because it will not compile with a compiler without CLI support (e.g. the ^ pointer).
I'm aware this might be a broad question (there's no specific code for you to look at), but I'm hoping I'd get some insights as to what to do, or how to approach the problem.
To keep things simple, suppose the compiler that I'm writing performs these three steps:
parse (and bind all variables)
typecheck
codegen
Also the language that I'm building the compiler for wants to support late-analysis/late-binding (ie., it has a function that takes a String, which is to be compiled and executed as a piece of source-code during runtime).
Now during parse-phase, I have a piece of context that I need to keep around till run-time for the sole benefit of the aforementioned function (because it needs to parse and typecheck its argument in that context).
So the question, how should I do this? What do other compilers do?
Should I just serialise the context object to disk (codegen for it) and resurrect it during run-time or something?
Thanks
Yes, you'll need to emit the type information (or other context, you weren't very specific) in your object/executable files, so that your eval can read it at runtime. You might look at Java's .class file format for inspiration; Java doesn't have eval as such, but you can dynamically spin new bytecode at runtime that must be linked in a type-safe manner. David Conrad's comment is spot-on: this information can also be used to implement reflection, if your language has such a feature.
That's as much as I can help you without more specifics.
I know gcroot is used for holding a refrence to a managed object in a native class, but what about using managed objects as local variables inside an unmanaged class function ?
The compiler does not seem to generate an error on this, but is it "proper" ? Does it hurt performence?
There is no such thing as a local managed object. All managed objects are stored on the heap, required to let the garbage collector do its job. You could only have a reference as a local variable. A pointer at runtime.
Using an managed object reference in an unmanaged function is possible if you compile that code with /clr or #pragma managed in effect. Such code will be translated to IL and gets just-in-time compiled at runtime, just like normal managed code. It won't otherwise have the characteristics of managed code, there is no verification and you'll suffer all the normal pointer bugs. And yes, it can hurt performance because such code doesn't get the normal optimizer love. The optimizer built in the jitter isn't as effective because it works under time constraints.
Compile native code without the /clr option or use #pragma unmanaged inside your code to switch the compiler on-the-fly.
Managed objects, both values of managed value types (value struct, value class, enum class) and handles to managed reference types (ref struct, ref class) can be used inside code which is compiled to MSIL.
And code that is compiled to MSIL can be part of unmanaged objects (for example, a virtual member function of a standard C++ type can be compiled to MSIL, and the Visual C++ compiler "It Just Works" technology will make sure that the v-table is set up correctly). This is extremely useful when forwarding events and callbacks produced by standard C++ code into the managed GUI world. But it is also applicable if you have an algorithm implemented in managed code (perhaps C#) that you want to call from C++.
As Hans mentions, there are performance implications of switching between MSIL and machine code generation for a particular function. But if you are sitting on a native-managed boundary, then compilation to MSIL and use of the "It Just Works" a/k/a "C++ interop" is by far the highest performance alternative.
In the documentation for com it says that it works literally with every language. Do you need to have a specific API for that language so it can interface with com, or can any language literally just use it out of box? Also do you need a special compiler? Sorry if this is a stupid question but I have never used it before, and I have been trying to find this answer. When I look at demos of com examples it all seems to access the objects in a c style syntax, are their bindings and apis for other languages (literally all)?
The key thing about COM is that it is a "binary standard": which is to say that it doesn't care what the language used is, so long as the bits and bytes in memory end up in the right place.
COM basically specifies that all COM objects must have a specific layout in memory: the interface pointer points to a pointer that in turn points to a table of function pointers, which has at least three members, the first three of which are pointers to the IUnknown functions (AddRef, Release, QueryInterface), and the remainder are pointers to the other functions in the interface. COM also specifies how arguments are passed to these functions - so that the caller and callee agree on how the stack is used, and who pops off the values.
This requirement happily matches how C++ just happens to work on Windows; so most C++ classes that implement IUnknown will just happen to ends up as being valid COM classes: this is because Microsoft's implementation of C++ happens to use an object layout that matches what COM requires: the C++ object vtable pointer is the same as the COM pointer-to-table-of-function-pointers, the C++ table of function pointers is exactly what COM requires for its table-of-function-pointers, and so on. (This isn't entirely just a happy coincidence: COM was likely designed to take advantage of the most common way that C++ objects are implemented in memory which is the technique that MS's compiler uses. Note that C++ the language specification doesn't actually specify any particular object layout - so you could have a 3rd party C++ compiler that implemented C++ in a way that gave you classes that are not usable by COM. But no compiler vendor in their right mind would do that, since they would appear to be broken compared to the others!)
In plain C, you can create a COM object by creating suitable structs-containing-pointers manually. This works because C essentially allows you to specify binary-level memory layout for structs manually; you can create structs that you know will have the appropriate layout that COM is expecting.
In other languages, especially those that don't allow the user to specify memory layout explicitly, you need support from the language to allow for COM support. All the .Net languages - C#, VB.Net, and so on - use support in the .Net runtime that understands what COM expects, and produces the appropriate wrappers as needed to allow the interop to work.
So, long story short, it's not the case that any language under the sun will automatically work with COM; it's really the case that a couple of languages - namely C and C++ - are already aligned with COM's requirements; and most other languages will need some compiler support to make it work.
I need to work on converting a very huge c++ project to clr safe. The current c++ project has a lot of stuff from c++ like templates, generics, pointers, storage/stream, ole apis, zlib compression apis, inlines etc. Where can I find the datiled document for this type of conversion? Can you suggest some good book to refer to? If anyone of you have done such conversion, can I get some analysis from you?
I'll just cough up the MSDN Library article titled "How to: Migrate to /clr:safe
Visual C++ can generate verifiable components with using /clr:safe, which causes the compiler to generate errors for each non-verifiable code construct.
The following issues generate verifiability errors:
Native types. Even if it isn't used, the declaration of native classes, structures, pointers, or arrays will prevent compilation.
Global variables
Function calls into any unmanaged library, including common language runtime function calls
A verifiable function cannot contain a static_cast Operator for down-casting. The static_cast operator can be used for casting between primitive types, but for down-casting, safe_cast or a C-Style cast (which is implemented as a safe_cast) must be used.
A verifiable function cannot contain a reinterpret_cast operator (or any C-style cast equivalent).
A verifiable function cannot perform arithmetic on an interior_ptr. It may only assign to it and dereference it.
A verifiable function can only throw or catch pointers to reference types, so value types must be boxed before throwing.
A verifiable function can only call verifiable functions (such that calls to the common language runtime are not allowed, include AtEntry/AtExit, and so global constructors are disallowed).
A verifiable class cannot use Explicit.
If building an EXE, a main function cannot declare any parameters, so GetCommandLineArgs must be used to retrieve command-line arguments.
Making a non-virtual call to a virtual function.
Also, the following keywords cannot be used in verifiable code:
unmanaged and pack pragmas
naked and align __declspec modifiers
__asm
__based
__try and __except
I reckon that will keep you busy for a while. There is no magic wand to wave to turn native C++ into verifiable code. Are you sure this is worth the investment?
The vast majority of native C++ is entirely valid C++/CLI, including templates, inlines, etc, except the CLR STL is rather slow compared to the BCL. Also, native C++ doesn't have generics, only templates.
The reality of compiling as C++/CLI is to check the switch and push compile, and wait for it to throw errors.
Rewriting native C++ into safe C++/CLI will result in a code that is syntactically different, but semantically same as C#. If that is the case, why not rewrite directly in C#?
If you want to avoid what is essentially a complete rewrite, consider the following alternatives:
P/Invoke. Unfortunately, I'm unfamiliar whether this would isolate safe from unsafe code. Even if it can perform the isolation, you'll need to wrap your existing C++ code into procedural, C-like API, so it can be consumed by P/Invoke. On a plus side, unless your API is excessively chatty, you get to keep (most of) your native performance.
Wrapping your C++ into out-of-process COM server and using COM Interop to consume it from the manged code. This way, your managed code is completely protected from any corruption that might happen at C++ end and can remain "safe". The downside is a performance hit that you'll get for out-of-process marshaling and the implementation effort you'll need to expend to correctly implement the COM.