Is Microsofts C++/CLI built on top of the C++ Standard (C++98 or C++11) or is it only "similar" and has deviations?
Or, specifically, is every ISO standard conforming C++ program (either C++98 or C++11), also a conforming C++/CLI program?
Note: I interpret the Wikipedia article above only comparing C++/CLI to MC++, not to ISO Standard C++.
Sure, it is an extension to C++03 and can compile any compliant C++03 program that doesn't conflict with the added keywords. The only thing it doesn't support are some of the Microsoft extensions to C++, the kind that are fundamentally incompatible with managed code execution like __fastcall and __try. MC++ was their first attempt at it, kept compatible by prefixing all added keywords with underscores. The syntax was rather forced and not well received by their customers, C++/CLI dropped the practice and has a much more intuitive syntax. Stanley Lippman of C++ Primer fame was heavily involved btw.
The compiler can be switched between managed and native code generation on-the-fly with #pragma managed, the product is a .NET mixed-mode assembly that contains both MSIL and native machine code. The MSIL produced from native C++ source is not exactly equivalent to the kind produced by, say, the C# or VB.NET compilers. It doesn't magically become verifiable and doesn't get the garbage collector love, you can corrupt the heap or blow the stack just as easily. And no optimizer love either, the MSIL gets translated to machine code at runtime and is optimized just like normal managed code with the time restrictions inherent in a jitter. Getting too much native C++ code translated to MSIL is a very common mistake, the compiler hides it too well.
C++/CLI is notable for introducing syntax that got later adopted into C++11. Like nullptr, override, final and enum class. Bit of a problem, actually, it begat __nullptr to be able to distinguish between a managed and a native null pointer. They never found a great solution for enum class, you have to declare it public to get a managed enum type. Some C++11 extensions work, few beyond the ones it already had, auto is fine but no lambda expressions, quite a loss in .NET programming. The language has been frozen since 2005.
The C++/CX language extension is notable as well, one that makes writing C++ code for Store and Phone apps palatable. The syntax resembles C++/CLI a great deal, including the ref class and hats in the syntax. But with objects allocated with ref new instead of gcnew, the latter would have been too misleading. Otherwise very different from C++/CLI at runtime, you get pure native code out of C++/CX. The language extension hides the COM interop code that's underneath, automatically reference-counting objects, translating error codes into exceptions and mapping generics. The resemblance to C++/CLI syntax is no accident, they basically perform the same role. Mapping C++-like syntax to a foreign type system.
CLI is a set of extensions for standard C++. CLI has full support of standard C++ and adds something more. So every C++ program will compile with enabled CLI, except you are using a CLI reserved word and this is the weakness of the extension, because it does not respect the double underscore rule for extensions (such reserved words has to begin with __).
You can deactivate those extensions in the GUI by:
Configuration Properties -> General -> Common Language Runtime Support
Even Bjarne Stroustrup calls CLI an extension:
On the difficult and controversial question of what the CLI binding/extensions to C++ is to be called, I prefer C++/CLI as a shorthand for "The CLI extensions to ISO C++". Keeping C++ as part of the name reminds people what is the base language and will help keep C++ a proper subset of C++ with the C++/CLI extension
Language extensions could always be called deviations from the standard, because it will not compile with a compiler without CLI support (e.g. the ^ pointer).
Related
I am going back to school where we have to take a C++ class.
I am familiar with the language but there's a few things I have never heard of...
Generally, my teacher said that plain C++ is "unsafe". It generates "unsafe code" (whatever that means). That's why we have to use C++/CLI which is supposed to make "safe" code.
Now... isn't CLI just a Microsoft .NET extension?
He is also telling us to use Console::WriteLine() instead of cout. Since Console::WriteLine() is "safe" and cout is "unsafe".
All this seems weird to me... Can anyone clarify this?
Thanks!
To put it very blunt and simple.
Safe
By "safe code" you teacher probably means managed code. That is code where you don't have to "care" about pointers and memory, you have a garbagecollector that takes care of this for your. You are dealing with refrences. Examples of languages built like this is java and c#. Code is compiled to a "fictional" opcodes(intermediate language, IL for C#), and compiles and run realtime(JIT, just in time compilation). The IL generated code, will have to be converted to real native platform based opcodes, in java this is one of things the jvm does. You may easily disassemble code from languages like these. And they may run on several platforms without a recompilation.
Unsafe
By "unsafe code" the teacher means ordinary native c++ unmanaged code, where all memory and resource management is handled by you. This makes room for human error, and memory leaks, resource leaks and other memory errors, you don't usually deal with in managed languages. It also compiles to pure bytecode (native assembly opcodes), which means that you have to compile your code for each platform you intend to target. You will encounter that you will have to make a lot of code specific for each platform, depending on what you are going to code. It's nice to see that simple things such as threading, which where platform dependent, now is a part of the c++ standard.
Then you have c++/CLI, which basicly is a mix. You may use managed code from the .net framework in c++, and it may be used as a bridge, and be used to make wrappers.
Console::WriteLine() is managed .net code, safe.
cout is standard iso c++ from <iostream>, unsafe
You find a related post here, with a broader answer here and here :)
Edit
As pointed out by Deduplicator below this is also of interest for you
Hope it helps.
Cheers
In the world of .NET, "safe" is synonymous with "verifiable" type safety. In Visual C++, it's enabled by /clr:safe.
/clr:safe will prevent you from using std::cout or any other function or type implemented in native code, because the metadata needed by .NET's verifier does not exist for native functions. MSIL which Stigandr mentioned can be used for just-in-time compilation, but even when compilation to native code is performed ahead of time, the MSIL is provided alongside the compiled native code and serves as a proof of its type safety which the verifier inspects.
Standard (native / unmanaged) C++ does check type safety during compilation. But that can be disabled by casts, and without runtime type checks, which C++ does not provide as part of the language, pointer arithmetic (e.g. array index out of bounds) can also violate type safety, as can using pointers to freed objects. C++ isn't just a language though, it is also a standard library, where you find smart pointers and smart collections that do the necessary runtime checks, so it can be just as type-safe as any managed framework.
As I understand correctly, besides the fact that Objective-C language is a strict superset of a "clean" C, added OOP paradigm is simulated by a set of functions partially described in Objective-C Runtime Reference.
Therefore, I'm expecting a possibility to somehow compile Objective-C code in an intermediate C/C++ file (maybe with some asm inserts).
Is it generally possible ?
You could use the clang rewriter to convert to C++. Not aware of a way to go to C though.
The rewriter is available via the "-rewrite-objc" command line option.
As far as I know, there is no software that preprocesses Objective-C code into intermediate C code.
But you could write your Objective-C program entirely in C by calling directly into the Objective-C runtime. The trouble is just that the code might vary between implementations or even different versions of the same runtime.
The question is, is it actually worth the trouble?
In the documentation for com it says that it works literally with every language. Do you need to have a specific API for that language so it can interface with com, or can any language literally just use it out of box? Also do you need a special compiler? Sorry if this is a stupid question but I have never used it before, and I have been trying to find this answer. When I look at demos of com examples it all seems to access the objects in a c style syntax, are their bindings and apis for other languages (literally all)?
The key thing about COM is that it is a "binary standard": which is to say that it doesn't care what the language used is, so long as the bits and bytes in memory end up in the right place.
COM basically specifies that all COM objects must have a specific layout in memory: the interface pointer points to a pointer that in turn points to a table of function pointers, which has at least three members, the first three of which are pointers to the IUnknown functions (AddRef, Release, QueryInterface), and the remainder are pointers to the other functions in the interface. COM also specifies how arguments are passed to these functions - so that the caller and callee agree on how the stack is used, and who pops off the values.
This requirement happily matches how C++ just happens to work on Windows; so most C++ classes that implement IUnknown will just happen to ends up as being valid COM classes: this is because Microsoft's implementation of C++ happens to use an object layout that matches what COM requires: the C++ object vtable pointer is the same as the COM pointer-to-table-of-function-pointers, the C++ table of function pointers is exactly what COM requires for its table-of-function-pointers, and so on. (This isn't entirely just a happy coincidence: COM was likely designed to take advantage of the most common way that C++ objects are implemented in memory which is the technique that MS's compiler uses. Note that C++ the language specification doesn't actually specify any particular object layout - so you could have a 3rd party C++ compiler that implemented C++ in a way that gave you classes that are not usable by COM. But no compiler vendor in their right mind would do that, since they would appear to be broken compared to the others!)
In plain C, you can create a COM object by creating suitable structs-containing-pointers manually. This works because C essentially allows you to specify binary-level memory layout for structs manually; you can create structs that you know will have the appropriate layout that COM is expecting.
In other languages, especially those that don't allow the user to specify memory layout explicitly, you need support from the language to allow for COM support. All the .Net languages - C#, VB.Net, and so on - use support in the .Net runtime that understands what COM expects, and produces the appropriate wrappers as needed to allow the interop to work.
So, long story short, it's not the case that any language under the sun will automatically work with COM; it's really the case that a couple of languages - namely C and C++ - are already aligned with COM's requirements; and most other languages will need some compiler support to make it work.
I need to work on converting a very huge c++ project to clr safe. The current c++ project has a lot of stuff from c++ like templates, generics, pointers, storage/stream, ole apis, zlib compression apis, inlines etc. Where can I find the datiled document for this type of conversion? Can you suggest some good book to refer to? If anyone of you have done such conversion, can I get some analysis from you?
I'll just cough up the MSDN Library article titled "How to: Migrate to /clr:safe
Visual C++ can generate verifiable components with using /clr:safe, which causes the compiler to generate errors for each non-verifiable code construct.
The following issues generate verifiability errors:
Native types. Even if it isn't used, the declaration of native classes, structures, pointers, or arrays will prevent compilation.
Global variables
Function calls into any unmanaged library, including common language runtime function calls
A verifiable function cannot contain a static_cast Operator for down-casting. The static_cast operator can be used for casting between primitive types, but for down-casting, safe_cast or a C-Style cast (which is implemented as a safe_cast) must be used.
A verifiable function cannot contain a reinterpret_cast operator (or any C-style cast equivalent).
A verifiable function cannot perform arithmetic on an interior_ptr. It may only assign to it and dereference it.
A verifiable function can only throw or catch pointers to reference types, so value types must be boxed before throwing.
A verifiable function can only call verifiable functions (such that calls to the common language runtime are not allowed, include AtEntry/AtExit, and so global constructors are disallowed).
A verifiable class cannot use Explicit.
If building an EXE, a main function cannot declare any parameters, so GetCommandLineArgs must be used to retrieve command-line arguments.
Making a non-virtual call to a virtual function.
Also, the following keywords cannot be used in verifiable code:
unmanaged and pack pragmas
naked and align __declspec modifiers
__asm
__based
__try and __except
I reckon that will keep you busy for a while. There is no magic wand to wave to turn native C++ into verifiable code. Are you sure this is worth the investment?
The vast majority of native C++ is entirely valid C++/CLI, including templates, inlines, etc, except the CLR STL is rather slow compared to the BCL. Also, native C++ doesn't have generics, only templates.
The reality of compiling as C++/CLI is to check the switch and push compile, and wait for it to throw errors.
Rewriting native C++ into safe C++/CLI will result in a code that is syntactically different, but semantically same as C#. If that is the case, why not rewrite directly in C#?
If you want to avoid what is essentially a complete rewrite, consider the following alternatives:
P/Invoke. Unfortunately, I'm unfamiliar whether this would isolate safe from unsafe code. Even if it can perform the isolation, you'll need to wrap your existing C++ code into procedural, C-like API, so it can be consumed by P/Invoke. On a plus side, unless your API is excessively chatty, you get to keep (most of) your native performance.
Wrapping your C++ into out-of-process COM server and using COM Interop to consume it from the manged code. This way, your managed code is completely protected from any corruption that might happen at C++ end and can remain "safe". The downside is a performance hit that you'll get for out-of-process marshaling and the implementation effort you'll need to expend to correctly implement the COM.
What is the Objective-C equivalent of the Java Language Specification or the C++ Standard?
Is it this:
http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/Introduction/introObjectiveC.html ?
(I'm just looking for an (official) authoritative document which will explain the little nitty-gritties of the language. I'll skip the why for now :)
Appendix A of the document you linked to is a description of all of the language features, which is the closest we have to a specification (Appendix B used to be a grammar specification, but they've clearly removed that from the later versions of the document).
There has never been a standardisation of Objective-C and it's always been under the control of a single vendor - initially StepStone, then NeXT Computer licensed it (and ultimately bought the IP) and finally Apple consumed NeXT Software. I expect there's little motivation to go through the labourious process of standardisation on Apple's part, especially as there are no accusations of ObjC being an anticompetitive platform which standardisation could mitigate.
There is none. The link you provided is the only 'official' documentation, which is essentially a prose description, and not a rigorous language specification. Apple employees suggest that this is sufficient for most purposes, and if you require something more formal you should file a bug report (!). Sadly, the running joke is the Objective-C standard is defined by whatever the compiler is able to compile.
Many consider Objective-C to be either a "strict superset" or "superset" of C. IMHO, for 'classic' Objective-C (or, Objective-C 1.0), I would consider this to be a true statement. In particular, I'm not aware of any Objective-C language addition that does not map to an equivalent "plain C" statement. In essence, this means the Objective-C additions are pure syntactic sugar, and one can use the C standard in effect to reason about the nitty gritty. I'm not convinced that this is entirely true for Objective-C 2.0 with GC enabled. This is because pointers to GC managed memory need to be handled specially (the compiler must insert various barriers depending on the particulars of the pointer). Since the GC pointer type qualifiers, such as __strong, are implemented as __attribute__(()) inside gcc, this means that void *p; and void __strong *p; are similarly qualified pointers according to the C99 standard. The problems that this can cause, and even the ability to write programs that operate in a deterministic manner, are either self evident or not (consult your local language lawyer or compiler writer for more information).
Another problem that comes up from time to time is that the C language has continued to evolve relative to the Objective-C language. Objective-C dates back to the mid 1980's, which is pre-ANSI-C standard time. Consider the following code fragement:
NSMutableArray *mutableArray = [NSMutableArray array];
NSArray *array = mutableArray;
This is legal Objective-C code as defined by the official prose description of the language. This is also one of the main concepts behind Object Oriented programming. However, when one considers those statements couched from the perspective of "strict superset of C99", one runs in to a huge problem. In particular, this violates C99's strict aliasing rules. A standards grade language specification would help clarify the treatment and behavior of such conflicts. Unfortunatly, because no such document exists, there can be much debate over such details, and ultimately result in bugs in the compiler. This has resulted in a bug in gcc that dates all the way back to version 3.0 (gcc bug #39753).
Apple's document is about the best you're going to get. Like many other languages, Objective-C doesn't have a formal standard or specification; rather, it is described mostly by its canonical implementations.
Further resources include:
The Objective-C Language and GNUstep Base Library Programming Manual.
The NeXT developers library
Apple (now) using clang of the llvm.org project.
Some of the language elements are defined in this context
e.g. Objective-C literals --> http://clang.llvm.org/docs/ObjectiveCLiterals.html
But i didn't found a clear overview of all elements.
--- updated --
The source of Apples clang is available (as open source) here:
http://opensource.apple.com/source/clang/