Advantage of std::wstring over CComBSTR - com

There are some questions talked about conversion between std::wstring and CComBSTR, like this one, but what' the advantage of each one over the other if both are available in a project?

std::wstring has more methods for actual string handling, whereas CComBSTR is meant specifically for holding a BSTR string. BSTRs are used mostly by COM methods and have a different memory layout. Generally you should use std::wstring or CString unless you actually need the memory layout of BSTRs.

Related

Why do two NSStrings taken from user input end up with the same address?

I am curious to know how the OS or compiler manages the memory usage. Take this example of a login screen.
If I enter the same strings for both user ID & password: say "anoop" both times, the following two strings have same addresses:
NSString *userID = self.userNameField.stringValue;
NSString *password = self.passwordField.stringValue;
If I enter "anoop" & "Anoop" respectively, the address changes.
How does the compiler know that the the password text is same as the user ID, so that instead of allocating a new memory space it uses the same reference?
The answer in this case is that you are testing on a 64-bit platform with Objective-C tagged pointers. In this system, some "pointers" are not memory addresses at all; they encode data directly in the pointer value instead.
One kind of tagged pointer on these systems encodes short strings of common characters entirely in the pointer, and "anoop" is a string that can be encoded this way.
Check out Friday Q&A 2015-07-31: Tagged Pointer Strings for a very detailed explanation.
EDIT: see this answer for an explanation of the exact mechanism for short strings on the 64-bit runtime.
Non-mutable NSString instances (well, the concrete immutable subclass instances, as NSString is a class cluster) are uniqued by the NSString class under certain conditions. Even copying such a string will just return itself instead of creating an actual copy.
It may seem strange, but works well in practice, as the semantics of immutable strings allow it: the string cannot change, ever, so a copy of it will forever be indistinguishable from the original string.
Some of the benefits are: smaller memory footprint, less cache thrashing, faster string compares (as two constant strings just need their addresses compared to check for equality).
First of all I want to know, how you checked that. ;-)
However, a user input is done, when the program is executed. At this point in time there is no compiler anymore. So the compiler cannot do any optimization (string matching).
But your Q is broader than your example. And the answer is as in many programming language: If the memory usage is known at compile time ("static", "stack allocated") the compiler generates code that reserves the memory on stack. This applies to local vars.
If the memory usage is not known at compile time, for example if memory is allocated explicitly by the program, the code tells the system to reserve the amount of memory.

Is there any gain in Swift by defining constants instead of variables as much as possible?

Is there any gain in speed, memory usage, whatever, in Swift by defining as much as possible constants x vars?
I mean, defining as much as possible with let instead of var?
In theory, there should be no difference in speed or memory usage - internally, the variables work the same. In practice, letting the compiler know that something is a constant might result in better optimisations.
However the most important reason is that using constants (or immutable objects) helps to prevent programmer errors. It's not by accident that method parameters and iterators are constant by default.
Using immutable objects is also very useful in multithreaded applications because they prevent one type of synchronization problems.

COM memory layout and method pointer sizes

Suppose I get a pointer to a COM interface in a totally untyped way as just a raw address
void *p
How do I find the addresses of methods and access them? Is *p the address of the virtual table and then **p is the address of the first method? Are all the pointers involved 32-bit always in COM? So that to find a particular method I just need to index at multiples of 4-bytes into **p assuming I know which index the method will appear at. Is there any potential issue of BIG endian vs LITTLE endian?
Yes, technically it should point to the vtable. Methods in vtable appear in the order they were declared, starting with IUnknown methods.
But calling method using indexing will make your code type unsafe. Compiler has no way to ensure the parameters you will pass are correct or not. Big endian vs little endian matters if your COM object is out of proc and on other host. Proxy objects take care of that stuff so it will be transparent to the client.

Can I assume and handle SEL in Objective-C as a pointer to something?

I'm trying to interface Lua with Objective-C, and I think string conversion with NSSelectorFromString() has too big an overhead because Lua has to copy all strings to internalize them (although I'm not sure about this).
So I'm trying to find more lightweight way to represent a selector in Lua.
An Objective-C selector is an abstracted type, but it's defined as a pointer to something:
typedef struct objc_selector *SEL;
So it looks safe to handle as a regular pointer, so I can pass it to Lua with lightuserdata. Is this fine?
I don't believe it is safe to handle it as a pointer (even a void pointer), because if this ever changes in a future implementation or a different implementation of the language. I didn't see a formal Objective-C spec that tells what is implementation defines, but often when opaque types like this are used it means that you shouldn't have to know details about the underlying type is. In fact, the struct is forward-declared so that you can't access any of its members.
The other problem you might run into is implementing equality comparisons: are selectors references to a pool of constants or is each selector mutable. Once again, implementation defined.
Using C strings as suggested above is probably your best bet; ruby manages to use symbols for selectors and doesn't have too much of a performance penalty. Since the strings are const, lua doesn't need to copy them, but probably does anyway to be safe. If you can find a way to not copy the strings you might not take that much of a performance hit.

Determining what a CFTypeRef is?

I have a function which returns CFTypeRef. I have no idea what it really is. How do I determine that? For example it might be a CFStringRef.
CFGetTypeID():
if (CFGetTypeID(myObjectRef) == CFStringGetTypeID()) {
//i haz a string
}
The short answer is that you can (see Dave DeLongs answer). The long answer is that you can't. Both are true. A better question might be "Why do you need to know?" In my opinion, if you can arrange things so that you don't need to know, you're probably going to be better off.
I'm not saying that you can't do it, or even that you shouldn't. What I am saying is that there are some hidden gotchas when you start down this path, and some times you're not really aware of what all the unstated assumptions are. Unfortunately, programming correctly depends on knowing all the little details. Off the top of my head, here's a few of the potential gotchas:
To the best of my knowledge the set of Core Foundation types has increased in each major OS release. Therefore each major OS release has a superset Core Foundation types of the previous releases, and likely a strict superset at that. This is "observed behavior", and not necessarily "guaranteed" behavior. The important thing to note is that "things can and do change", and all things being equal, the easier and simpler solutions tend not to take this in to account. It is generally considered poor programming style to code something that breaks in the future, regardless of the reason or justification.
Because of Toll-Free Bridging between Core Foundation and Foundation, just because a CFTypeRef = CFStringRef does not mean that a CFTypeRef ≡ CFStringRef, where = means "equal to" and ≡ means "identical to". There is a distinction, which may or may not be important depending on context. As a warning, this tends to be where the bugs roam freely.
For example, a CFMutableStringRef can be used where ever a CFStringRef can be used, or CFStringRef = CFMutableStringRef. However, you can not use a CFStringRef everywhere a CFMutableStringRef can be used for obvious reasons. This means CFStringRef ≢ CFMutableStringRef. Again, depending on the context, they can be equal, but they are not identical.
It is very important to note that while there is a CFStringGetTypeID(), there is no corresponding CFMutableStringGetTypeID().
Logically, CFMutableStringRef is a strict superset of CFStringRef. It would follow, then, that passing a bona fide immutable CFStringRef to a CFMutableString API call would cause "some kind of problem". While this may not be true now (i.e., 10.6), I know for a fact that the following was true in the past: The CFMutableString API calls did not verify that "the string argument" was actually mutable (this was actually true for all types that made a distinction between immutable and mutable). The checks were there, but they were in the form of debug assertions that were disabled on "Release" builds (in other words, the checks were never performed in practice).
This is (or possibly was) officially not considered to be a bug, and the (trivial) mutability checks were not done "for performance reasons". No "public" API is provided to tell the mutability of a CFString pointer (or mutability of any type). Combined with Toll-Free bridging, this meant that you could mutate immutable NSString objects, even though the NSMutableString APIs did perform a mutability check and caused "some kind of problem" when trying to mutate an immutable object. Flavor with the fact that #"" constant strings in your source are mapped to read-only memory at run time.
The official line, as I recall, was "not to pass immutable objects, either CFStringRef or NSString, to CFMutableString API's, and further more, it was a bug to do so". When it was pointed out that there might be some security related issues with this stance (never mind the fact that it was fundamentally impossible), say if anything ever made the mistake of critically depending on the immutability of a string, especially "well known" strings, the answer was "the problem is theoretical and nothing will be done at this time until a workable exploit can be demonstrated."
Update: I was curious to see what the current behavior is. On my machine, running 10.6.4, using CFMutableString API's on an immutable CFString causes the immutable string to become essentially #"", which is at least better than what it did before (<= 10.5) and actually mutate the string. Definitely not the ideal solution, has that bitter real world taste to it where its only redeeming quality is that it is "the least worst solution".
So remember, be careful in your assumptions! You can do it, but if you do, it's more important that you not do it wrong. :) Of course, a lot of "wrong" solutions will work, so the fact that things are working is not necessarily proof that you're doing it right. Good times!
Also, in a Duck Typed system it is often considered bad form, and possibly even a bug, to "look too closely at the type of an object". Objective-C is definitely a Duck Typed system and this unquestionably bleeds over in to Core Foundation due to the tight coupling of Toll-Free bridging. CFTypeRef is a direct manifestation of this Duck Type ambiguity, and depending heavily on the context, may be an explicit way of saying "You are not supposed to be looking too closely at the types".
If you want to find out what type a CFTypeRef is during development, you can use the following snippet.
printf("CFTypeRef type is: %s\n",CFStringGetCStringPtr(CFCopyTypeIDDescription(CFGetTypeID(myObjectRef)),kCFStringEncodingUTF8));
This will print a human readable name for the type so you know what it is. But Apple makes no guarantees that they'll keep these descriptions consistant so don't use this in production code. (As is the snippet will leak memory but you should only use it during development anyway so who cares).