ComBSTR assignment - com

I'm confused about COM string assignments. Which of the following string assignment is correct. Why?
CComBSTR str;
.
.
Obj->str = L"" //Option1
OR should it be
Obj->str = CComBSTR(L"") //Option2
What is the reason

A real BSTR is:
temporarily allocated from the COM heap (via SysAllocString() and family)
a data structure in which the string data is preceded by its length, stored in a 32-bit value.
passed as a pointer to the fifth byte of that data structure, where the string data resides.
See the documentation:
MSDN: BSTR
Most functions which accept a BSTR will not crash when passed a BSTR created the simple assignment. This leads to confusion as people observe what seems to be working code from which they infer that a BSTR can be initialized just like any WCHAR *. That inference is incorrect.
Only real BSTRs can be passed to OLE Automation interfaces.
By using the CComBSTR() constructor, which calls SysAllocString(), your code will create a real BSTR. The CComBSTR() destructor will take care of returning the allocated storage to the system via SysFreeString().
If you pass the CComBSTR() to an API which takes ownership, be sure to call the .Detach() method to ensure the BSTR is not freed. BSTRs are not reference counted (unlike COM objects, which are), and therefore an attempt to free a BSTR more than once will crash.

If you use str = CComBSTR(L"") you use the constructor:
CComBSTR( LPCSTR pSrc );
If you use str = L"" you use the assignment operator:
CComBSTR& operator =(LPCSTR pSrc);
They both would initialize the CComBSTR object correctly.

Personally, I'd prefer option 1, because that doesn't require constructing a new CComBSTR object. (Whether their code does so behind the scenes is a different story, of course.)

Option 1 is preferred because it only does one allocation for the string where as option 2 does 2 (not withstanding the creation of a new temporary object for no particular reason). Unlike the bstr_t type in VC++ the ATL one does not do referenced counted strings so it will copy the entire string across.

Related

Why do I have to store the result of a function in a variable?

Is it because some functions will change the object and some don't so you have to store the returned value in a variable? I'm sure there's a better way to ask the question, but I hope that makes sense.
Example case: Why doesn't thisString stay capitalized? What happens to the output of the toUpperCase() function when I call it on thisString? Is there a name for this behavior?
var thisString: String = "this string"
var thatString: String = "that string"
thisString.toUpperCase()
thatString = thatString.toUpperCase()
println(thisString)
println(thatString)
which prints:
this string
THAT STRING
By convention if a function starts with the word to or a past participle, it always returns a new object and does not mutate the object it's called on. But that's not exclusively true. Functions that begin with a verb may or may not mutate the object, so you have to check the documentation to know for sure.
A mutable object might still have functions that return new objects. You have to check the documentation for the function you call.
For a function that returns a new object, if you don't do anything with the returned result or store it in a variable, it is lost to the garbage collector and you can never retrieve it.
String is an immutable class, so none of the functions you call on it will ever modify the original object. Immutable classes are generally less error-prone to work with because you can't accidentally modify an instance that's still being used somewhere else.
All the primitives are also immutable. If all the properties of a class are read-only vals and all the class types they reference are also immutable classes, then the class is immutable.
If you want an mutable alternative to String, you can use StringBuilder, StringBuffer, CharArray, or MutableList<Char>, depending on your needs. They all have different pros and cons.
Why doesn't thisString stay capitalized?
Because that's how the function was coded (emphasis mine):
"Returns a copy of this string converted to upper case using the rules of the default locale."
What happens to the output of the toUpperCase() function when I call it on thisString?
Nothing. If you don't assign it to a variable (save a reference to it) it's discarded.
Is there a name for this behavior?
AFAIK, this is simply "ignoring the return value".
Hope that helps.

Is passing NULL for COM interface arguments valid?

If I have a COM interface method expecting BSTR and SAFEARRAY parameters, but these are optional, what is the correct way to implement this? Can I pass NULL or do I need to pass empty strings and zero-length arrays? Or would I be better passing VARIANTs which can be VT_EMPTY or VT_BSTR / VT_ARRAY?
e.g.
Login([in]BSTR Name, [in]BSTR Password /*optional*/);
SendEmail([in]SAFEARRAY *To, [in]SAFEARRAY *Cc /*optional*/);
In these examples, should Password be passed as NULL or ""? And should Cc be passed as NULL, or do I need to create a 0-length SAFEARRAY, or pass a VARIANT of type VT_EMPTY... which are valid/sensible options?
Well, those sort of arguments really aren't quite right--the MIDL compiler should throw a warning or even an error if you try to make anything other than a VARIANT to be "optional".
The correct way is to define default values ("defaultvalue"). For BSTRs you want to make the default value to be L"" and not 0 (NULL). If you make the default value for BSTRs to be 0, you will run into problems down the road--I think in some .NET interop.
For the SAFEARRAY it should be safe to make the "defaultvalue" to be NULL.
Of course, this advice is from the point of designing how the interface ought to be. You may be in the situation where someone has already designed and implemented the interface. In that case, you're at the mercy of their implementation. For the BSTR arguments, I would try passing in empty strings (L"") and for the SAFEARRAY I would try passing in NULL.
If you are going to define it as "optional", make it a variant. And in that case, the correct argument is VT_EMPTY.

Why Microsoft CRT is so permissive regarding a BSTR double free

This is a simplified question for the one I asked here. I'm using VS2010 (CRT v100) and it doesn't complain, in any way ever, when i double free a BSTR.
BSTR s1=SysAllocString(L"test");
SysFreeString(s1);
SysFreeString(s1);
Ok, the question is highly hypothetical (actually, the answer is :).
SysFreeString takes a BSTR, which is a pointer, which actually is a number which has a specific semantic. This means that you can provide any value as an argument to the function, not just a valid BSTR or a BSTR which was valid moments ago. In order for SysFreeString to recognize invalid values, it would need to know all the valid BSTRs and to check against all of them. You can imagine the price of that.
Besides, it is consistent with other C, C++, COM or Windows APIs: free, delete, CloseHandle, IUnknown::Release... all of them expect YOU to know whether the argument is eligible for releasing.
In a nutshell your question is: "I am calling SysFreeString with an invalid argument. Why compiler allows me this".
Visual C++ compiler allows the call and does not issue a warning because the call itself is valid: there is a match of argument type, the API function is good, this can be converted to binary code that executes. The compiler has no knowledge whether your argument is valid or not, you are responsible to track this yourselves.
The API function on the other hand expects that you pass valid argument. It might or might not check its validity. Documentation says about the argument: "The previously allocated string". So the value is okay for the first call, but afterward the pointer value is no longer a valid argument for the second call and behavior is basically undefined.
Nothing to do with the CRT, this is a winapi function. Which is C based, a language that has always given programmers enough lengths of rope to hang themselves by invoking UB with the slightest mistake. Fast and easy-to-port has forever been at odds with safe and secure.
SysFreeString() doesn't win any prizes, clearly it should have had a BOOL return type. But it can't, the IMalloc::Free() interface function was fumbled a long time ago. Nothing you can't fix yourself:
BOOL SafeSysFreeString(BSTR* str) {
if (str == NULL) {
SetLastError(ERROR_INVALID_ARGUMENT);
return FALSE;
}
SysFreeString(*str);
*str = NULL;
return TRUE;
}
Don't hesitate to yell louder, RaiseException() gives a pretty good bang that is hard to ignore. But writing COM code in C is cruel and unusual punishment, outlawed by the Geneva Convention on Programmers Rights. Use the _bstr_t or CComBSTR C++ wrapper types instead.
But do watch out when you slice the BSTR out of them, they can't help when you don't or can't use them consistently. Which is how you got into trouble with that VARIANT. Always pay extra attention when you have to leave the safety of the wrapper, there are C sharks out there.
See this quote from MSDN:
Automation may cache the space allocated for BSTRs. This speeds up
the SysAllocString/SysFreeString sequence.
(...)if the application allocates a BSTR and frees it, the free block
of memory is put into the BSTR cache by Automation(...)
This may explain why calling SysFreeString(...) twice with the same pointer does not produce a crash,since the memory is still available (kind of).

NULL in/out parameter in COM

My COM object has a method, in IDL defined as:
HRESULT _stdcall my_method( [in] long value, [in, out] IAnotherObject **result );
Is the caller allowed to call this method like so:
ptr->my_method(1234, NULL);
or would the caller be violating the COM specification in doing so?
In other words, should my code which implements this function check result != NULL before proceeding; and if so, does the COM spec require that I return E_INVALIDARG or E_POINTER or something; or would it be acceptable for my function to continue on and return 0 without allocating an AnotherObject ?
My object is intended to be Automation-compatible; and it uses standard marshaling.
Note: Question edited since my original text. After posting this question I discovered that optional should only be used for VARIANT, and an [in, out] parameter where result != NULL but *result == NULL should be treated like an out parameter, and I must allocate an object.
The Rules of the Component Object Model say:
The in-out parameters are initially allocated by the caller, then freed and re-allocated by the callee if necessary. As with out parameters, the caller is responsible for freeing the final returned value. The standard COM memory allocator must be used.
So, passing NULL is a violation. You can see several violations of COM rules even in Microsoft's own interfaces, such as IDispatch, where a few [out] parameters accept NULL, but that's because they have remote interface methods (see [local] and [call_as]) that most probably allocate the needed memory when crossing apartments, or otherwise perform custom marshaling.
EDIT: To further answer your questions.
I recommend you check for NULL [out] (or [in, out]) arguments and return E_POINTER when you find one. This will allow you to catch/detect most common errors early instead of raising an access violation.
Yes, you should check for argument validity.
If the client is in-process (and same apartment, etc.) with the server, there's nothing (no proxy, no stub) to protect your code from being called with a NULL.
So you're the only one left there to enforce any COM rule, whether that's considered to be a "violation" or not.
PS: defining in+out (w/o using VARIANTs) for Automation clients seems a bit unusual IMHO. I'm not sure all Automation clients can use this (VBScript?)

What is the equivalent of a C pointer in VB.NET?

What is the most similar thing in VB.NET to a pointer, meaning like C pointers?
I have a TreeView within a class. I need to expose some specific nodes (or leaves) that can be modified by external objects.
C#, and I also believe VB.Net, will work on the concept of references. Essentially, it means when you say
A a = new A()
the 'a' is a reference, and not the actual object.
So if I go
B b = a
b is another reference to the same underlying object.
When you want to expose any internal objects, you can simply do so by exposing 'properties'. Make sure, that you do not provide setters for the properties, or that if you do, there is code to check if the value is legal.
ByRef is used when you want to pass the object as a parameter, and when you want the called method to be able to change the reference (as opposed to the object).
As mentioned above, if you post some code, it will be easier to explain.
Nathan W has already suggested the IntPtr structure which can represent a pointer or handle, however, whilst this structure is part and parcel of the .NET framework, .NET really doesn't have pointers per-say, and certainly not like C pointers.
This is primarily because the .NET Framework is a "managed" platform and memory is managed, assigned, allocated and deallocated by the CLR without you, the developer, having to worry about it (i.e. no malloc commands!) It's mostly because of this memory management that you don't really have access to direct memory addresses.
The closest thing within the .NET Framework that can be thought of as a "pointer" (but really isn't one) is the delegate. You can think of a delegate as a "function pointer", however, it's not really a pointer in the strictest sense. Delegates add type-safety to calling functions, allowing code that "invokes" a delegate instance to ensure that it is calling the correct method with the correct parameters. This is unlike "traditional" pointers as they are not type-safe, and merely reference a memory address.
Delegates are everywhere in the .NET Framework, and whenever you use an event, or respond to an event, you're using delegates.
If you want to use C# rather than VB.NET, you can write code that is marked as "unsafe". This allows code within the unsafe block to run outside of the protection of the CLR. This, in turn, allows usage of "real" pointers, just like C, however, they still do have some limitations (such as what can be at the memory address that is pointed to).
Best way to do it is to just allocate everything manually:
You can move up OR down each Stack at free will without Pushing or Popping.
Dim Stack(4095) as Byte 'for 8bit - 1 bytes each entry
Dim Stack(4095) as Integer 'for 16bit - 2 bytes each entry
Dim Stack(4095) as Long 'for 32bit - 4 bytes each entry
Dim Stack(4095) as Double 'for 64 bit - 8 bytes each entry
Dim i as integer 'Where i is your Stack Pointer(0 through 4095)
Dim int as integer 'Byte Integer Long or Double (8, 16, 32, 64 bit)
for i = 0 to 4095
int = i
Stack(i) = int/256 'For 8bit Byte
Stack(i) = int 'For 16bit Integer
Stack(i) = Microsoft.VisualBasic.MKL$(int) 'For 32bit Long
Stack(i) = Microsoft.VisualBasic.MKD$(int) 'For 64bit Double
MsgBox(Microsoft.VisualBasic.HEX$(Stack(i))) 'To See Bitwise Length Per Entry
next i
If you're looking to pass something back from a subroutine you can pass it by reference - as in "ByRef myParamter as Object".
Best answer depends, to some degree, on what you're trying to do.
If you are using VB the only thing that is really close to a pointer is a IntPtr. If you have access to C# you can use unsafe C# code to do pointer work.