Why Microsoft CRT is so permissive regarding a BSTR double free - com

This is a simplified question for the one I asked here. I'm using VS2010 (CRT v100) and it doesn't complain, in any way ever, when i double free a BSTR.
BSTR s1=SysAllocString(L"test");
SysFreeString(s1);
SysFreeString(s1);

Ok, the question is highly hypothetical (actually, the answer is :).
SysFreeString takes a BSTR, which is a pointer, which actually is a number which has a specific semantic. This means that you can provide any value as an argument to the function, not just a valid BSTR or a BSTR which was valid moments ago. In order for SysFreeString to recognize invalid values, it would need to know all the valid BSTRs and to check against all of them. You can imagine the price of that.
Besides, it is consistent with other C, C++, COM or Windows APIs: free, delete, CloseHandle, IUnknown::Release... all of them expect YOU to know whether the argument is eligible for releasing.

In a nutshell your question is: "I am calling SysFreeString with an invalid argument. Why compiler allows me this".
Visual C++ compiler allows the call and does not issue a warning because the call itself is valid: there is a match of argument type, the API function is good, this can be converted to binary code that executes. The compiler has no knowledge whether your argument is valid or not, you are responsible to track this yourselves.
The API function on the other hand expects that you pass valid argument. It might or might not check its validity. Documentation says about the argument: "The previously allocated string". So the value is okay for the first call, but afterward the pointer value is no longer a valid argument for the second call and behavior is basically undefined.

Nothing to do with the CRT, this is a winapi function. Which is C based, a language that has always given programmers enough lengths of rope to hang themselves by invoking UB with the slightest mistake. Fast and easy-to-port has forever been at odds with safe and secure.
SysFreeString() doesn't win any prizes, clearly it should have had a BOOL return type. But it can't, the IMalloc::Free() interface function was fumbled a long time ago. Nothing you can't fix yourself:
BOOL SafeSysFreeString(BSTR* str) {
if (str == NULL) {
SetLastError(ERROR_INVALID_ARGUMENT);
return FALSE;
}
SysFreeString(*str);
*str = NULL;
return TRUE;
}
Don't hesitate to yell louder, RaiseException() gives a pretty good bang that is hard to ignore. But writing COM code in C is cruel and unusual punishment, outlawed by the Geneva Convention on Programmers Rights. Use the _bstr_t or CComBSTR C++ wrapper types instead.
But do watch out when you slice the BSTR out of them, they can't help when you don't or can't use them consistently. Which is how you got into trouble with that VARIANT. Always pay extra attention when you have to leave the safety of the wrapper, there are C sharks out there.

See this quote from MSDN:
Automation may cache the space allocated for BSTRs. This speeds up
the SysAllocString/SysFreeString sequence.
(...)if the application allocates a BSTR and frees it, the free block
of memory is put into the BSTR cache by Automation(...)
This may explain why calling SysFreeString(...) twice with the same pointer does not produce a crash,since the memory is still available (kind of).

Related

Can the delete operator be used instead of the Marshal.FreeHGlobal method to free memory from a wchar_t*?

I read the official documentation for the StringToHGlobalUni method and it says that the FreeHGlobal method should be use to free memory.
The memory allocated by StringToHGlobalUni method is on the native heap, so I don't see why the delete operator could not be used.
I searched a lot, but I couldn't find any explanation why I can't use the delete operator.
I'm new to this and some explanations would help me. Can the delete operator be used or not ?
Code
const wchar_t* filePath = (const wchar_t*)(Marshal::StringToHGlobalUni(inputFilePath)).ToPointer();
Marshal::FreeHGlobal(IntPtr((void*)filePath));
Don't use Marshal::StringToHGlobalUni in C++/CLI.
Use either
PtrToStringChars (accesses Unicode characters in-place, no allocation) or
marshal_as<std::wstring> (manages the allocation with a smart pointer class that will free it correctly and automatically)
Apart from the fact that StringToHGlobalUni requires you to free the memory manually, the name of that function is completely misleading. It has absolutely no connection to HGLOBAL whatsoever.

Fortran Functions with a pointer result in a normal assignment

After some discussion on the question found here Correct execution of Final routine in Fortran
I thought it will be useful to know when a function with a pointer result is appropriate to use with a normal or a pointer assignment. For example, given this simple function
function pointer_result(this)
implicit none
type(test_type),intent(in) pointer :: this
type(test_type), pointer :: pointer_result
allocate(pointer_result)
end function
I would normally do test=>pointer_result(test), where test has been declared with the pointer attribute. While the normal assignment test=pointer_result(test) is legal it means something different.
What does the normal assignment imply compared to the pointer assignment?
When does it make sense to use one or the other assignment?
A normal assignment
test = pointer_result()
means that the value of the current target of test will be overwritten by the value pointed to by the resulting pointer. If test points to some invalid address (is undefined or null) the program will crash or produce undefined results. The anonymous target allocated by the function will have no pointer to it any more and the memory will be leaked.
There is hardly any legitimate use for this, but it is likely to happen when one makes a typo and writes = instead of =>. It is a very easy one to make and several style guides recommend to never use pointer functions.

What's the Matlab equivalent of NULL, when it's calling COM/ActiveX methods?

I maintain a program which can be automated via COM. Generally customers use VBS to do their scripting, but we have a couple of customers who use Matlab's ActiveX support and are having trouble calling COM object methods with a NULL parameter.
They've asked how they do this in Matlab - and I've been scouring Mathworks' COM/ActiveX documentation for a day or so now and can't figure it out.
Their example code might look something like this:
function do_something()
OurAppInstance = actxserver('Foo.Application');
OurAppInstance.Method('Hello', NULL)
end
where NULL is where in another language, we'd write NULL or nil or Nothing, or, of course, pass in an object. The problem is this is optional (and these are implemented as optional parameters in most, but not all, cases) - these methods expect to get NULL quite often.
They tell me they've tried [] (which from my reading seemed the most likely) as well as '', Nothing, 'Nothing', None, Null, and 0. I have no idea how many of those are even valid Matlab keywords - certainly none work in this case.
Can anyone help? What's Matlab's syntax for a null pointer / object for use as a COM method parameter?
Update: Thanks for all the replies so far! Unfortunately, none of the answers seem to work, not even libpointer. The error is the same in all cases:
Error: Type mismatch, argument 2
This parameter in the COM type library is described in RIDL as:
HRESULT _stdcall OurMethod([in] BSTR strParamOne, [in, optional] OurCoClass* oParamTwo, [out, retval] VARIANT_BOOL* bResult);
The coclass in question implements a single interface descending from IDispatch.
I'm answering my own question here, after talking to Matlab tech support: There is no equivalent of Nothing, and Matlab does not support this.
In detail: Matlab does support optional arguments, but does not support passing in variant NULL pointers (actually, to follow exactly how VB's Nothing works, a VT_EMPTY variant, I think) whether as an optional argument or not. There is documentation about some null / pointerish types, a lot of which is mentioned in my question or in various answers, but these don't seem to be useable with their COM support.
I was given a workaround by Matlab support using a COM DLL they created and Excel to create a dummy nothing object that could be passed around in scripts. I haven't managed to get this workaround / hack working, and even if I had unfortunately I probably could not redistribute it. However, if you encounter the same problem this description might give you a starting point at least!
Edit
It is possible this Old New Thing blog post may be related. (I no longer work with access to the problematic source code, or access to Matlab, to refresh my memory or to test.)
Briefly, for IUnknown (or derived) parameters, you need a [unique] attribute for them to legally be NULL. The above declaration required Matlab create or pass in a VT_EMPTY variant, which it couldn't do. Perhaps adding [unique] may have prompted the Matlab engine to pass in a NULL pointer (or variant containing a NULL pointer), instead - assuming it was able to do that, which is guesswork.
This is all speculation since this code and the intricacies of it are several years behind me at this point. However, I hope it helps any future reader.
From the mathworks documentation, you can use the libpointer function:
p = libpointer;
and then p will be a NULL pointer. See that page for more details.
See also: more information about libpointer.
Peter's answer should work, but something you might want to try is NaN, which is what Matlab ususally uses as a NULL value.
In addition to using [] and libpointer (as suggested by Peter), you can also try {}.
The correct answer for something in VB that is expecting a Nothing argument, is to somehow get a COM/ActiveX Variant which has a variant type of VT_EMPTY. (see MSDN docs which reference marshaling behavior for Visual Basic Nothing)
MATLAB may do this with the empty array ([]), but I'm not sure.... so it may not be possible purely in MATLAB. Although someone could easily write a tiny COM library whose purpose is to create a Variant with VT_EMPTY.
But if the argument has the [optional] atttribute, and you want to leave that optional argument blank, you should not do this. See the COM/ActiveX docs on Variants which say under VT_EMPTY:
VT_EMPTY: No value was specified. If an optional argument to an Automation method is left blank, do not pass a VARIANT of type VT_EMPTY. Instead, pass a VARIANT of type VT_ERROR with a value of DISP_E_PARAMNOTFOUND.
Matlab should (but probably does not) provide methods to create these objects (a "nothing" and an "optional blank") so you can interface correctly with COM objects.

How should I check that [out] params in COM can be used?

Officially one should not use [out] parameters from COM functions unless the function succeeded this means that there are (at least) three ways to see if an [out] parameter can be used.
Consider the following interface
interface IFoo : IUnknown {
HRESULT GetOtherFoo([out] IFoo** ppFoo);
HRESULT Bar();
};
Which of the following ways would you recommend on using it?
1. Check return value
CComPtr<IFoo> other;
HRESULT hr = foo->GetOtherFoo(&other);
if (SUCCEEDED(hr))
other->Bar();
This makes me a bit nervous since a bug in IFoo could cause a NULL pointer dereferencing.
2. Check the output parameter
This depends on the fact that if a method fails it mustn't change any of the [out] parameters (if the parameter changed <==> it's safe to use it).
CComPtr<IFoo> other;
foo->GetOtherFoo(&other);
if (other)
other->Bar();
Note that this sort of happens anyway, CComPtr's destructor will call Release if the pointer isn't NULL so it can't be garbage.
3. The paranoid way, check both
CComPtr<IFoo> other;
HRESULT hr = foo->GetOtherFoo(&other);
if (SUCCEEDED(hr) && other)
other->Bar();
This is a bit verbose in my opinion.
P.S. See related question.
If you are willing to write more checks and make code a bit slower for making it more reliable option 3 is for you. Since you expect that there are bugs in the COM server it is quite reasonable to check against them.
COM server methods that return a success HRESULT, yet set some of their output parameters to NULL are not very common. There are a few cases (IClientSecurity::QueryBlanket comes to mind) where this is used, but usually the client may expect all output parameters to be non-NULL if the method returned successfully.
It is, after all, a matter of how the method is documented. In the default case, however, I would consider 1. to be a safe way to go.

const vs enum in D

Check out this quote from here, towards the bottom of the page. (I believe the quoted comment about consts apply to invariants as well)
Enumerations differ from consts in that they do not consume any space
in the final outputted object/library/executable, whereas consts do.
So apparently value1 will bloat the executable, while value2 is treated as a literal and doesn't appear in the object file.
const int value1 = 0xBAD;
enum int value2 = 42;
Back in C++ I always assumed this was for legacy reasons, and old compilers that couldn't optimize away constants. But if this is still true in D, there must be a deeper reason behind this. Anyone know why?
Just like in C++, an enum in D seems to be a "conserved integer literal" (edit: amazing, D2 even supports floats and strings). Its enumerators have no location. They are just immaterial as values without identity.
Placing enum is new in D2. It first defines a new variable. It is not an lvalue (so you also cannot take its address). An
enum int a = 10; // new in D2
Is like
enum : int { a = 10 }
If i can trust my poor D knowledge. So, a in here is not an lvalue (no location and you can't take its address). A const, however, has an address. If you have a global (not sure whether this is the right D terminology) const variable, the compiler usually can't optimize it away, because it doesn't know what modules can access that variable or could take its address. So it has to allocate storage for it.
I think if you have a local const, the compiler can still optimize it away just as in C++, because the compiler knows by looking at its scope whether or not anyone is interested in its address or whether everyone just takes its value.
Your actual question; why enum/const is the same in D as in C++; seems to be unanswered. Sadly there exists no good reason for this choice whatsoever. I believe that this was just an unintentional side effect in C++ that became a de facto pattern. In D the same pattern was needed, and Walter Bright decided that it should be done as in C++ such that those coming from that place would recognize what to do ... In fact, before this rather IMHO silly decision, the keyword manifest was used instead of enum for this usecase.
I think a good compiler/linker should still remove the constant. It's just that with the enum, it's actually guaranteed in the spec. The difference is primarily a matter of semantics. (Also keep in mind that 2.0 isn't complete yet)
The real purpose of enum being expanded syntactically to support single manifest constants, from what I understand, is that Don Clugston, a D template guru, was doing some crazy stuff with templates. He kept running into long build times, ridiculous compiler memory usage, etc. because the compiler kept creating internal data strucutres for const variables. One key thing about const/immutable variables compared to enums is that const/immutable variables are lvalues and can have their address taken. This means there is some extra overhead for the compiler. This usually doesn't matter, but when you're executing really complicated compile-time metaprograms, even if const variables are optimized away, this is still significant overhead at compile time.
It sounds like the enum value will be used "inline" in expressions where as the const will actually take storage and any expression referencing it will be loading the value from the memory storage.
This sound similar to the difference between const vs. readonly in C#. The former is a compile-time constant and the later is a run-time constant. This definitely affected versioning of assemblies (since assemblies referencing a readonly would receive a copy at compile time and would not get a change to the value if the referenced assembly was rebuilt with a different value).