Dump the interface exposed by a COM object - com

I want to find a tool that can see all the interface, including the methods, properties, events, exposed by a COM(or ActiveX) component. Is such tool available?

It's not actually possible to build such a tool for ANY COM object, you might have some luck with specific objects. If a type library is available then you could use OLEView or you can programatically open and traverse the type library itself. Bear in mind that the contents of the type library is just what the developer wanted to include in it; there's nothing to stop objects implementing more interfaces than their type libraries say they do.
For objects without type libraries it's impossible to produce a general purpose tool:
Given the way that QueryInterface works you would have to ask the object under investigation if it supports every interface possible. Where would such a tool obtain a list of all possible interfaces that the object in question could support? Whilst it's true that some interfaces are registered in the registry due to proxy requirements not all interfaces are and it's by no means a requirement that they should be.
Once you know that an object supports a given interface how do you work out what methods that interface supports? If the interface derives from IDispatch then this is possible as that's the purpose of IDispatch, but for interfaces derived from IUnknown there is no way to programatically discover things about the interface.
You also have the added problem that some objects may have additional interfaces implemented for them by the proxy layer, for example, if an interface has been proxied then you will also be able to QueryInterface from it to IProxyManager though the object itself does not implement this interface (it's part of the proxy).

If the component has a typelib (in resources or shipped separately) you can use OLE View that comes with Visual Studio. You should use "View Typelib", not "Bind to File" there.

Related

Why do we use only [List, Map, Set] collections in Kotlin?

I've been learning Kotlin and I've faced with Collections API. Before Kotlin I'd been learning Java and I know that in Java there's a lot of different types of Collections API. For example, instead of general List, Map, Queue, Set we use ArrayList, HashMap, LinkedList, LinkedMap and etc. Though in Kotlin we only use general types like Map, List, Set but also we can use HashMap and etc. So, what's going on there? Can you help me to figure out?
While Kotlin's original and primary target is the JVM, there is a huge push by JetBrains to make it multiplatform, and support JS and Native as well.
If you're using Kotlin on the JVM, the implementations of any collections you're using will still be the original JDK classes, e.g. java.util.ArrayList or java.util.HashSet. These are not reimplemented by the Kotlin standard library, which has some great benefits:
These are well-tested implementations, which are maintained anyway.
Using the exact same classes makes interop with Java a breeze, as you can pass them back and forth without having to perform conversions or mapping of any kind.
What Kotlin does do is introduce its own collection semantics over these existing implementations, in the form of the standard library interfaces such as List, Map, MutableList, MutableMap and so on. A small bit of compiler magic makes it so that these interfaces are implemented by the existing JDK classes as well.
If you don't need a specific implementation of a certain type of collection, you can use your collections via these interfaces plus the respective factory methods of the standard library (listOf, mapOf, mutableListOf, mutableMapOf, etc.). This keeps your code more generic, and independent of the concrete underlying implementations. You don't know what specific class the standard library mutableListOf function will create for you, only that it will be an object that satisfies the contract of the MutableList interface.
You should basically use these interfaces by default in your code, especially in public API:
In the case of function parameters, this lets clients provide the function with whatever implementation of the collection they wish to give you. If your function can operate on anything that's a List, you should ask for just that interface - no reason to require an ArrayList or LinkedList specifically.
If this is a return type, using these interfaces lets you change the specific implementation that you create internally in the future, without breaking client code. You can promise to just return a MutableList of things, and what implementation backs that list is not exposed to your clients.
If you look at all the collection handling functions of the Kotlin standard library, you'll see that on the surface, they almost exclusively operate on these interfaces. If you dig down deep enough, you'll find ArrayList instances being created, but this is not exposed to the client code, as it doesn't have to care about the concrete implementation most of the time.
Going back to the multiplatform point once more, if you write your code in a way such that it only relies on Kotlin standard library defined types, that code will be easily usable for non-JVM targets. If you reference kotlin.MutableList in your imports, that can immediately compile to JS code, because there's a Kotlin standard library implementation of that interface on each platform. Whether that maps to an existing class directly, wraps an existing class somehow, or is implemented for Kotlin from scratch, again, doesn't have to concern you. But if you refer to java.util.TreeSet in your code, that won't fly for the JS target, as the Java platform classes are not available there.
Can you still use classes such as java.util.ArrayList directly? Of course.
If you don't see your code going multiplatform at some point, using Java collections directly is perfectly okay.
If you need a specific implementation for a List or a Set for performance reasons, sometimes you'll have to use the Java classes directly.
Interestingly, in recent releases of Kotlin, these specific types of implementations (such as an array based list) are wrapped under standard library typealiases too, so that they're platform independent by default: see kotlin.collections.ArrayList or kotlin.collections.HashSet for examples of this. These Kotlin-defined types will usually show up first in IntelliJ completion, so you'll find yourself being pushed towards using them wherever possible. Same thing goes for most exceptions, e.g. IllegalArgumentException.
TL;DR: You can use either Kotlin collection types of Java types in Kotlin, but you should probably do the former whenever you can.

Why does TypeName() return different results from .GetType and TypeOf when working with COM?

I feel like I would benefit greatly from understanding the differences in how these functions work so that I could better understand when to use each one.
I'm having a very difficult time working with two different interops (Excel, and EPDM) which have both made extensive use of weak typed parameters. I keep running into problems using returned objects and casting them to the proper type (per the documentation). After wasting a ton of time, I've found that using TypeName, GetType, and a TypeOf operator with COM objects can yield different results, and in different circumstances each one can be more or less reliable than the next.
Now, in most cases TypeName() seems to be the most reliable for determining type with COM objects. However, avoiding the other two functions entirely seems quite cargo cultish to me, and besides that today I ran into an interesting problem where I can't seem to cast an object to the type reported by TypeName(). An interesting notion was brought up in the comments on that problem that objects which implement IDispatch may actually return the dispatched interface typename, which could partially explain the differences.
I'd really like to better understand how these functions actually work, but I get kind of lost running through the .NET ReferenceSource, so I'm offering a bounty on this question in hopes someone can explain how these different functions work and in what context each should be used.
Here is a code excerpt from working with the Excel interop.
Dim DocProps As Object
DocProps = WeeklyReports.CustomDocumentProperties 'WeeklyReports is a Workbook object
Debug.Print(DocProps Is Nothing)
Debug.Print(TypeName(DocProps))
Debug.Print(TypeOf (DocProps) Is DocumentProperties)
Debug.Print(DocProps.GetType.ToString)
The output is:
False
DocumentProperties
False
System.__ComObject
It is a long story and a bit doubtful that English is going to cut it. It does require understanding how COM works and how it was integrated into the Office products.
At breakneck speed, COM is very heavily an interface-based programming paradigm at its core. Interfaces are easy, classes are hard. Something you see back in the .NET design as well, a class can derive from only one single base class but can implement any number of interfaces. To make language interop work smoothly, it is important to take as few dependencies on language implementation details as possible.
There is a lot that COM does not do that you'd be used to in any modern language. It does not support exceptions, only error codes. No notion of generics at all. No Reflection. No support for method overloads. No support for implementation inheritance whatsoever, the notion of a class is completely hidden. It only appears as a number, the CLSID, a guid that identifies a class type. With a factory function implemented in the COM component that creates an object of the class. The COM component retains ownership of that object. The client code then only ever uses interfaces to make calls to use methods and get or set properties. CoCreateInstance() is the primary runtime support function that does this.
This was further whittled down to a subset called OLE Automation, the flavor that you use when you interop with Office. It strictly limits the kind of types you can use for properties and method arguments with prescribed ways to deal with the difficult ones like strings and arrays. It does add some capabilities, it supports late binding through the IDispatch interface, important to scripting languages. And VARIANTs, a data type that can store a value or object reference of an arbitrary type. And supports type libraries, a machine-readable description of the interfaces implemented by the COM server. .NET metadata is the exact analogue.
And important to this question, it limit the number of interfaces that a class can implement to just one. Important to languages that don't support the notion of interfaces at all, like VBA, scripting languages like Javascript and VBScript and early Visual Basic versions. The Office interop object model was very much designed with these limitations in mind.
So from the point of view from a programmer that uses such a language to automate an Office program, it is completely invisible that his language runtime is actually using interfaces. All he ever sees and uses in his program are identifiers that look like class names, not interface names. That DocumentProperties is actually an interface name is something you can see in Object Browser. Just type the name in the search box, it properly annotates "public interface DocumentProperties / Member of Microsoft.Office.Core" in the lower-right panel.
One specific detail of the Office object model matters a great deal here, many properties and method return types are VARIANTs. A OLE Automation type that can store an arbitrary value or object reference, it is mapped to System.Object when you use .NET. The Workbook.CustomDocumentProperties property is like that. Even though the property is documented to actually return a DocumentProperties interface reference. They probably did this to leave elbow room to some day return another kind of interface. Fairly necessary for "custom document properties".
That the property is a VARIANT doesn't matter that much in languages that support dynamic typing, they take them with stride. It is however pretty painful in a strongly typed language. And pretty unfriendly to programming editors that support auto-completion, like VS's IntelliSense. What you normally do is declare your variable to the expected interface type:
Dim DocProps As DocumentProperties
DocProps = CType(WeeklyReports.CustomDocumentProperties, Microsoft.Office.Core.DocumentProperties)
And now everything lights up. You don't need the CType() cast either if you favor programming VB.NET with Option Strict Off in effect. Which turns it into a programming language that supports dynamic typing well.
We're getting there. As long as you declare DocProps as Object then the compiler knows beans about the interface. Nor does the debugger, it isn't helped by the variable declaration and can only see that it is a __System.ComObject from the runtime type. So it isn't Nothing, that's easy enough to understand, the property getter did not fail and the document has properties.
The TypeName() function uses a feature of the IDispatch interface, it exposes type information at runtime. That happens to work in your case, it usually doesn't, the function first calls IDispatch::GetTypeInfo() to get an ITypeInfo interface reference, then calls ITypeLib::GetDocumentation(). That works, you get the interface name back. Otherwise pretty comparable to Reflection in .NET, just not nearly as powerful. Do not rely on it heavily, there are lots of COM components that don't implement this.
And crucial to your question, TypeOf (DocProps) Is DocumentProperties is a fail whale. Something you'll discover when you try to write the code I proposed earlier. You'll get a nasty runtime exception, System.InvalidCastException:
{"Unable to cast COM object of type 'System.__ComObject' to interface type 'Microsoft.Office.Core.DocumentProperties'. This operation failed because the QueryInterface call on the COM component for the interface with IID '{2DF8D04D-5BFA-101B-BDE5-00AA0044DE52}' failed due to the following error: No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE))."}
In other words, the Excel documentation is lying to you. You get an interface back that resembles DocumentProperties, it still has the members that this interface documents, but is no longer identical to the Microsoft.Office.Core.DocumentProperties. It probably once was, many moons ago. A nasty little detail that's buried inside this KB article:
Note The DocumentProperties and the DocumentProperty interfaces are late bound interfaces. To use these interfaces, you must treat them like you would an IDispatch interface.

Can an interface be retroactively implemented in VB.NET?

Is it possible to define an interface (e.g. MyClass Implements MyInterface) whose method/property definitions already match some of the methods/properties defined on a third-party (or a native) class?
For example, the DataRow class has a number of properties/methods that make it "row-like". What if I want to implement an interface (i.e. IRowLike) that defines certain methods and properties already existing on the native DataRow class (which I cannot directly touch or extend). I simply want the class to agree at runtime that it does indeed abide by some interface.
Interfaces afford a poor-man's version of "duck typing". Once I have a set of classes that all abide by a given interface, I can define extension methods against that interface and all classes that support the interface immediately gain new behavior. I know it may seem odd to want to retroactively apply an interface against third-party classes, but it would definitely allows us to do more with less code.
This isn't possible in .Net. A type defines the interfaces that it implements in metadata at compile time and its definition isn't alterable at runtime. It is possible to generate types at runtime which implement specific interfaces but not alter an existing type
There are some alternatives though. In VB.Net you could simply choose to use late binding on the type and access the interface methods in that manner (or dynamic in C#) The downside of course is the code isn't statically verifiable.

How can I stop someone from calling my COM interfaces APIs?

I have a COM inproc DLL that we are using in our product.
Now if someone finds out which interface and APIs we have exposed from the DLL then those APIs can be called easily.
Is there a way to stop unknown applications from calling my APIs?
Can we add some signature in COM?
The formal way of controlling use of your object is by implementing IClassFactory2 on the class factory that creates your COM objects.
Here's a link at MSDN explaining the interface.
IClassFactory2 at MSDN
The benefit of creating an implementation is that nobody can fetch an instance without clearing the hurdles of registration through IClassFactory2.
The downside is that you'll have to inspect all the locations where you are creating an object, to make sure that they haven't broken. Creating instances becomes more burdensome, although some languages already have facilities to make the process less painful (ex. VB6).
If you are trying to protect an object that has a lot of instantiation activity, you might want to go with Mastermind's method of adding a key parameter, or add an unlock method of some sort to your interfaces that must be called correctly before the component behind it can be used.
You could make your interfaces inheriting directly from IUnknown (without IDispatch) and not include the type library into the DLL. This way only those who have access to the type library will be able to find what interfaces are supported and the only other way to discover the interfaces will be to just guess. If you go this way you might also wish to minimize the number of classes exposed to registry (those that can be created with CoCreateInstance()) and use a set of factory methods of some dedicated registry-exposed class instead.
This implies that only vtable early-binding will work with your component. You will also be unable to use default call marshaling with this component (since no type library is included). And this is not real protection, just a way to hide things.
Nothing prevents you from adding a "key" parameter to the methods which will just return if the key is wrong.
Very simple but will do for starters.
Other than some sort of 'key' param, you can't prevent the curious from discovering your function and then calling it. All it takes is a debugger and some patience. To be totally secure you'd have to require some sort of certificate that authorized code could obtain but all others couldn't but that would mean you're code would have to be able to verify the certificate.

Best method of calling managed code(c#) from unmanaged C++

We have developed a s/w architecture consisting of set of objects developed in C#. They make extensive use of events to notify the client of changes in status, etc.
The intention originally was to allow legacy code to use these managed objects through COM interop services. That was easy to write in the design specification but, I'm seeing that actually implementing it is more problematic. I've searched for many hours looking for a good sample of event handling using this method. Before we march down that path, I want to make sure that COM interop is the best way to allow legacy code to call our new code.
It appears there are several different options: 1) COM interop, 2) write unmanaged wrapper classes 3) use the /clr compiler switch to enable calling of managed objects, 4) use some sort of reverse pInvoke call. Am I missing any?
Each option will have its benefits & drawbacks and I'm wondering what the best way to go is. Here are specific questions/comments for each
COM INTEROP - It appears event handling is a hurdle. We use events that have variable types as parameters. An event parameter may have an event ID and an object. Based on the event ID, the object will be of a certain type. Can this be handled with COM interop? Many of the objects that are exposed have properties. You can't declare properties in an interface so all properties will need a corresponding get/set method.
WRITE UNMANAGED WRAPPER - I assume this means creating a DLL using the /clr option to allow creating and calling managed objects and exposing unmanaged objects. Would the client of these unmanaged. I haven't done this before. What are benefits/drawbacks of this?
USE THE /CLR SWITCH - I understand this means to add support for managed objects. What are the drawbacks of this approach? Does this option support events as described above? Can we say, "here's the managed library. Use the /clr compiler option with your legacy code and have at it?" I don't know the ramifications of this. Is there a good sample of how this works around? (I'm sure there is, I just haven't found it)
USE A REVERSE PINVOKE - I'm not sure exactly how this would work but, from what I've been able to find, this is not a likely valid solution.
So, what does the decision tree look like to find the correct direction? Any help is appreciated.
DP
I think your initial solution is the best one. COM interop is stable and reasonably well documented. All you need to do is ensure that all the different event objects that might pop out of a given event handler implement the same COM-visible base event object interface (that has the event type id, etc). From there, individual objects can implement whatever other interfaces they want, and your unmanaged code can QI for the right "detail" interface based on whatever criteria you want to define. It's really not that hard. Have a look at this CodeProject article for an end-to-end sample including unmanaged event handlers.