VB.NET/COM Server code way slower than Excel VBA code

VB.NET/COM Server code way slower than Excel VBA code - vb.net

Background
I have a client who needs Excel VBA code that produces formula values moved to VB.NET. He is in the business of providing financial analytics, in this case delivered as an Excel add-in. I have translated the VBA into VB.NET code that runs in a separate DLL. The DLL is compiled as a COM Server because, well, Excel-callable .NET UDFs have to be. So far, so good: Excel cells have "=foo(Range1, Range2, ...)", the VB.NET Com Server's UDF is called, and the cell obtains a value that matches the VBA code's value.
The problem
The VB.NET code is way slower. I can stretch a range of VBA-based formulas and get instantaneous calculation. I can stretch a comparable range of VB.NET-based formulas and the calculation takes 5-10 seconds. It is visibly slower and unacceptable to the client.
There are a few possibilities that occur to me:
native compilation of VBA is faster because of the absence of a switch
the DLL may be loaded and unloaded for each UDF call
the DLL calls Excel WorksheetFunction methods and requires an Application object, and creating the Application object is expensive
calling an Excel WorksheetFunction method from the DLL is expensive
I don't think that (2) is true because I put calls to append to a file in the Shared New, the Public New, and Finalize functions, and all I get are:
Shared Sub New
Public Sub New
Finalize
when I open the spreadsheet, repeatedly stretch a formula range, and close the spreadsheet.
I don't think (3) is true because the file writing shows that the Application object is created only once.
The question
How do I figure out what is taking the time? How to profile in this environment? Are there obvious enhancements?
In the last category, I have tried to reduce the number of creations of an Application object (used for WorkSheetFunction calls) by making it Shared:
<Guid("1ECB17BB-444F-4a26-BC3B-B1D6F07D670E")> _
<ClassInterface(ClassInterfaceType.AutoDual)> _
<ComVisible(True)> _
<ProgId("Library.Class")> _
Public Class MyClass
Private Shared Appp As Application ' Very annoying
Approaches taken
I've tried to reduce the dependence on Excel mathematical functions by rewriting my own. I've replaced Min, Max, Average, Stdev, Small, Percentile, Skew, Kurtosis, and a few more. My UDF code calls out to Excel much less. The unavoidable call seems to be taking a Range as an argument and converting that to a .NET Array for internal use.

The DLL is compiled as a COM Server
because, well, Excel-callable .NET
UDFs have to be
A bit of a show-stopper if true, I agree. But of course, it isn't true at all, why else would I have started that way...
You can write your UDFs in C++ against the Excel SDK and deliver them as an XLL, for one thing. It's a common practice among quantitative analysts in banks; in fact they seem to enjoy it, which says a lot about them as a group.
Another, less painful option, that I've only recently come across, is ExcelDNA, which, AFAICT, provides the nasty SDK/XLL bit with a way to hook up your .NET DLLs. It's sufficiently cool that it even lets you load source code, rather than building a separate DLL, which is great for prototyping (it makes use of the fact that the CLR actually contains the compiler). I don't know about performance: I haven't attempted to benchmark it, but it does seem to get around the COM Interop issue, which is well-known to be ghastly.
Beyond that, I can only endorse other recommendations: reference your workbook, its content and the Excel application as little as possible. Every call costs.

I seriously suppose that interop from VB.NET to the COM server is done via marshalling. In VBA the methods were called directly - the control was passed into them at a cost of couple of processor instructions and that looked really fast. Now with marshalling a whole set of extra work is done and each call encounters a serious overhead. You need to either seriously reduce the number of calls (make each call do more work) or disable marshalling and work as if was with VBA. See this question for details on how to possibly accomplish the latter.

I recently benchmarked moving data from Excel to .NET using various products/methods.
All the .NET methods I tried were slower than VBA and VB6 but the best ones were able to use the XLL interface which gave better results than the Automation interface.
the benchmark was reasonably optimised (transferring ranges to arrays etc)
results were (millisecs for my benchmark)
VB6 COM addin 63
C XLL 37
Addin Express Automation VB.net 170
Addin Express XLL VB.net 100
ExcelDNA XLL CVB.Net 81
Managed XLL gave comparable times but also enables cusom marshallers which can be fast.

There is some more performance stuff for ExcelDna on CodePlex: http://exceldna.codeplex.com/Wiki/View.aspx?title=ExcelDna%20Performance.
For really simple functions, the overhead of calling a managed function through ExcelDna is very small, allowing you to make several hundred thousand UDF calls per second.

My guess based on a lot of experience using Excel via COM Interop is that it is the context switch and / or the mapping of data from Excel's internal data structures to .NET objects.
SpreadsheetGear for .NET might be an option for you. It is much faster than Excel via COM Interop (see what some customers say here) and it does support Excel compatible calculations and user defined functions (see the Custom Functions sample on this page).
You can download a free trial here if you want to try it out.
Disclaimer: I own SpreadsheetGear LLC

I have the same experience as Joe. It is mostly the interop that is slow.
In most cases this can be solved by working with entire ranges instead if individual cells.
You do this by using .Net arrays and the pass them to/from excel in one call.
e.g.
Dim values(10,10) As object
Dim r As Excel.Range = Me.Range("A1")
r = r.Resize(UBound(values, 1), UBound(values,2))
values = r.Value
For ii = 0 To UBound(values,1)
For jj = 0 To UBound(values,2)
values(ii,jj) = CType(values(ii,jj), Double)*2
Next
Next
r.Value = values
This has solved all performance problems I have seen

One thought. Instead of passing the Range object (it could be that every call onto the Ranbe object could be marshalled from .Net to Excel), collate all your parameters into basic types, doubles, strings, typed arrays and if necessary un-typed variant arrays, and pass them into the .Net DLL. That way you only have to marshall a variant.
-- DM

Really late to this question (7 years) but for what it is worth, I have worked on 5/6 separate Excel systems in Investment Banks and have seen a similar design pattern in all their Excel systems which I'll describe.
Yes, they have blocks of cells which contain related data such as a list of government bond prices but they do not always pass this block of cells around. Instead they will create an object that resides in memory which is globally accessible and is labelled with a handle. The object contains a copy of the cell's content and so is thus more easily accessed in analytic code.
So an example handle would be
'USTreasuries(103450|2016-07-25T15:33)'
where it can be seen that '103450' is an object number, unique enough to acquire the object from a globally scoped dictionary (say), the timestamp represents when the object is created and USTreasuries is a user friendly description. One would create such as object with a formula function something like this
=CreateHandledObject("USTreasuries",A1:D30)
The one would write an analytic which accepts this handle and acquires the data internally. It requires the CreateHandledObject() to be marked volatile and you have to turn calculation to manual and execute recalculation by code or by user.
You problems stem from endless marshalling data from the sheet. I think this approach will help you reduce this cumbersome element to a minimum.

Related

"Out Of Memory" error in excel VBA for a very large model. How can I avoid this?

I am trying a declare a new variable in VBA for Excel. I have an excel model which has 9 modules and 7 class modules. Each module is really large, with an average of 60 variables declared in each module and a minimum of a few hundred lines of code to a maximum of a couple of thousand lines of code in each module. Every time I try typing a new variable, I get an error that says "Out Of Memory". How can I avoid this error and continue declaring more variables ?

As mentioned in the comment we have too little data to provide you with a definite answer.
However the reasons may be plenty:
You are declaring a lot of objects ("Set obj = ") and never cleaning them (Set obj = Nothing). If you do not reduce the reference to an object it will remain in memory.
You have a loop in which you are declaring a lot of objects/variables until you get a memory Overflow.
You are creating too many objects at once that allocate too much memory (e.g. IE object etc.)
How to deal with this?
Start with the code that raises the error as most likely this is happening in a loop or another place which increases memory usage (use debugging F8 to traverse code). There may be many solutions depending on the source of your issue.
Leverage memory statistics throughout different milestones in your code https://social.msdn.microsoft.com/Forums/office/en-US/e3aefd82-ec6a-49c7-9fbf-5d57d8ef65ca/check-size-of-excelexe-in-memory or simply use the Task Manager
See if any of these tips help: https://www.add-ins.com/support/out-of-memory-or-not-enough-resource-problem-with-microsoft-excel.htm

Every time I try typing a new variable, I get an error that says "Out Of Memory".
This sounds like a design-time error--an error you get when editing the code rather than a run-time error that you get when running the code.
If indeed this is a design-time error your file may be corrupted. Try rebuilding it by copying all sheets into a new workbook and copying the code into new, blank modules.

Please note that there are size limits on VBA Forms, Standard, and Class Modules, Procedures, Types, and Variables.
I've only seen it documented here:
https://learn.microsoft.com/en-us/previous-versions/visualstudio/visual-basic-6/aa716182(v=vs.60)
You either need to reduce the scope of your program, splitting it in to logical steps. Or use a more robust programming language like VB.NET.

.NET & CreateObject, whose memory does the COM object use?

I have a VB.NET application that uses CreateObject to use Excel and dump a lot of data into it. We are getting out of memory exceptions and our app is generally hitting 1GB of memory at this point. However I can't make all the numbers add up.
This is how the data is passed to Excel:
worksheet_object.Range("A1").Resize(rows, cols).Value = an_array
The app is around 400mb with the data on screen (datagrid), when it crashes it has used an additional 600mb despite the fact that Excel with the CSV is only 200mb and the CSV is only 68mb. I realise the in memory array could be somewhat larger but how does 600mb get gobbled up passing the data to Excel unless Excel is somehow using my apps memory?
I have tried to find out if Excel via CreateObject runs in its own memory space or using my apps memory space, drew a complete blank. ProcessExplorer shows them as separate processes so I don't know what to think.
We found running the app as 64-bit rather than 32 solved the problem, but not all our clients will have 64-bit office.
So my question is this: How can that one line use 600mb and is there a better way to pass the data to Excel.

I would bet that this takes some time to run the 1 line as well, 9.7M cells!? But anyway, 2 questions:
1: How can that one line use 600mb (I assume you mean MB (Byte), not mb (bit))
I think you answered that with 9.7M cells. Thats only 61 Bytes per cell. I would assume you have some strings. Each string uses a couple bytes, plus 2 Bytes for every character in the cell. Numbers with a decimal point, they take up 16 Bytes.. period. Not to mention, the array has some size to it, to handle it's information. See this chart for an idea of what the data takes up in VB. 2012 VB Data Types One amazing thing that I just learned because I had not looked at this chart in so long, is that by going to 64bit OS, it should have actually taken MORE memory. Luckily, 64bit can handle more.
2: Is there a better way to pass the data to Excel.
You might try using the OLEDB ACE driver to work with your excel file. It is faster and does not require EXCEL to be installed on the computer, big pluses there.
How To Use ADO.NET to Retrieve and Modify Records in an Excel Workbook With Visual Basic .NET
This post is old but still relevant and very helpfull. For connection strings for newer versions of excel, try ConnectionStrings.com

Loading (and executing) a lisp-file in autocad using .NET

I'm currently in the process of rewriting some old AutoCAD plugins from VBA to VB.NET. As it turns out, a (rather large) part of said plugin is implemented in LISP, and I've been told to leave that be. So the problem became running LISP-code in AutoCAD from .NET. Now, there are a few resources online who explain the process necessary to do so (like this one), but all of them takes for granted that the lisp-files/functions are already loaded. The VBA-function I'm currently scratching my head trying to figure out how to convert does a "(LOAD ""<file>"")", and the script is built in such a way that it auto-executes on load (it's a simple script, doesn't register functions, just runs from start to end and does it's thing).
So my question is. How can I load (and thus execute) a lisp-file in autocad from a .NET plugin?

Ok, there are two ways to sendcommand via .NET.
The first thing you need to understand is that ThisDocument doesn't exist in .NET.
ThisDocument is the document where the VBA code is written, but since your addin is document undependant, it stands alone and you must take the documents from the Application object.
You access the application with:
Autodesk.AutoCAD.ApplicationServices.Application
If you want to transform it to the same Application object as in VBA, with same methods and functions
using Autodesk.Autocad.Interop;
using Autodesk.Autocad.Interop.Common;
AcadApplication App = (AcadApplication)Autodesk.AutoCAD.ApplicationServices.Application.AcadApplication;
The first application has MdiActiveDocument, from where you can call the Editor and send written commands, or call the SendStringToExecute as said in other answer.
The AcadApplication has ActiveDocument (an AcadDocument object that behaves exactly as in VBA).
This document will have the same SendCommand your VBA has, use it the same way it's done in VBA.
If you can explain better the autoexecute part, I can help with that too.

Avoiding cross process calls when doing Word automation via VB.net

The short version
I've got a Word Addin in VB.net and VSTO that exposes a COM compatible object via Word.COMAddins.Object, so that the addin functionality can be called External to Word, without accesses to Word itself being cross-process.
The technique worked in VB6, but with VB.net, it still works, but it's much slower than the same code running directly from the addin via a task pane, as if the calls are all cross process when they shouldn't be.
x
The Long version
This addin essentially does tons of processing on Word Documents.
The addin can be run in two ways.
from within Word, using a taskpane
externally, via a set of classes
exposed to COM (because I have to
provide access to the functionality
to VB6 client apps.
BUT, here's the rub. Anyone who's ever done Word automation knows that code that runs perfectly acceptably INPROC with Word (in this case the instance of the ADDIN that Word itself loads), will generally run unacceptably slowly out of process (or cross process).
This app is no different.
Ages ago, I made use of a handy trick to circumvent this issue.
Create a Word Addin as usual
Expose an object via the
Word.COMAddin.Object property that
will let external code access your
addin.
In your external project, instead of
manipulating Word directly, Use the
Application.COMAddins collection,
find your addin, retrieve the
exposed COMAddin.Object property
from it and then call a method on
that object that does the work.
Of course, the call to your COMAddin.Object object will still be cross process, BUT, once execution is in the addin that is IN PROCESS with Word, your addin can now perform all the Word object manipulations it wants and it's fast because they're all in-process calls at that point.
That worked in the VB6 COM days.
But, I put together this VB.net vsto addin, and expose my addin object via the RequestComAddInAutomationService function of VSTO's Connect object
I can make calls into my addin externally and they all work exactly as I would expect them to, except they're all +slow+, very much like the calls into Word are still being performed cross process even though the code making those calls to Word is part of the addin dll that was loaded in-process by Word!
And slow as in a factor of about 10 to 1; what takes 3 seconds to run when run directly from the ADDIN via the task pane, takes ~30seconds to run when called from external code through the COMADDIN.object object.
I'm guessing that I'm running into some sort of issue with .net APPDOMAINS or something and what +really+ constitutes cross proc calls in .net, but I've found nothing so far that would even hint about this sort of thing.
My next step, barring some mystical insight, will be to code up a repro, which could get tricky because of the shear number of elements in play.
Any thoughts?

I've made the same observations with my VSTO Word add in. What I'd like to add here: When you add your procedure as a click handler to a button:
`this.testButton.Click += new Office._CommandBarButtonEvents_ClickEventHandler(YourProcedure);´
and implement your expensive procedure in "YourProcedure", you can call into Word's UI thread using
this.testButton.Execute();
This is not an elegant solution either, but maybe useful if you happen to have buttons ready in a CommandBar.

Unfortunately, the Event hook technique Thorben mentions wouldn't work for my particular situation.
So I'm closing this question out with the workaround that I mentioned in comments and I'll repeat here...
Well, not a perfect solution, but I have found +a+ solution. It involved a timer, so it's definitely suboptimal Essentially, when the addin is loaded by Word, (ie during the STARTUP event), initialize a timer (a WINFORMS timer, not a threading timer), and set it's interval to 500. When External code connects to the addin via the COMADDIN.OBject property, and makes a call into the addin, set a variable flag, which is being polled by the timer. When the timer sees it set, it resets the flag and performs the action.
It's not the clean solution I'd have preferred, but it's fairly easy to implement, moderately easy to understand after the fact, and it definitely avoids the slowdown of xprocess COM calls into Word.

Is use of Mid(), Instr(), LBound(), UBound() etc. in VB.Net not recommended?

I come from a C# background but am now working mostly with VB.Net. It seems to me that the above functions (and others - eg. UCase, LCase) etc. are carryovers from VB6 and before. Is the use of these functions frowned upon in VB.Net, or does it purely come down to personal preference?
My personal preference is to stay well away from them, but I'm wondering if that is just my C# prejudice.
I've come across a couple of issues - particularly with code converted from VB6 to VB.Net, where the 0 indexing of collections has meant that bugs have been introduced into code, and am therefore wary of them.

The reason that those functions are there in the first place is of course that they are part of the VB language, inherited from VB 6.
However, they are not just wrappers for methods in the framework, some of them have some additional logic that makes them different in some ways. The Mid function for example allows that you specify a range that is outside the string, and it will silently reduce the range and return the part of the string that remains. The String.Substring method instead throws an exception if you specify a range outside the string.
So, the functions are not just wrappers, they represent a different approach to programming that is more in line with Visual Basic, where you can throw just about anything at a function and almost always get something out. In some ways that is easier, as you don't have to think about all the special cases, but on the other hand you might want to get an exception instead of getting a result when you feed something unreasonable to a function. When debugging, it's often easier if you get the exception as early as possible instead of trying to trace back where a faulty value comes from.

Those options are for backward compatibility.
But, it will be better for people to use framework classes/methods to ensure consistency.
Having said that, VB6 functions are easy to understand. So, it should not be an issue for someone who has the VB background.
EDIT: Also, some of the overloads available with framework classes, might not be available with an equivalent of a simple VB6 like statement. I cannot remember of any, as of now - But this is what I think, could be a better reason to use framework classes/methods.

There will be special cases, but, Hands down, use the VB6 versions, unless you care about the difference between a string being "" and Nothing.
I was working on a big project where different programmers using both ways, the code where people used MyString.SubString(1) was blowing up while Mid(MyString,2) was working.
The two main errors for this example: (Which apply in various ways to others as well)
(1) String can be nothing and you have to check before running a method on it. Limitation of the OO notation: You can't call a member method if the object is nothing, even if you want 'nothing' or (empty object) back. Even if this were solved by using nullable/stub objects for strings (which you kind of can using "" or string.empty), you'd still have to ensure they're initialized properly - or, as in our case - convert Nothing to "" when receiving strings from library calls beyond our control.
You are going to have strings that are Nothing. 90% of the time you'll want it to mean "". with .SubString, you always have to check for nothing. With the VB versions, only the 10% about which you'll care.
(2) Specifically with the Mid example, again, 90% of the time if you want chars 3-10 of a 2 char string, you'll want to see "" returned, not have it throw an exception! In fact, you'll rarely want an exception: you'll have to check first for the proper length and code how it should behave (there is usually a defined behaviour, at the very least, a data entry error, for which you don't want to throw an exception).
So you're checking 100% of the time with the .Net versions and rarely with the VB versions.
.Net wanted to keep everything into the object-oriented philosophy. But strings are a little different than most objects used in subtle ways. MS-Basic wasn't thinking about this when they made the functions, it just got lucky - one of the strengths of functions is that they can handle null objects.
For our project, one may ask how Nothing strings got into our flow in the first place. But in the end, the decision of some programmers to use the .Net functions meant avoidable service calls, emergency bug fixes, and patches. Save yourself the trouble.

I would avoid them. Since you've mentioned them it sounds as though you've inherited some VB6 code that was possibly converted to a VB.NET project. Otherwise, if it was a new VB.NET project, I see no value in using the VB6 methods.
I've been on a few VB6 to VB.NET conversion projects. While I am familiar with the names and the difference in 0 based indexing, any code I came across got refactored to use their .NET equivalents. This was partially for consistency and to get the VB6 programmers on that project familiar with the framework. However, the other benefit I've found in refactoring is the ability to chain method calls together.
Consider the following:
Dim input As String = "hello world"
Dim result As String = input.ToUpper() ' .NET
Dim result As String = UCase(input) ' VB6
Next thing you know, I need to do more work to satisfy other requirements. Let's say I need to take the substring and get "hello," which results in the code getting updated to:
Dim result As String = input.ToUpper().Substring(0, 5) ' .NET
Dim result As String = Mid(UCase(input), 1, 5) ' VB6
Which is clearer and easier to modify? In .NET I just chain it. In VB6 I had to start at the beginning of the method, then go to the end of it and add the closing parenthesis. If it changes again or I need to remove it, in .NET I just chop off the end, but in VB6 I need to backtrack to the start and end.
I think there's value in using the .NET methods since other .NET developers that join the project later, especially those from a C# background, can easily pick it up.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas