.NET & CreateObject, whose memory does the COM object use? - vb.net

I have a VB.NET application that uses CreateObject to use Excel and dump a lot of data into it. We are getting out of memory exceptions and our app is generally hitting 1GB of memory at this point. However I can't make all the numbers add up.
This is how the data is passed to Excel:
worksheet_object.Range("A1").Resize(rows, cols).Value = an_array
The app is around 400mb with the data on screen (datagrid), when it crashes it has used an additional 600mb despite the fact that Excel with the CSV is only 200mb and the CSV is only 68mb. I realise the in memory array could be somewhat larger but how does 600mb get gobbled up passing the data to Excel unless Excel is somehow using my apps memory?
I have tried to find out if Excel via CreateObject runs in its own memory space or using my apps memory space, drew a complete blank. ProcessExplorer shows them as separate processes so I don't know what to think.
We found running the app as 64-bit rather than 32 solved the problem, but not all our clients will have 64-bit office.
So my question is this: How can that one line use 600mb and is there a better way to pass the data to Excel.

I would bet that this takes some time to run the 1 line as well, 9.7M cells!? But anyway, 2 questions:
1: How can that one line use 600mb (I assume you mean MB (Byte), not mb (bit))
I think you answered that with 9.7M cells. Thats only 61 Bytes per cell. I would assume you have some strings. Each string uses a couple bytes, plus 2 Bytes for every character in the cell. Numbers with a decimal point, they take up 16 Bytes.. period. Not to mention, the array has some size to it, to handle it's information. See this chart for an idea of what the data takes up in VB. 2012 VB Data Types One amazing thing that I just learned because I had not looked at this chart in so long, is that by going to 64bit OS, it should have actually taken MORE memory. Luckily, 64bit can handle more.
2: Is there a better way to pass the data to Excel.
You might try using the OLEDB ACE driver to work with your excel file. It is faster and does not require EXCEL to be installed on the computer, big pluses there.
How To Use ADO.NET to Retrieve and Modify Records in an Excel Workbook With Visual Basic .NET
This post is old but still relevant and very helpfull. For connection strings for newer versions of excel, try ConnectionStrings.com

Related

After debugging VBA code, excel filesize is inexplicably large

The problem
Although one might assume this is a de rigeur problem with a mismanaged workbook, please hear me out. I'm working with a relatively modest (80-100kb) excel file. It has minimal formatting, very few formulas, and it's vba code doesn't do anything too fancy. I've been working with vba for a few months now and have had this problem exactly once before. As this tool will be used in producing reports for a while to come, and as I've limited time, I'd rather not rebuild the whole thing or turn in an unwieldy project.
Details
My code loops through several dozen 200+kb .xls files using Workbooks.Open method and grabs 1-2 values depending on what it's opened. It's been working through over a half dozen iterations with no change in size. In my latest version it's balloned from 100kb to 10mb. No tabs have been added, I've looked extensively for all sorts of formatting and other traditional pitfalls—it's not my first rodeo in that respect. I am new-ish to vba though, and so often step through code and have to pause or stop debugging before resuming to correct syntax. Sometimes this will happen before any Workbooks.Close method.
I think, inexplicably, during some combination of saves/crashes/debugging, excel has 'saved' some of the opened workbooks in this file's memory. It gets weirder: if I copy all of my worksheets, as well as my code, to a new workbook, the file size falls back down to 100kb or so. I've had this problem once before. Using a 20kb workbook I was trying to pull values from a 135mb workbook; after one save I noticed that my workbook was now 135mb and change, and I was only able to get it back down to 20kb by copying everything one sheet at a time to a new workbook. Does anyone know what's going on?
Attempting to google this produces dozens of pages on extraneous formatting, formulas, and VBA code unrelated to my issue.
I could post pseudo-code if it would be useful to see, but I'm pretty sure it wouldn't be too enlightening, and that my problem is some memory issue that occurs between .Open and .Close methods.
Edit: after looking through some of the workbooks that I'm opening, quite a few of them are 'around' 10mb. It seems suspicious to me that my tool would increase in size by about as much as one of the .xlsx files I'm opening.

Excel form object limit? Compile Error out of memory

I have created an excel sheet that opens a form but I seem to have hit an object limit of some kind. I say this because with no other code changes, if I add a few more text boxes it will no longer opens and I receive compile Error out of memory.
I have seen multiple posts about out of memory where the advice is given to use "set object = nothing". However I don't believe this will help in my case because I am not using ANY global objects and all my pvt sub variables are integer, byte, or currency. As I understand it, all private dimensions should be terminated by VBA garbage collector at the end of the function anyway.
I exported my form1 and it shows UserForm1.frx as 4,143KB and UserForm1.frm as 140KB.
Did I indeed hit some cap? How can I confirm that?
Is there a way to get more out of excel since this appears as a memory limit?
If I am at a cap, does visual studio professional have a higher cap than excel for forms?
By Default Excel is 32 bit and does have a low memory limit of 2 Gigabytes, however I found if you uninstall and reinstall as Excel 2016 64 bit you increase your virtual memory usage to up to 8-Terabytes. This makes any program less widely usable but it does allow for your to build greatly larger programs! That was the fix I needed any way.
Here is the source that led me there:
http://www.decisionmodels.com/memlimitsc.htm

VBA Ignore Excel cannot complete this task with available resources

I'm writing a VBA code that will run everything after I leave the office.
The macro works fine, the problems is that sometimes (more often than I'd like) I get the message:
Excel cannot complete this task with the available resources. Choose less data or close other applications. Excel cannot complete this task with the available resources. Choose less data or close other applications. Continue without Undo?
I just click OK and the code runs fine, but I have do click the OK manually,
I've already tried the
Application.DisplayAlerts = False
but this doesn't work.
Does anyone know if I can't make excel "overpass" this problem?
Thank you in advanced
I believe "Continue without Undo" means Excel is temporarily clearing the RAM it uses to track undo levels and then (it seems) your macro has the resources it needs to complete the process.
Take a look at what your macro is doing to use so much RAM: Is there a way to modify it so that less RAM is required? There are several options for this listed here:
How to clear memory to prevent "out of memory error" in excel vba?
Second option to fix this might be adding RAM to your machine, but it will not fix the cause of the error.
Third, if you want to risk a registry edit and reduced or eliminated undo levels in Excel, you might be able to prevent this error by reducing the number of undo levels (http://support.microsoft.com/kb/211922).

Using open source SNES emulator code to turn a rom file into a self-contained executable game

Would it be possible to take the source code from a SNES emulator (or any other game system emulator for that matter) and a game ROM for the system, and somehow create a single self-contained executable that lets you play that particular ROM without needing either the individual rom or the emulator itself to play? Would it be difficult, assuming you've already got the rom and the emulator source code to work with?
It shouldn't be too difficult if you have the emulator source code. You can use a method that is often used to store images in c source files.
Basically, what you need to do is create a char * variable in a header file, and store the contents of the rom file in that variable. You may want to write a script to automate this for you.
Then, you will need to alter the source code so that instead of reading the rom in from a file, it uses the in memory version of the rom, stored in your variable and included from your header file.
It may require a little bit of work if you need to emulate file pointers and such, or you may be lucky and find that the rom loading function just loads the whole file in at once. In this case it would probably be as simple as replacing the file load function with a function to return your pointer.
However, be careful for licensing issues. If the emulator is licensed under the GPL, you may not be legally allowed to store a proprietary file in the executable, so it would be worth checking that, especially before you release / distribute it (if you plan to do so).
Yes, more than possible, been done many times. Google: static binary translation. Graham Toal has a good howto paper on the subject, should show up early in the hits. There may be some code out there I may have left some code out there.
Completely removing the rom may be a bit more work than you think, but not using an emulator, definitely possible. Actually, both requirements are possible and you may be surprised how many of the handheld console games or set top box games are translated and not emulated. Esp platforms like those from Nintendo where there isnt enough processing power to emulate in real time.
You need a good emulator as a reference and/or write your own emulator as a reference. Then you need to write a disassembler, then you have that disassembler generate C code (please dont try to translate directly to another target, I made that mistake once, C is portable and the compilers will take care of a lot of dead code elimination for you). So an instruction of a make believe instruction set might be:
add r0,r0,#2
And that may translate into:
//add r0,r0,#2
r0=r0+2;
do_zflag(r0);
do_nflag(r0);
It looks like the SNES is related to the 6502 which is what Asteroids used, which is the translation I have been working on off and on for a while now as a hobby. The emulator you are using is probably written and tuned for runtime performance and may be difficult at best to use as a reference and to check in lock step with the translated code. The 6502 is nice because compared to say the z80 there really are not that many instructions. As with any variable word length instruction set the disassembler is your first big hurdle. Do not think linearly, think execution order, think like an emulator, you cannot linearly translate instructions from zero to N or N down to zero. You have to follow all the possible execution paths, marking bytes in the rom as being the first byte of an instruction, and not the first byte of an instruction. Some bytes you can decode as data and if you choose mark those, otherwise assume all other bytes are data or fill. Figuring out what to do with this data to get rid of the rom is the problem with getting rid of the rom. Some code addresses data directly others use register indirect meaning at translation time you have no idea where that data is or how much of it there is. Once you have marked all the starting bytes for instructions then it is a trivial task to walk the rom from zero to N disassembling and or translating.
Good luck, enjoy, it is well worth the experience.

VB.NET/COM Server code way slower than Excel VBA code

Background
I have a client who needs Excel VBA code that produces formula values moved to VB.NET. He is in the business of providing financial analytics, in this case delivered as an Excel add-in. I have translated the VBA into VB.NET code that runs in a separate DLL. The DLL is compiled as a COM Server because, well, Excel-callable .NET UDFs have to be. So far, so good: Excel cells have "=foo(Range1, Range2, ...)", the VB.NET Com Server's UDF is called, and the cell obtains a value that matches the VBA code's value.
The problem
The VB.NET code is way slower. I can stretch a range of VBA-based formulas and get instantaneous calculation. I can stretch a comparable range of VB.NET-based formulas and the calculation takes 5-10 seconds. It is visibly slower and unacceptable to the client.
There are a few possibilities that occur to me:
native compilation of VBA is faster because of the absence of a switch
the DLL may be loaded and unloaded for each UDF call
the DLL calls Excel WorksheetFunction methods and requires an Application object, and creating the Application object is expensive
calling an Excel WorksheetFunction method from the DLL is expensive
I don't think that (2) is true because I put calls to append to a file in the Shared New, the Public New, and Finalize functions, and all I get are:
Shared Sub New
Public Sub New
Finalize
when I open the spreadsheet, repeatedly stretch a formula range, and close the spreadsheet.
I don't think (3) is true because the file writing shows that the Application object is created only once.
The question
How do I figure out what is taking the time? How to profile in this environment? Are there obvious enhancements?
In the last category, I have tried to reduce the number of creations of an Application object (used for WorkSheetFunction calls) by making it Shared:
<Guid("1ECB17BB-444F-4a26-BC3B-B1D6F07D670E")> _
<ClassInterface(ClassInterfaceType.AutoDual)> _
<ComVisible(True)> _
<ProgId("Library.Class")> _
Public Class MyClass
Private Shared Appp As Application ' Very annoying
Approaches taken
I've tried to reduce the dependence on Excel mathematical functions by rewriting my own. I've replaced Min, Max, Average, Stdev, Small, Percentile, Skew, Kurtosis, and a few more. My UDF code calls out to Excel much less. The unavoidable call seems to be taking a Range as an argument and converting that to a .NET Array for internal use.
The DLL is compiled as a COM Server
because, well, Excel-callable .NET
UDFs have to be
A bit of a show-stopper if true, I agree. But of course, it isn't true at all, why else would I have started that way...
You can write your UDFs in C++ against the Excel SDK and deliver them as an XLL, for one thing. It's a common practice among quantitative analysts in banks; in fact they seem to enjoy it, which says a lot about them as a group.
Another, less painful option, that I've only recently come across, is ExcelDNA, which, AFAICT, provides the nasty SDK/XLL bit with a way to hook up your .NET DLLs. It's sufficiently cool that it even lets you load source code, rather than building a separate DLL, which is great for prototyping (it makes use of the fact that the CLR actually contains the compiler). I don't know about performance: I haven't attempted to benchmark it, but it does seem to get around the COM Interop issue, which is well-known to be ghastly.
Beyond that, I can only endorse other recommendations: reference your workbook, its content and the Excel application as little as possible. Every call costs.
I seriously suppose that interop from VB.NET to the COM server is done via marshalling. In VBA the methods were called directly - the control was passed into them at a cost of couple of processor instructions and that looked really fast. Now with marshalling a whole set of extra work is done and each call encounters a serious overhead. You need to either seriously reduce the number of calls (make each call do more work) or disable marshalling and work as if was with VBA. See this question for details on how to possibly accomplish the latter.
I recently benchmarked moving data from Excel to .NET using various products/methods.
All the .NET methods I tried were slower than VBA and VB6 but the best ones were able to use the XLL interface which gave better results than the Automation interface.
the benchmark was reasonably optimised (transferring ranges to arrays etc)
results were (millisecs for my benchmark)
VB6 COM addin 63
C XLL 37
Addin Express Automation VB.net 170
Addin Express XLL VB.net 100
ExcelDNA XLL CVB.Net 81
Managed XLL gave comparable times but also enables cusom marshallers which can be fast.
There is some more performance stuff for ExcelDna on CodePlex: http://exceldna.codeplex.com/Wiki/View.aspx?title=ExcelDna%20Performance.
For really simple functions, the overhead of calling a managed function through ExcelDna is very small, allowing you to make several hundred thousand UDF calls per second.
My guess based on a lot of experience using Excel via COM Interop is that it is the context switch and / or the mapping of data from Excel's internal data structures to .NET objects.
SpreadsheetGear for .NET might be an option for you. It is much faster than Excel via COM Interop (see what some customers say here) and it does support Excel compatible calculations and user defined functions (see the Custom Functions sample on this page).
You can download a free trial here if you want to try it out.
Disclaimer: I own SpreadsheetGear LLC
I have the same experience as Joe. It is mostly the interop that is slow.
In most cases this can be solved by working with entire ranges instead if individual cells.
You do this by using .Net arrays and the pass them to/from excel in one call.
e.g.
Dim values(10,10) As object
Dim r As Excel.Range = Me.Range("A1")
r = r.Resize(UBound(values, 1), UBound(values,2))
values = r.Value
For ii = 0 To UBound(values,1)
For jj = 0 To UBound(values,2)
values(ii,jj) = CType(values(ii,jj), Double)*2
Next
Next
r.Value = values
This has solved all performance problems I have seen
One thought. Instead of passing the Range object (it could be that every call onto the Ranbe object could be marshalled from .Net to Excel), collate all your parameters into basic types, doubles, strings, typed arrays and if necessary un-typed variant arrays, and pass them into the .Net DLL. That way you only have to marshall a variant.
-- DM
Really late to this question (7 years) but for what it is worth, I have worked on 5/6 separate Excel systems in Investment Banks and have seen a similar design pattern in all their Excel systems which I'll describe.
Yes, they have blocks of cells which contain related data such as a list of government bond prices but they do not always pass this block of cells around. Instead they will create an object that resides in memory which is globally accessible and is labelled with a handle. The object contains a copy of the cell's content and so is thus more easily accessed in analytic code.
So an example handle would be
'USTreasuries(103450|2016-07-25T15:33)'
where it can be seen that '103450' is an object number, unique enough to acquire the object from a globally scoped dictionary (say), the timestamp represents when the object is created and USTreasuries is a user friendly description. One would create such as object with a formula function something like this
=CreateHandledObject("USTreasuries",A1:D30)
The one would write an analytic which accepts this handle and acquires the data internally. It requires the CreateHandledObject() to be marked volatile and you have to turn calculation to manual and execute recalculation by code or by user.
You problems stem from endless marshalling data from the sheet. I think this approach will help you reduce this cumbersome element to a minimum.