vbnet opening Workbook too slow - vb.net

I have many Excel Files and I want to read cell values from this files via a VB.NET windows application. But I realized that I have a performance bottleneck when I want to open the Excel.Workbook:
xlApp.Workbooks.Open(Path)
I takes on my Laptop (with 4 GB RAM) about 600ms. And I have to open many Workbooks.
Is there a way to speed up the opening process? Or is there a way to read cell values from an Excel File without opening it?

Use Excel as a Database and then the reading should be quick. Alternatively, you can use OpenXml for your operations. A sample is available at: http://www.codeproject.com/Articles/670141/Read-and-Write-Microsoft-Excel-with-Open-XML-SDK

Related

PowerBI data model conversion

I published my Power BI dashboard to the cloud using Import Excel Workbook Content. Unfortunately my hard disk dies and I lost the original workbook (which contains PowerPivot data model).
Can I convert PBIX file to PowerPivot model? I actually tried to rename the file to .zip and I could see a file called DataModel which has the similar file size as my original model but I don't know what to do with it.
Help me please?
Currently the conversion between a Power Pivot for Excel data model and a Power BI data model is one way from Power Pivot for Excel to Power BI.
May I ask why you need to convert back to Power Pivot? Power BI Desktop offers Power Pivot, so you can do your data modelling and measure definition there.
I've seen some people talk about changing the file extension to '.zip', opening up the archive, removing the DATA file, putting that DATA file into a similarly renamed and unzipped Excel document in the 'model' folder, and then rezipping and renaming the Excel file back to '.xlsx'. I've never had luck with this.

MultiCore processing with VBA

I know there is no way to thread or anything fancy like that with VBA but when I was recently running a find duplicates macro in VBA on about 60000 lines of excel data I opened task manager performance to find that all 4 cores of my processor were under heavier than normal usage. One core was at about 80% and the other 3 were at about 60%.
This is rather high being as of right now none of them are going over 15%.
So my question is how does my sequential VBA macro get divided into 4 parrallel parts that tax each processor core?

How to handle a very big array in vb.net

I have a program producing a lot of data, which it writes to a csv file line by line (as the data is created). If I were able to open the csv file in excel it would be about 1 billion cells (75,000*14,600). I get the System.OutOfMemoryException thrown every time I try and access it (or even create an array this size). If anyone has any idea how to can take the data into vb.net so I can do some simple operations (all data needs to be available at once) then I'll try every idea you have.
I've looked at increasing the amount of ram used but other articles/posts say this will run short way before the 1 billion mark. There's no issues with time here, assuming it's no more than a few days/weeks I can deal with it (I'll only be running it once or twice a year). If you don't know anyway to do it the only other solutions I can think of would be increasing the number of columns in excel to ~75,000 (if that's possible - can't write the data the other way around), or I suppose if there's another language that could handle this?
At present it fails right at the start:
Dim bigmatrix(75000, 14600) As Double
Many thanks,
Fraser :)
First, this will always require a 64bit operating system and a fairly large amount of RAM, as you're trying to allocate about 8 GB.
This is theoretically possible in Visual Basic targeting .NET 4.5 if you turn on gcAllowVeryLargeObjects. That being said, I would recommend using a jagged array instead of a multidimensional array if possible, as this will remove the requirement of needing a single allocation of 8GB. (This will also potentially allow it to work in .NET 4 or earlier.)

Problems with huge CPU overhead

I'm writing an application in vb.net 2005. The app reads a spreadsheet into a DataSet with ADO.NET and uses a column of that table to populate a ListBox. When a ListBox Item is selected, the user will be presented with detailed information on the selected record.
One part of this information isn't in the DataSet. I have to compare a column from the spreadsheet with several external data sources to determine the nature of the record in question. Here's where I have my problem.
This comparison has to search through 9.5m rows in a SQL table at one stage. I've checked and there's no way to "shrink" the query down as I'm already only searching absolutely essential data.
What happens is that the application never visibly does anything. The CPU usage shoots up to 100% regardless of what it was at beforehand and the system's performance becomes almost unbearably slow.
Can anyone suggest a way I can improve this situation while this massive query is running?
EDIT: I was originally going to write the contents of the 9.5m rows in the database table to a text file which I'd then read from, but after 6.5m rows, I got an OutOfMemoryException.
I suspect your CPU might be used in populating the DataSet, though you would have to profile your application to confirm that. Try using a DataReader instead and either storing the results in some more compact format in memory or, if you're running out of memory, then writing them to a file as you go. With the DataReader approach you never need to store the entire result set in memory at the same time.
An index in the column to search?
A new field in the table to help to search faster?

Excell 2007 cell auto-complete too slow for worksheets with 10,000 to 15,000 rows

I'm doing a Excel 2007 VSTO template where I load a worksheet with between 10.000 to 15.000 rows of data, and was counting on using the built-in excel cell auto-complete to speed future data entry. Nevertheless I noticed the auto-complete is very slow, it takes a couple of minutes for excel to, after starting to edit a cell, be able to finish the "internal scan" of all the upper cells in other to be able to find a match. Is there anything I can do to speed it up?
Contacted MS support and they said it was by design and could only be solved by a change request to the product team which would take a log time.