Can EPPLUS read sections of Excel worksheets? - vb.net

I use the following code to read in an excel file creating a package:
Public _package As ExcelPackage = Nothing
Dim flInfo As New FileInfo(FILENAME)
_package = New ExcelPackage(flInfo) 'load the excel file
This code is great when I want to load an entire file, but sometimes whether due to my local hardware issues or limitations with Epplus itself, if my file size exceeds 100MB then the import process crashes.
Therefore is there a way of loading part of an excel file into an EPPLUS package, in fact is it possible to pick and select certain portions of the Excel file?
Many thanks

How many rows and columns of data do you have? EPPlus has trouble when the file gets too big. Version 4 improved it much but did not solve it. Unfortunately, there is no great work around:
EPPlus, handling big ExcelWorksheet

Related

exporting a word Document to PDF and saving it on global drive

i am facing a little issue. I wrote a small programm with VBA that writes some Data into a word Document and then exports this word Document to PDF. Since this is part of my job I have to save this on a global Drive of my company.
The problem is that somehow exporting takes just very long. For my VBA Code i am using the follwing statement to save it
WordDoc.ExportAsFixedFormat path, ExportFormat:=wdExportFormatPDF
Generally, if I want to export a Word Document and save it manually the first time in a certain folder, exporting that file takes also very long. But if I then export another document in this same folder, exporting becomes very fast. So generally how does exporting on a global drive actually work (internally) ?
But somehow my programm is always very slow. I just do not understand why exporting a file by my Programm always takes so long (I am 100 percent sure that is it the exporting that takes so long, and it is just that statement mentioned above that takes so long (WordDoc.ExportAsFixedFormat path, ExportFormat:=wdExportFormatPDF) )
Do you guys have any Idea how to make exporting and saving a file quicker.
I would really need help :)
Thank you so much in advance
jonas

EPPLus - Opening Just The First 1000 Rows Of An Excel File From C#

Some of the Excel files I'm working with are big, and I only need to see a sampling of the data in them. The first 1000 lines is plenty.
Does anyone know of a way to open just the top of a file? Currently we're doing the following:
_package = new ExcelPackage(file);
This reads the whole file into memory. I've played with OpenXML a little bit and it looks like it's possible to read just part of a file with that, but we have a lot of code already built around EPPlus, so I'm trying to do it with EPPlus.
I believe EPPLus uses OpenXML. If anyone knows how to use the two together to read partial files and could give me some guidance, it would be greatly appreciated.

VBS or VBA for Multiple Excel Files?

Here is what I need to do:
I have around 1000 folders, within each of these folders is an excel file that I need to reformat (all the excel files are similarly formatted), I imagine the actual programming part is quite easy but that is besides the point. If I need to run a script on each of these excel files, should I use VBA or VBS?
As far as I understand, VBS can be run from the command line and doesn't require any Excel files to be open but VBA does require Excel to be open in order to run the script but that seems counter intuitive since I need to run a script on not just one workbook but a thousand.
Any help will be appreciated.
Many thanks.
Edit: I should say explicitly, I am trying to convert around 1000 CSV files to XLSX, I have tried python packages such as OpenPYXL and XLSXwriter but they are just terribly slow - it takes around 5 minutes per file.
I suggest you to use Python with the openpyxl package to implement your idea. The openpyxl package is intuitively easy to use, while Python's flexibility will allow you to get the job done in short term.

Dataset to Excel - VB.NET

I guess this question is already out in the internet a lot. I have gone through many of them but still stuck with this problem
My requirement is to get one of the Dataset Tables to a Excel file. I have all the data I need in a Dataset.Table object. Lot of the code on the internet talks about looping through the columns and rows and assigning it to the cell in Excel file. I am able to do that but that really doesnt solve the purpose as large datasets wiht a few thousand rows takes more than 5 minutes to execute and get an output.
Is there any other efficient way to do it? Any input is appreciated as every bit of information is useful to me.
Thank you
EPPlus is free, very fast and very powerful Excel tool for visual studio using the and can do everything you want - They have the functionality to output a datatable directly to Excel using the LoadFromDataTable() function.
You could create a CSV file, and then open it with excel and convert within excel.
If I am understanding correctly you want to do an excel file from a "dataset". You can try using CSV (http://en.wikipedia.org/wiki/Comma-separated_values); the format for CSV is really simple. For performance, store as much as data in the memory and finally write to a file, otherwise if you are writing to a file everytime you are reading a row from a dataset, then it will take much longer. Make sure your file ends with the extension of .csv otherwise MS excel will not open it. Hopefully this helps a bit.
Use GemBox Spreadsheet library (http://gemboxsoftware.com/). It does what you need.
They also have a free version.

Is there a tactical (read 'hack') solution to avoid having the VBA code and Excel sheet as one binary file?

As I understand it when I create an Excel sheet with VBA code the VBA code is saved as binary with the sheet. I therefore can't put the code into source control in a useful way, have multiple devs working on the problems, diffing is difficult, etc.
Is there a way round this without switching to VSTO, COM addins etc? E.g. for the sheet to load all is VBA in at runtime from a web service, shared drive, etc? Any ideas appreciated.
Thanks.
I wrote a kind of build system for Excel which imports the VBA code from source files (which can then be imported into source control, diffed, etc.). It works by creating a new Excel file which contains the imported code, so it might not work in your case.
The build macro looks like this, I save it in a file called Build.xls:
Sub Build()
Dim path As String
path = "excelfiles"
Dim vbaProject As VBIDE.VBProject
Set vbaProject = ThisWorkbook.VBProject
ChDir "C:\Excel"
' Below are the files that are imported
vbaProject.VBComponents.Import (path & "\something.frm")
vbaProject.VBComponents.Import (path & "\somethingelse.frm")
Application.DisplayAlerts = False
ActiveWorkbook.SaveAs "Output.xls"
Application.DisplayAlerts = True
Application.Quit
End Sub
Now, the VBIDE stuff means you have to import a reference called “Microsoft Visual Basic for Applications Extensibility 5.3", I think.
Of course you still have the problem with having to start Excel to build. This can be fixed with a small VB script:
currentPath = CreateObject("Scripting.FileSystemObject") _
.GetAbsolutePathName(".")
filePath = currentPath & "\" & "Build.xls"
Dim objXL
Set objXL = CreateObject("Excel.Application")
With objXL
.Workbooks.Open(filePath)
.Application.Run "Build.Build"
End With
Set objXL = Nothing
Running the above script should start the build Excel file which outputs the resulting sheet. You problably have to change some things to make it movable in the file system. Hope this helps!
I would strongly recommend against trying to load the VBA at runtime. The VBIDE automation is flaky at best and I think you'll run into quite a lot of maintenance and support headaches.
My solution to this was to export the code peridocally for source control as text files. I would also remove all code from the xls (complete removal of Forms, Modules and Class Modules and deletion of code in Worksheet and Workbook modules) and put only this 'stub' xls into source control.
A rebuild from source control would thereforce consist of taking the 'stub' xls, importing all the Forms, Class Modules and Modules and copy and pasting in all the code into the Worksheet and Workbook modules.
Whilst this was a painful process, it did allow for proper source control of code with all the usual capabilities of diffing, branching etc. Unfortunately most of it was manual. I did write an add-in which automated some of this but it was quickly hacked together solution, i.e. fairly buggy and requiring manual intervention. If I ever get round to revisiting it and getting it up to scratch then I'll let you know ;-)
Interesting question ... we also have a problem with VBA source control but have never actually got around to address it.
Does this Microsoft KB article match your criteria ? The code is put in a .BAS file which can be in an arbitary location (and separate from the .xls).
As I said I've never actually tried to do this but it looks as if this might be one approach.
The shortest path forward is using SourceTools.xla addin from http://www.codeproject.com/KB/office/SourceTools.aspx
Another approach is to write a similar tool using COM bindings to your favorite programming language. An able colleague of mine created one in Python that could export everything including VBA code, worksheet contents and formulas and even VBA references into text files and could assemble a working workbook out of those. That was prior to the XLSX format too.
I'd also advise against loading at runtime and look for a make-style utility. After all, if your code is under source control then you should only need to update your workbooks after a commit.
Unfortunately, there isn't such a utility, or at least not one that I could find.
I think compared with the alternatives (ie constantly rebuilding Excel workbook files) it'd be much less hassle to move over to VSTO or COM Addins, using VBA for light prototyping work. You also get the added benefit of not having easily "hackable" source code (ie you need more than to guess the VBA project password)