EPPLus - Opening Just The First 1000 Rows Of An Excel File From C# - epplus

Some of the Excel files I'm working with are big, and I only need to see a sampling of the data in them. The first 1000 lines is plenty.
Does anyone know of a way to open just the top of a file? Currently we're doing the following:
_package = new ExcelPackage(file);
This reads the whole file into memory. I've played with OpenXML a little bit and it looks like it's possible to read just part of a file with that, but we have a lot of code already built around EPPlus, so I'm trying to do it with EPPlus.
I believe EPPLus uses OpenXML. If anyone knows how to use the two together to read partial files and could give me some guidance, it would be greatly appreciated.

Related

ExportAsFixedFormat for VBA saving Files

i have a problem regarding the process of saving a bunch of pdfs (exported from WordDocuments).
The runtime of my program just behaves a bit weird and thats why i am asking.
So I want to save the files on a global drive.
In my program i create a folder (in that drive where) i put all the pdfs.
Somehow if i do this operation for the first time it is really slow.
But if I do this operation ( for the same fodler a second time) it is somehow really fast. (after I deleted the "old" pdfs, or the old pdfs where overwritten)
I am a bit frustrated and I cannot explain why that is
Could somebody help pls
Would be very happy for an answer
Greetings
Jonas
Using this simple code
doc.ExportAsFixedFormat wholefile, ExportFormat:=wdExportFormatPDF
There are multiple reasons why the ExportAsFixedFormat method can be slow. But I would start from dealing with local files only. Then I'd play with arguments specifying the document quality and exported format. It makes sense to play with other parameters left in your code sample to its default values.

VBS or VBA for Multiple Excel Files?

Here is what I need to do:
I have around 1000 folders, within each of these folders is an excel file that I need to reformat (all the excel files are similarly formatted), I imagine the actual programming part is quite easy but that is besides the point. If I need to run a script on each of these excel files, should I use VBA or VBS?
As far as I understand, VBS can be run from the command line and doesn't require any Excel files to be open but VBA does require Excel to be open in order to run the script but that seems counter intuitive since I need to run a script on not just one workbook but a thousand.
Any help will be appreciated.
Many thanks.
Edit: I should say explicitly, I am trying to convert around 1000 CSV files to XLSX, I have tried python packages such as OpenPYXL and XLSXwriter but they are just terribly slow - it takes around 5 minutes per file.
I suggest you to use Python with the openpyxl package to implement your idea. The openpyxl package is intuitively easy to use, while Python's flexibility will allow you to get the job done in short term.

Select three rows of text in PDF using VBA

I am trying to use VBA to select three rows of data in a PDF file and copy them to the clipboard. I have tried third party libraries but I still can't seem to find a simple solution. I can use the cursor to select the data and copy it, so I just want to automate this step with VBA.
I have looked high and low for an answer to this and I feel like it might be really simple and I'm just missing it. I assume I could just use the "highliteList" method in the acrobat library to select the rows, but I don't know how to specify where to begin the selection. There is a header on each page, so I just want to say something like:
For Each header In pdf.pages
NextLine.SelectRow
NextLine.SelectRow
Next header
Selection.CopyToClipboard
Is this possible? I know those methods probly don't exist, I was using it as an example. Does anyone have experience with doing this? Thanks in advance for any help
I found a solution for all those interested. I used Bytescout PDF extractor library to convert the file to .xls format. Then I just parsed out what I needed in Excel since Excel is easy to work with via vb.net.

Create documents without dependency on Office programs?

Is there any way for my app to create a document that will need saved and printed without utilizing some external software? Currently I have a spreadsheet that does a bunch of calculations then a button that runs some VBA to export 2 sheets to a PDF, then it saves and prints it. I want to port this spreadsheet into a .NET app.
I have experience with everything EXCEPT this: how do I re-create these 2 documents that are currently in Excel, without having to utilize Excel? My whole goal here is to get out of Excel...because I hate it, and I'd like to send this app to clients across the country. I really don't want to have to tell clients that they need Excel to use this. I might as well just leave it in the spreadsheet if that's the case.
I'm sorry if this is sort of "nooby", but I'm not sure what the best course of action here is. Should I try to mimic my Excel sheets on 2 hidden forms and save/print those? Should I write some HTML to produce these forms in a browser and save/print from there? Are there any other options here? I'll probably end up saving as XPS if I can find a way to get out of Excel. Would love some pointers if you have any ideas. Thanks everyone!
Edit: a little more info...I don't need help with the calculations, or exporting PDF's. I don't need any help regarding Excel or VBA. The workbook has 1 input sheet where users enter data. The results of the calculations appear on 2 other sheets in this workbook. These 2 sheets are currently exported to 1 PDF using VBA, which is then saved and printed using VBA. These spreadsheets are not "spreadsheets" like you may be thinking. They are actually "forms", for lack of a better term, that the user will never edit after running the macro to export them as PDF's. They contain text, pictures, shapes, etc. Excel is merely the medium currently being used to create these documents. My goal is to build this project in .NET and get us out of Excel, but I'm not quite sure how to reproduce these 2 forms within my app without utilizing Office. Think of them as templates. After the user enters data in my app, it will do some calculations, and the results need to appear on 2 forms that need printed and saved on the user's machine. How do I recreate these 2 forms in .NET?
Of ya, vb.net, winforms (although I could use WPF as I haven't started yet), 4.0 framework :)
Here is what you can do:
Add an empty Excel file into your resources and use it as template
When your program is to save data, you can take that file from resources and save it to hard disk.
Connect to Excel file using "Microsoft.Ace.oleDb."
Save and read data in Excel just as it was a db table - there are plenty examples on the net. Google for it.
For this project you don't need Excel application on the machine.
Now, if your concern is Excel, you don't have to worry. Your clients can use OpenOffice, for example. Or, you can save data in CSV format. CSV is not Excel. You can create your own text format and your clients will be able to read it with the Notepad. You can do HTML/XML combination and have your html page load whatever xml you supply.
Seriously, create a spreadsheet and tell your clients that they can open it with their favorite spreadsheet editor.
So instead of using a third party program or anything fancy , you could just read in the excel or xsl file and spit it out? Just write some code to format the data properly for users... There is a similar question that may help you with a tutorial - Here But this is for java, Are you using c# or vb ? .Net 4.5 ? razor ?
You can create the PDF's from code using tools like iTextSharp. Or use fillable PDF templates and use iTextsharp to fill in the form. You will need a program to print the PDF, I use the Foxit Reader. I'm sure there are some full featured PDF tools out there that include printing.
You can also doing printing from VB.Net but I would guess mixed media documents would be difficult.

Dataset to Excel - VB.NET

I guess this question is already out in the internet a lot. I have gone through many of them but still stuck with this problem
My requirement is to get one of the Dataset Tables to a Excel file. I have all the data I need in a Dataset.Table object. Lot of the code on the internet talks about looping through the columns and rows and assigning it to the cell in Excel file. I am able to do that but that really doesnt solve the purpose as large datasets wiht a few thousand rows takes more than 5 minutes to execute and get an output.
Is there any other efficient way to do it? Any input is appreciated as every bit of information is useful to me.
Thank you
EPPlus is free, very fast and very powerful Excel tool for visual studio using the and can do everything you want - They have the functionality to output a datatable directly to Excel using the LoadFromDataTable() function.
You could create a CSV file, and then open it with excel and convert within excel.
If I am understanding correctly you want to do an excel file from a "dataset". You can try using CSV (http://en.wikipedia.org/wiki/Comma-separated_values); the format for CSV is really simple. For performance, store as much as data in the memory and finally write to a file, otherwise if you are writing to a file everytime you are reading a row from a dataset, then it will take much longer. Make sure your file ends with the extension of .csv otherwise MS excel will not open it. Hopefully this helps a bit.
Use GemBox Spreadsheet library (http://gemboxsoftware.com/). It does what you need.
They also have a free version.