Select three rows of text in PDF using VBA - vba

I am trying to use VBA to select three rows of data in a PDF file and copy them to the clipboard. I have tried third party libraries but I still can't seem to find a simple solution. I can use the cursor to select the data and copy it, so I just want to automate this step with VBA.
I have looked high and low for an answer to this and I feel like it might be really simple and I'm just missing it. I assume I could just use the "highliteList" method in the acrobat library to select the rows, but I don't know how to specify where to begin the selection. There is a header on each page, so I just want to say something like:
For Each header In pdf.pages
NextLine.SelectRow
NextLine.SelectRow
Next header
Selection.CopyToClipboard
Is this possible? I know those methods probly don't exist, I was using it as an example. Does anyone have experience with doing this? Thanks in advance for any help

I found a solution for all those interested. I used Bytescout PDF extractor library to convert the file to .xls format. Then I just parsed out what I needed in Excel since Excel is easy to work with via vb.net.

Related

EPPLus - Opening Just The First 1000 Rows Of An Excel File From C#

Some of the Excel files I'm working with are big, and I only need to see a sampling of the data in them. The first 1000 lines is plenty.
Does anyone know of a way to open just the top of a file? Currently we're doing the following:
_package = new ExcelPackage(file);
This reads the whole file into memory. I've played with OpenXML a little bit and it looks like it's possible to read just part of a file with that, but we have a lot of code already built around EPPlus, so I'm trying to do it with EPPlus.
I believe EPPLus uses OpenXML. If anyone knows how to use the two together to read partial files and could give me some guidance, it would be greatly appreciated.

How to read a .csv file embeded in a Word file

I currently am working on an MS Word form where I use VBA to import data from a .csv file. The form is a phone request for employes who work outside, and my .csv is a list of phones that is updated for time to time. My boss would like to have this .csv directly into the Word to have only one single file to handle.
I found how to include (embeded) it, I go to "insert" tab on the ribbon, "Object" button, then "Create from File" tab. I check "display as icon" but not "Link to file".
My .csv is only used to import data through a macro when opening the Word form. Data are then paste in a variable that is handled in VBA to fill combos. The .csv is periodicaly updated and re-embedded so the form is updated also. I don't plan to alter the .csv in VBA in any way.
The objective is eventualy to give the Word document to someone else who will embed a new .csv when needed, but won't look at the code.
The problem I am working on is to reach the .csv and import data.
Another solution, thanks to #Cindy Meister's and #SlowLearner's comments for pointing out the idea, would be to use Custom XML Parts. I will make some additional researches about that.
Moreover, with my currently very limited experience on programming and VBA, I'm not sure how to handle that (I'm not even sure it's possible). The lack of quality documentation on VBA doesn't help either.
After comments of #Cindy Meister and #SlowLearner, I reformulated part of the question to be more precise, although I feel it's getting very long and and messy. I probably should make my own research to narrow down the problem.
ps: that's my first question here, so any advice is welcome, and please be nice with me :)

Create documents without dependency on Office programs?

Is there any way for my app to create a document that will need saved and printed without utilizing some external software? Currently I have a spreadsheet that does a bunch of calculations then a button that runs some VBA to export 2 sheets to a PDF, then it saves and prints it. I want to port this spreadsheet into a .NET app.
I have experience with everything EXCEPT this: how do I re-create these 2 documents that are currently in Excel, without having to utilize Excel? My whole goal here is to get out of Excel...because I hate it, and I'd like to send this app to clients across the country. I really don't want to have to tell clients that they need Excel to use this. I might as well just leave it in the spreadsheet if that's the case.
I'm sorry if this is sort of "nooby", but I'm not sure what the best course of action here is. Should I try to mimic my Excel sheets on 2 hidden forms and save/print those? Should I write some HTML to produce these forms in a browser and save/print from there? Are there any other options here? I'll probably end up saving as XPS if I can find a way to get out of Excel. Would love some pointers if you have any ideas. Thanks everyone!
Edit: a little more info...I don't need help with the calculations, or exporting PDF's. I don't need any help regarding Excel or VBA. The workbook has 1 input sheet where users enter data. The results of the calculations appear on 2 other sheets in this workbook. These 2 sheets are currently exported to 1 PDF using VBA, which is then saved and printed using VBA. These spreadsheets are not "spreadsheets" like you may be thinking. They are actually "forms", for lack of a better term, that the user will never edit after running the macro to export them as PDF's. They contain text, pictures, shapes, etc. Excel is merely the medium currently being used to create these documents. My goal is to build this project in .NET and get us out of Excel, but I'm not quite sure how to reproduce these 2 forms within my app without utilizing Office. Think of them as templates. After the user enters data in my app, it will do some calculations, and the results need to appear on 2 forms that need printed and saved on the user's machine. How do I recreate these 2 forms in .NET?
Of ya, vb.net, winforms (although I could use WPF as I haven't started yet), 4.0 framework :)
Here is what you can do:
Add an empty Excel file into your resources and use it as template
When your program is to save data, you can take that file from resources and save it to hard disk.
Connect to Excel file using "Microsoft.Ace.oleDb."
Save and read data in Excel just as it was a db table - there are plenty examples on the net. Google for it.
For this project you don't need Excel application on the machine.
Now, if your concern is Excel, you don't have to worry. Your clients can use OpenOffice, for example. Or, you can save data in CSV format. CSV is not Excel. You can create your own text format and your clients will be able to read it with the Notepad. You can do HTML/XML combination and have your html page load whatever xml you supply.
Seriously, create a spreadsheet and tell your clients that they can open it with their favorite spreadsheet editor.
So instead of using a third party program or anything fancy , you could just read in the excel or xsl file and spit it out? Just write some code to format the data properly for users... There is a similar question that may help you with a tutorial - Here But this is for java, Are you using c# or vb ? .Net 4.5 ? razor ?
You can create the PDF's from code using tools like iTextSharp. Or use fillable PDF templates and use iTextsharp to fill in the form. You will need a program to print the PDF, I use the Foxit Reader. I'm sure there are some full featured PDF tools out there that include printing.
You can also doing printing from VB.Net but I would guess mixed media documents would be difficult.

Dataset to Excel - VB.NET

I guess this question is already out in the internet a lot. I have gone through many of them but still stuck with this problem
My requirement is to get one of the Dataset Tables to a Excel file. I have all the data I need in a Dataset.Table object. Lot of the code on the internet talks about looping through the columns and rows and assigning it to the cell in Excel file. I am able to do that but that really doesnt solve the purpose as large datasets wiht a few thousand rows takes more than 5 minutes to execute and get an output.
Is there any other efficient way to do it? Any input is appreciated as every bit of information is useful to me.
Thank you
EPPlus is free, very fast and very powerful Excel tool for visual studio using the and can do everything you want - They have the functionality to output a datatable directly to Excel using the LoadFromDataTable() function.
You could create a CSV file, and then open it with excel and convert within excel.
If I am understanding correctly you want to do an excel file from a "dataset". You can try using CSV (http://en.wikipedia.org/wiki/Comma-separated_values); the format for CSV is really simple. For performance, store as much as data in the memory and finally write to a file, otherwise if you are writing to a file everytime you are reading a row from a dataset, then it will take much longer. Make sure your file ends with the extension of .csv otherwise MS excel will not open it. Hopefully this helps a bit.
Use GemBox Spreadsheet library (http://gemboxsoftware.com/). It does what you need.
They also have a free version.

Manipulate Excel workbooks programmatically

I have an Excel workbook that I want to use as a template. It has several worksheets setup, one that produces the pretty graphs and summarizes the numbers. Sheet 1 needs to be populated with data that is generated by another program. The data comes in a tab delimited file.
Currently the user imports the tab delimited file into a new Workbook, selects all and copies. Then goes to the template and pastes the data into sheet1.
This is a large amount of data, 269 columns and over 135,000 rows. It’s a cumbersome process and the users are not experienced Excel users. All they really want is the pretty graphs.
I would like to add a step after the program that generates the data to programmatically automate the process the user currently must do manually.
Can anyone suggest the best method/programming language that could accomplish this?
POI is the answer. Look at the Apache website. You can use java to read the data and place it in cells. The examples are very easy.
You can can solve this, for example, by a simple VBA macro. Just use the macro recorder to record the steps the user does manually now, this will give you something to start with (you probably will have to add a function to let the user choose the import file).
You said you have some data generated by another program. What kind of program? A program that you have developed by yourself and where you can add the excel-import functionality? Or a third party program with a GUI that cannot be automated easily?
And if you really want to create an external program for this task - choose whatever programming lanuguage you like as long as it can use COM objects. In .NET, you have the option of using VSTO, but I would only suggest that for this task if you have already some experience with that (but than you would not ask this kind of question, I think :-))
Look here:
Create Excel (.XLS and .XLSX) file from C#
There's NPOI (.NET Framework version of POI) so that you can code in C# if you want.
If you use two workbooks - one for data and one for graphs - and don't update links automatically you can use a macro to get the data (maybe an ODBC connection if the file is in a format it can read - long shot) and then link the charts to the data workbook.
Use a macro to update the links and generate the charts and then send them out and hope no one updates the links.