I need to read data inside a PDF file in an ASP page.
I will need convert the file to another format or is there another way to do this?
Thanks in advance.
You will need to either have a text file or a COM library that can understand PDF files, such as http://www.asppdf.com/ (result from Google search, this is not a product recommendation, only an example).
Amyuni PDF Creator ActiveX fits this scenario if a commercial library is an option for you. You can find examples for Visual Basic here.
Usual disclaimer applies
Related
I have a project from my lecture to create a web apps to read and analyze a pdf file based on keywords. What kind of programming language that I can use?
Example : I need to find or check some keywords or data on the pdf file. If the keyword or data is exist and available, the result is true.
I usually work in javascript so could answer you in that, I had a great help from the below conversation, it might be a good help for you too.
extract text from pdf in Javascript
I am brandnew to PDF Generation or rendering but have a project to, create a PDF Template system that allows users to save Template to Database,
and later generate a PDF document using the template and values from my database.
Language to use C#
Questions
a) Is there a PDF tool out there that can help me with this and documentation I can study to learn of this?
b) Are there free tools out there for this?
c) How do I create a PDF Template? XML?
Thanks in Advance!
You should have a look at xsl:fo.
Apache has a tool which might be helpful.
You can use PHP to create and modify PDFs. (Everything below is completely free.)
Here are two extensive tutorials on generating PDFs in PHP:
http://blog.eirikhoem.net/index.php/2008/04/28/populate-pdf-templates-with-php-fpdf-fpdi/
http://www.astahost.com/info.php/create-pdf-php_t4972.html
You can use the FPDF library located here to handle generating PDFs based off of templates.
If You are using Java, you could try Docmosis or JODReports - they work from templates and can produce PDF output dynamically based on data and those templates. Depending on your template requirements, you might also be able to use Jasper Reports or Apache POI. All have free versions.
If you are looking for an instant solution, take a look on http://pdfnow.com . You can upload your XSL/FO-Templates and simply generate PDF-Templates with a simple webservice call.
I would give a shot to jsreport. You can install it on premise for free or use it online. It supports html -> pdf transformation using phantomjs or xml -> pdf transformation using apache fop.
The idea is that first you create report template using javascript templating engines like handlebars in jsreport studio and then you get back pdf by calling jsreport api.
If you are in c# there is jsreport c# sdk for it.
Note: I am the author of jsreport
The above question says it all. I know you can create a PDF from an image file or HTML in ColdFusion 8 using CFPDF, but I'm wondering if it's possible to create a PDF from a MS Word document directly - in CF8 or CF9.
Could you import the Word document and convert it to HTML or an image file, and then do the conversion? Or is there a shortcut?
see Doc: Office file interoperability - Using cfdocument
ColdFusion 9 supports OpenOffice, which uses the cfdocument tag to convert a Word document (.doc format) to PDF.
<cfdocument
format="pdf"
srcfile="C:\documents\MyDocument.doc"
filename="C:\documents\MyDocument.pdf">
</cfdocument>
In CF8, you could probably do something with COM object integration or POI integration, but it would not be simple/straightforward.
Converting it to HTML using Word's save as feature is probably the simplest route using CF8. I'll suggest that Henry has the right idea, though, upgrading to CF9 to take advantage of OO.O integration.
Edit: Thanks to #jarofclay, I now know that the POI CFC wrapper has been updated to include Word docs. I remembered it only supporting Excel, but that's clearly changed. Um, is it too late for me to change my vote for how to do this in CF8?
I am not at all familiar with CF, but if you can make web service calls from it then try this product. It relies on MS-Office rather than Open Office so provides much better conversion fidelity. It also supports additional formats including Infopath, Excel, PowerPoint etc as well as Watermarking support.
Please note that I have worked on this product so the usual disclaimers apply.
I am looking for some good tools (free or paid, though free tool is always preferred)
for doing following operations on word doc files:
Manipulation of doc/docx/text files (like replacing some placeholders with DB values) as well as
converts doc files to .pdf
Because, I will be using this tool in my WCF service library,
So I am looking for a code library and not for a GUI based product.
Please share your experience regarding same.
Thank you!
Aspose has a decent collection of MS Office and PDF manipulation libraries.
Aspose Homepage
On the off chance that you're only looking for PDFs for viewing or archival purposes, you could also setup a PDF print driver and print your office files into a given location using Automation. You could also edit Office files through Automation although this may be tedious.
VSTO would give you access to the save as PDF from the Office applications.
Please see my answer to a related question on SO where I recommend a number of ways to convert your Word document to a format that is more easy to manipulate programmatically (using XSL-FO).
We are developing a little application that given a directory with PDF files creates a unique PDF file containing all the PDF files in the directory. This is a simple task using iTextSharp. The problem appears if in the directory exist some files like Word documents, or Excel documents.
My question is, is there a way to convert word, excel documents into PDF programmatically? And even better, is this possible without having the office suite installed on the computer running the application?
Office 2007 allows for this. I have found PDFCreator to be good, the VBA is included in sample files, and have heard that CutePDF is also good. PDFCreator and CutePDF are free.
To work without Office, you would need viewers, as far as I know:
http://www.microsoft.com/downloads/details.aspx?FamilyID=c8378bf4-996c-4569-b547-75edbd03aaf0&displaylang=EN
http://www.microsoft.com/downloads/details.aspx?familyid=95E24C87-8732-48D5-8689-AB826E7B8FDF&displaylang=en
I needed to do this myself, but managed to get it done with .Net and without 3rd party tools:
MSDN: Saving Word 2007 Documents to PDF and XPS Formats
Pretty simple, about 50 lines of code. However I think you will need Word 2007 installed on the machine as well as the ability to Save As PDF
To convert Word documents to PDF, take a look at jWordConvert, a java library that can do exactly that. This will not work with the Excel files though, only with the Word files. The language is not Sharp, it's Java but you could switch to use IText (which is java) instead of ITextSharp.
You can also use a component like activePDF's DocConverter to convert a lot formats to PDF.
Use PDF maker that comes with adobe 7- 9
I just used this code Covert Doc to PDF
I'm surprised Aspose wasn't mentioned here, it's easy, simple, and reliable. Downside is that it is not free.
I've used iTextSharp in the past, it's really good, easy to install (one DLL I believe), the merge takes a bit of tindering so it's not as easy to use as Aspose, but hey, it's free so that is the best part.
TallPDF.NET (comes with a hefty price tag) allows you to serve dynamic PDF from any .NET application including ASP.NET pages and web services.
PDFEdit (free and open source) is an editor for manipulating PDF documents. It has a GUI version and a command-line interface. Scripting is used to a great extent in the editor and almost anything can be scripted. It is possible to create your own scripts or plugins.
The most common way to convert files to a pdf is to print them to a pdf printer driver. There are a number of such drivers, one that i know of that will do the job is Black Ice.
Another is to use Adobe Acrobat's SDK. from memory its very expensive.
Its been a while since i have actually done any work with converting pdf's and the landscape may have changed.