Not able to load web PDF file which doesn't have .pdf at the end of the URL - pdfbox

I am using the below code for PDFBOX to read web pdf file
PDDocument pddDocument = PDDocument.load(new File("http://<website name>/flow.html?operatorin"));
PDFTextStripper textStripper = new PDFTextStripper();
String doc = textStripper.getText(pddDocument);
pddDocument.close();
System.out.println(doc);
So in the above code if you see url it is not showing as normal web pdf url with .pdf at the end.
How can we handle such cases?

Related

When generating a RMarkdown PDF from Shiny, PDF saving and printing is disabled

I am trying to generate a report from a Shiny app where the user can select either HTML or PDF. I am able to generate a PDF from my .rmd file (LaTex format) and everything looks good with the layout/formatting. However, the PDF is opened (using Foxit PDF Reader) with a filename other than what I specified in my downloadhandler. I am also unable to save or print the PDF through the Foxit Reader window. The file name that is output is RStudio-randomletters.pdf (ex: RStudio-FoZvSx.pdf).
Generating an html report works fine with no issues, generates the correct file name that I specified, and opens a window for me to save or rename the file.
It seems that the PDF is the only issue so I'm not sure if it is just related to reading the PDF in the Foxit Reader or if it is something else?
Update
Using Adobe Acrobat instead of Foxit allows me to now save and print, but I am still having issues with the file name for the PDF.
Here is the code for my downloadhandler
output$downloadReport <- downloadHandler(
filename = function() {
paste('Report', sep = '.', switch(
input$format, PDF = 'pdf', HTML = 'html'))},
content = function(file) {
out <- if (input$format == 'HTML'){rmarkdown::render('report.Rmd',
params = list(Name = input$Name,
Reference = input$Reference),
switch(input$format,
PDF = pdf_document(), HTML = html_document()),
envir = new.env(parent = globalenv()))}
else if (input$format == 'PDF'){rmarkdown::render('pdfreport2.Rmd',
params = list(Name = input$Name,
Reference = input$Reference),
switch(input$format,
PDF = pdf_document(), HTML = html_document()),
envir = new.env(parent = globalenv()))}
file.rename(out, file)})

how can I show the pdf file content inside from razor file

I created a document file from word and has exported as pdf . i want to show the pdf content inside the Div element in razor page. How can I show the pdf content from razor page. Please can you provide an example code how to show in blazor server side
If you stored your pdf file directly in your documents for example in the folder wwwroot/pdf.
wwwroot/pdf/test.pdf
You can display this PDF with this line of html bellow :
< embed src="pdf/test.pdf" style="width=100%; height=2100px;" />
It will provide you a pdf displayer with printing options !
If you want to go further, upload your file and then display it, I will recommend you to go check this explanation :
https://www.learmoreseekmore.com/2020/10/blazor-webassembly-fileupload.html
The upload for PDF files works the same as Img file, you need to go check IBrowserFile documentation.
You will see that it has a Size obj and a OpenReadStream() function that will help you get the display Url for your file (image or pdf)
If the site abow closes, this is the upload code that is shown on it :
#code{
List<string> imgUrls = new List<string>();
private async Task OnFileSelection(InputFileChangeEventArgs e)
{
foreach (IBrowserFile imgFile in e.GetMultipleFiles(5))
{
var buffers = new byte[imgFile.Size];
await imgFile.OpenReadStream().ReadAsync(buffers);
string imageType = imgFile.ContentType;
string imgUrl = $"data:{imageType};base64,{Convert.ToBase64String(buffers)}";
imgUrls.Add(imgUrl);
}
}
}
This code was written by Naveen Bommidi, author of the blog where I found this usefull code
If you want, as I said, upload a PDF and then display it.
You can use the same html line :
< embed src="#imgUrl" style="width=100%; height=2100px;" />
And your uploaded files will be displaying.
Example

Generate PDF from gsp page

I am using grails 2.5.2.
I have created a table which shows all the data from database to gsp page and now i need to save that shown data in a pdf format with a button click.What will be the best way to show them into a PDF and save it to my directory. please Help
You can use itext for converting HTML into pdf using the code below:
public void createPdf(HttpServletResponse response, String args, String css, String pdfTitle) {
response.setContentType("application/force-download")
response.setHeader("Content-Disposition", "attachment;filename=${pdfTitle}.pdf")
Document document = new Document()
Rectangle one = new Rectangle(900, 600)
document.setPageSize(one)
PdfWriter writer = PdfWriter.getInstance(document, response.getOutputStream())
document.open()
ByteArrayInputStream bis = new ByteArrayInputStream(args.toString().getBytes())
ByteArrayInputStream cis = new ByteArrayInputStream(css.toString().getBytes())
XMLWorkerHelper.getInstance().parseXHtml(writer, document, bis, cis)
document.close()
}
Though answering this question late,take a look at grails export plugin.It will be useful if you want to export your data to excel and pdf( useful only if there is no in pre-defined template to export).
Got idea from itext. Used itext 2.1.7 and posted all the values to pdf from a controller method. Used images as background and paragraph and phrase to show values from database.

Print one page of WebBrowser

I am trying to automate printing of intranet websites. Since this is an application that will be put on a specific user's computer, which will be run on an as-needed basis, I'd like it to be as un-disruptive as possible (in other words, not launching IE for each page). The catch is that I need to print the first page of the website and then print the whole website again, which will produce the first page two times. What is the best way to do this?
I have no problem getting it to loop through the pages that it needs to print, nor do I have a problem opening the page with webbrowser. I do, however, have a problem specifying a print range.
I also tried PrintDocument, but couldn't figure out how to get that to open within the form.
Thanks for any help that can be provided.
To download the pdf file, try this solution using iTextSharp:
ITextSharp HTML to PDF?
Except with one substitution if you want to save directly to a file
private MemoryStream createPDF(string html)
{
MemoryStream msOutput = new MemoryStream();
TextReader reader = new StringReader(html);
// step 1: creation of a document-object
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
// step 2:
// we create a writer that listens to the document
// and directs a XML-stream to a file
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream("c:\\my.pdf", FileMode.Create));
// step 3: we create a worker parse the document
HTMLWorker worker = new HTMLWorker(document);
// step 4: we open document and start the worker on the document
document.Open();
worker.StartDocument();
// step 5: parse the html into the document
worker.Parse(reader);
// step 6: close the document and the worker
worker.EndDocument();
worker.Close();
document.Close();
return msOutput;
}
Once the PDF is set up, try ghostscript to print one page:
Print existing PDF (or other files) in C#
If you start a shell execute of the process, you can use the command line arguments:
gsprint "filename.pdf" -from 1 - to 1
Alternatively, WebBrowser can just print the full page: http://msdn.microsoft.com/en-us/library/b0wes9a3.aspx
I can't find anything referencing that WebBrowser itself can print "From page X to Y" without a print dialog.
Since I'm facing a similar problem, here's an alternate solution:
This open source project turns HTML documents to PDF documents similar to iTextSharp (http://code.google.com/p/wkhtmltopdf/). We ended up not using iTextSharp because of several formatting issues with the way the site we wanted to print was laid out. We send command line arguments to turn the html downloaded using a webclient into a pdf file.
WebClient wc = new WebClient();
wc.Credentials = CredentialCache.DefaultNetworkCredentials;
string htmlText = wc.DownloadString("http://websitehere.com);
Then, after turning to pdf, you can simply print the file:
Process p = new Process();
p.StartInfo.FileName = string.Format("{0}.pdf", fileLocation);
p.StartInfo.Verb = "Print";
p.Start();
p.WaitForExit();
(Apologies for C#, I'm more familiar with it than VB.NET, though it should be a simple conversion)

Java Image to PDF

How can an image file be converted into a PDF file using java? I am taking output from a graphic library. the output that I am able to export is in image formats like JPEG and PNG.
I want to convert that image file to PDF file.
You can use Itext to add an Image to a PDF.
Use IText PDF API for Java you must first download the IText JAR file from the IText website
First a Document instance is created.
Second, a PDFWriter is created, passing the Document instance and an OutputStream to its constructor. The Document instance is the document we are currently adding content to. The OutputStream is where the generated PDF document is written to.
OutputStream file = newFileOutputStream(newFile("/path/JavaGeneratedPDF.pdf"));
Document document = new Document();
PdfWriter.getInstance(document, file);
Here make sure that you handle DocumentException
Inserting Image in PDF
Image image = Image.getInstance ("/Image.jpg");
image.scaleAbsolute(200f, 100f); //image width,height
Here make sure that you handle MalformedURLException
Now Open PDF document, add image and close document instance
document.open();
document.add(image);
document.close();
file.close();