When generating a RMarkdown PDF from Shiny, PDF saving and printing is disabled - pdf

I am trying to generate a report from a Shiny app where the user can select either HTML or PDF. I am able to generate a PDF from my .rmd file (LaTex format) and everything looks good with the layout/formatting. However, the PDF is opened (using Foxit PDF Reader) with a filename other than what I specified in my downloadhandler. I am also unable to save or print the PDF through the Foxit Reader window. The file name that is output is RStudio-randomletters.pdf (ex: RStudio-FoZvSx.pdf).
Generating an html report works fine with no issues, generates the correct file name that I specified, and opens a window for me to save or rename the file.
It seems that the PDF is the only issue so I'm not sure if it is just related to reading the PDF in the Foxit Reader or if it is something else?
Update
Using Adobe Acrobat instead of Foxit allows me to now save and print, but I am still having issues with the file name for the PDF.
Here is the code for my downloadhandler
output$downloadReport <- downloadHandler(
filename = function() {
paste('Report', sep = '.', switch(
input$format, PDF = 'pdf', HTML = 'html'))},
content = function(file) {
out <- if (input$format == 'HTML'){rmarkdown::render('report.Rmd',
params = list(Name = input$Name,
Reference = input$Reference),
switch(input$format,
PDF = pdf_document(), HTML = html_document()),
envir = new.env(parent = globalenv()))}
else if (input$format == 'PDF'){rmarkdown::render('pdfreport2.Rmd',
params = list(Name = input$Name,
Reference = input$Reference),
switch(input$format,
PDF = pdf_document(), HTML = html_document()),
envir = new.env(parent = globalenv()))}
file.rename(out, file)})

Related

Trouble Reading PDFPY2

Open the PDF file
pdf_file = open(file, 'rb')
Create a PDF reader object
pdf_reader = PyPDF2.PdfFileReader(pdf_file)
Get the number of pages in the PDF file
pages = pdf_reader.numPages
Initialize a variable to store the extracted text
text = ''
Loop through each page
for page in range(pages):
# Get the current page
pdf_page = pdf_reader.getPage(page)
# Extract the text from the page
page_text = pdf_page.extractText()
# If the page contains text, add it to the overall text
if page_text:
text += page_text
Close the PDF file
pdf_file.close()
Print the extracted text
print(text)
**Error:
**
TypeError: 'NumberObject' object is not subscriptable
Tried changing the pdf reader from WPF to Adobe Acrobat XI

Extract Text from Multipage Attachment PDF Using Google Apps Script

I have a Gmail attachment PDF with multiple scanned pages. When I use Google Apps Script to save the blob from the attachment to a Drive file, open the PDF manually from Google Drive, then select Open With Google Docs, all of the text from the PDF is displayed as a Google Doc. However, when I save the blob as a Google Doc with OCR, only the text from the image on the first page is saved to a Doc, accessed either manually or by code.
The code to get the blob and process it is:
function getAttachments(desiredLabel, processedLabel, emailQuery){
// Find emails
var threads = GmailApp.search(emailQuery);
if(threads.length > 0){
// Iterate through the emails
for(var i in threads){
var mesgs = threads[i].getMessages();
for(var j in mesgs){
var processingMesg = mesgs[j];
var attachments = processingMesg.getAttachments();
var processedAttachments = 0;
// Iterate through attachments
for(var k in attachments){
var attachment = attachments[k];
var attachmentName = attachment.getName();
var attachmentType = attachment.getContentType();
// Process PDFs
if (attachmentType.includes('pdf')) {
processedAttachments += 1;
var pdfBlob = attachment.copyBlob();
var filename = attachmentName + " " + processedAttachments;
processPDF(pdfBlob, filename);
}
}
}
}
}
}
function processPDF(pdfBlob, filename){
// Saves the blob as a PDF.
// All pages are displayed if I click on it from Google Drive after running this script.
let pdfFile = DriveApp.createFile(pdfBlob);
pdfFile.setName(filename);
// Saves the blob as an OCRed Doc.
let resources = {
title: filename,
mimeType: "application/pdf"
};
let options = {
ocr: true,
ocrLanguage: "en"
};
let file = Drive.Files.insert(resources, pdfBlob, options);
let fileID = file.getId();
// Open the file to get the text.
// Only the text of the image on the first page is available in the Doc.
let doc = DocumentApp.openById(fileID);
let docText = doc.getBody().getText();
}
If I try to use Google Docs to read the PDF without OCR directly, I get Exception: Invalid argument, for example:
DocumentApp.openById(pdfFile.getId());
How do I get the text from all of the pages of the PDF?
DocumentApp.openById is a method that can only be used for Google Docs documents
pdfFile can only be "opened" with the DriveApp - DriveApp.getFileById(pdfFile.getId());
Opening a file with DriveApp allows you to use the following methods on the file
When it comes to OCR conversion, your code works for me correctly to convert all pages of a PDF document to Google Docs, so you error source is likely come from the attachment itself / the way you retrieve the blob
Mind that OCR conversion is not good at preserving formatting, so a two page PDF might be collapsed into a one-page Docs - depneding on the formatting of the PDF

Unable to create PDF from JSPDF

Save in DB and import data and create a pdf file using jspdf.
Data is stored up to html tag...
select ct_contents from contract where ct_id = 659;
RESULT : `<p style="text-align:justify"><span style="font-size:10.5pt"><span style="font-family:Century,serif"><span style="font-family:"MS Mincho"">氏  名</span></span></span></p>`
I have this js code :
let pdfName = this.newTemplate.tp_title.trim()
var doc = new jsPDF();
doc.addFileToVFS('NotoSansCJKjp-Regular.ttf', VFS);
doc.addFont('NotoSansCJKjp-Regular.ttf', 'NotoSansCJKjp', 'Bold');
doc.setFont('NotoSansCJKjp', 'Bold');
doc.setFontSize(12);
var paragraph = this.contract.ct_contents;
var lines = doc.splitTextToSize(paragraph, 150);
doc.text(15, 60, lines);
doc.save(pdfName + '.pdf');
add a font to work on it, but check the downloaded pdf, the html tag will also appear.
I want to remove this tag and make it appear only in text.
image is the result of downloading by pdf.
And it is page 3 in ms word and only page 1 of pdf is download.....
How can I get the font to come out without getting the html tag?

How do I convert multiple .pdf files to .ai files all at once?

As a part of my job, my boss wants me to convert thousands of .pdfs into .ai format in Illustrator CS6 without having to open each individual file (among the thousands) and save each pdf as a .ai. I need to convert these files by the thousands with a few simple steps.
Using Illustrator CS6, I have tried to do this by using the batch option by applying the same action to multiple files, (2). I have chosen two folders for input and output. A source from which I get the pdfs and a destination for the converted .pdfs in .ai format are placed.
While the conversions are successful, the multiple files, in this case 2, opened up individually in Illustrator, I had to save them rudimentarily.
This is not what I need. I need to be able to automatically convert thousands of pdfs into .ai's, without having to open and save each and every one of them.
How do I do this?
You can use this script as starting point. It works for singlepage .pdf files right away. For multipage files you will have to tweak it a bit more
(function(thisObj){
main();
function main(){
var pdffiles = File.openDialog ('select one or more pdf files', '*.pdf', true);
if(pdffiles === null){
return;
}
for(var f = 0; f < pdffiles.length;f++){
var pdf = pdffiles[f];
//~ alert(pdf);
var doc = app.open (pdf);
var namepattern = pdf.path + "/" + pdf.name + ".converted.ai";
var newai = null;
if(!(File(namepattern).exists)){
newai = new File(namepattern);
}else{
newai = File(namepattern);
}
doc.saveAs(newai);
doc.close (SaveOptions.DONOTSAVECHANGES);
}
}
})(this);

Print PDF in Website

I have been searching for days for a solution to this problem.
Description : I have a website which loads a PDF dynamically via an iFrame. The PDF is saved on the server and the user of the website can view the pdf on the website.
Problem : Introduce a Print button on website which prints the PDF which was created dynamically and saved on the server.
Is this even possible ? I am looking at a cross-browser implementation as well to make things worse. I have tried n number of JS options from the web but none of them seem to work. I can not seem to get the PDF printed in the same way as it looks. To put it short, I am trying to emulate the print button which appears on the PDF when it is loaded. Is there an option to pass the pdf document from the server to the print dialog box ?
Description : I have a website which loads a PDF dynamically via an iFrame. The PDF is saved on the server and the user of the website can view the pdf on the website.
Problem : Introduce a Print button on website which prints the PDF which was created dynamically and saved on the server.
Solution : I could not find an exact solution to this problem, but here is how I solved the problem -
Create the 'Print' as per req and redirect that to another page which has only the PDF.
Copy the previous PDF & Create new PDF with JS - this.print() such that when it opens up, the print dialog pops up directly to the user.
In the new page -
if ("Location of PDF " != null)
{
sPdf = "Location of PDF ";
PdfReader pReader = new PdfReader(sPdf);
Document document = new Document
(pReader.GetPageSizeWithRotation(ApplicationConstants.INDEX_ONE));
int n = pReader.NumberOfPages;
FileStream fs = new FileStream
("New PDF location",
FileMode.Create, FileAccess.Write);
PdfCopy copy = new PdfCopy(document, fs);
// Write to pdf
document.Open();
for (int i = ApplicationConstants.INDEX_ONE; i <= n; i++)
{
PdfImportedPage page = copy.GetImportedPage(pReader, i);
copy.AddPage(page);
}
copy.AddJavaScript("this.print(true);", true);
document.Close();
pReader.Close();
inStr = File.OpenRead("New PDF location");
while ((bytecnt = inStr.Read
(buffer, ApplicationConstants.INDEX_ZERO, buffer.Length))
> ApplicationConstants.INDEX_ZERO)
{
if (Context.Response.IsClientConnected)
{
Context.Response.ContentType = "application/PDF";
Context.Response.OutputStream.Write(buffer,
ApplicationConstants.INDEX_ZERO, buffer.Length);
Context.Response.Flush();
}
}
}
Please note that I am using itextsharp to inject the JS script into the new PDF. Hope this helps someone else. I am trying to find another solution without the usage of itextsharp or any other dll but this will have to do for now.
I am not sure if this will work, but you could try launching a popup window with a special version of your PDF file that opens the print dialog when opened. Then close the popup afterwards. This last part might be tricky since I think there is no clean way to know if the print dialog has been closed.