PyPDF2.errors.PdfStreamError AFTER the program did its job? - pdf

I have a small programm going through a folder of PDFs (path_contracts) and merges each with one specific other pdf (anlage) and saves it in another folder (path_finished). It runs fine and does the job, but after it is done it throws an Error :
PyPDF2.errors.PdfStreamError: Stream has ended unexpectedly
Here is the code :
anlage = r"Folder with one PDF"
path_contracts = r"Folder with many PDFs"
path_finished = r"Path to the export folder"
for pdf in os.listdir(path_contracts):
merger = PdfMerger()
merger.append(os.path.join(path_contracts, pdf))
merger.append(anlage)
merger.write(path_finished + "\\" + pdf)
merger.close()
Anybody can help out, what is missing?

Related

Photoshop script suddenly stopped working - Error 8000

So I made a script for Photoshop based on this generator
The important part is
#target photoshop
function main() {
// prompt user to select source file, cancel returns null
var sourceFile = File.openDialog("Select a 1:1 sqaure PNG file that is at least 618x618.", "*.png", false);
if (sourceFile == null) {
// user canceled
return;
}
var doc = open(sourceFile, OpenDocumentType.PNG);
if (doc == null) {
alert("Oh shit!\nSomething is wrong with the file. Make sure it is a valid PNG file.");
return;
}
....
}
main();
this allways worked. But when today I wanted to change something in the script (I haven't even started yet and not used it for about 2 weeks) I suddendly only get an error (translated from german):
Error 8000: The file can not be opened since the parameters for opening are incorrect.
Line:764
-> doc = open(sourceFile, OpenDocumentType.PNG);
How can I open a PNG file via a File.Open dialog in a Photoshop script?
I already tried to add the app
var doc = app.open(sourceFile, OpenDocumentType.PNG);
to remove the document type specifier
var doc = open(sourceFile);
or to add this as I saw it in many forums
var doc = open(sourceFile, OpenDocumentType.PNG, undefined);
and variations between them. Nothing helped so far.
For debugging I also added
alert(sourceFile);
before the according line and get e.g.
~/Desktop/Example/originalImage_2000x2000.png
The problem apparently was with Photshop in general!
When I opened Photshop I didn't even get the default view of last opened files etc and actually was not able to open any file ... but never tested this first.
After rebooting the PC and launching Photshop now everything went back to normal and the script just runs fine and as expected.

Could not write the file an assertation has failed in illustrator

I'm using adobe extend script for exporting a file into jpg format in adobe illustrator. when i'm performing an export operation, sometimes it shows "could not write the file. an assertation has failed" but not every time.
I'm using following code
var exportOptions = new ExportOptionsJPEG();
var type = ExportType.JPEG;
var fileSpec = new File(dest);
exportOptions.antiAliasing = true;
exportOptions.qualitySetting = 70;
app.activeDocument.exportFile( fileSpec, type, exportOptions );
Is it part of code error or anything from the side of Illustrator? it doesn't occur at all time. does it seem to an OS or version problems? Although I'm using the latest version of illustrator.

How to check multiple PDF files for annotations/comments?

Problem: I routinely receive PDF reports and annotate (highlight etc.) some of them. I had the bad habit of saving the annotated PDFs together with the non-annotated PDFs. I now have hundreds of PDF files in the same folder, some annotated and some not. Is there a way to check every PDF file for annotations and copy only the annotated ones to a new folder?
Thanks a lot!
I'm on Win 7 64bit, I have Adobe Acrobat XI installed and I'm able to do some beginner coding in Python and Javascript
Please ignore the following suggestion, since the answers already solved the problem.
EDIT: Following Mr. Wyss' suggestion, I created the following code for Acrobat's Javascript console to be run only once at the beginning:
counter = 1;
// Open a new report
var rep = new Report();
rep.size = 1.2;
rep.color = color.blue;
rep.writeText("Files WITH Annotations");
Then this code should be applied to all PDFs:
this.syncAnnotScan();
annots = this.getAnnots();
path = this.path;
if (annots) {
rep.color = color.black;
rep.writeText(" ");
rep.writeText(counter.toString()+"- "+path);
rep.writeText(" ");
if (counter% 20 == 0) {
rep.breakPage();
}
counter++;
}
And, at last, one code to be run only once at the end:
//Now open the report
var docRep = rep.open("files_with_annots.pdf");
There are two problems with this solution:
1. The "Action Wizard" seems to always apply the same code afresh to each PDF (that means that the "counter" variable, for instance, is meaningless; it will always be = 1. But more importantly, var "rep" will be unassigned when the middle code is run on different PDFs).
2. How can I make the codes that should be run only once run only at the beginning or at the end, instead of running everytime for every single PDF (like it does by default)?
Thank you very much again for your help!
This would be possible using the Action Wizard to put together an action.
The function to determine whether there are annotations in the document would be done in Acrobat JavaScript. Roughly, the core function would look like this:
this.syncAnnotScan() ; // updates all annots
var myAnnots = this.getAnnots() ;
if (myAnnots != null) {
// do something if there are annots
} else {
// do something if there are no annots
}
And that should get you there.
I am not completely positive, but I think there is also a Preflight check which tells you whether there are annotations in the document. If so, you would create a Preflight droplet, which would sort out the annotated and not annotated documents.
Mr. Wyss is right, here's a step-by-step guide:
In Acrobat XI Pro, go to the 'Tools' panel on the right side
Click on the 'Action Wizard' tab (you must first make it visible, though)
Click on 'Create New Action...', choose 'More tools' > 'Execute Javascript' and add it to right-hand pane > click on 'Execute Javascript' > 'Specify Settings' (uncheck 'prompt user' if you want) > paste this code:
.
this.syncAnnotScan();
var annots = this.getAnnots();
var fname = this.documentFileName;
fname = fname.replace(",", ";");
var errormsg = "";
if (annots) {
try {
this.saveAs({
cPath: "/c/folder/"+fname,
bPromptToOverwrite: false //make this 'true' if you want to be prompted on overwrites
});
} catch(e) {
for (var i in e)
{errormsg+= (i + ": " + e[i]+ " / ");}
app.alert({
cMsg: "Error! Unable to save the file under this name ('"+fname+"'- possibly an unicode string?) See this: "+errormsg,
cTitle: "Damn you Acrobat"
});
}
;}
annots = 0;
Save and run it! All your annotated PDFs will be saved to 'c:\folder' (but only if this folder already exists!)
Be sure to enable first Javascript in 'Edit' > 'Preferences...' > 'Javascript' > 'Enable Acrobat Javascript'.
VERY IMPORTANT: Acrobat's JS has a bug that doesn't allow Docs to be saved with commas (",") in their names (e.g., "Meeting with suppliers, May 11th.pdf" - this will get an error). Therefore, I substitute in the code above all "," for ";".

How do I convert multiple .pdf files to .ai files all at once?

As a part of my job, my boss wants me to convert thousands of .pdfs into .ai format in Illustrator CS6 without having to open each individual file (among the thousands) and save each pdf as a .ai. I need to convert these files by the thousands with a few simple steps.
Using Illustrator CS6, I have tried to do this by using the batch option by applying the same action to multiple files, (2). I have chosen two folders for input and output. A source from which I get the pdfs and a destination for the converted .pdfs in .ai format are placed.
While the conversions are successful, the multiple files, in this case 2, opened up individually in Illustrator, I had to save them rudimentarily.
This is not what I need. I need to be able to automatically convert thousands of pdfs into .ai's, without having to open and save each and every one of them.
How do I do this?
You can use this script as starting point. It works for singlepage .pdf files right away. For multipage files you will have to tweak it a bit more
(function(thisObj){
main();
function main(){
var pdffiles = File.openDialog ('select one or more pdf files', '*.pdf', true);
if(pdffiles === null){
return;
}
for(var f = 0; f < pdffiles.length;f++){
var pdf = pdffiles[f];
//~ alert(pdf);
var doc = app.open (pdf);
var namepattern = pdf.path + "/" + pdf.name + ".converted.ai";
var newai = null;
if(!(File(namepattern).exists)){
newai = new File(namepattern);
}else{
newai = File(namepattern);
}
doc.saveAs(newai);
doc.close (SaveOptions.DONOTSAVECHANGES);
}
}
})(this);

Print one page of WebBrowser

I am trying to automate printing of intranet websites. Since this is an application that will be put on a specific user's computer, which will be run on an as-needed basis, I'd like it to be as un-disruptive as possible (in other words, not launching IE for each page). The catch is that I need to print the first page of the website and then print the whole website again, which will produce the first page two times. What is the best way to do this?
I have no problem getting it to loop through the pages that it needs to print, nor do I have a problem opening the page with webbrowser. I do, however, have a problem specifying a print range.
I also tried PrintDocument, but couldn't figure out how to get that to open within the form.
Thanks for any help that can be provided.
To download the pdf file, try this solution using iTextSharp:
ITextSharp HTML to PDF?
Except with one substitution if you want to save directly to a file
private MemoryStream createPDF(string html)
{
MemoryStream msOutput = new MemoryStream();
TextReader reader = new StringReader(html);
// step 1: creation of a document-object
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
// step 2:
// we create a writer that listens to the document
// and directs a XML-stream to a file
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream("c:\\my.pdf", FileMode.Create));
// step 3: we create a worker parse the document
HTMLWorker worker = new HTMLWorker(document);
// step 4: we open document and start the worker on the document
document.Open();
worker.StartDocument();
// step 5: parse the html into the document
worker.Parse(reader);
// step 6: close the document and the worker
worker.EndDocument();
worker.Close();
document.Close();
return msOutput;
}
Once the PDF is set up, try ghostscript to print one page:
Print existing PDF (or other files) in C#
If you start a shell execute of the process, you can use the command line arguments:
gsprint "filename.pdf" -from 1 - to 1
Alternatively, WebBrowser can just print the full page: http://msdn.microsoft.com/en-us/library/b0wes9a3.aspx
I can't find anything referencing that WebBrowser itself can print "From page X to Y" without a print dialog.
Since I'm facing a similar problem, here's an alternate solution:
This open source project turns HTML documents to PDF documents similar to iTextSharp (http://code.google.com/p/wkhtmltopdf/). We ended up not using iTextSharp because of several formatting issues with the way the site we wanted to print was laid out. We send command line arguments to turn the html downloaded using a webclient into a pdf file.
WebClient wc = new WebClient();
wc.Credentials = CredentialCache.DefaultNetworkCredentials;
string htmlText = wc.DownloadString("http://websitehere.com);
Then, after turning to pdf, you can simply print the file:
Process p = new Process();
p.StartInfo.FileName = string.Format("{0}.pdf", fileLocation);
p.StartInfo.Verb = "Print";
p.Start();
p.WaitForExit();
(Apologies for C#, I'm more familiar with it than VB.NET, though it should be a simple conversion)