PyPDF2: Error -5 while decompressing data: incomplete or truncated stream - pdf

I'm having problem with Incomplete or truncated stream while trying to pull data out of PDF interactive form. Could anyone help me with this please
PDFfile = open(fname, "rb")
pdfread = p2.PdfFileReader(PDFfile)
I'm having below error when i execute pdfread
Error -5 while decompressing data: incomplete or truncated stream

It mostly happens when you already opened pdf in different program or pdf is is corrupted. Try opening pdf with open()

Related

Google colab unable to work with hdf5 files

I have 4 hdf5 files in my drive. While using colab, db=h5py.File(path_to_file, "r") works sometimes and doesn't rest of the time. While writing the hdf5 file, I have ensured that I closed it after writing. Say File1 works on notebook_#1, when I try to use it on notebook_#2 it works sometimes, and doesn't other times. When I run it again on notebook_#1 it may work or maynot.
Probably size is not a matter because my 4 files are 32GB and others 4GB and mostly the problem is with 4GB files.
The hdf5 files were generated using colab itself. The error that I get is:
OSError: Unable to open file (file read failed: time = Tue May 19 12:58:36 2020
, filename = '/content/drive/My Drive/Hrushi/dl4cv/hdf5_files/train.hdf5', file descriptor = 61, errno = 5, error message = 'Input/output error', buf = 0x7ffc437c4c20, total read size = 8, bytes this sub-read = 8, bytes actually read = 18446744073709551615, offset = 0
or
/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
171 if swmr and swmr_support:
172 flags |= h5f.ACC_SWMR_READ
--> 173 fid = h5f.open(name, flags, fapl=fapl)
174 elif mode == 'r+':
175 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/h5f.pyx in h5py.h5f.open()
OSError: Unable to open file (bad object header version number)
Would be grateful for any help, thanks in advance.
Reading directly from Google Drive can cause problems.
Try copying it to local directory e.g. /content/ first.

unable to load csv file from GCS into bigquery

I am unable to load 500mb csv file from google cloud storage to big query but i got this error
Errors:
Too many errors encountered. (error code: invalid)
Job ID xxxx-xxxx-xxxx:bquijob_59e9ec3a_155fe16096e
Start Time Jul 18, 2016, 6:28:27 PM
End Time Jul 18, 2016, 6:28:28 PM
Destination Table xxxx-xxxx-xxxx:DEV.VIS24_2014_TO_2017
Write Preference Write if empty
Source Format CSV
Delimiter ,
Skip Leading Rows 1
Source URI gs://xxxx-xxxx-xxxx-dev/VIS24 2014 to 2017.csv.gz
I have gzipped 500mb csv file to csv.gz to upload to GCS.Please help me to solve this issue
The internal details for your job show that there was an error reading the row #1 of your CSV file. You'll need to investigate further, but it could be that you have a header row that doesn't conform to the schema of the rest of the file, so we're trying to parse a string in the header as an integer or boolean or something like that. You can set the skipLeadingRows property to skip such a row.
Other than that, I'd check that the first row of your data matches the schema you're attempting to import with.
Also, the error message you received is unfortunately very unhelpful, so I've filed a bug internally to make the error you received in this case more helpful.

Failed to load pdf in chrome browser

I am getting an error as below,
Error message screenshot
I am opening pdf file by below link
http://www.satyajainfratech.com/Bliss-web-brochure-landscaped.pdf
code written for the same is
Download Brochure
The PDF is corrupt, ending in the middle of a stream. The last object starts with
10 0 obj^M
<<^M
/Filter [/FlateDecode ]^M
/Length 465325^M
>>^M
stream^M
This stream should have 465325 bytes of binary data followed by and endstream marker. But the file end abruptly mid-stream after about 43475 bytes of data, with no endstream or file trailer.
When I opened it I got this warning in chrome: "/deep/ combinator is deprecated. See https://www.chromestatus.com/features/6750456638341120 for more details."
If you remove the deprecated combinator it should open normally.

Reading PDF into a blob then sending as an attachment

I am trying to read a PDF into a blob object then do an INSERT into my oracle database so that it can be sent off as an attachment. Now the email portion is working, and it adds an attachment but the attachment is always corrupt and I can't open it. Below is the code where I create my blob pdf, can someone help me figure out why this isn't creating the proper attachment?
ls_pdf_name = ls_pdf_path + "\" + "invnum_" + ls_invoice + ".pdf"
ls_pdf_filename = "invoice_" + ls_invoice + ".pdf"
ls_rc = wf_check_pdf_status(ll_invoice_number, ls_sub_type, ll_user_supp_id)
If ls_rc = "Y" Then
li_fnum = FileOpen(ls_pdf_name, StreamMode!)
li_bytes = FileRead(li_fnum, bPDF)
FileClose(li_fnum)
ll_rc = wf_update_pdf_tables(bPDF, ls_pdf_filename, ls_sub_type, ll_user_supp_id, ll_invoice_number, ls_month, ls_year)
EDIT
So I took Calvin's advice and switched my insert to the following:
Here is the INSERT statement that puts the blob into the table
INSERT INTO ATTACH_DOCUMENT
(id, filename, mime_type, date_time_created)
VALUES
(ATTACH_DOCUMENT_SEQ.NEXTVAL, :pdf_filename, 'application/pdf', CURRENT_TIMESTAMP);
UPDATEblob ATTACH_DOCUMENT
SET data = :pdf
WHERE id = ATTACH_DOCUMENT_SEQ.CURRENTVAL;
But when I go to open the PDF email attachment from my email, Adobe opens up with this error - Could not open because it is either not a supported file type or because it has been damaged (for example it was sent as an email attachment and wasn't decoded correctly)
Thanks
How big is the PDF file?
You may not be getting all the contents with a simple FileRead() - try using FileReadEx()
The first thing to do is to check if the pdf is correctly saved in the database :
don't remove the pdf file after it is inserted in the db.
using a sql interpreter calculate the size of the blob column you have inserted the file into and verify that it matches the file size.
You didn't mention the database you are using, for example in ms sql server you could use the datalength() function to do this. Depending on the database you may check if the pdf is corrupted by calculating its md5 hash.
if you use a capable query tool (eg TOAD for Oracle) you could save the blob as a pdf file and verify that is readable.

error when trying to import ps file by grImport in R

I need to create a pdf file with several chart created by ggplot2 arranged in a A4 paper, and repeat it 20-30 times.
I export the ggplot2 chart into ps file, and try to PostScriptTrace it as instructed in grImport, but it just keep giving me error of Unrecoverable error, exit code 1.
I ignore the error and try to import and xml file generated into R object, give me another error:
attributes construct error
Couldn't find end of Start Tag text line 21
Premature end of data in tag picture line 3
Error: 1: attributes construct error
2: Couldn't find end of Start Tag text line 21
3: Premature end of data in tag picture line 3
What's wrong here?
Thanks!
If you have no time to deal with Sweave, you could also write a simple TeX document from R after generating the plots, which you could later compile to pdf.
E.g.:
ggsave(p, file=paste('filename', id, '.pdf'))
cat(paste('\\includegraphics{',
paste('filename', id, '.pdf'), '}', sep=''),
file='report.pdf')
Later, you could easily compile it to pdf with for example pdflatex.