this commands fig, axs = plt.subplots(2, 2) show error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte - python-unicode

I am running the following command in spyder,
import matplotlib.pyplot as plt
fig, axs = plt.subplots(2, 2)
Traceback (most recent call last):
File "/home/hh/.local/lib/python3.8/site-packages/matplotlib_inline/backend_inline.py", line 41, in show
display(
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/IPython/core/display.py", line 327, in display
publish_display_data(data=format_dict, metadata=md_dict, **kwargs)
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/IPython/core/display.py", line 119, in publish_display_data
display_pub.publish(
File "/home/hh/.local/lib/python3.8/site-packages/ipykernel/zmqshell.py", line 138, in publish
self.session.send(
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/jupyter_client/session.py", line 830, in send
to_send = self.serialize(msg, ident)
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/jupyter_client/session.py", line 704, in serialize
content = self.pack(content)
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/jupyter_client/session.py", line 95, in json_packer
return jsonapi.dumps(
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/zmq/utils/jsonapi.py", line 40, in dumps
s = jsonmod.dumps(o, **kwargs)
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/simplejson/init.py", line 398, in dumps
return cls(
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/simplejson/encoder.py", line 296, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/simplejson/encoder.py", line 378, in iterencode
return _iterencode(o, 0)
File "/home/hh/anaconda3/envs/gee/lib/python3.8/site-packages/simplejson/encoder.py", line 44, in encode_basestring
s = str(s, 'utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
Got the same error if I run the same code in jupyter lab. However, if I run the command on terminal, the fig, ax = plt.subplots() works fine.
This only happens recently and I didn't have this issue before. Checked online material, but didn't find a solution. appreciate if only can provide any insights. thanks.

I had the same problem running plain jupyter and within vscode.
Try updating the jupyter installation incl. required libraries. In my case
pip3 install --upgrade jupyter_client pyzmq
fixed the problem.

Related

Data Length Error when Merging PDFs with PyPDF2

I am starting a project that will take specific pages out of each PDF in a folder and merge those pages into a single file. I am getting the error below when building the quoted code about the length of the encryption, and I don't know where I would need to address that.
from PyPDF2 import PdfFileMerger
import glob
files = glob.glob('C:/Users/Jake/Documents/UPLOAD/test_merge/*.pdf')
merger = PdfFileMerger()
for file in files:
merger.append(file)
merger.write("merged.pdf")
merger.close()
ERROR
Traceback (most recent call last):
File "C:\Users\Jake\Documents\Work Projects\Python\Contract Merger\Merger .02", line 10, in <module>
merger.write("merged.pdf")
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_merger.py", line 312, in write
my_file, ret_fileobj = self.output.write(fileobj)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 838, in write
self.write_stream(stream)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 811, in write_stream
self._sweep_indirect_references(self._root)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 960, in _sweep_indirect_references
data = self._resolve_indirect_object(data)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 1005, in _resolve_indirect_object
real_obj = data.pdf.get_object(data)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_reader.py", line 1187, in get_object
retval = self._encryption.decrypt_object(
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 747, in decrypt_object
return cf.decrypt_object(obj)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 185, in decrypt_object
obj[dictkey] = self.decrypt_object(value)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 179, in decrypt_object
data = self.strCrypt.decrypt(obj.original_bytes)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 87, in decrypt
d = aes.decrypt(data)
File "C:\Users\Jake\Anaconda3\lib\site-packages\Crypto\Cipher\_mode_cbc.py", line 246, in decrypt
raise ValueError("Data must be padded to %d byte boundary in CBC mode" % self.block_size)
ValueError: Data must be padded to 16 byte boundary in CBC mode
[Finished in 393ms]
I wrote a basic program from a YouTube video and tried to run it, but I got the error that PyCryptodome was a dependent for PyPDF2. After installing that, I am getting an error about the data length for encryption when writing the pdf. Googling that error lead me to this solution. I am a bit of a novice, and I don't really understand why any kind of encryption is being applied in the first place, other than what I assume is necessary for the pdf reader/writer to operate, so I don't know where I would need to apply that solution in this code.
After writing up this question, I was lead to this solution, which I tried to run the code below, I received the same error.
from PyPDF2 import PdfFileMerger, PdfFileReader
import glob
merger = PdfFileMerger()
files = glob.glob('C:/Users/Jake/Documents/UPLOAD/test_merge/*.pdf')
for filename in files:
with open(filename, 'rb') as source:
tmp = PdfFileReader(source)
merger.append(tmp)
merger.write('Result.pdf')
ERROR
Traceback (most recent call last):
File "C:\Users\Jake\Documents\Work Projects\Python\Contract Merger\Merger .03.py", line 13, in <module>
merger.write('Result.pdf')
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_merger.py", line 312, in write
my_file, ret_fileobj = self.output.write(fileobj)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 838, in write
self.write_stream(stream)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 811, in write_stream
self._sweep_indirect_references(self._root)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 960, in _sweep_indirect_references
data = self._resolve_indirect_object(data)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_writer.py", line 1005, in _resolve_indirect_object
real_obj = data.pdf.get_object(data)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_reader.py", line 1187, in get_object
retval = self._encryption.decrypt_object(
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 747, in decrypt_object
return cf.decrypt_object(obj)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 185, in decrypt_object
obj[dictkey] = self.decrypt_object(value)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 179, in decrypt_object
data = self.strCrypt.decrypt(obj.original_bytes)
File "C:\Users\Jake\Anaconda3\lib\site-packages\PyPDF2\_encryption.py", line 87, in decrypt
d = aes.decrypt(data)
File "C:\Users\Jake\Anaconda3\lib\site-packages\Crypto\Cipher\_mode_cbc.py", line 246, in decrypt
raise ValueError("Data must be padded to %d byte boundary in CBC mode" % self.block_size)
ValueError: Data must be padded to 16 byte boundary in CBC mode
[Finished in 268ms]
My thinking is that something else has gone wrong, but I am at a loss at to what that could be.
What have I done wrong with this build to get this error, and how can I correct it?
Turns out this is an issue with PyPDF2. There is a 3-line fix that can be injected to correct the error if you attempt this before it is patched.

Anaconda Pandas breaks on reading hdf file on Python 3.6.x

I am using an Anaconda environment with Python 3.6.8, created with conda create -n temp pandas pytables h5py python=3.6.8. When I try to read a .h5 file like:
f = pd.read_hdf(filename, key)
I get an ValueError exception:
Traceback (most recent call last):
File "read_data.py", line 6, in <module>
f = pd.read_hdf(filename, key)
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/pandas/io/pytables.py", line 394, in read_hdf
return store.select(key, auto_close=auto_close, **kwargs)
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/pandas/io/pytables.py", line 741, in select
return it.get_result()
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/pandas/io/pytables.py", line 1483, in get_result
results = self.func(self.start, self.stop, where)
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/pandas/io/pytables.py", line 734, in func
columns=columns)
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/pandas/io/pytables.py", line 2928, in read
ax = self.read_index('axis%d' % i, start=_start, stop=_stop)
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/pandas/io/pytables.py", line 2523, in read_index
_, index = self.read_index_node(getattr(self.group, key), **kwargs)
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/pandas/io/pytables.py", line 2621, in read_index_node
data = node[start:stop]
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/tables/vlarray.py", line 685, in __getitem__
return self.read(start, stop, step)
File "/home/fauzanzaid/anaconda3/envs/temp/lib/python3.6/site-packages/tables/vlarray.py", line 821, in read
listarr = self._read_array(start, stop, step)
File "tables/hdf5extension.pyx", line 2155, in tables.hdf5extension.VLArray._read_array
ValueError: cannot set WRITEABLE flag to True of this array
This problem goes away if I use an environment with python 3.7, or 3.5. However, I need to use python 3.6.
How can I resolve this error?
I downgraded numpy to 1.14.3 with below command, and it worked for me:
pip3 install numpy==1.14.3

compute() in dask not working

I am trying a simple parallel computation in Dask.
This is my code.
import time
import dask as dask
import dask.distributed as distributed
import dask.dataframe as dd
import dask.delayed as delayed
from dask.distributed import Client,progress
client = Client('localhost:8786')
df = dd.read_csv('file.csv')
ddf = df.groupby(['col1'])[['col2']].sum()
ddf = ddf.compute()
print ddf
It seems fine from the documentation but on running I am getting this :
Traceback (most recent call last):
File "dask_prg1.py", line 17, in <module>
ddf = ddf.compute()
File "/usr/local/lib/python2.7/site-packages/dask/base.py", line 156, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/usr/local/lib/python2.7/site-packages/dask/base.py", line 402, in compute
results = schedule(dsk, keys, **kwargs)
File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 2159, in get
direct=direct)
File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 1562, in gather
asynchronous=asynchronous)
File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 652, in sync
return sync(self.loop, func, *args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/distributed/utils.py", line 275, in sync
six.reraise(*error[0])
File "/usr/local/lib/python2.7/site-packages/distributed/utils.py", line 260, in f
result[0] = yield make_coro()
File "/usr/local/lib/python2.7/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/usr/local/lib/python2.7/site-packages/tornado/concurrent.py", line 260, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib/python2.7/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 1439, in _gather
traceback)
File "/usr/local/lib/python2.7/site-packages/dask/bytes/core.py", line 122, in read_block_from_file
with lazy_file as f:
File "/usr/local/lib/python2.7/site-packages/dask/bytes/core.py", line 166, in __enter__
f = SeekableFile(self.fs.open(self.path, mode=mode))
File "/usr/local/lib/python2.7/site-packages/dask/bytes/local.py", line 58, in open
return open(self._normalize_path(path), mode=mode)
IOError: [Errno 2] No such file or directory: 'file.csv'
I am not understanding what is wrong.Kindly help me with this .Thank you in advance .
You may wish to pass the absolute file path to read_csv. The reason is, that you are giving the work of opening and reading the file to a dask worker, and you might not have started that worked with the same working directory as your script/session.

Twitsted ValueError: Unknown ECC curve on Raspian Stretch

I want to use my Raspberry Pi 3, running Rapian Stretch for a web scraping project. For python i use the berryconada distribution.
When I run my Spider, I get
ValueError: Unknown ECC curve
On my Laptop (Xubuntu 16.04) everything runs fine. Maybe I need to install an additional library or something?
Down below the full traceback.
Traceback (most recent call last):
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/internet/defer.py", line 1384, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/python/failure.py", line 393, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/core/downloader/handlers/__init__.py", line 65, in download_request
return handler.download_request(request, spider)
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/core/downloader/handlers/http11.py", line 63, in download_request
return agent.download_request(request)
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/core/downloader/handlers/http11.py", line 300, in download_request
method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/web/client.py", line 1633, in request
endpoint = self._getEndpoint(parsedURI)
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/web/client.py", line 1617, in _getEndpoint
return self._endpointFactory.endpointForURI(uri)
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/web/client.py", line 1494, in endpointForURI
uri.port)
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/core/downloader/contextfactory.py", line 59, in creatorForNetloc
return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext())
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/core/downloader/contextfactory.py", line 56, in getContext
return self.getCertificateOptions().getContext()
File "/home/pi/berryconda3/lib/python3.6/site-packages/scrapy/core/downloader/contextfactory.py", line 51, in getCertificateOptions
acceptableCiphers=DEFAULT_CIPHERS)
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/python/deprecate.py", line 792, in wrapped
return wrappee(*args, **kwargs)
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/internet/_sslverify.py", line 1595, in __init__
self._ecCurve = _OpenSSLECCurve(_defaultCurveName)
File "/home/pi/berryconda3/lib/python3.6/site-packages/twisted/internet/_sslverify.py", line 1744, in __init__
raise ValueError("Unknown ECC curve.")
I dropped berryconda and pip installed scrapy. If you're getting this error on Jessie, moving to Stretch gives you access to the newer openssl libs which contain the missing things.
After I upgraded to Stretch I cut berryconda from my path, pip uninstalled cryptography, twisted, pyopenssl, and scrapy.
Then with the no cache option I pip installed scrapy, which brought all those packages back, and now my spider is running.

matplotlib pgf: OSError: No such file or directory in subprocess.py

I try to use matplotlib to create a pgf file for LaTeX:
from matplotlib.pyplot import subplots
from numpy import linspace
x = linspace(0, 100, 30)
fig, ax = subplots(figsize = (10, 6))
ax.scatter(x, x)
fig.tight_layout()
fig.savefig('/home/mark/dicp/python/figure.pgf')
But I get OSError: [Errno 2] No such file or directory:
Traceback (most recent call last):
File "visualize/latex_figs.py", line 32, in <module>
fig.savefig('/home/mark/dicp/python/figure.pgf')
File "/usr/local/lib/python2.7/dist-packages/matplotlib/figure.py", line 1421, in savefig
self.canvas.print_figure(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backend_bases.py", line 2220, in print_figure
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backend_bases.py", line 1957, in print_pgf
return pgf.print_pgf(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_pgf.py", line 818, in print_pgf
self._print_pgf_to_fh(fh, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_pgf.py", line 797, in _print_pgf_to_fh
RendererPgf(self.figure, fh),
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_pgf.py", line 409, in __init__
self.latexManager = LatexManagerFactory.get_latex_manager()
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_pgf.py", line 223, in get_latex_manager
new_inst = LatexManager()
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_pgf.py", line 305, in __init__
cwd=self.tmpdir)
File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
It also generates this part of the output file:
%% [whole bunch of comments]
\begingroup%
\makeatletter%
\begin{pgfpicture}%
\pgfpathrectangle{\pgfpointorigin}{\pgfqpoint{10.000000in}{6.000000in}}%
\pgfusepath{use as bounding box}%
I do not understand what OSError: No such file or directory in subprocesses.py has to do with anything... The file I'm trying to save is writable. Am I misunderstanding something, or is this a bug I should report?
I also had this problem while trying to run the example scripts. The problem occurs where backend_pgf.py first tries to use the default LaTeX command. It seems that the PGF backend assumes that it should use xelatex by default. If the problem is the same for you as for me, then you have two options:
add the key "pgf.texsystem" : "pdflatex" (or lualatex, whatever) to your matplotlib.rcParams. For example, add the following snippet to the top of your script:
import matplotlib
pgf_with_rc_fonts = {"pgf.texsystem": "pdflatex"}
matplotlib.rcParams.update(pgf_with_rc_fonts)
ensure that you have xelatex, and that it is on your PATH, and use that as the default latex command (i.e. assuming you're on a Mac or Linux system, which xelatex should return a path).