Running IJulia on Conda. Trying to plot Quandl data - matplotlib

Simple example of using Google prices in DataFrame format. Gadfly plot gives the following error: TypeError(u'There is no Line2D property "y"',). Also references matplotlib for some reason.
Here's code:
using Quandl
using DataFrames
google = quandl("GOOG/NASDAQ_QQQ", format = "DataFrame")
date = google[1]
dt_str = Array(Any,length(date))
for i=1:length(date)
dt_str[i] = string(date[i]);
end
price = google[5]
using Gadfly
set_default_plot_size(20cm, 10cm)
p1 = plot(x=dt_str, y=price,
Geom.point,
Geom.smooth(method=:lm),
Guide.xticks(ticks=[1:25]),
Guide.yticks(ticks=[1:25]),
Guide.xlabel("Date"),
Guide.ylabel("Price"),
Guide.title("Google: Close Price"))
LoadError: PyError (:PyObject_Call)
TypeError(u'There is no Line2D property "y"',)
File "C:\Anaconda2\lib\site-packages\matplotlib\pyplot.py", line 3154, in plot
ret = ax.plot(*args, **kwargs)
File "C:\Anaconda2\lib\site-packages\matplotlib\__init__.py", line 1811, in inner
return func(ax, *args, **kwargs)
File "C:\Anaconda2\lib\site-packages\matplotlib\axes\_axes.py", line 1424, in plot
for line in self._get_lines(*args, **kwargs):
File "C:\Anaconda2\lib\site-packages\matplotlib\axes\_base.py", line 395, in _grab_next_args
for seg in self._plot_args(remaining[:isplit], kwargs):
File "C:\Anaconda2\lib\site-packages\matplotlib\axes\_base.py", line 374, in _plot_args
seg = func(x[:, j % ncx], y[:, j % ncy], kw, kwargs)
File "C:\Anaconda2\lib\site-packages\matplotlib\axes\_base.py", line 281, in _makeline
self.set_lineprops(seg, **kwargs)
File "C:\Anaconda2\lib\site-packages\matplotlib\axes\_base.py", line 189, in set_lineprops
line.set(**kwargs)
File "C:\Anaconda2\lib\site-packages\matplotlib\artist.py", line 936, in set
(self.__class__.__name__, k))
while loading In[64], in expression starting on line 1
in getindex at C:\Users\yburkitbayev\.julia\v0.4\PyCall\src\PyCall.jl:239

The presence of the PyError would imply to me that the session in which this example was executed has loaded PyPlot prior to loading Gadfly. Both PyPlot and Gadfly export the plot function, so uses of plot in a session where both PyPlot and Gadfly have been loaded require the qualification of the function name with the package name (e.g. PyPlot.plot or Gadfly.plot).
Executing your example in a session where PyPlot has not been loaded, but Gadfly is loaded, produces a Gadfly plot without displaying the error message provided in your post.

Related

SentenceTransformers throwing KeyError on pandas Series

I'm using the following simplified code:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
where sentences is a pandas Series, containing the sentences I want to transform.
and then I'm getting the following error Traceback
embeddings = model.encode(sentences)
File "/anaconda/envs/topics/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 157, in encode
sentences_sorted = [sentences[idx] for idx in length_sorted_idx]
File "/anaconda/envs/topics/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 157, in <listcomp>
sentences_sorted = [sentences[idx] for idx in length_sorted_idx]
File "/anaconda/envs/topics/lib/python3.8/site-packages/pandas/core/series.py", line 942, in
__getitem__
return self._get_value(key)
File "/anaconda/envs/topics/lib/python3.8/site-packages/pandas/core/series.py", line 1051, in
_get_value
loc = self.index.get_loc(label)
File "/anaconda/envs/topics/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: 144
The actual solution was to convert the pandas Series to a numpy array:
sentences_array = sentences.to_numpy()
The Embeddings will give in the form of array or Tensor form so use the following codes to solve this issue
embeddings = model.encode(sentences, convert_to_tensor=True)
or
embeddings = model.encode(sentences, convert_to_numpy=True)

What do I need in order to save animation videos from matplotlib in mp3 format?

I am using python3.8 on Linux Mint 19.3, and I am trying to save an animation created by a cellular automata model in matplotlib. My actual code for the model is private, but it uses the same code for saving the animation as the code shown below, which is a slight modification of one of the examples shown in the official matplotlib documentation:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
fig, ax = plt.subplots()
def f(x, y):
return np.sin(x) + np.cos(y)
x = np.linspace(0, 2 * np.pi, 120)
y = np.linspace(0, 2 * np.pi, 100).reshape(-1, 1)
fig, ax = plt.subplots()
ims = []
for i in range(60):
x += np.pi / 15.
y += np.pi / 20.
im = ax.imshow(f(x, y), animated=True)
if i == 0:
ax.imshow(f(x, y)) # show an initial one first
ims.append([im])
ani = animation.ArtistAnimation(fig, ims, interval=50, blit=True,
repeat_delay=1000)
# To save the animation, use e.g.
#
# ani.save("movie.mp4")
#
# or
#
writer = animation.FFMpegWriter(fps=15, metadata=dict(artist='Me'), bitrate=1800)
ani.save("movie.mp3", writer=writer)
When executed, the code produces this error:
MovieWriter stderr:
Output file #0 does not contain any stream
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/matplotlib/animation.py", line 234, in saving
yield self
File "/usr/local/lib/python3.8/dist-packages/matplotlib/animation.py", line 1093, in save
writer.grab_frame(**savefig_kwargs)
File "/usr/local/lib/python3.8/dist-packages/matplotlib/animation.py", line 351, in grab_frame
self.fig.savefig(self._proc.stdin, format=self.frame_format,
File "/usr/local/lib/python3.8/dist-packages/matplotlib/figure.py", line 3046, in savefig
self.canvas.print_figure(fname, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/matplotlib/backend_bases.py", line 2319, in print_figure
result = print_method(
File "/usr/local/lib/python3.8/dist-packages/matplotlib/backend_bases.py", line 1648, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/matplotlib/_api/deprecation.py", line 415, in wrapper
return func(*inner_args, **inner_kwargs)
File "/usr/local/lib/python3.8/dist-packages/matplotlib/backends/backend_agg.py", line 486, in print_raw
fh.write(renderer.buffer_rgba())
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/justin/animation_test.py", line 36, in <module>
ani.save("movie.mp3", writer=writer)
File "/usr/local/lib/python3.8/dist-packages/matplotlib/animation.py", line 1093, in save
writer.grab_frame(**savefig_kwargs)
File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.8/dist-packages/matplotlib/animation.py", line 236, in saving
self.finish()
File "/usr/local/lib/python3.8/dist-packages/matplotlib/animation.py", line 342, in finish
self._cleanup() # Inline _cleanup() once cleanup() is removed.
File "/usr/local/lib/python3.8/dist-packages/matplotlib/animation.py", line 373, in _cleanup
raise subprocess.CalledProcessError(
subprocess.CalledProcessError: Command '['ffmpeg', '-f', 'rawvideo', '-vcodec', 'rawvideo', '-s', '640x480', '-pix_fmt', 'rgba', '-r', '15', '-loglevel', 'error', '-i', 'pipe:', '-vcodec', 'h264', '-pix_fmt', 'yuv420p', '-b', '1800k', '-metadata', 'artist=Me', '-y', 'movie.mp3']' returned non-zero exit status 1.
I have looked at posts on similar queries concerning matplotlib animations, but none have specifically included the error Output file #0 does not contain any stream. I have little experience with ffmpeg, so I am wondering what might be missing.

Can anyone tell me the solution of this ValueError in MatPlotLib?

When I'm plotting some random scatter plot an error occurred. Can someone solve it?
The code is
# Date 27-06-2021
import time
import matplotlib.pyplot as plt
import numpy as np
import random as rd
def rd_color():
random_number = rd.randint(0, 16777215)
hex_number = str(hex(random_number))
hex_number = '#' + hex_number[2:]
return hex_number
arr1_for_x = np.linspace(10, 99, 1000)
arr1_for_y = np.random.uniform(10, 99, 1000)
# print(rd_color())
for i in range(1000):
plt.scatter(arr1_for_x[i:i+1], arr1_for_y[i:i+1], s=5,
linewidths=0, color=rd_color())
plt.show()
and the ValueError is
ValueError: 'color' kwarg must be a color or sequence of color specs. For a sequence of values to be color-mapped, use the 'c' argument instead.
Traceback (most recent call last):
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\axes\_axes.py", line 4289, in _parse_scatter_color_args
mcolors.to_rgba_array(kwcolor)
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\colors.py", line 367, in to_rgba_array
raise ValueError("Using a string of single character colors as "
ValueError: Using a string of single character colors as a color sequence is not supported. The colors can be passed as an explicit list instead.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\Google Drive\tp\Programming\Python\tp2.py", line 24, in <module>
plt.scatter(arr1_for_x[i:i+1], arr1_for_y[i:i+1], s=5,
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\pyplot.py", line 3068, in scatter
__ret = gca().scatter(
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\__init__.py", line 1361, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\axes\_axes.py", line 4516, in scatter
self._parse_scatter_color_args(
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\axes\_axes.py", line 4291, in _parse_scatter_color_args
raise ValueError(
ValueError: 'color' kwarg must be an color or sequence of color specs. For a sequence of values to be color-mapped, use the 'c' argument instead.
after this error I also use c in-place of color
# Date 27-06-2021
import time
import matplotlib.pyplot as plt
import numpy as np
import random as rd
def rd_color():
random_number = rd.randint(0, 16777215)
hex_number = str(hex(random_number))
hex_number = '#' + hex_number[2:]
return str(hex_number)
arr1_for_x = np.linspace(10, 99, 1000)
arr1_for_y = np.random.uniform(10, 99, 1000)
# print(rd_color())
for i in range(1000):
plt.scatter(arr1_for_x[i:i+1], arr1_for_y[i:i+1], s=5,
linewidths=0, c=rd_color())
plt.show()
but again error occured
this time the error is
Traceback (most recent call last):
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\axes\_axes.py", line 4350, in _parse_scatter_color_args
colors = mcolors.to_rgba_array(c)
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\colors.py", line 367, in to_rgba_array
raise ValueError("Using a string of single character colors as "
ValueError: Using a string of single character colors as a color sequence is not supported. The colors can be passed as an explicit list instead.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\Google Drive\tp\Programming\Python\tp2.py", line 22, in <module>
plt.scatter(arr1_for_x[i:i+1], arr1_for_y[i:i+1], s=5,
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\pyplot.py", line 3068, in scatter
__ret = gca().scatter(
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\__init__.py", line 1361, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\axes\_axes.py", line 4516, in scatter
self._parse_scatter_color_args(
File "C:\Users\amanr\AppData\Local\Programs\Python\Python39\lib\site-packages\matplotlib\axes\_axes.py", line 4359, in _parse_scatter_color_args
raise ValueError(
ValueError: 'c' argument must be a color, a sequence of colors, or a sequence of numbers, not #814aa
please anyone resolve/spell it out for me.
The way I read Matplotlib's Tutorial on Specifying Colours, the hex literals need to have exactly 6 or 3 "hex digits".
This will work:
for i in range(1000):
myColor=rd_color()
plt.scatter(arr1_for_x[i:i+1], arr1_for_y[i:i+1], s=5,
linewidths=0, color=[myColor])
plt.show()
Because you cannot call a function inside this color specification and your colors must be in a list, so define a variable for that color and pass it as list into your color parameter.

How to avoid set_index on a pre-sorted DataFrame constructed with from_delayed?

I am trying to get the expression, 'df.resample('1T', how='mean').sum()' to work in Dask but, running into an issue where it seems like Dask needs me to explicitly set_index on the DataFrame before performing resample. I get an error as below...
>>> c.gather(df).compute()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 1508, in gather
asynchronous=asynchronous)
File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 615, in sync
return sync(self.loop, func, *args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/distributed/utils.py", line 253, in sync
six.reraise(*error[0])
File "/usr/local/lib/python2.7/site-packages/distributed/utils.py", line 238, in f
result[0] = yield make_coro()
File "/usr/local/lib64/python2.7/site-packages/tornado/gen.py", line 1055, in run
value = future.result()
File "/usr/local/lib64/python2.7/site-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib64/python2.7/site-packages/tornado/gen.py", line 1063, in run
yielded = self.gen.throw(*exc_info)
File "/usr/local/lib/python2.7/site-packages/distributed/client.py", line 1385, in _gather
traceback)
File "/usr/local/lib/python2.7/site-packages/dask/dataframe/core.py", line 1633, in resample
return _resample(self, rule, how=how, closed=closed, label=label)
File "/usr/local/lib/python2.7/site-packages/dask/dataframe/tseries/resample.py", line 33, in _resample
return getattr(resampler, how)()
File "/usr/local/lib/python2.7/site-packages/dask/dataframe/tseries/resample.py", line 151, in mean
return self._agg('mean')
File "/usr/local/lib/python2.7/site-packages/dask/dataframe/tseries/resample.py", line 126, in _agg
meta_r = self.obj._meta_nonempty.resample(self._rule, **self._kwargs)
File "/usr/local/lib64/python2.7/site-packages/pandas/core/generic.py", line 7104, in resample
base=base, key=on, level=level)
File "/usr/local/lib64/python2.7/site-packages/pandas/core/resample.py", line 1148, in resample
return tg._get_resampler(obj, kind=kind)
File "/usr/local/lib64/python2.7/site-packages/pandas/core/resample.py", line 1276, in _get_resampler
"but got an instance of %r" % type(ax).__name__)
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'
Below is the python code which I am using. Since the pandas DFs being returned by my delayed objects were already timestamp indexed, my expectation was for Dask to infer/construct an index from those DFs' timestamp indices instead of me having to explicitly set one. Although, I am unsure how an explicit set_index can be called in this case (what are the arguments to be passed?). Setting a pd.DatetimeIndex on the meta dataframe (commented line as below) works. Is constructing the index by hand and feeding it to meta the only realistic way to do this? Am I missing something?
#! /usr/bin/env python
# Start dask scheduler and workers
# dask-scheduler &
# dask-worker --nthreads 1 --nprocs 6 --memory-limit 3GB localhost:8786 --local-directory /dev/shm &
from dask.distributed import Client
from dask.delayed import delayed
import pandas as pd
import numpy as np
import dask.dataframe as dd
import time
c = Client('127.0.0.1:8786')
def load(epoch):
# 1525132800 - 1/5
# 1527811200 - 1/6
num_ts=100
idx = []
for ts in range(0, 86400, 15):
idx.append(epoch + ts)
d = np.random.rand(86400/15, num_ts)
ts = []
for i in range(0, num_ts):
# tsname = "ts_%s_%s" % (i, epoch)
tsname = "ts_%s" % (i)
ts.append(tsname)
gts.append(tsname)
res = pd.DataFrame(index=idx, data=d, columns=ts, dtype=np.float64)
res.index = pd.to_datetime(arg=res.index, unit='s')
return res
gts = []
load(1525132800)
print time.time()
i = pd.DatetimeIndex(start=1525132800, freq='15S', end=1527811185, dtype='datetime64[s]')
# meta = pd.DataFrame(index=i, data=[], columns=gts, dtype=np.float64)
meta = pd.DataFrame(index=[], data=[], columns=gts, dtype=np.float64)
dfs = [delayed(load)(fn) for fn in range(1525132800, 1527811200, 86400)]
print time.time()
df = dd.from_delayed(dfs, meta, 'sorted')
print time.time()
df.npartitions
df.divisions
print time.time()
df = c.submit(dd.DataFrame.resample, df, rule='1T', how='mean')
print time.time()
#df = c.submit(dd.DataFrame.sum, df, axis=1)
print time.time()
c.gather(df).compute()
print time.time()
#c.gather(df).visualize(filename='/usr/share/nginx/html/svg/df4.svg')
Dask uses the meta of a data-frame to infer the data types before computing any of the chunks of data. In your case, your chunks contain datetime indexes, but the meta doesn't. The meta should be a zero-length version of the data:
meta = pd.DataFrame(index=i[:0], data=[], columns=gts, dtype=np.float64)

Beginner's questions about Matplotlib OO Api

I am new to Matplotlib and am struggling a bit to differentiate between the OO and pyplot interfaces. I’m actually working with the Kivy GUI framework and trying to plot 4 subplots on a single figure, to be displayed by Kivy. Here’s a snippet of my code:
def create_plot(self):
self.fig, ((self.ax0, self.ax1), (self.ax2, self.ax3)) = plt.subplots(nrows=2, ncols=2)
self.ax0.set_title("A")
self.ax0.grid(True, lw = 2, ls = '--', c = '.75')
self.ax1.set_title("B")
self.ax1.grid(True, lw = 2, ls = '--', c = '.75')
self.ax2.set_title("C")
self.ax2.grid(True, lw = 2, ls = '--', c = '.75')
self.ax3.set_title("D")
self.ax3.grid(True, lw = 2, ls = '--', c = '.75')
#plt.tight_layout()
plt.show()
canvas = self.fig.canvas
self.add_widget(canvas)
What worries me is that I am calling plt methods and assigning the results to my objects. Is plt the state machine interface and not the OO interface, or is this OK?
Secondly, I want to periodically update the plotted lines, so I have a plot method that does this:
def plot(self, xCoords, yCoords):
if len(self.ax0.lines) > 0:
self.ax0.lines.pop(0)
line = self.ax0.plot(xCoords, yCoords, color='blue')
canvas = self.fig.canvas
canvas.draw()
Does that look ok? Can I just pop the existing line, or should I reuse the existing line?
Lastly, and most difficult, if I enable:
plt.tight_layout()
I get an exception:
C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\matplotlib\tight_layout.py:225: UserWarning: tight_layout : falling back to Agg renderer
warnings.warn("tight_layout : falling back to Agg renderer")
Traceback (most recent call last):
File "main.py", line 1117, in <module>
GuiApp().run()
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\app.py", line 801, in run
self.load_kv(filename=self.kv_file)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\app.py", line 598, in load_kv
root = Builder.load_file(rfilename)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 1801, in load_file
return self.load_string(data, **kwargs)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 1880, in load_string
self._apply_rule(widget, parser.root, parser.root)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 2038, in _apply_rule
self._apply_rule(child, crule, rootrule)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 2037, in _apply_rule
self.apply(child)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 1924, in apply
self._apply_rule(widget, rule, rule)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 2038, in _apply_rule
self._apply_rule(child, crule, rootrule)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 2038, in _apply_rule
self._apply_rule(child, crule, rootrule)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\lang.py", line 2035, in _apply_rule
child = cls(__no_builder=True)
File "C:\SVNProj\Raggio\trunk\hostconsole\gui\mygraph.py", line 127, in __init__
self.create_plot()
File "C:\SVNProj\Raggio\trunk\hostconsole\gui\mygraph.py", line 224, in create_plot
self.add_widget(canvas)
File "C:\Kivy-1.9.0-py3.4-win32-x64\Python34\lib\site-packages\kivy\uix\boxlayout.py", line 211, in add_widget
widget.bind(
AttributeError: 'FigureCanvasAgg' object has no attribute 'bind'
Can anyone help with that please?
Best regards
David
Original post
regarding def create_plot
plt.subplots() is just a wrapper around creating a new figure and adding subplots (axes) to that figure, so that is safe to use in any context AFAIK
regarding updating lines vs creating new lines.
Popping off and creating a new line with a call to axes.plot() will work, but will be slower than just using the existing line artist and calling artist.set_data(x_data, y_data)
Regarding the tight_layout issue
Consider just calling self.fig.tight_layout() instead of plt.tight_layout(), as plt.tight_layout() might be grabbing the wrong figure
Update based on discussion in comments:
Since you are plotting one line per panel, I would suggest adding these four lines to your create_plot method to save the LineArtist's that are created by the axes.plot() method:
self.line0, = self.ax0.plot([], [], 'b')
self.line1, = self.ax1.plot([], [], 'b')
self.line2, = self.ax2.plot([], [], 'b')
self.line3, = self.ax3.plot([], [], 'b')
Then in the plot method you can just do this:
self.line0.set_data(xCoords, yCoords)
Though I seem to be struggling to remember how to make ax0 automatically update its limits based on the xCoords and yCoords. I thought that it was as simple as self.ax0.relim(), but that is not working in my jupyter notebook right now. Hmm.