How to turn off matplotlib quiver scaling? - matplotlib

The matplotlib.pyplot.quiver function takes a set of "origin" points and a set of "destination" points and the plots a bunch of arrows starting at the "origin" points headed in the direction of the "destination" points. However, there is a scaling factor so that the arrows don't necessarily end AT the "destination" points, they simply point in that direction.
e.g.
import matplotlib.pyplot as plt
import numpy as np
pts = np.array([[1, 2], [3, 4]])
end_pts = np.array([[2, 4], [6, 8]])
plt.quiver(pts[:,0], pts[:,1], end_pts[:,0], end_pts[:,1])
Note that the vector in the bottom left starts at (1,2) (which I want), but does not end at (2,4). This is governed by a scale parameter to the quiver function that makes the arrow longer or shorter. How do I get the arrow to end at EXACTLY (2,4)?

The quiver documentation states
To plot vectors in the x-y plane, with u and v having the same units as x and y, use angles='xy', scale_units='xy', scale=1.
Note however that u and v are understood relative to the position. Hence you would need to take the difference first.
import matplotlib.pyplot as plt
import numpy as np
pts = np.array([[1, 2], [3, 4]])
end_pts = np.array([[2, 4], [6, 8]])
diff = end_pts - pts
plt.quiver(pts[:,0], pts[:,1], diff[:,0], diff[:,1],
angles='xy', scale_units='xy', scale=1.)
plt.show()

Related

Matplotlib `fill_between`: Remove thin boundary

Consider the following code:
import matplotlib.pyplot as plt
import numpy as np
from pylab import *
graph_data = [[0, 1, 2, 3], [5, 8, 7, 9]]
x = range(len(graph_data[0]))
y = graph_data[1]
fig, ax = plt.subplots()
alpha = 0.5
plt.plot(x, y, '-o',markersize=3, color=[1., alpha, alpha], markeredgewidth=0.0)
ax.fill_between(x, 0, y, facecolor=[1., alpha, alpha], interpolate=False)
plt.show()
filename = 'test1.pdf'
fig.savefig(filename, bbox_inches='tight')
It works fine. However, when zoomed in the generated PDF, I can see two thin gray/black boundaries that separate the line:
I can see this when viewing in both Edge and Chrome. My question is, how can I get rid of the boundaries?
UPDATE I forgot to mention, I was using Sage to generate the graph. Now it seems a problem specific to Sage (and not to Python in general). This time I used native Python, and got correct result.
I could not reproduce it but maybe you can try to not plot the line.
import matplotlib.pyplot as plt
import numpy as np
from pylab import *
graph_data = [[0, 1, 2, 3], [5, 8, 7, 9]]
x = range(len(graph_data[0]))
y = graph_data[1]
fig, ax = plt.subplots()
alpha = 0.5
plt.plot(x, y, 'o',markersize=3, color=[1., alpha, alpha])
ax.fill_between(x, 0, y, facecolor=[1., alpha, alpha], interpolate=False)
plt.show()
filename = 'test1.pdf'
fig.savefig(filename, bbox_inches='tight')

Multicolor marker with matplotlib plot

I am trying to plot a simple chart with markers as "x" (green colour) if datapoint is even and "o" (red color) if datapoint is odd. However the chart is rendered with all markers as "o" except the last which is correctly showing as "x". Pls guide.
import matplotlib.pyplot as plt
a = []
b = []
for i in range( 1, 11 ):
a.append( i )
b.append( i )
if ( i % 2 ) == 0:
plt.plot( a, b, "gx" )
else:
plt.plot( a, b, "ro" )
plt.show()
If you look carefully, the first markers are not red "o"'s, but red circles with a green "x" inside. The for loop you made is equivalent to:
plt.plot([1], [1], "ro")
plt.plot([1, 2], [1, 2], "gx")
plt.plot([1, 2, 3], [1, 2, 3], "ro")
(...)
As a consequence, you will plot 10 different graphics (technically lines.Lines2D objects). The last object you plot, for i=10, is "gx"; it ends up on top of the others.
Here's a corrected version of your algorithm (make one plot per point):
# Not a good way to go about it
import matplotlib.pyplot as plt
# Loop, one plot per iteration
for i in range(1,11):
if i % 2 == 0:
plt.plot(i, i, "gx")
else:
plt.plot(i, i, "ro")
plt.show()
Here's a better algorithm, though:
# use this one instead
import matplotlib.pyplot as plt
# Generate a, b outside the body of the loop
a = list(range(1,11))
b = list(range(1,11))
# Make one plot per condition branch
# See also https://stackoverflow.com/questions/509211/understanding-slice-notation
plt.plot(a[0::2], b[0::2], "gx")
plt.plot(a[1::2], b[1::2], "ro")
plt.show()

Customizing legend with scatterplot

I struggle with customizing the legend of my scatterplot. Here is a snapshot :
And here is a code sample :
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
my_df = pd.DataFrame([[5, 3, 1], [2, 1, 2], [3, 4, 1], [1, 2, 1]],
columns=["DUMMY_CT", "FOO_CT", "CI_CT"])
g = sns.scatterplot("DUMMY_CT", "FOO_CT", data=my_df, size="CI_CT")
g.set_title("Number of Baz", weight="bold")
g.set_xlabel("Dummy count")
g.set_ylabel("Foo count")
g.get_legend().set_title("Baz count")
Also, I work in a Jupyter-lab notebook with Python 3, if it helps.
The red thingy issue
First things first, I wish to hide the name of the CI_CT variable (contoured in red on the picture). After exploring the whole documentation for this afternoon, I found the get_legend_handlers_label method (see here), which produces the following :
>>> g.get_legend_handles_labels()
([<matplotlib.collections.PathCollection at 0xfaaba4a8>,
<matplotlib.collections.PathCollection at 0xfaa3ff28>,
<matplotlib.collections.PathCollection at 0xfaa3f6a0>,
<matplotlib.collections.PathCollection at 0xfaa3fe48>],
['CI_CT', '0', '1', '2'])
Where I can spot my dear CI_CT string. However, I'm unable to change this name or to hide it completely. I found a dirty way, that basically consists in not using efficiently the dataframe passed as a data parameter. Here is the scatterplot call :
g = sns.scatterplot("DUMMY_CT", "FOO_CT", data=my_df, size=my_df["CI_CT"].values)
Result here :
It works, but is there a cleaner way to achieve this?
The green thingy issue
Displaying a 0 level in this legend is incorrect, since there is no zero value in the column CI_CT of my_df. It is therefore misleading for the readers, who might assume the smaller dots represents a value of 0 or 1. I wish to setup a defined scale, in the way one can do it for the x and y axis. However, I cannot achieve it. Any idea?
TL;DR : A broader question that could solve everything
Those adventures make me wonder if there is a way to handle the data you can pass to the scatterplots with hue and size parameters in a clean, x-and-y-axis way. Is it actually possible?
Please pardon my English, please let me know if the question is too broad or uncorrectly labelled.
The "green thing issue", namely that there is one more legend entry than there are sizes, is solved by specifying legend="full".
g = sns.scatterplot(..., legend="full")
The "red thing issue" is more tricky. The problem here is that seaborn misuses a normal legend label as a headline for the legend. An option is indeed to supply the values directly instead of the name of the column, to prevent seaborn from using that column name.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
my_df = pd.DataFrame([[5, 3, 1], [2, 1, 2], [3, 4, 1], [1, 2, 1]],
columns=["DUMMY_CT", "FOO_CT", "CI_CT"])
g = sns.scatterplot("DUMMY_CT", "FOO_CT", data=my_df, size=my_df["CI_CT"].values, legend="full")
g.set_title("Number of Baz", weight="bold")
g.set_xlabel("Dummy count")
g.set_ylabel("Foo count")
g.get_legend().set_title("Baz count")
plt.show()
If you really must use the column name itself, a hacky solution is to crawl into the legend and remove the label you don't want.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
my_df = pd.DataFrame([[5, 3, 1], [2, 1, 2], [3, 4, 1], [1, 2, 1]],
columns=["DUMMY_CT", "FOO_CT", "CI_CT"])
g = sns.scatterplot("DUMMY_CT", "FOO_CT", data=my_df, size="CI_CT", legend="full")
g.set_title("Number of Baz", weight="bold")
g.set_xlabel("Dummy count")
g.set_ylabel("Foo count")
g.get_legend().set_title("Baz count")
#Hack to remove the first legend entry (which is the undesired title)
vpacker = g.get_legend()._legend_handle_box.get_children()[0]
vpacker._children = vpacker.get_children()[1:]
plt.show()
I finally managed to get the result I wish, but the ugly way. It might be useful to someone, but I would not advise to do this.
The solution to fix the scale into the legend consists of moving all the CI_CT column values to the negatives (to keep the order and the consistency of markers size). Then, the values displayed in the legend are corrected accordingly to the previous data changes (inspiration from here).
However, I did not find any better way to make the "CI_CT" text desapear in the legend without leaving an atrociously huge blank space.
Here is the sample of code and the result.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set()
my_df = pd.DataFrame([[5, 3, 1], [2, 1, 2], [3, 4, 1], [1, 2, 1]], columns=["DUMMY_CT", "FOO_CT", "CI_CT"])
# Substracting the maximal value of CI_CT for each value
max_val = my_df["CI_CT"].agg("max")
my_df["CI_CT"] = my_df.apply(lambda x : x["CI_CT"] - max_val, axis=1)
# scatterplot declaration
g = sns.scatterplot("DUMMY_CT", "FOO_CT", data=my_df, size=my_df["CI_CT"].values)
g.set_title("Number of Baz", weight="bold")
g.set_xlabel("Dummy count")
g.set_ylabel("Foo count")
g.get_legend().set_title("Baz count")
# Correcting legend values
l = g.legend_
for t in l.texts :
t.set_text(int(t.get_text()) + max_val)
# Restoring the DF
my_df["CI_CT"] = my_df.apply(lambda x : x["CI_CT"] + max_val, axis=1)
I'm still looking for a better way to achieve this.

squared-off line plot matplotlib

How do I generate a line graph in Matplotlib where lines connecting the data points are only vertical and horizontal, not diagonal, giving a "blocky" look?
Note that this is sometimes called zero order extrapolation.
MWE
import matplotlib.pyplot as plt
x = [1, 3, 5, 7]
y = [2, 0, 4, 1]
plt.plot(x, y)
This gives:
and I want:
I think you are looking for plt.step. Here are some examples.

matplotlib collection linewidth mapping?

I'm creating some GIS-style plots in matplotlib of road networks and the like, so I'm using LineCollection to store and represent all of the roads and color accordingly. This is working fine, I color the roads based on a criteria and the following map:
from matplotlib.colors import ListedColormap,BoundaryNorm
from matplotlib.collections import LineCollection
cmap = ListedColormap(['grey','blue','green','yellow','orange','red','black'])
norm = BoundaryNorm([0,0.5,0.75,0.9,0.95,1.0,1.5,100],cmap.N)
roads = LineCollection(road_segments, array=ratios, cmap=cmap, norm=norm)
axes.add_collection(roads)
This works fine, however I would really like to have linewidths defined in a similar manner to the color map - ranging from 0.5 to 5 for each color
Does anyone know of a clever way of doing this?
The linewidths keyword.
import matplotlib.pyplot as plt
from matplotlib.collections import LineCollection
axes = plt.axes()
roads = LineCollection([
[[0, 0], [1, 1]],
[[0, 1], [1, 0]]
],
colors=['black', 'red'],
linewidths=[3, 8],
)
axes.add_collection(roads)
plt.show()
HTH