Read and return all values from an array - matplotlib

Elaborated question:
Let me clarify my question. I want to plot a list of array output as a 2D scatter plot with polarity along x axis subjectivity along y axis and modality values that ranges between -1 and 1 determines the type of marker( o,x, ^, v)
output
polarities: [ 0. 0. 0. 0.]
subjectivity: [ 0.1 0. 0. 0. ]
modalities: [ 1. -0.25 1. 1. ]
The modified code with limited marker value for 2 range.
print "polarities: ", a[:,0]
print "subjectivity: ", a[:,1]
print "modalities: ", a[:,2]
def markers(r):
markers = np.array(r, dtype=np.object)
markers[(r>=0)] = 'o'
markers[r<0] = 'x'
return markers.tolist()
def colors(s):
colors = np.array(s, dtype=np.object)
colors[(s>=0)] = 'g'
colors[s<0] = 'r'
return colors.tolist()
fig=plt.figure()
ax=fig.add_subplot(111)
ax.scatter(a[:,0], a[:,1], marker = markers(a[:,2]), color= colors(a[:,0]), s=100, picker=5)
My intent is to check the modality value and return one of the four markers.
if I hardcore 'o' it returns the plot.
ax.scatter(a[:,0], a[:,1], marker = markers('o'), color= colors(a[:,0]), s=100, picker=5)
As a trial i tried to mimic the color function and pass it as a[:,2] but hit a shell output error
ValueError: Unrecognized marker style ['o', 'x', 'o', 'o']
The question is: Is my approach wrong? or how to make it recognize the marker style?
Edit1
Trying to get the m value between 0 and .5
with this code
ax.scatter (p[0<m<=.5], s[0<m<=.5], marker = "v", color= colors(a[:,0]), s=100, picker=5)
yields this error
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
How to range m value between 0 and .5 in the example given in answer 2.

It's not clear from your question, but I assume your array a is of shape (N,3) and so your arrays s and r are actual arrays and not scalars.
First off, you cannot have several markers with one call of scatter(). If you want your plot to have several markers, you'll have to slice your array correctly and do several scatter() for each of your markers.
Regarding the colors, your problem is that your function colors(r) only return one color where it should return an array of colors (with the same number of elements as a[:,0]). Like such:
def colors(s):
colors = np.array(s, dtype=np.object)
colors[(s>0.25)&(s<0.75)] = 'g'
colors[s>=0.75] = 'b'
colors[s<=0.25] = 'r'
return colors.tolist()
a = np.random.random((100,))
b = np.random.random((100,))
plt.scatter(a,b,color=colors(b))
ANSWER TO YOUR EDIT 1:
You seem to be on the right track, you'll have to do as many scatter() calls as you have markers.
Your error comes from the slicing index [0<m<=.5] which you cannot use like that. You have to use the full notation [(m>0.)&(m<=.5)]

As Diziet pointed out, plt.scatter() cannot handle several markers. You therefore need to make one scatter plot per marker-category. This can be done my conditioning on the property which should be reflected by the marker. In this case:
import numpy as np
import matplotlib.pyplot as plt
p = np.array( [ 0. , 0.2 , -0.3 , 0.2] )
s = np.array( [ 0.1, 0., 0., 0.3 ] )
m = np.array( [ 1., -0.25, 1. , -0.6 ] )
colors = np.array([(0.8*(1-x), 0.7*x, 0) for x in np.ceil(p)])
fig=plt.figure()
ax=fig.add_subplot(111)
ax.scatter(p[m>=0], s[m>=0], marker = "o", color= colors[m>=0], s=100)
ax.scatter(p[m<0], s[m<0], marker = "s", color= colors[m<0], s=100)
ax.set_xlabel("polarity")
ax.set_ylabel("subjectivity")
plt.show()

Related

linspace colormesh heatmap does not match initial distribution

I have the result of a tsne algorithm and I want to create a 2D grid with it.
The results look like this:
array([[-31.129612 , 2.836552 ],
[ 14.543636 , 1.628475 ],
[-21.804733 , 17.605087 ],
...,
[ 1.6285285, -5.144769 ],
[ -8.478171 , -17.943161 ],
[-20.473257 , 1.7228899]], dtype=float32)
I plotted the results in a scatter plot to see the overall distribution in the 2D space.
tx2, ty2 = tsne_results[:,0], tsne_results[:,1]
plt.figure(figsize = (16,12))
plt.scatter(tx2,ty2)
plt.show()
However, when creating bins using linspace, I get a very different shape for my data.
bins_nr = 150
tx2, ty2 = tsne_results[:,0], tsne_results[:,1]
grid_tmp, xl, yl = np.histogram2d(tx2, ty2, bins=bins_nr)
gridx_tmp = np.linspace(min(tx2),max(tx2),bins_nr)
gridy_tmp = np.linspace(min(ty2),max(ty2),bins_nr)
plt.figure(figsize = (16,12))
plt.grid(True)
plt.pcolormesh(gridx_tmp, gridy_tmp, grid_tmp)
plt.show()
The latter chart looks like it was inverted and the data is not being projected in the same way as the scatter plot.
Any idea why this is happening?
Kind regards

How do I modify the first label from 0.0 to 0 of the x axis in my graph?

I have tried to change 0.0 to 0 at the start of the x-axis when I have my graph.
My numerical data are:
x = 0.115, 0.234, 0.329, 0.443, 0.536, 0.654, 0.765, 0.846
y = 5.598, 7.6942, 9.1384, 11.2953, 12.4065, 15.736, 21.603, 31.4367
s = 0.05, 0.1, 0.16, 0.4, 0.32, 0.17, 0.09, 1.2
The original data does not have x = 0, y = 0.
I make the commands to add it and make the graph automatically.
But the graph starts at 0.0 on the x-axis. How do I change 0.0 to 0 without affecting the rest of the numbers?
I have studied the following links ... but still have not succeeded ...
Modify tick label text
pyplot remove the digits of zero ( start from 0 not 0.00)
The commands I have are:
import pandas as pd
import matplotlib.pyplot as plt
datos = pd.read_csv('.name.csv')
print(datos)
datosSM1 = datos[0:0]
datosSM1.loc[0] = 0
datosSM2 = datos[0:]
datosSM = pd.concat([datosSM1, datosSM2])
print(datosSM)
x = datosSM['x']
y = datosSM['y']
ys = datosSM['s']
plt.errorbar(x,y, fmt = 'ko', label = 'datos',
yerr = ys, ecolor='r' )
plt.axis([0, x.max()+0.02, 0, y.max()+(y.max()/10)])
plt.show()
I really appreciate your help and attention.
Really thank you very much, and excellent suggestions.
I think your code is better than the alternative I just wrote ...
ax = plt.axes()
def format_func(value, tick_number):
N = value
if N == 0:
return "0"
else:
return "{0:0.2}".format(N)
ax.xaxis.set_major_formatter(plt.FuncFormatter(format_func))
Thank you
To modify a selected label (actually its text), try the below code:
# Prepend first row with zeroes
datosSM = pd.concat([pd.DataFrame({'x': 0, 'y': 0, 's': 0}, index=[0]),
datos], ignore_index=True)
# Drawing
fig, ax = plt.subplots() # Will be needed soon
plt.errorbar(datosSM.x, datosSM.y, yerr=datosSM.x, fmt='ko', label='datos', ecolor='r')
plt.axis([0, datosSM.x.max() + 0.02, 0, datosSM.y.max() + (datosSM.y.max() / 10)])
fig.canvas.draw() # Needed to get access to label texts
# Get label texts
labels = [item.get_text() for item in ax.get_xticklabels()]
labels[0] = '0' # Modify the selected label
ax.set_xticklabels(labels)
plt.show()
One additional improvement in the above code is a more concise way to
generate a Dataframe with prepended row with zeroes.
Another improvement is that you don't need to "extract" individual columns.
You can pass existing columns of your DataFrame.
The result is:

Numpy Array to Networkx-Graph with Edge-Labels [duplicate]

The code below produces a very "dodgy" placement of the labels for edge weights in a graph. Please see image. I would like to have a better placement (close to the midpoint of each line), while still taking advantage of the automatic positioning of the nodes - i.e. I don't want to have to manually position the nodes.
Any ideas please? Also there is a warning - The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead. which would be good to address if anyone knows how.
import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
G = nx.Graph()
G.add_nodes_from(["A", "B", "C"])
G.add_edge("A", "B", weight=5)
G.add_edge("B", "C", weight=7)
G.add_edge("C", "A", weight=2)
pos = nx.spring_layout(G)
weights = nx.get_edge_attributes(G, "weight")
nx.draw_networkx(G, with_labels=True)
nx.draw_networkx_edge_labels(G, pos, edge_labels=weights)
plt.show()
From the documentation of draw_networkx:
draw_networkx(G, pos=None, arrows=True, with_labels=True, **kwds)
Parameters:
[...]
pos (dictionary, optional) – A dictionary with nodes as keys and positions as values. If not specified a spring layout positioning will
be computed. See networkx.layout for functions that compute node
positions.
So, if you do not pass pos explicitly, a spring_layout is generated, but this won't be identical to the layout that you generate through
pos = nx.spring_layout(G)
, because calling nx.spring_layout(G) twice gives different results:
for a in [0,1]:
pos = nx.spring_layout(G)
print(pos)
output:
{'A': array([ 0.65679786, -0.91414348]), 'B': array([0.34320214, 0.5814527 ]), 'C': array([-1. , 0.33269078])}
{'A': array([-0.85295569, -0.70179415]), 'B': array([ 0.58849111, -0.29820585]), 'C': array([0.26446458, 1. ])}
So, passing the same pos to both drawing functions solves the problem:
pos = nx.spring_layout(G)
weights = nx.get_edge_attributes(G, "weight")
nx.draw_networkx(G, pos, with_labels=True)
nx.draw_networkx_edge_labels(G, pos, edge_labels=weights)

Pandas Colormap - Changing line color [duplicate]

If you have a Colormap cmap, for example:
cmap = matplotlib.cm.get_cmap('Spectral')
How can you get a particular colour out of it between 0 and 1, where 0 is the first colour in the map and 1 is the last colour in the map?
Ideally, I would be able to get the middle colour in the map by doing:
>>> do_some_magic(cmap, 0.5) # Return an RGBA tuple
(0.1, 0.2, 0.3, 1.0)
You can do this with the code below, and the code in your question was actually very close to what you needed, all you have to do is call the cmap object you have.
import matplotlib
cmap = matplotlib.cm.get_cmap('Spectral')
rgba = cmap(0.5)
print(rgba) # (0.99807766255210428, 0.99923106502084169, 0.74602077638401709, 1.0)
For values outside of the range [0.0, 1.0] it will return the under and over colour (respectively). This, by default, is the minimum and maximum colour within the range (so 0.0 and 1.0). This default can be changed with cmap.set_under() and cmap.set_over().
For "special" numbers such as np.nan and np.inf the default is to use the 0.0 value, this can be changed using cmap.set_bad() similarly to under and over as above.
Finally it may be necessary for you to normalize your data such that it conforms to the range [0.0, 1.0]. This can be done using matplotlib.colors.Normalize simply as shown in the small example below where the arguments vmin and vmax describe what numbers should be mapped to 0.0 and 1.0 respectively.
import matplotlib
norm = matplotlib.colors.Normalize(vmin=10.0, vmax=20.0)
print(norm(15.0)) # 0.5
A logarithmic normaliser (matplotlib.colors.LogNorm) is also available for data ranges with a large range of values.
(Thanks to both Joe Kington and tcaswell for suggestions on how to improve the answer.)
In order to get rgba integer value instead of float value, we can do
rgba = cmap(0.5,bytes=True)
So to simplify the code based on answer from Ffisegydd, the code would be like this:
#import colormap
from matplotlib import cm
#normalize item number values to colormap
norm = matplotlib.colors.Normalize(vmin=0, vmax=1000)
#colormap possible values = viridis, jet, spectral
rgba_color = cm.jet(norm(400),bytes=True)
#400 is one of value between 0 and 1000
I once ran into a similar situation where I needed "n" no. of colors from a colormap so that I can assign each color to my data.
I have compiled a code to this in a package called "mycolorpy".
You can pip install it using:
pip install mycolorpy
You can then do:
from mycolorpy import colorlist as mcp
import numpy as np
Example: To create a list of 5 hex strings from cmap "winter"
color1=mcp.gen_color(cmap="winter",n=5)
print(color1)
Output:
['#0000ff', '#0040df', '#0080bf', '#00c09f', '#00ff80']
Another example to generate 16 list of colors from cmap bwr:
color2=mcp.gen_color(cmap="bwr",n=16)
print(color2)
Output:
['#0000ff', '#2222ff', '#4444ff', '#6666ff', '#8888ff', '#aaaaff', '#ccccff', '#eeeeff', '#ffeeee', '#ffcccc', '#ffaaaa', '#ff8888', '#ff6666', '#ff4444', '#ff2222', '#ff0000']
There is a python notebook with usage examples to better visualize this.
Say you want to generate a list of colors from a cmap that is normalized to a given data. You can do that using:
a=random.randint(1000, size=(200))
a=np.array(a)
color1=mcp.gen_color_normalized(cmap="seismic",data_arr=a)
plt.scatter(a,a,c=color1)
Output:
You can also reverse the color using:
color1=mcp.gen_color_normalized(cmap="seismic",data_arr=a,reverse=True)
plt.scatter(a,a,c=color1)
Output:
I had precisely this problem, but I needed sequential plots to have highly contrasting color. I was also doing plots with a common sub-plot containing reference data, so I wanted the color sequence to be consistently repeatable.
I initially tried simply generating colors randomly, reseeding the RNG before each plot. This worked OK (commented-out in code below), but could generate nearly indistinguishable colors. I wanted highly contrasting colors, ideally sampled from a colormap containing all colors.
I could have as many as 31 data series in a single plot, so I chopped the colormap into that many steps. Then I walked the steps in an order that ensured I wouldn't return to the neighborhood of a given color very soon.
My data is in a highly irregular time series, so I wanted to see the points and the lines, with the point having the 'opposite' color of the line.
Given all the above, it was easiest to generate a dictionary with the relevant parameters for plotting the individual series, then expand it as part of the call.
Here's my code. Perhaps not pretty, but functional.
from matplotlib import cm
cmap = cm.get_cmap('gist_rainbow') #('hsv') #('nipy_spectral')
max_colors = 31 # Constant, max mumber of series in any plot. Ideally prime.
color_number = 0 # Variable, incremented for each series.
def restart_colors():
global color_number
color_number = 0
#np.random.seed(1)
def next_color():
global color_number
color_number += 1
#color = tuple(np.random.uniform(0.0, 0.5, 3))
color = cmap( ((5 * color_number) % max_colors) / max_colors )
return color
def plot_args(): # Invoked for each plot in a series as: '**(plot_args())'
mkr = next_color()
clr = (1 - mkr[0], 1 - mkr[1], 1 - mkr[2], mkr[3]) # Give line inverse of marker color
return {
"marker": "o",
"color": clr,
"mfc": mkr,
"mec": mkr,
"markersize": 0.5,
"linewidth": 1,
}
My context is JupyterLab and Pandas, so here's sample plot code:
restart_colors() # Repeatable color sequence for every plot
fig, axs = plt.subplots(figsize=(15, 8))
plt.title("%s + T-meter"%name)
# Plot reference temperatures:
axs.set_ylabel("°C", rotation=0)
for s in ["T1", "T2", "T3", "T4"]:
df_tmeter.plot(ax=axs, x="Timestamp", y=s, label="T-meter:%s" % s, **(plot_args()))
# Other series gets their own axis labels
ax2 = axs.twinx()
ax2.set_ylabel(units)
for c in df_uptime_sensors:
df_uptime[df_uptime["UUID"] == c].plot(
ax=ax2, x="Timestamp", y=units, label="%s - %s" % (units, c), **(plot_args())
)
fig.tight_layout()
plt.show()
The resulting plot may not be the best example, but it becomes more relevant when interactively zoomed in.
To build on the solutions from Ffisegydd and amaliammr, here's an example where we make CSV representation for a custom colormap:
#! /usr/bin/env python3
import matplotlib
import numpy as np
vmin = 0.1
vmax = 1000
norm = matplotlib.colors.Normalize(np.log10(vmin), np.log10(vmax))
lognum = norm(np.log10([.5, 2., 10, 40, 150,1000]))
cdict = {
'red':
(
(0., 0, 0),
(lognum[0], 0, 0),
(lognum[1], 0, 0),
(lognum[2], 1, 1),
(lognum[3], 0.8, 0.8),
(lognum[4], .7, .7),
(lognum[5], .7, .7)
),
'green':
(
(0., .6, .6),
(lognum[0], 0.8, 0.8),
(lognum[1], 1, 1),
(lognum[2], 1, 1),
(lognum[3], 0, 0),
(lognum[4], 0, 0),
(lognum[5], 0, 0)
),
'blue':
(
(0., 0, 0),
(lognum[0], 0, 0),
(lognum[1], 0, 0),
(lognum[2], 0, 0),
(lognum[3], 0, 0),
(lognum[4], 0, 0),
(lognum[5], 1, 1)
)
}
mycmap = matplotlib.colors.LinearSegmentedColormap('my_colormap', cdict, 256)
norm = matplotlib.colors.LogNorm(vmin, vmax)
colors = {}
count = 0
step_size = 0.001
for value in np.arange(vmin, vmax+step_size, step_size):
count += 1
print("%d/%d %f%%" % (count, vmax*(1./step_size), 100.*count/(vmax*(1./step_size))))
rgba = mycmap(norm(value), bytes=True)
color = (rgba[0], rgba[1], rgba[2])
if color not in colors.values():
colors[value] = color
print ("value, red, green, blue")
for value in sorted(colors.keys()):
rgb = colors[value]
print("%s, %s, %s, %s" % (value, rgb[0], rgb[1], rgb[2]))
Colormaps come with their own normalize method, so if you have a plot already made you can access the color at a certain value.
import matplotlib.pyplot as plt
import numpy as np
cmap = plt.cm.viridis
cm = plt.pcolormesh(np.random.randn(10, 10), cmap=cmap)
print(cmap(cm.norm(2.2)))
For a quick and dirty you can use the map directly.
Or you can just do what #amaliammr says.
data_size = 23 # range 0..23
colors = plt.cm.turbo
color_normal = colours.N/data_size
for i in range(data_size):
col = colours.colors[int(i*color_normal)]

translate luminance to an rgb array

I'm trying to translate luminance (an N x M x 1 array) to an rgb array (N x M x 3).
The idea is to use the rgb array to get an rgba array for imshow(). The result I'm looking for is identical to what I'd get just feeding the luminance array to imshow(), but it gives me control over alpha. Is there some simple function kicking around to do this?
There are some useful things which you can use in matplotlib to achieve what you want.
You can easily take a collection of numbers, and given an appropriate normalisation and colormap, turn those into rgba values:
import matplotlib.pyplot as plt
# define a norm which scales data from the range 20-30 to 0-1
norm = plt.normalize(vmin=20, vmax=30)
cmap = plt.get_cmap('hot')
With these you can do some useful stuff:
>>> # put data in the range 0-1
>>> norm([20, 25, 30])
masked_array(data = [ 0. 0.5 1. ],
mask = False,
fill_value = 1e+20)
# turn numbers in the range 0-1 into colours defined in the cmap
>>> cmap([0, 0.5, 1])
array([[ 0.0416 , 0. , 0. , 1. ],
[ 1. , 0.3593141, 0. , 1. ],
[ 1. , 1. , 1. , 1. ]])
Is this what you meant, or were you trying to do something else?