Scatter plot with scalar data - matplotlib

I want to create a scatter plot with matplotlib where the data points have scalar data attached to them and are assigned a color depending on how large their attached value is relative to the other points in the set. I.e., I want something akin to a heatmap. However, I'm looking for a "discrete" heatmap, i.e. nothing should be ploted where there were no points in the original data set and, in particular, no interpolation (in space) should be performed.
Can this be done?

you can use scatter, and set the attached value to c parameter:
import numpy as np
import pylab as pl
x = np.random.uniform(-1, 1, 1000)
y = np.random.uniform(-1, 1, 1000)
z = np.sqrt(x*x+y*y)
pl.scatter(x, y, c=z)
pl.colorbar()
pl.show()

Solving this in Altair.
import numpy as np
import pylab as pl
x = np.random.uniform(-1, 1, 1000)
y = np.random.uniform(-1, 1, 1000)
z = np.sqrt(x*x+y*y)
df = pd.DataFrame({'x':x,'y':y, 'z':z})
from altair import *
Chart(df).mark_circle().encode(x='x',y='y', color='z')

Related

Connecting point without continus boundaries

I want to plot trajectories, without connecting the points from boundaries. Attached an image of what i mean.
My code:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
# import polygon as poly
x, y = np.loadtxt('c55.txt', delimiter=' ', unpack=True)
plt.plot(x, y, '.' ,color = 'k' , markersize=0.5)
#for i in range(1, len(x),1):
#if abs(x[i]-x[i+1])>300:
plt.plot(x,y,'-o',color='red',ms=5,label="Window 1")
plt.show()
Your x-values go several times from low to high. plt.plot connects all points in the order they are encountered in the x and y arrays.
The following approach firsts looks for the indices where the x-values start again (so, where the difference of successive x's isn't positive).
These indices are then used to draw the separate curves.
from matplotlib.colors import ListedColormap
import numpy as np
# first create some test data a bit similar to the given ones.
x = np.tile(np.linspace(-3, 3, 20), 4)
y = np.cos(x) + np.repeat(np.linspace(-3, 3, 4), 20)
fig, axs = plt.subplots(ncols=2, figsize=(15, 4))
# plotting the test data without change
axs[0].plot(x, y, '-o')
bounds = np.argwhere(np.diff(x) < 0).squeeze() # find the boundaries
bounds = np.concatenate([[0], bounds + 1, [len(x)]]) # additional boundaries for the first and last point
for b0, b1 in zip(bounds[:-1], bounds[1:]):
axs[1].plot(x[b0:b1], y[b0:b1], '-o') # use '-ro' for only red curves
plt.show()

Get Value from Contourplot - Python Matplotlib

i have a problem with my contourplot. I have messured data from experimental work, then i interpolated and plot it with matplotlib contourplot. Now i want to validate my interpolation.
For this validation i need to know the plottet value from a specific (x,y) point out of my contourplot. Due to i want to check how close my interpolation at (x,y) to my messured data at (x,y) is.
At the end i want to plot the difference over x.
i hope you understand my problem and can help me!
thanks a lot!
import pandas as pd
import numpy as np
from matplotlib.pyplot import griddata
from matplotlib.pyplot import plot
df = pd.read_excel("my_work.xlsx")
x = df.loc["x_messured" ]
y = df.loc["y_messured" ]
z = df.loc["z_messured" ]
x_interp = np.linspace(0, max(x), 200)
y_interp = np.linspace(0, max(y), 200)
z2d = griddata((x, y), z, (x_interp[None,:], y_interp[:,None]))
matplotlib.pyplot.figure()
cs = plt.contour(x_interp, y_interp, z2d)
csf = plt.contourf(x_interp, y_interp, z2d, cmap="viridis")
diff = []
for q in range(len(x)):
diff.append( abs( z[q] - get_from_z2d(x[q], y[q]) ) )
plot(x, diff)
I need the function get_from_z2d()...

Interpolation between two values using python

I am trying to perform a linear interpolation in Python from a graph which have coordinate values say (x1,y1) and (x2,y2). According to my values I will get a straight line in the graph as in this figure
My aim is at 10^6(x-axis value) should give me the value of the parameter on y-axis but presently i am getting the extrapolate value not on the line.
Required Output:OUtput needed
I tried with below Code
import matplotlib.pyplot as plt
import math
import numpy as np
x = np.array([1, 10000000])
y = np.array([0.65, 0.25])
BK = np.asarray(np.interp(0.7,x,y))
print("aa:",BK)
plt.xscale("log")
plt.plot(x,y)
plt.plot(1000000,BK, marker="o",markersize=10)
plt.plot([1000000,1000000,0],[0,BK,BK], "b--", linewidth=1)
plt.xlim(1, 100000000)
plt.ylim(0, 1)
plt.show()
Note that the line drawn in the chart is completely unrelated to the data because it is a line in the chart, not in data coordinates. An interpolation of that line hence has zero meaning!
If you still want to interpolate that line you first need to transform to logspace:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 10000000])
y = np.array([0.65, 0.25])
xinp = 1e6
BK = np.asarray(np.interp(np.log(xinp), np.log(x), y))
print("aa:",BK)
plt.xscale("log")
plt.plot(x,y)
plt.plot(xinp, BK, marker="o",markersize=10)
plt.plot([1000000,1000000,0],[0,BK,BK], "b--", linewidth=1)
plt.xlim(1, 100000000)
plt.ylim(0, 1)
plt.show()

Python keeps overwriting hist on previous plot but doesn't save it with the desired plot

I am saving two separate figures, that each should contain 2 plots together.
The problem is that the first figure is ok, but the second one, does not gets overwritten on the new plot but on the previous one, but in the saved figure, I only find one of the plots :
This is the first figure , and I get the first figure correctly :
import scipy.stats as s
import numpy as np
import os
import pandas as pd
import openpyxl as pyx
import matplotlib
matplotlib.rcParams["backend"] = "TkAgg"
#matplotlib.rcParams['backend'] = "Qt4Agg"
#matplotlib.rcParams['backend'] = "nbAgg"
import matplotlib.pyplot as plt
import math
data = [336256, 620316, 958846, 1007830, 1080401]
pdf = array([ 0.00449982, 0.0045293 , 0.00455894, 0.02397463,
0.02395788, 0.02394114])
fig, ax = plt.subplots();
fig = plt.figure(figsize=(40,30))
x = np.linspace(np.min(data), np.max(data), 100);
plt.plot(x, s.exponweib.pdf(x, *s.exponweib.fit(data, 1, 1, loc=0, scale=2)))
plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
text1= ' Weibull'
plt.savefig(text1+ '.png' )
datar =np.asarray(data)
mu, sigma = datar.mean() , datar.std() # mean and standard deviation
normal_std = np.sqrt(np.log(1 + (sigma/mu)**2))
normal_mean = np.log(mu) - normal_std**2 / 2
hs = np.random.lognormal(normal_mean, normal_std, 1000)
print(hs.max()) # some finite number
print(hs.mean()) # about 136519
print(hs.std()) # about 50405
count, bins, ignored = plt.hist(hs, 100, normed=True)
x = np.linspace(min(bins), max(bins), 10000)
pdfT = [];
for el in range (len(x)):
pdfTmp = (math.exp(-(np.log(x[el]) - normal_mean)**2 / (2 * normal_std**2)))
pdfT += [pdfTmp]
pdf = np.asarray(pdfT)
This is the second set :
fig, ax = plt.subplots();
fig = plt.figure(figsize=(40,40))
plt.plot(x, pdf, linewidth=2, color='r')
plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
text= ' Lognormal '
plt.savefig(text+ '.png' )
The first plot saves the histogram together with curve. instead the second one only saves the curve
update 1 : looking at This Question , I found out that clearing the plot history will help the figures don't mixed up , but still my second set of plots, I mean the lognormal do not save together, I only get the curve and not the histogram.
This is happening, because you have set normed = True, which means that area under the histogram is normalized to 1. And since your bins are very wide, this means that the actual height of the histogram bars are very small (in this case so small that they are not visible)
If you use
n, bins, _ = plt.hist(data, bins = np.linspace(data[0], data[-1], 100), normed=True, alpha= 1)
n will contain the y-value of your bins and you can confirm this yourself.
Also have a look at the documentation for plt.hist.
So if you set normed to False, the histogram will be visible.
Edit: number of bins
import numpy as np
import matplotlib.pyplot as plt
rand_data = np.random.uniform(0, 1.0, 100)
fig = plt.figure()
ax_1 = fig.add_subplot(211)
ax_1.hist(rand_data, bins=10)
ax_2 = fig.add_subplot(212)
ax_2.hist(rand_data, bins=100)
plt.show()
will give you two plots similar (since its random) to:
which shows how the number of bins changes the histogram.
A histogram visualises the distribution of your data along one dimension, so not sure what you mean by number of inputs and bins.

Matplotlib cmap values must be between 0-1

I am having trouble with the code below:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
from pylab import *
import sys
s = (('408b2e00', '24.21'), ('408b2e0c', '22.51'), ('4089e04a', '23.44'), ('4089e04d', '24.10'))
temp = [x[1] for x in s]
print temp
figure(figsize=(15, 8))
pts = [(886.38864047695108, 349.78744809964849), (1271.1506973277974, 187.65500904929195), (1237.272277227723, 860.38363675077176), (910.58751197700428, 816.82566805067597)]
x = map(lambda x: x[0],pts) # Extract the values from pts
y = map(lambda x: x[1],pts)
t = temp
result = zip(x,y,t)
img = mpimg.imread('floor.png')
imgplot = plt.imshow(img, cmap=cm.hot)
scatter(x, y, marker='h', c=t, s=150, vmin=-20, vmax=40)
print t
# Add cmap
colorbar()
show()
Given the temperature in s - I am trying to set the values of the cmap so I can use temperatures between -10 and 30 instead of having to used values between 1 and 0. I have set the vmin and vmax values but it still gives me the error below:
ValueError: to_rgba: Invalid rgba arg "23.44" to_rgb: Invalid rgb arg "23.44" gray (string) must be in range 0-1
I have use earlier code to simplify the problem and have been successful. This example below works and shows what I am trying to (hopefully) do:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
from pylab import *
figure(figsize=(15, 8))
# use ginput to select markers for the sensors
matplotlib.pyplot.hot()
markers = [(269, 792, -5), (1661, 800, 20), (1017, 457, 30)]
x,y,t = zip(*markers)
img = mpimg.imread('floor.png')
imgplot = plt.imshow(img, cmap=cm.hot)
scatter(x, y, marker='h', c=t, s=150, vmin=-10, vmax=30)
colorbar()
show()
Any ideas why only the second solution works? I am working with dynamic values i.e inputs from mysql and user selected points and so the first solution would be much easier to get working later on (the rest of that code is in this question: Full program code )
Any help would be great. Thanks!
You are handing in strings instead of floats, change this line:
temp = [float(x[1]) for x in s]
matplotlib tries to be good about guessing what you mean and lets you define gray as a string of a float between [0, 1] which is what it is trying to do with your string values (and complaining because it is not in than range).