matplotlib: Get the colormap array - matplotlib

I am new to matplotlib, and have get stuck in colormaps.
In matplotlib how do I get the whole array of RGB colors for a specific colormap, let's say for "hot". For example if I was in MATLAB I would have just done this:
# in matlab
c = hot(256);
disp(c)
Any ideas?

You can look up the values by calling the colormap as a function, and it accepts numpy arrays to query many values at once:
In [12]: from matplotlib import cm
In [13]: cm.hot(range(256))
Out[13]:
array([[ 0.0416 , 0. , 0. , 1. ],
[ 0.05189484, 0. , 0. , 1. ],
[ 0.06218969, 0. , 0. , 1. ],
...,
[ 1. , 1. , 0.96911762, 1. ],
[ 1. , 1. , 0.98455881, 1. ],
[ 1. , 1. , 1. , 1. ]])

Got it! So you just go in the command window of your Matlab and type
cmap = colormap(nameOfTheColormapYouWant)
Possible colormap in Matlab are: parula, jet, hsv, hot, cool, spring, summer,autumn,winter, gray, bone, copper, pink, lines, colorcube, prism, flag.
You get a matrix where each row is the color code used for the colormap.

Related

Built-in index dependent weight for tensordot in numpy?

I would like to obtain a tensordot of two arrays with the same shape with index-dependent weight applied, without use of explicit loop. For example,
import numpy as np
A=np.array([1,2,3])
B=np.array([-2,6,9])
C=np.zeros((3,3))
for i in range(3):
for j in range(3):
C[i,j]=A[i]*B[j]*(np.exp(i-j)if i>j else 0)
Can an array similar to C be obtained with a built-in tool (e.g., with some options for tensordot)?
Here's a vectorized solution:
N = 3
C = np.tril(A[:, None] * B * np.exp(np.arange(N)[:, None] - np.arange(N)), k=-1)
Output:
>>> C
array([[ -2. , 0. , 0. ],
[-10.87312731, 12. , 0. ],
[-44.33433659, 48.92907291, 27. ]])
With np.einsum inconsistently slightly faster for some larger inputs than broadcasting, slower for others.
import numpy as np
A=np.array([1,2,3])
B=np.array([-2,6,9])
np.einsum('ij,i,j->ij', np.tril(np.exp(np.subtract.outer(A,A)), -1), A, B)
Output
array([[ 0. , 0. , 0. ],
[-10.87312731, 0. , 0. ],
[-44.33433659, 48.92907291, 0. ]])

numpy function to use for mathematical dot product to produce scalar

Question
What numpy function to use for mathematical dot product in the case below?
Backpropagation for a Linear Layer
Define sample (2,3) array:
In [299]: dldx = np.arange(6).reshape(2,3)
In [300]: w
Out[300]:
array([[0.1, 0.2, 0.3],
[0. , 0. , 0. ]])
Element wise multiplication:
In [301]: dldx*w
Out[301]:
array([[0. , 0.2, 0.6],
[0. , 0. , 0. ]])
and summing on the last axis (size 3) produces a 2 element array:
In [302]: (dldx*w).sum(axis=1)
Out[302]: array([0.8, 0. ])
Your (6) is the first term, dropping the 0. One might argue that the use of a dot/inner in (5) is a bit sloppy.
np.einsum borrows ideas from physics, where dimensions may be higher. This case can be expressed as
In [303]: np.einsum('ij,ik->i',dldx,w)
Out[303]: array([1.8, 0. ])
inner and dot do more calculations that we want. We just want the diagonal:
In [304]: np.dot(dldx,w.T)
Out[304]:
array([[0.8, 0. ],
[2.6, 0. ]])
In [305]: np.inner(dldx,w)
Out[305]:
array([[0.8, 0. ],
[2.6, 0. ]])
In matmul/# terms, the size 2 dimension is a 'batch' one, so we have to add dimensions:
In [306]: dldx[:,None,:]#w[:,:,None]
Out[306]:
array([[[0.8]],
[[0. ]]])
This is (2,1,1), so we need to squeeze out the 1s.

matrix multiplication with numpy array object and dataframe

I did this using a for loop and feel like there is a faster way to achieve it but it eludes me.
datal=[[-9.8839112e-05, -0.001128727317, -0.000197679149],
[-0.0009201639200000001, 0.0005601014289999999, 0.000496686232],
[-0.000184700668, 9.414391600000001e-05, 0.000409526574]]
bigtranfo=[array([[ 0.89442732, 0. , 0.44721334],
[ 0.44721334, 0. , -0.89442732],
[-0. , 1. , 0. ]]),
array([[ 0.27639329, 0.85065091, 0.44721334],
[ 0.13819655, 0.42532516, -0.89442732],
[-0.9510565 , 0.30901705, 0. ]]),
array([[-0.72360684, 0.52573128, 0.44721334],
[-0.36180316, 0.26286545, -0.89442732],
[-0.58778535, -0.80901692, 0. ]])]
vectorfield=[]
for i in range(0,3):
x=list(bigtransfo[i].dot(datal[['tx','ty','tz']].iloc[i]))
vectorfield.append(x)
vf=pd.DataFrame(vectorfield,columns=['tx','ty','tz'])
output:
[[-0.00017680915486414586, 0.00013260746120237444, -0.001128727317],
[0.00044424836491567196, -0.0003331879850180065, 0.0010482087675674915],
[0.000366290815094829, -0.00027471928608663234, 3.240032593749042e-05]]
bigtransfo is an object containing 800 3x3 arrays, transformations. datal is just a chunk of a data frame that has 800 rows and three columns. The idea is to multiply the three components of each row, the selected vector, by the corresponding transformation.
Any ideas are welcome. Thanks in advance.
update: Added working example.

Not to make 'nan' to '0' when reading the data through numpy.genfromtxt python

Now I am trying to read the array in the file named "filin1" such as:
filin1 = [1,3,4, ....,nan,nan,nan..] (in the file, actually it is just a column not an array like this)
So, I am trying to use numpy.genfromtxt as:
np.genfromtxt(filin1,dtype=None,delimiter=',',usecols=[0],missing_values='Missing',usemask=False,filling_values=np.nan)
I expected to get [1,3,4, ....,nan,nan,nan..], but turned out to be:
[1,3,4, ....,0.,0.,0...]
I would like to hold 'nan' without converting it to '0.'.
Would you please give any idea or advice?
Thank you,
Isaac
If I try to simulate your case with a string input, I have no problem reading the nan
In [73]: txt=b'''1,2
3,4
1.23,nan
nan,02
'''
In [74]: txt=txt.splitlines()
In [75]: txt
Out[75]: [b'1,2', b'3,4', b'1.23,nan', b'nan,02']
In [76]: np.genfromtxt(txt,delimiter=',')
Out[76]:
array([[ 1. , 2. ],
[ 3. , 4. ],
[ 1.23, nan],
[ nan, 2. ]])
nan is a valid float value
In [80]: float('nan')
Out[80]: nan
Your command works also, though it does
In [82]: np.genfromtxt(txt,dtype=None,delimiter=',',usecols=[0],missing_values='Missing',usemask=False,filling_values=np.nan)
Out[82]: array([ 1. , 3. , 1.23, nan])
Expecting the columns to contain integers (rather than float) could cause problems, since nan is a float, not int.
And missing values result in nan with both calls
In [91]: txt
Out[91]: [b'1,2', b'3,', b'1.23,nan', b'nan,02', b',']

Plotting a histogram of 2D numpyArray of (latitude, latitude), in order to determine the proper values for DBSCAN

I am trying to apply DBSCAN on a dataset of (Lan,Lat) .. The algorithm is very sensitive for the parameter; EPS & MinPts.
I would like to have a look through a Histogram over the data, to determine the proper values. Unfortunately, Matplotlib Hist() take only 1D array.
Passing a 2D matrix as argument, Hist() treats each column as a separate input.
Scatter plot and histograms:
Does anyone has a way to solve this,
If you follow the DBSCAN article, you only need the 4-nearest-neighbor distance for each object, not all pairwise distances. I.e., a 1 dimensional array.
Instead of doing a histogram, they sort the values, and try to choose a knee in this plot.
find the 4 nearest neighbor of each object
collect all 4NN distances in one array
sort this array in descending order
plot the resulting curve
look for a knee, often best at around 5%-10% of your x axis (so 95%-90% of objects are core points).
For details, see the original DBSCAN publication!
You could use numpy.histogram2d:
import numpy as np
np.random.seed(2016)
N = 100
arr = np.random.random((N, 2))
xedges = np.linspace(0, 1, 10)
yedges = np.linspace(0, 1, 10)
lat = arr[:, 0]
lng = arr[:, 1]
hist, xedges, yedges = np.histogram2d(lat, lng, (xedges, yedges))
print(hist)
yields
[[ 0. 0. 5. 0. 3. 0. 0. 0. 3.]
[ 0. 3. 0. 3. 0. 0. 4. 0. 2.]
[ 2. 2. 1. 1. 1. 1. 3. 0. 1.]
[ 2. 1. 0. 3. 1. 2. 1. 1. 3.]
[ 3. 0. 3. 2. 0. 1. 0. 2. 0.]
[ 3. 2. 3. 1. 1. 2. 1. 1. 0.]
[ 2. 3. 0. 1. 0. 1. 3. 0. 0.]
[ 1. 1. 1. 1. 2. 0. 2. 1. 1.]
[ 0. 1. 1. 0. 1. 1. 2. 0. 0.]]
Or to visualize the histogram:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.imshow(hist)
plt.show()