correct way to create a 3d median numpy array - numpy

so I tried to create a 3d array using numpy via this line:
self.dark_median_roi=np.median(self.dark_roi, axis=3)
where self.dark_roi is a multidimensional array and I got this error:
IndexError: axis 3 out of bounds (2)
I'm guessing I went about creating a 3d array the wrong way. What is the correct way to create a median numpy array? This will be running/is trying to run on a Raspberry pi, so I would rather avoid using loops, especially with arrays.
Edit:
so I corrected some mistakes from earlier in the code that weren't noticeable at first until I started adding print statements so this is the error I'm getting now:
IndexError: axis 3 out of bounds (3)
and I tried changing the the axis flag to 2 and it created a 2d array

You have 3 axes: 0, 1 and 2.
If you mean the last one - enter axis=2.

Related

python,numpy matrix must be 2-dimensional

Why the line3 raise valueError‘ matrix must be 2-dimensional’
import numpy as np
np.mat([[[1],[2]],[[10],[1,3]]])
np.mat([[[1],[2]],[[10],[1]]])
The reason why this code raises an error is because NumPy tries to determine the dimensionality of your input using nesting levels (nesting levels -> dimensions).
If, at some level, some elements do not have the same length (i.e. they are incompatible), it will create the array using the deepest nesting it can, using the objects as the elements of the array.
For this reason:
np.mat([[[1],[2]],[[10],[1,3]]])
Will give you a matrix of objects (lists), while:
np.mat([[[1],[2]],[[10],[1]]])
would result in a 3D array of numbers which np.mat() does not want to squeeze into a matrix.
Also, please avoid using np.mat() in your code as it is deprecated.
Use np.array() instead.
Incidentally, np.array() would work in both cases and it would give you a (2, 2, 1)-shaped array of int, which you could np.squeeze() into a matrix if you like.
However, it would be better to start from nesting level of 2 if all you want is a matrix:
np.array([[1, 2], [10, 1]])

Construct NumPy matrix row by row

I'm trying to construct a 2D NumPy array from values in an extant 2D NumPy array using an iterative process. Using ordinary python lists the process I'm describing would look like so:
coords = #data from file contained in a 2D list
d = #integer
edges = []
for i in range(d+1):
for j in range(i+1, d+1):
edge = coords[j] - coords[i]
edges.append(edge)
However, the NumPy array imposes restrictions that do not permit the process shown above. Below I try to do the same thing using NumPy arrays, and it should immediately be clear where the problems are:
coords = np.genfromtxt('Energies.txt', dtype=float, skip_header=1)
d = #integer
#how to initialize?
for i in range(d+1):
for j in range(i+1, d+1):
edge = coords[j] - coords[i]
#how to append?
Because .append does not exist for NumPy arrays I need to rely on concatenate or stack instead. But these functions are designed to join existing arrays, and I don't have anything to concatenate or stack until after the first iteration of my loop. So I suppose I need to change my data flow, but I'm unsure how to go about this.
Any help would be greatly appreciated. Thanks in advance.
that function is numpy.meshgrid [1] , the function does it by default.
[1] https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.meshgrid.html

Numpy Array Shape Issue

I have initialized this empty 2d np.array
inputs = np.empty((300, 2), int)
And I am attempting to append a 2d row to it as such
inputs = np.append(inputs, np.array([1,2]), axis=0)
But Im getting
ValueError: all the input arrays must have same number of dimensions
And Numpy thinks it's a 2 row 0 dimensional object (transpose of 2d)
np.array([1, 2]).shape
(2,)
Where have I gone wrong?
To add a row to a (300,2) shape array, you need a (1,2) shape array. Note the matching 2nd dimension.
np.array([[1,2]]) works. So does np.array([1,2])[None, :] and np.atleast_2d([1,2]).
I encourage the use of np.concatenate. It forces you to think more carefully about the dimensions.
Do you really want to start with np.empty? Look at its values. They are random, and probably large.
#Divakar suggests np.row_stack. That puzzled me a bit, until I checked and found that it is just another name for np.vstack. That function passes all inputs through np.atleast_2d before doing np.concatenate. So ultimately the same solution - turn the (2,) array into a (1,2)
Numpy requires double brackets to declare an array literal, so
np.array([1,2])
needs to be
np.array([[1,2]])
If you intend to append that as the last row into inputs, you can just simply use np.row_stack -
np.row_stack((inputs,np.array([1,2])))
Please note this np.array([1,2]) is a 1D array.
You can even pass it a 2D row version for the same result -
np.row_stack((inputs,np.array([[1,2]])))

Should a pandas dataframe column be converted in some way before passing it to a scikit learn regressor?

I have a pandas dataframe and passing df[list_of_columns] as X and df[[single_column]] as Y to a Random Forest regressor.
What does the following warnning mean and what should be done to resolve it?
DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). probas = cfr.fit(trainset_X, trainset_Y).predict(testset_X)
Simply check the shape of your Y variable, it should be a one-dimensional object, and you are probably passing something with more (possibly trivial) dimensions. Reshape it to the form of list/1d array.
You can use df.single_column.values or df['single_column'].values to get the underlying numpy array of your series (which, in this case, should also have the correct 1D-shape as mentioned by lejlot).
Actually the warning tells you exactly what is the problem:
You pass a 2d array which happened to be in the form (X, 1), but the method expects a 1d array and has to be in the form (X, ).
Moreover the warning tells you what to do to transform to the form you need: y.values.ravel().
Use Y = df[[single_column]].values.ravel() solves DataConversionWarning for me.

matplotlib: working with range in x-axis

I'm trying to do a basic line graph here, but I can't seem to figure out how to adjust my x axis.
And here is the error I get when I try adjusting my range.
from pylab import *
plot ( range(0,11),[9,4,5,2,3,5,7,12,2,3],'.-',label='sample1' )
plot ( range(0,11),[12,5,33,2,4,5,3,3,22,10],'o-',label='sample2' )
xlabel('x axis')
ylabel('y axis')
title('my sample graphs')
legend(('sample1','sample2'))
savefig("sampleg.png",dpi=(640/8))
show()
File "C:\Python26\lib\site-packages\matplotlib\axes.py", line 228, in _xy_from_xy
raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension
I want my range to be a list of strings: ["12/1/2007","12/1/2008", "12/1/2009","12/1/2010"]
Any suggestions?
Honestly, I found the code online and was trying to rewrite it to properly understand it. I think I'm going to start from scratch so that I know what I'm doing but I need help on where to start.
I posted another question which explains what I want to do here:
Using PyLab to create a 2D graph from two separate lists
range(0,11) should be range(0,10).
In addition to Steve's observation: If your points are always some y-value at the same consecutive integer x's, matplotlib makes the range even implicit.
plot([9,4,5,2,3,5,7,12,2,3],'.-',label='sample1')