I am new to matplotlib and scipy. I want to create a two dimensional mgrid in matplotlib and assign individual cells in this two dimensional array to values that I have generated. How can I do it? I am looking for an assignment function such as a[i,j] = k but I cant find one. Any clues?
Thanks in advance.
Ranga
OK. I think I found the answer. What I had wanted to do was better done with an numpy.array. So the way to do this (for me) was :
t = []
zeroRow = []
for j in range(cols):
zeroRow.append(0)
for i in range(rows):
t.append(zeroRow)
spectrogramData = np.array(t,float)
Later I read the values from a file where the row and column are stored and assign to the spectrogramData
spectrogramData[row][column] = valueRead
My confusion was not knowing how to access the wrapped array. It is accessed like any two dimensional array.
Thanks for responding!
Related
Let's say I want to find the smallest data type I can safely cast this array to, to save it as efficiently as possible. (The expected output is int8.)
arr = np.array([-101,125,6], dtype=np.int64)
The most logical solution seems something like
np.min_scalar_type(arr) # dtype('int64')
but that function doesn't work as expected for arrays. It just returns their original data type.
The next thing I tried is this:
np.promote_types(np.min_scalar_type(arr.min()), np.min_scalar_type(arr.max())) # dtype('int16')
but that still doesn't output the smallest possible data type.
What's a good way to achieve this?
Here's a working solution I wrote. It will only work for integers.
def smallest_dtype(arr):
arr_min = arr.min()
arr_max = arr.max()
for dtype_str in ["u1", "i1", "u2", "i2", "u4", "i4", "u8", "i8"]:
if (arr_min >= np.iinfo(np.dtype(dtype_str)).min) and (arr_max <= np.iinfo(np.dtype(dtype_str)).max):
return np.dtype(dtype_str)
This is close to your initial idea:
np.result_type(np.min_scalar_type(arr.min()), arr.max())
It will take the signed int8 from arr.min() if arr.max() fits inside of it.
here's the code:
print(array)
here's part of the outcomes:
array([[1.09080648e-07, 1.27947783e-07, 1.35521106e-07, 2.36965352e-03,
1.76941751e-07, 6.02428392e-03, 1.93768765e-07],
[1.17183374e-03, 1.54375957e-03, 4.94265019e-04, 1.72861062e-07,
7.56083752e-04, 5.68696862e-03, 3.03002388e-04],...)
if i want to add elements in each row of the array, what should i do ?
i can't directly use .sum() because it will get a sum total...
can i use a double for loop?
what should i do next?
it seems that i am very close to the answer but this is kind of ugent...
THANKS IN ADVANCE!
If you have an array with shape (N,M):
use array.sum(axis=0) to sum all values in the same column, obtaining an array with shape (M,);
use array.sum(axis=1) to sum all values in the same row, obtaining an array with shape (N,);
See the Numpy documentation for other details:
https://numpy.org/doc/stable/reference/generated/numpy.sum.html
I'm looking for a way to select a specific part of a DataFrame. This works as follows:
df = gpd.read("path_to_file")
df.set_index(['OBJECTID'], inplace=True)
Polygon = df.loc[['81207'], 'geometry']
(the code continues with other operations using the same 'OBJECTID' in different GeoDataFrame; this is needed to not lose geometry of points and/or polygons as only one geometry type can be linked to a GeoDataFrame)
This gives the correct output. However, the same process will be incorporated in a function to receive similar output for a user-defined input of the 'OBJECTID'. I'm therefore looking for a way to select data based on a user-defined variable: OBJECTID = 81207. How can an index be called by using a variable?
Any suggestions are welcome.
Thanks in advance.
Example of what I would like to achieve:
def Building(OBJECTID):
OBJECTID = OBJECTID
print("Building with OBJECTID:", OBJECTID)
Polygon = df.loc['OBJECTID'] #OBJECTID defined in function
Points = df2.loc['OBJECTID'] #OBJECTID defined in function
return (Polygon, Points)
SOLVED:
This can be done by formatting the variable as a string.
def Building(OBJECTID):
print("Building with ObjectID:", OBJECTID)
Polygon = df.loc['{}'.format(OBJECTID)]
Points = df2.loc['{}'.format(OBJECTID)]
return Polygon, Points
Hope this is helpful for others.
I read in the excel file like so:
data = sc.read_excel('/Users/user/Desktop/CSVB.xlsx',sheet= 'Sheet1', dtype= object)
There are 3 columns in this data set that I need to work with as .obs but it looks like everything is in the .X data matrix.
Anyone successfully subset after reading in the file or is there something I need to do beforehand?
Okay, so assuming sc stands for scanpy package, the read_excel just takes the first row as .var and the first column as .obs of the AnnData object.
The data returned by read_excel can be tweaked a bit to get what you want.
Let's say the index of the three columns you want in the .obs are stored in idx variable.
idx = [1,2,4]
Now, .obs is just a Pandas DataFrame, and data.X is just a Numpy matrix (see here). Thus, the job is simple.
# assign some names to the new columns
new_col_names = ['C1', 'C2', 'C3']
# add the columns to data.obs
data.obs[new_col_names] = data.X[:,idx]
If you may wish to remove the idx columns from data.X, I suggest making a new AnnData object for this.
this might seem like a silly question. Started with Julia very recently and encountered this trivial problem.
Creating matrix as follows:
Matice = rand(10, 10)
Matice = convert(DataFrame, Matice)
Wanted to change the column names to A,B,C,...
NewColNames = Array(String,ncol(Matice))
for i = 1:(ncol(Matice))
NewColNames[i] = string('A' + (i-1))
end
names!(Matice, NewColNames)
the last line produces an error.
Also tried to do something more direct, such as:
for i = 1:(ncol(Matice))
names(Matice)[i] = string('A' + (i-1))
end
But it isn't working again.
Any help will be greatly appreciated.
I thought it is worth to add three remarks:
1) Probably the simplest code to generate the symbols you want is
Symbol.('A':'A'+ncol-1)`
where ncol is number of columns you want.
2) Currently you can create DataFrame from vector of vectors and give names for variables in the constructor, e.g.:
DataFrame([rand(10) for i in 1:10], Symbol.('A':'J'))
3) when this PR to DataFrames is merged and released (probably in a few weeks at most) you will be able to write the same using a matrix:
DataFrame(rand(10, 10), Symbol.('A':'J'))
With help of Dan, posting an alternative solution without creating a NewColNames vector:
names!(Matice, [Symbol('A' + (i-1)) for i in 1:(ncol(Matice))])