How to assign a value to a specific mgrid entry in matplotlib? - matplotlib

I am new to matplotlib and scipy. I want to create a two dimensional mgrid in matplotlib and assign individual cells in this two dimensional array to values that I have generated. How can I do it? I am looking for an assignment function such as a[i,j] = k but I cant find one. Any clues?
Thanks in advance.
Ranga

OK. I think I found the answer. What I had wanted to do was better done with an numpy.array. So the way to do this (for me) was :
t = []
zeroRow = []
for j in range(cols):
zeroRow.append(0)
for i in range(rows):
t.append(zeroRow)
spectrogramData = np.array(t,float)
Later I read the values from a file where the row and column are stored and assign to the spectrogramData
spectrogramData[row][column] = valueRead
My confusion was not knowing how to access the wrapped array. It is accessed like any two dimensional array.
Thanks for responding!

Related

Finding smallest dtype to safely cast an array to

Let's say I want to find the smallest data type I can safely cast this array to, to save it as efficiently as possible. (The expected output is int8.)
arr = np.array([-101,125,6], dtype=np.int64)
The most logical solution seems something like
np.min_scalar_type(arr) # dtype('int64')
but that function doesn't work as expected for arrays. It just returns their original data type.
The next thing I tried is this:
np.promote_types(np.min_scalar_type(arr.min()), np.min_scalar_type(arr.max())) # dtype('int16')
but that still doesn't output the smallest possible data type.
What's a good way to achieve this?
Here's a working solution I wrote. It will only work for integers.
def smallest_dtype(arr):
arr_min = arr.min()
arr_max = arr.max()
for dtype_str in ["u1", "i1", "u2", "i2", "u4", "i4", "u8", "i8"]:
if (arr_min >= np.iinfo(np.dtype(dtype_str)).min) and (arr_max <= np.iinfo(np.dtype(dtype_str)).max):
return np.dtype(dtype_str)
This is close to your initial idea:
np.result_type(np.min_scalar_type(arr.min()), arr.max())
It will take the signed int8 from arr.min() if arr.max() fits inside of it.

how can i add elments together inside the array in numpy?

here's the code:
print(array)
here's part of the outcomes:
array([[1.09080648e-07, 1.27947783e-07, 1.35521106e-07, 2.36965352e-03,
1.76941751e-07, 6.02428392e-03, 1.93768765e-07],
[1.17183374e-03, 1.54375957e-03, 4.94265019e-04, 1.72861062e-07,
7.56083752e-04, 5.68696862e-03, 3.03002388e-04],...)
if i want to add elements in each row of the array, what should i do ?
i can't directly use .sum() because it will get a sum total...
can i use a double for loop?
what should i do next?
it seems that i am very close to the answer but this is kind of ugent...
THANKS IN ADVANCE!
If you have an array with shape (N,M):
use array.sum(axis=0) to sum all values in the same column, obtaining an array with shape (M,);
use array.sum(axis=1) to sum all values in the same row, obtaining an array with shape (N,);
See the Numpy documentation for other details:
https://numpy.org/doc/stable/reference/generated/numpy.sum.html

Selecting in DataFrame without typing 'INDEX', but by calling user-defined variable as existing index/value

I'm looking for a way to select a specific part of a DataFrame. This works as follows:
df = gpd.read("path_to_file")
df.set_index(['OBJECTID'], inplace=True)
Polygon = df.loc[['81207'], 'geometry']
(the code continues with other operations using the same 'OBJECTID' in different GeoDataFrame; this is needed to not lose geometry of points and/or polygons as only one geometry type can be linked to a GeoDataFrame)
This gives the correct output. However, the same process will be incorporated in a function to receive similar output for a user-defined input of the 'OBJECTID'. I'm therefore looking for a way to select data based on a user-defined variable: OBJECTID = 81207. How can an index be called by using a variable?
Any suggestions are welcome.
Thanks in advance.
Example of what I would like to achieve:
def Building(OBJECTID):
OBJECTID = OBJECTID
print("Building with OBJECTID:", OBJECTID)
Polygon = df.loc['OBJECTID'] #OBJECTID defined in function
Points = df2.loc['OBJECTID'] #OBJECTID defined in function
return (Polygon, Points)
SOLVED:
This can be done by formatting the variable as a string.
def Building(OBJECTID):
print("Building with ObjectID:", OBJECTID)
Polygon = df.loc['{}'.format(OBJECTID)]
Points = df2.loc['{}'.format(OBJECTID)]
return Polygon, Points
Hope this is helpful for others.

is there a way to subset an AnnData object after reading it in?

I read in the excel file like so:
data = sc.read_excel('/Users/user/Desktop/CSVB.xlsx',sheet= 'Sheet1', dtype= object)
There are 3 columns in this data set that I need to work with as .obs but it looks like everything is in the .X data matrix.
Anyone successfully subset after reading in the file or is there something I need to do beforehand?
Okay, so assuming sc stands for scanpy package, the read_excel just takes the first row as .var and the first column as .obs of the AnnData object.
The data returned by read_excel can be tweaked a bit to get what you want.
Let's say the index of the three columns you want in the .obs are stored in idx variable.
idx = [1,2,4]
Now, .obs is just a Pandas DataFrame, and data.X is just a Numpy matrix (see here). Thus, the job is simple.
# assign some names to the new columns
new_col_names = ['C1', 'C2', 'C3']
# add the columns to data.obs
data.obs[new_col_names] = data.X[:,idx]
If you may wish to remove the idx columns from data.X, I suggest making a new AnnData object for this.

Changing column names in Julia to strings

this might seem like a silly question. Started with Julia very recently and encountered this trivial problem.
Creating matrix as follows:
Matice = rand(10, 10)
Matice = convert(DataFrame, Matice)
Wanted to change the column names to A,B,C,...
NewColNames = Array(String,ncol(Matice))
for i = 1:(ncol(Matice))
NewColNames[i] = string('A' + (i-1))
end
names!(Matice, NewColNames)
the last line produces an error.
Also tried to do something more direct, such as:
for i = 1:(ncol(Matice))
names(Matice)[i] = string('A' + (i-1))
end
But it isn't working again.
Any help will be greatly appreciated.
I thought it is worth to add three remarks:
1) Probably the simplest code to generate the symbols you want is
Symbol.('A':'A'+ncol-1)`
where ncol is number of columns you want.
2) Currently you can create DataFrame from vector of vectors and give names for variables in the constructor, e.g.:
DataFrame([rand(10) for i in 1:10], Symbol.('A':'J'))
3) when this PR to DataFrames is merged and released (probably in a few weeks at most) you will be able to write the same using a matrix:
DataFrame(rand(10, 10), Symbol.('A':'J'))
With help of Dan, posting an alternative solution without creating a NewColNames vector:
names!(Matice, [Symbol('A' + (i-1)) for i in 1:(ncol(Matice))])