here's the code:
print(array)
here's part of the outcomes:
array([[1.09080648e-07, 1.27947783e-07, 1.35521106e-07, 2.36965352e-03,
1.76941751e-07, 6.02428392e-03, 1.93768765e-07],
[1.17183374e-03, 1.54375957e-03, 4.94265019e-04, 1.72861062e-07,
7.56083752e-04, 5.68696862e-03, 3.03002388e-04],...)
if i want to add elements in each row of the array, what should i do ?
i can't directly use .sum() because it will get a sum total...
can i use a double for loop?
what should i do next?
it seems that i am very close to the answer but this is kind of ugent...
THANKS IN ADVANCE!
If you have an array with shape (N,M):
use array.sum(axis=0) to sum all values in the same column, obtaining an array with shape (M,);
use array.sum(axis=1) to sum all values in the same row, obtaining an array with shape (N,);
See the Numpy documentation for other details:
https://numpy.org/doc/stable/reference/generated/numpy.sum.html
Related
I am having a problem with filtering a pandas dataframe. I am trying to filter a dataframe based on column values being equal to a specific list but I am getting a length error.
I tried every possible way of filtering a dataframe but got nowhere. Any help would be appreciated, thanks in advance.
Here is my code :
for ind in df_hourly.index:
timeslot = df_hourly['date_parsed'][ind][0:4] # List value to filter
filtered_df = df.loc[df['timeslot'] == timeslot]
Error : ValueError: ('Lengths must match to compare', (5696,), (4,))
Above Image : df , Below Image : df_hourly
In the above image, the dataframe I want to filter is shown. Specifically, I want to filter according to the "timeslot" column.
And the below image shows the the dataframe which includes the value I want to filter by. I specifically want to filter by "date_parsed" column. In the first line of my code, I iterate through every row in this dataframe and assign the first 4 elements of the list value in df_hourly["date_parsed"] to a variable and later in the code, I try to filter the above dataframe by that variable.
When comparing columns using ==, pandas try to compare value by value - aka does the first item equals to first item, second item to the second and so on. This is why you receive this error - pandas expects to have two columns of the same shape.
If you want to compare if value is inside a list, you can use the .isin (documentation):
df.loc[df['timeslot'].isin(timeslot)]
Depends on what timeslot is exactly, you might to take timeslot.values or something like that (hard to understand exactly without giving an example for your dataframe)
Let's say I want to find the smallest data type I can safely cast this array to, to save it as efficiently as possible. (The expected output is int8.)
arr = np.array([-101,125,6], dtype=np.int64)
The most logical solution seems something like
np.min_scalar_type(arr) # dtype('int64')
but that function doesn't work as expected for arrays. It just returns their original data type.
The next thing I tried is this:
np.promote_types(np.min_scalar_type(arr.min()), np.min_scalar_type(arr.max())) # dtype('int16')
but that still doesn't output the smallest possible data type.
What's a good way to achieve this?
Here's a working solution I wrote. It will only work for integers.
def smallest_dtype(arr):
arr_min = arr.min()
arr_max = arr.max()
for dtype_str in ["u1", "i1", "u2", "i2", "u4", "i4", "u8", "i8"]:
if (arr_min >= np.iinfo(np.dtype(dtype_str)).min) and (arr_max <= np.iinfo(np.dtype(dtype_str)).max):
return np.dtype(dtype_str)
This is close to your initial idea:
np.result_type(np.min_scalar_type(arr.min()), arr.max())
It will take the signed int8 from arr.min() if arr.max() fits inside of it.
TLDR: I have 2 arrays indices = numpy.arange(9) and another that contains some of the numbers in indices (maybe none at all, maybe it'll contain [2,4,7]). The output I'd like for this example is [0,1,3,5,6,8]. What method can be used to achieve this?
Edit: I found a method which works somewhat: casting both arrays to a set then taking the difference of the two does give the correct result, but as a set, even if I pass this result to a numpy.array(). I'll update this if I find a solution for that.
Edit2: Casting the result of the subtraction to a list, then casting passing that to a numpy.array() resolved my issue.
I guess I posted this question a little prematurely, given that I found the solution for it myself, but maybe this'll be useful to somebody in future!
You can make use of boolean masking:-
indices[~numpy.isin(indices,[2,4,7])]
Explanation:-
we are using numpy.isin() method to find out the values exists or not in incides array and then using ~ so that this gives opposite result and finally we are passing this boolean mask to indices
I have this piece of code that creates a new dataframe column, using first a conditional, and then slicing some string, with a fixed slicing index (0, 5):
df.loc[df['operation'] == 'dividend', ['order_adj']] = df['comment'].str.slice(0, 5)
But, instead of having a fixed slicing index, I need to use str.find() at the final of this code, to have a dynamic slice index on df['comment'], based on its characters.
As I'm creating a new column by broadcasting, I couldn't find the correct sintaxe to use str.find('some_string') inside str.slice(). Thanks.
Option using split:
df['comment'].str.split("some_string").str[0]
Or option using regex (move the capture group to be where you want regarding inclusive/exclusive):
pandas.Series.str.extract("(.*?)some_string")
pandas.Series.str.extract("(.*?some_string)")
I am new to matplotlib and scipy. I want to create a two dimensional mgrid in matplotlib and assign individual cells in this two dimensional array to values that I have generated. How can I do it? I am looking for an assignment function such as a[i,j] = k but I cant find one. Any clues?
Thanks in advance.
Ranga
OK. I think I found the answer. What I had wanted to do was better done with an numpy.array. So the way to do this (for me) was :
t = []
zeroRow = []
for j in range(cols):
zeroRow.append(0)
for i in range(rows):
t.append(zeroRow)
spectrogramData = np.array(t,float)
Later I read the values from a file where the row and column are stored and assign to the spectrogramData
spectrogramData[row][column] = valueRead
My confusion was not knowing how to access the wrapped array. It is accessed like any two dimensional array.
Thanks for responding!