ARRAYFORMULA ON SPECIFIC CELLS - sum

So I have this formula, it basiclly mulitiplies every value in a row for every row and sums up the products, and while thats awesome and all
=sum(ArrayFormula(iferror(A1:A*B1:B*C1:C)))
I would like if there was a way to choose what rows it multiplies and sums up, if I can put a specific letter or like tag those cells in any way and like "filter" them out so it only sums up lets say row 1,2,4 and so on and for infinity, how ever many rows Ill like to add and whichever rows I want to include!
EXAMPLE:
1: 100 4 10
2: 120 2 12
3: 125 5 10
4: 105 3 15

Not sure I fully understand the question but I believe you could solve the problem by introducing a fourth column, using "1" to indicate rows to add to the sum and "0" for rows to ignore. By extending your formula to include the new column D each row is multiplied with 1 (adding the value, since N*1=N) or 0 (ignoring the value, since N*0=0):
=sum(ArrayFormula(iferror(A1:A*B1:B*C1:C*D1:D)))
The below example data would sum row 1, 2 and 4:
1: 100 4 10 1
2: 120 2 12 1
3: 125 5 10 0
4: 105 3 15 1

Related

We have age columns and in that we have single values or 15+ values we need to have single value or 15+

If source value is 3 or 4 then target value is 3 or 4. If source having any minus value then -1 and if source value is 15 or more than 15 then 15+.
Table 1 Table 2
Age column. Age column
3 3
4 4
15 15+
-2 -1
-3 -1
100 15+
you can use the CASE WHEN THEN END syntax for problems like that (https://www.w3schools.com/sql/sql_case.asp)
i assume you have 3 conditions:
negative values are always -1
up to 14 it is the original value
15 and above is 15+
that means, you have to cover these 3 cases with the CASE WHEN THEN END clause. Just evaluate what comes out from your select to the first table and then transform it to the wanted outcome

How to iterate over rows and get max values of any previous rows

I have this dataframe:
pd.DataFrame({'ids':['a','b','c','d','e','f']
,'id_order':[1,2,3,4,5,6]
,'value':[1000,500,3000,2000,1000,5000]})
What I want is to iterate over the rows and get the maximum value of all previous rows.
For example, when I iterate to id_order==2 I would get 1000 (from id_order 1).
When I move forward to id_order==5 I would get 3000 (from id_order 3)
The desired outcome should be as follows:
pd.DataFrame({'ids':['a','b','c','d','e','f']
,'id_order':[1,2,3,4,5,6]
,'value':[1000,500,2000,3000,1000,5000]
,'outcome':[0,1000,1000,2000,3000,3000]})
This will be done on a big dataset so efficiency is also a factor.
I would greatly appreciate your help in this.
Thanks
You can shift the value column and take the cumulative maximum:
df["outcome"] = df.value.shift(fill_value=0).cummax()
Since shifting nullifies the first entry we fill it with 0.
>>> df
ids id_order value outcome
0 a 1 1000 0
1 b 2 500 1000
2 c 3 3000 1000
3 d 4 2000 3000
4 e 5 1000 3000
5 f 6 5000 3000

Count instances of number in Row

I have a sheet formated somewhat like this
Thing 5 6 7 Person 1 Person 2 Person 3
Thing 1 1 2 7 7 6
Thing 2 5 5
Thing 3 7 6 6
Thing 4 6 6 5
I am trying to find a query formula that I can place in the columns labeled 5,6,7 that will count the number of people who have that amount of Thing 1. For example, I filled out the Thing 1 row, showing that 1 person has 6 of Thing 1 and 2 people have 7 of Thing 1.
You can use this function: "COUNTIF".
The formula to write in the cells will look like this:
=COUNTIF(E2:G2;"=5")
For more information regarding this function, check the documentation: https://support.google.com/docs/answer/3093480?hl=en

Looking up values from one dataframe in specific row of another dataframe

I'm struggling with a bit of a complex (to me) lookup-type problem.
I have a dataframe df1 that looks like this:
Grade Value
0 A 16
1 B 12
2 C 5
And another dataframe (df2) where the values in one of the columns('Grade') from df1 forms the index:
Tier 1 Tier 2 Tier 3
A 20 17 10
B 16 11 3
C 7 6 2
I've been trying to write a bit of code that for each row in df1, look ups the row corresponding with 'Grade' in df2, finds the smallest value in df2 greater than 'Value', and returns the name of that column.
E.g. for the second row of df1, it looks up the row with index 'B' in df2: 16 is the smallest value greater than 12, so it returns 'Tier 1'. Ideal output would be:
Grade Value Tier
0 A 16 Tier 2
1 B 12 Tier 1
2 C 5 Tier 2
My novice, downloaded-Python-last-week attempt so far has been as follows, which is throwing up all manner of errors and doesn't even try returning the column name. Sorry also about the micro-ness of the question: any help appreciated!
for i, row in input_df1.iterrows():
Tier = np.argmin(df1['Value']<df2.loc[row,0:df2.shape[1]])
df2.loc[df1.Grade].eq(df1.Value, 0).idxmax(1)

Need query to find rows that may have 4, 5 or 6 consecutive numbers

Trying to come up with a query that will find rows that contain 4, 5 or 6 consecutive numbers.
For example: Table MyNumbers contains 6 columns of number combinations from 1 to 52.
Cloumn names are: nbr1 nbr2 nbr3 nbr4 nbr5 nbr6
Row one contains: 1 5 43 50 51 52
Row two contains: 41 42 43 44 45 52 <----- five consecutive numbers
Row three contains: 8 14 38 39 42 50
Row four contains: 1 2 3 4 15 29 <----- four consecutive numbers
Row five contains: 8 14 24 36 48 51
Row six contains: 1 2 3 4 5 6 <----- six consecutive numbers
Need to come up with a query that would find rows 2, 4 and 6 based on containing a result set where there were 4 or more consecutive numbers in that row.
I created a database that contains all possible combinations for a 6 numbers out of 52 (1 to 52). What I would like to do is eliminate rows that have four or more numbers that are consecutive. So I am not sure above would do the trick. For those that asked, I am using sql server 2008 R2.
Assuming the numbers are always increasing, and not repeating
select *
from mynumbers
where nbr4 - nbr1 = 3
or nbr5 - nbr2 = 3
or nbr6 - nbr3 = 3
I took the liberty of simplifying it to the fact that for a series of 6 consecutive numbers, there must already be a series of 4 consecutive numbers.