I've got a column U and a column L.
What I need to get is the value from column L when searched in column U.
Column L Column U
516 11
123 11
74 5
46 11
748 21
156 11
189 21
For example:
I want to search 21 in column U but need to find the last one.
So if I want the value belonging to 21 I need to get 189.
I tried it with:
=INDEX($L$10:$L$500,MACTH(D2,$U$10:$U$500,0))
But this gets me the first 21 value so 748 as answer.
Does anybody know how to solve this?
Use AGGREGATE Instead of MATCH:
=INDEX($L:$L,AGGREGATE(14,6,ROW($U$10:$U$500)/($U$10:$U$500=D2),1))
The AGGREGATE will return the highest row number to the INDEX where the ($U$10:$U$500=D2) resolves to TRUE.
Related
I have a table with two columns:
sort_column = A column I use for sorting
value_column = My metric of interest (a positive integer)
Using SQL, I need to create contiguous groups of rows, ordered by sort_column, such that the sum of value_column within each group is the largest possible but staying below 100 (100 not included).
Find below an example of my desired result.
Thanks
sort_column
value_column
desired_result
1
53
1
2
25
1
3
33
2
4
25
2
5
10
2
6
46
3
7
9
3
8
49
4
9
48
4
10
53
5
11
33
5
12
52
6
13
29
6
14
16
6
15
66
7
16
1
7
17
62
8
18
57
9
19
47
10
20
12
10
Ok, so after a few lengthy attempts, I came to the conclusion the task is impossible with pure SQL, because a given value of the desired column depends on previous values of that same column, in a way that cannot be obtained from the first two columns alone, so the problem is impossible to tackle without using a recursive CTE, which BigQuery does not support.
I solved the issue by writing a javascript UDF for the task. It seems to be working fine and produces the expected results.
Many thanks everyone!
In a DataFrame, I have negative numbers, and also missing values that are given by a - . I want to replace the missing values with an empty cell, but this operation should NOT remove the - in front of the negative numbers.
It looks like:
45 45 45 45 45 45 45 45 45 45
45 45 15 31 43 45 45 45 45 45
44.24 121.55 1.80 0.00% - 97.63 -4.87 -6.02 -20.14 169.19
1 1 7 12 3 1 1 1 1 1
So the missing value cell with the - should be empty, but the -4.87 should stay intact.
Any help would be greatly appreciated.
The problem should have been addressed at the time of loading the file into the DataFrame (by providing the na_values parameter to read_csv() or whatever function you used).
At this point, use operation replace(): it replaces whole words, not individual characters.
df = df.replace("-", np.nan)
Let's say I have a column A:
A
1 | 10
2 | 20
3 | 33
4 | 42
On line 5 I can calculate the maximum of the row: MAX(A1:A4), which returns 42. In row 6 I would like to get the row number for the maximum, i.e. row number 4.
Thanks
A5 contains =MAX(A1:A4)
A6 then should have formula =MATCH(A5;A1:A4;0)
A6 returns the n'th row of the search matrix where your value can be found.
Does this help to solve your problem?
I have a table in PowerPivot which contains the logged data of a traffic control camera mounted on a road. This table is filled the velocity and the number of vehicles that pass this camera during a specific time(e.g. 14:10 - 15:25). Now I want to know that how can I get the average velocity of cars for an specific hour and list them in a separate table with 24 rows(hour 0 - 23) where the second column of each row is the weighted average velocity of that hour? A sample of my stat_table data is given below:
count vel hour
----- --- ----
133 96.00237 15
117 91.45705 21
81 81.90521 6
2 84.29946 21
4 77.7841 18
1 140.8766 17
2 56.14951 14
6 71.72839 13
4 64.14309 9
1 60.949 17
1 77.00728 21
133 100.3956 6
109 100.8567 15
54 86.6369 9
1 83.96901 17
10 114.6556 21
6 85.39127 18
1 76.77993 15
3 113.3561 2
3 94.48055 2
In a separate PowerPivot table I have 24 rows and 2 columns but when I enter my formula, the whole rows get updated with the same number. My formula is:
=sumX(FILTER(stat_table, stat_table[hour]=[hour]), stat_table[count] * stat_table[vel])/sumX(FILTER(stat_table, stat_table[hour]=[hour]), stat_table[count])
Create a new calculated column named "WeightedVelocity" as follows
WeightedVelocity = [count]*[vel]
Create a measure "WeightedAverage" as follows
WeightedAverage = sum(stat_table[WeightedVelocity]) / sum(stat_table[count])
Use measure "WeightedAverage" in VALUES area of pivot Table and use "hour" column in ROWS to get desired result.
I have a table called processed. The last column is named as monthid. The data type for this column is bigint. When I fire a simple query like this, I get no results:
select * from processed where monthid = 5 ;
A few rows for the table have been shown below. Can someone suggest what's wrong here?
11741 Negative 11 69.55 1401172919 48 27 5
11741 Negative 11 102.0 1401172997 48 27 5
11741 Negative 11 145.78 1401173093 48 27 5
11741 Negative 11 70.54 1401173137 49 27 5
11741 Negative 11 85.2 1401173146 49 27 5
11741 Negative 11 67.47 1401173156 49 27 5
11741 Negative 11 92.76 1401173223 49 27 5
As can be seen from the above sample data, the last column has monthid = 5. However, my query returns me nothing.
I believe the problem here was that i had partitioned the above table based on column #6. Hence, due to either permissions issue or something funky, the query was returning nothing. After, I dropped the table and created it again without the partition, the above query worked fine. For more information on this, please refer to
Hive - Queries on Partitions return nothing