Selecting the maximum value only for another maximum value - sql

If I have two int data type columns in SQL Server, how can I write a query so that I get the maximum number, at the maximum number of the other column?
Let me give an example. Lets say I have this table:
| Name | Version | Category | Value | Number | Replication |
|:-----:|:-------:|:--------:|:-----:|:------:|:-----------:|
| File1 | 1.0 | Time | 123 | 1 | 1 |
| File1 | 1.0 | Size | 456 | 1 | 1 |
| File2 | 1.0 | Time | 312 | 1 | 1 |
| File2 | 1.0 | Size | 645 | 1 | 1 |
| File1 | 1.0 | Time | 369 | 1 | 2 |
| File1 | 1.0 | Size | 258 | 1 | 2 |
| File2 | 1.0 | Time | 741 | 1 | 2 |
| File2 | 1.0 | Size | 734 | 1 | 2 |
| File1 | 1.1 | Time | 997 | 2 | 1 |
| File1 | 1.1 | Size | 997 | 2 | 1 |
| File2 | 1.1 | Time | 438 | 2 | 1 |
| File2 | 1.1 | Size | 735 | 2 | 1 |
| File1 | 1.1 | Time | 786 | 2 | 2 |
| File1 | 1.1 | Size | 486 | 2 | 2 |
| File2 | 1.1 | Time | 379 | 2 | 2 |
| File2 | 1.1 | Size | 943 | 2 | 2 |
| File1 | 1.2 | Time | 123 | 3 | 1 |
| File1 | 1.2 | Size | 456 | 3 | 1 |
| File2 | 1.2 | Time | 312 | 3 | 1 |
| File2 | 1.2 | Size | 645 | 3 | 1 |
| File1 | 1.2 | Time | 369 | 3 | 2 |
| File1 | 1.2 | Size | 258 | 3 | 2 |
| File2 | 1.2 | Time | 741 | 3 | 2 |
| File2 | 1.2 | Size | 734 | 3 | 2 |
| File1 | 1.3 | Time | 997 | 4 | 1 |
| File1 | 1.3 | Size | 997 | 4 | 1 |
| File2 | 1.3 | Time | 438 | 4 | 1 |
| File2 | 1.3 | Size | 735 | 4 | 1 |
How could I write a query so that I selected the maximum Replication value at the maximum Number value? As you can see, in this table, the maximum value in Number is 4 but the maximum number in Replication where Number = 4 is 1
All I can think to do is this:
SELECT MAX(Replication) FROM Table
WHERE Number IS MAX;
which is obviously wrong and doesn't work.

You can try Group By and Having
select max(Replication) from Table_Name group by [Number] having
[Number]=(select max([Number]) from Table_Name)

Just use a subquery to find the max number in the where clause. If you just want one single number as the result there is no need to use group by and having (which would make the query a lot more expensive):
select max([replication]) from tab
where number = (select max(number) from tab)

Related

Theil–Sen estimator using Hive

I would like to calculate the Theil–Sen estimator per ID for the value column in the sample table below using hive. The Theil–Sen estimator is defined here https://en.wikipedia.org/wiki/Theil%E2%80%93Sen_estimator, I tried to use arrays but could not figure out a solution. Any help is appreciated.
+----+-------+-------+
| 1 | 1 | 10 |
| 1 | 2 | 20 |
| 1 | 3 | 30 |
| 1 | 4 | 40 |
| 1 | 5 | 50 |
| 2 | 1 | 100 |
| 2 | 2 | 90 |
| 2 | 3 | 102 |
| 2 | 4 | 75 |
| 2 | 5 | 70 |
| 2 | 6 | 50 |
| 2 | 7 | 100 |
| 2 | 8 | 80 |
| 2 | 9 | 60 |
| 2 | 10 | 50 |
| 2 | 11 | 40 |
| 2 | 12 | 40 |
+----+-------+-------+

Pandas column backfill decreasing / increasing

I have DataFrame
| ind | A | B |
------------------------
| 1.01 | 10 | -1.734 |
| 1.04 | 10 | -1.244 |
| 1.05 | 10 | 0.016 |
| 1.11 | NaN | -2.737 | <-
| 1.13 | NaN | -4.232 | <-
| 1.19 | 11 | -3.241 | <=
| 1.20 | 12 | -2.832 |
| 1.21 | 10 | -4.277 |
and would like to back-fill NaN values using decreasing sequence ending with next valid value
| ind | A | B |
------------------------
| 1.01 | 10 | -1.734 |
| 1.04 | 10 | -1.244 |
| 1.05 | 10 | 0.016 |
| 1.11 | 13 | -2.737 | <-
| 1.13 | 12 | -4.232 | <-
| 1.19 | 11 | -3.241 | <=
| 1.20 | 12 | -2.832 |
| 1.21 | 10 | -4.277 |
Is there a way to do this?
Get positions where NaNs are found
positions = df['A'].isna().astype(int)
| positions |
--------------
| 0 |
| 0 |
| 0 |
| 1 |
| 1 |
| 0 |
| 0 |
| 0 |
then doing reverse cumulative sum:
mask = df['A'].isna().astype(int).loc[::-1]
cumSum = mask.cumsum()
posCumSum = (cumSum - cumSum.where(~mask).ffill().fillna(0).astype(int)).loc[::-1]
| posCumSum |
--------------
| 0 |
| 0 |
| 0 |
| 2 |
| 1 |
| 0 |
| 0 |
| 0 |
adding it to backfilled original column:
df['A'] = df['A'].bfill() + posCumSum
| ind | A | B |
------------------------
| 1.01 | 10 | -1.734 |
| 1.04 | 10 | -1.244 |
| 1.05 | 10 | 0.016 |
| 1.11 | 13 | -2.737 | <-
| 1.13 | 12 | -4.232 | <-
| 1.19 | 11 | -3.241 | <=
| 1.20 | 12 | -2.832 |
| 1.21 | 10 | -4.277 |

How do I conditionally increase the value of the proceeding row number by 1

I need to increase the value of the proceeding row number by 1. When the row encounters another condition I then need to reset the counter. This is probably easiest explained with an example:
+---------+------------+------------+-----------+----------------+
| Acct_ID | Ins_Date | Acct_RowID | indicator | Desired_Output |
+---------+------------+------------+-----------+----------------+
| 5841 | 07/11/2019 | 1 | 1 | 1 |
| 5841 | 08/11/2019 | 2 | 0 | 2 |
| 5841 | 09/11/2019 | 3 | 0 | 3 |
| 5841 | 10/11/2019 | 4 | 0 | 4 |
| 5841 | 11/11/2019 | 5 | 1 | 1 |
| 5841 | 12/11/2019 | 6 | 0 | 2 |
| 5841 | 13/11/2019 | 7 | 1 | 1 |
| 5841 | 14/11/2019 | 8 | 0 | 2 |
| 5841 | 15/11/2019 | 9 | 0 | 3 |
| 5841 | 16/11/2019 | 10 | 0 | 4 |
| 5841 | 17/11/2019 | 11 | 0 | 5 |
| 5841 | 18/11/2019 | 12 | 0 | 6 |
| 5132 | 11/03/2019 | 1 | 1 | 1 |
| 5132 | 12/03/2019 | 2 | 0 | 2 |
| 5132 | 13/03/2019 | 3 | 0 | 3 |
| 5132 | 14/03/2019 | 4 | 1 | 1 |
| 5132 | 15/03/2019 | 5 | 0 | 2 |
| 5132 | 16/03/2019 | 6 | 0 | 3 |
| 5132 | 17/03/2019 | 7 | 0 | 4 |
| 5132 | 18/03/2019 | 8 | 0 | 5 |
| 5132 | 19/03/2019 | 9 | 1 | 1 |
| 5132 | 20/03/2019 | 10 | 0 | 2 |
+---------+------------+------------+-----------+----------------+
The column I want to create is 'Desired_Output'. It can be seen from this table that I need to use the column 'indicator'. I want the following row to be n+1; unless the next row is 1. The counter needs to reset when the value 1 is encountered again.
I have tried to use a loop method of some sort but this did not produce the desired results.
Is this possible in some way?
The trick is to identify the group of consecutive rows starts from indicator 1 to the next 1. This is achieve by using the cross apply finding the Acct_RowID with indicator = 1 and use that as a Grp_RowID to use as partition by in the row_number() window function
select *,
Desired_Output = row_number() over (partition by t.Acct_ID, Grp_RowID
order by Acct_RowID)
from your_table t
cross apply
(
select Grp_RowID = max(Acct_RowID)
from your_table x
where x.Acct_ID = t.Acct_ID
and x.Acct_RowID <= t.Acct_RowID
and x.indicator = 1
) g

How do I transpose or pivot subgroups into to a single row?

I have a group-by-top-n-results query that is shown in the example input data. The subgroups (grouped by ID) are limited to the top 10 results and they are sorted ASC by rank. How do I go from the input example to the output example?
I was thinking it might be some sort of pivot function or a crosstab solution, or maybe it needs to be joined on itself. I'm just not sure exactly how to make this work. It's almost as if we concatenate each subgroup on its own row.
Each subgroup can have a maximum of 10 top results, but may also not have the full 10. In the example, the subgroup 1003 & 1007 do not have results past the top 6 and past the top 3 respectively. The schema in the example output is what I am looking for with all fields of 10 possible top-ranked rows as columns.
Example input data
+------+------+----------------------+---------+---------+-------+---------+---------+
| rank | ID | NAME | FIELD 1 | FIELD 2 | MAIN | FIELD 3 | FIELD 4 |
+------+------+----------------------+---------+---------+-------+---------+---------+
| 1 | 1001 | Backdash | 123053 | 2 | 21.1 | 17.09 | 20 |
| 2 | 1001 | cold Stone Creamery | 115404 | 2 | 19.78 | 1.04 | 0.93 |
| 3 | 1001 | Mado | 97650 | 2 | 16.74 | 0.1 | 0.14 |
| 4 | 1001 | Friendly's | 57638 | 1 | 9.88 | 0.21 | 0.4 |
| 5 | 1001 | Fosters Freeze | 53187 | 2 | 9.12 | 0.24 | 1 |
| 6 | 1001 | Marble Slab Creamery | 51381 | 2 | 8.81 | 15.75 | 28.57 |
| 7 | 1001 | Lappert's | 35929 | 1 | 6.16 | 0.01 | 0.04 |
| 8 | 1001 | Greater's | 23050 | 1 | 3.95 | 0.01 | 0.05 |
| 9 | 1001 | Happy Joe's | 20422 | 1 | 3.5 | 12.73 | 25 |
| 10 | 1001 | Shake Shack | 4260 | 1 | 0.73 | 8.1 | 50 |
| 1 | 1003 | Mauds Ice Cream | 949152 | 11 | 22.29 | 0.98 | 0.75 |
| 2 | 1003 | Mr Whippy | 433590 | 5 | 10.18 | 0.61 | 0.78 |
| 3 | 1003 | New Zeland Natural | 411348 | 7 | 9.66 | 0.03 | 0.12 |
| 4 | 1003 | Tropical Sno | 394558 | 10 | 9.27 | 0.15 | 0.4 |
| 5 | 1003 | Culver's | 375755 | 5 | 8.82 | 3.47 | 2.96 |
| 6 | 1003 | Marble Slab Creamery | 276971 | 7 | 6.5 | 13.05 | 12.07 |
| 1 | 1007 | Greater's | 105866 | 2 | 58.96 | 19.91 | 12.5 |
| 2 | 1007 | Hagan-Daz | 37697 | 3 | 20.99 | 26.17 | 37.5 |
| 3 | 1007 | cold Stone Creamery | 25520 | 1 | 14.21 | 0.23 | 0.47 |
+------+------+----------------------+---------+---------+-------+---------+---------+
Example Output Format
+------+-----------------+---------------+----------------+-------------+----------------+-----------------+---------------------+----------------+----------------+-------------+----------------+-----------------+---------------------+----------------+----------------+-------------+----------------+-----------------+--------------+----------------+----------------+-------------+----------------+-----------------+----------------+----------------+----------------+-------------+----------------+-----------------+----------------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+--------------+-----------------+-----------------+--------------+-----------------+------------------+
| ID | TOP 1 NAME | TOP 1 FIELD 1 | TOP 1 FIELD 2 | TOP 1 MAIN | TOP 1 FIELD 3 | TOP 1 FIELD 4 | TOP 2 NAME | TOP 2 FIELD 1 | TOP 2 FIELD 2 | TOP 2 MAIN | TOP 2 FIELD 3 | TOP 2 FIELD 4 | TOP 3 NAME | TOP 3 FIELD 1 | TOP 3 FIELD 2 | TOP 3 MAIN | TOP 3 FIELD 3 | TOP 3 FIELD 4 | TOP 4 NAME | TOP 4 FIELD 1 | TOP 4 FIELD 2 | TOP 4 MAIN | TOP 4 FIELD 3 | TOP 4 FIELD 4 | TOP 5 NAME | TOP 5 FIELD 1 | TOP 5 FIELD 2 | TOP 5 MAIN | TOP 5 FIELD 3 | TOP 5 FIELD 4 | TOP 6 NAME | TOP 6 FIELD 1 | TOP 6 FIELD 2 | TOP 6 MAIN | TOP 6 FIELD 3 | TOP 6 FIELD 4 | TOP 7 NAME | TOP 7 FIELD 1 | TOP 7 FIELD 2 | TOP 7 MAIN | TOP 7 FIELD 3 | TOP 7 FIELD 4 | TOP 8 NAME | TOP 8 FIELD 1 | TOP 8 FIELD 2 | TOP 8 MAIN | TOP 8 FIELD 3 | TOP 8 FIELD 4 | TOP 9 NAME | TOP 9 FIELD 1 | TOP 9 FIELD 2 | TOP 9 MAIN | TOP 9 FIELD 3 | TOP 9 FIELD 4 | TOP 10 NAME | TOP 10 FIELD 1 | TOP 10 FIELD 2 | TOP 10 MAIN | TOP 10 FIELD 3 | TOP 10 FIELD 4 |
+------+-----------------+---------------+----------------+-------------+----------------+-----------------+---------------------+----------------+----------------+-------------+----------------+-----------------+---------------------+----------------+----------------+-------------+----------------+-----------------+--------------+----------------+----------------+-------------+----------------+-----------------+----------------+----------------+----------------+-------------+----------------+-----------------+----------------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+--------------+-----------------+-----------------+--------------+-----------------+------------------+
| 1001 | Backdash | 123053 | 2 | 21.1 | 17.09 | 20 | cold Stone Creamery | 115404 | 2 | 19.78 | 1.04 | 0.93 | Mado | 97650 | 2 | 16.74 | 0.1 | 0.14 | Friendly's | 57638 | 1 | 9.88 | 0.21 | 0.4 | Fosters Freeze | 53187 | 2 | 9.12 | 0.24 | 1 | Marble Slab Creamery | 51381 | 2 | 8.81 | 15.75 | 28.57 | Lappert's | 35929 | 1 | 6.16 | 0.01 | 0.04 | Greater's | 23050 | 1 | 3.95 | 0.01 | 0.05 | Happy Joe's | 20422 | 1 | 3.5 | 12.73 | 25 | Shake Shack | 4260 | 1 | 0.73 | 8.1 | 50 |
| 1003 | Mauds Ice Cream | 949152 | 11 | 22.29 | 0.98 | 0.75 | Mr Whippy | 433590 | 5 | 10.18 | 0.61 | 0.78 | New Zeland Natural | 411348 | 7 | 9.66 | 0.03 | 0.12 | Tropical Sno | 394558 | 10 | 9.27 | 0.15 | 0.4 | Culver's | 375755 | 5 | 8.82 | 3.47 | 2.96 | Marble Slab Creamery | 276971 | 7 | 6.5 | 13.05 | 12.07 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1007 | Greater's | 105866 | 2 | 58.96 | 19.91 | 12.5 | Hagan-Daz | 37697 | 3 | 20.99 | 26.17 | 37.5 | cold Stone Creamery | 25520 | 1 | 14.21 | 0.23 | 0.47 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
+------+-----------------+---------------+----------------+-------------+----------------+-----------------+---------------------+----------------+----------------+-------------+----------------+-----------------+---------------------+----------------+----------------+-------------+----------------+-----------------+--------------+----------------+----------------+-------------+----------------+-----------------+----------------+----------------+----------------+-------------+----------------+-----------------+----------------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+-------------+----------------+----------------+-------------+----------------+-----------------+--------------+-----------------+-----------------+--------------+-----------------+------------------+
PS.
I am coming from a coding mindset, and haven't done SQL in a while. In sudocode this could be solved with a simple nested for-loop. Eg.
foreach subgroup i
foreach rank j <= 10
print i, rank[j].name, rank[j].FIELD 1, rank[j].FIELD 2, rank[j].MAIN, rank[j].FIELD 3, rank[j].FIELD 4
print \r\n
You can use conditional aggregation:
select id,
max(case when rank = 1 then name end) as name_1,
max(case when rank = 1 then field1 end) as field1_1,
. . .
max(case when rank = 2 then name end) as name_2,
max(case when rank = 2 then field1 end) as field1_2,
. . .
max(case when rank = 10 then field3 end) as field3_10,
max(case when rank = 10 then field4 end) as field4_10
from inputdata id
group by id;

Ask about query in sql server

i have table like this:
| ID | id_number | a | b |
| 1 | 1 | 0 | 215 |
| 2 | 2 | 28 | 8952 |
| 3 | 3 | 10 | 2000 |
| 4 | 1 | 0 | 215 |
| 5 | 1 | 0 |10000 |
| 6 | 3 | 10 | 5000 |
| 7 | 2 | 3 |90933 |
I want to sum a*b where id_number is same, what the query to get all value for every id_number? for example the result is like this :
| ID | id_number | result |
| 1 | 1 | 0 |
| 2 | 2 | 523455 |
| 3 | 3 | 70000 |
This is a simple aggregation query:
select id_number, sum(a*b)
from t
group by id_number
I'm not sure what the first column is for.