SQL compare one column, then another, by using max over partition by - sql

DB: SAP HANA
I have asked this question before, but now I'm facing more complicated question. When qty is the same, I want to return biggest no.
A
user
no
qty
A
10
20
A
11
20
B
12
40
B
13
10
B
id
user
1
A
2
B
Expected result
id
user
no
1
A
11
2
B
12
I try
SELECT
B.id,
B.user,
C.max_qty_no
FROM
B
LEFT JOIN (
SELECT
A.user,
CASE
WHEN A.qty = (
MAX(A.qty) OVER (PARTITION BY A.user)
) THEN A.no
END as max_qty_no
FROM
A
) C ON C.user = B.user AND
C.max_qty_no IS NOT NULL;
return
id
user
no
1
A
10
1
A
11
2
B
12

You want to rank the A rows per user and only select the best-ranked row. So far this ranking was on one column only, so you could simply compare the value with the maximum value. Now, however, the ranking must be done considering two columns instead of just one. You can use ROW_NUMBER for this ranking:
select id, user, no
from
(
select
b.id, b.user, a.no,
row_number() over (partition by b.user order by a.qty desc, a.no desc) as rn
from a
join b on b.user = a.user
) ranked
where rn = 1;

Since you want the MAX(no) per user having the largest quantity you need to apply additional selection criteria. The partitioning takes care of selecting the rows with MAX(qty) per user but you still need to select the rows with MAX(no) for each distinct user - you can do this by using the MAX aggregate function combined with a GROUP BY. With this small change you can return the expected results:
SELECT
B.id,
B.user,
MAX(C.max_qty_no)
FROM
B
LEFT JOIN (
SELECT
A.user,
CASE
WHEN A.qty = (
MAX(A.qty) OVER (PARTITION BY A.user)
) THEN A.no
END as max_qty_no
FROM
A
) C ON C.user = B.user AND
C.max_qty_no IS NOT NULL
GROUP BY B.id, B.user;

Related

select value based on max of other column

I have a few questions about a table I'm trying to make in Postgres.
The following table is my input:
id
area
count
function
1
100
20
living
1
200
30
industry
2
400
10
living
2
400
10
industry
2
400
20
education
3
150
1
industry
3
150
1
education
I want to group by id and get the dominant function based on max area. With summing up the rows for area and count. When area is equal it should be based on max count, when area and count is equal it should be based on prior function (i still have to decide if education is prior to industry or vice versa). So the result should be:
id
area
count
function
1
300
50
industry
2
1200
40
education
3
300
2
industry
I tried a lot of things and maybe it's easy, but i don't get it. Can someone help to get the right SQL?
One method uses row_number() and conditional aggregation:
select id, sum(area), sum(count),
max(function) over (filter where seqnum = 1) as function
from (select t.*,
row_number() over (partition by id order by area desc) as seqnum
from t
) t
group by id;
Another method uses ``distinct on`:
select id, sum(area) over (partition by id) as area,
sum(count) over (partition by id) as count,
function
from t
order by id, area desc;
Use a scalar sub-query for "function".
select t.id, sum(t.area), sum(t.count),
(
select "function"
from the_table
where id = t.id
order by area desc, count desc, "function" desc
limit 1
) as "function"
from the_table as t
group by t.id order by t.id;
SQL Fiddle
you can use sum as window function:
select distinct on (t.id)
id,
sum(area) over (partition by id) as area,
sum(count) over (partition by id) as count,
( select function from tbl_test where tbl_test.id = t.id order by count desc limit 1 ) as function
from tbl_test t
This is how you get the function for each group based on id:
select id, function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null;
(we ensure that no yt2 exists that would be of the same id but of higher areay)
This would work nicely, but you might have several max areas with different values. To cope with this isue, let's ensure that exactly one is chosen:
select id, max(function) as function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null
group by id;
Now, let's join this to our main table;
select yourtable.id, sum(yourtable.area), sum(yourtable.count), t.function
from yourtable
join (
select id, max(function) as function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null
group by id
) t
on yourtable.id = t.id
group by yourtable.id;

DB2 Get latest modified and previous value from audit table

I have a audit table, i am trying to get the current and previous value for a column(rank) with audit timestamp information. I would like to get the timestamp when the value was changed. E.g:
For id = 1, rank was latest changed from 3 to 5 on 13-05-2021 14:10 by userid = 2.
I have written below query it gives the current and previous modified value but it gives the latest date and userid (17-05-2021 20:00 and 2), because row_number is ordered by timestamp.
with v_rank as (
select * from (
select
id,
a.rank as current_rank,
b.rank as previous_rank,
a.log_timestamp,
a.log_username,
row_number() over(partition by a.id order by a.log_timestamp) as rnum
from
user a
inner join user b on a.id = b.id and a.log_timestamp > b.timestamp
where
a.rank != b.rank
order by a.log_timestamp, b.timestamp
) where rnum = 1
)
select * from v_rank
Any suggestion on how can i get the correct timestamp(13-05-2021 14:10) and userid(2).
Edit:
Rank can also be null, in that case i need to get the blank in query result.
Expected output:
You seem to want lag() with filtering:
select u.*
from (select u.*,
lag(rank) over (partition by id order by log_timestamp) as prev_rank
from user u
) u
where rank <> prev_rank;

Take precedence on a specific value from a table

For each person's distinct record that has a toyota,
only take toyota and filter out that person's other cars
else bring all cars.
The actual script will not match my logic above. I was trying to simplify my question by using random names and car brands, but the objective was the same since I wanted to get a specific address code and filter out the rest if it did exist for other vendor names (see below). Thank you, GMB.
GPMEM.dbo.PM00200 a -- Vendor Master
LEFT JOIN GPMEM.dbo.PM30200 b -- Historical/Paid Transactions
ON a.VENDORID = b.VENDORID
LEFT JOIN GPMEM.dbo.PM20000 c -- Open/Posted Transactions
ON a.VENDORID = c.VENDORID
LEFT JOIN (
SELECT d.*,
rank() over(
partition by d.VENDORID
order by case when d.ADRSCODE = 'ACH' THEN 0 ELSE 1 END
)rn
FROM GPMEM.dbo.PM00300 d
) d -- Vendor Address Master
ON a.VENDORID = d.VENDORID
WHERE
d.rn = 1
You can use window functions:
select colA, colB
from (
select
t.*,
rank() over(
partition by colA
order by case when colB = 'Toyota' then 0 else 1 end
) rn
from mytable t
) t
where rn = 1
The trick likes in the order by clause in the over() clause of window function rank(): if a person has a Toyota, it will be ranked first, and their (possible) other cars will be ranked second. If it has no Toyota, all their car will be ranked first.
You can do this with filtering logic:
select t.*
from t
where t.colb = 'toyota' or
not exists (select 1 from t t2 where t2.cola = t.cola and t2.colb = 'toyota');
If I were to use window functions for this, I would simply count the toyotas:
select t.*
from (select t.*,
sum(case when colb = 'toyota' then 1 else 0 end) over (partition by cola) as num_toyotas
from t
) t
where colb = 'toyota' or num_toyotas = 0;

Selecting a column where another column is maximal

I guess this is a standard problem. But I could not find a proper solution yet.
I have three columns in table A:
ID ID_Version Var
1 1 A
1 2 A
1 3 X
1 4 D
2 1 B
2 2 Z
2 3 D
3 1 A
4 1 B
4 2 Q
4 3 Z
For every unique ID, I would like to isolate the Var-value that belongs to the maximal ID-Version.
For ID = 1 this would be D, for ID = 2 this would be D, for ID = 3 this would be A and for ID = 4 this would be Z.
I tried to use a group by statement but I cannot select Var-values when using the max-function on ID-Version and grouping by ID.
Does anyone have a clue how to write fast, effective code for this simple problem?
use row_number() analytic function :
select ID,Var from
(
select row_number() over (partition by id order by id_version desc) as rn,
t.*
from tab t
)
where rn = 1
or max(var) keep (dense_rank...)
select id, max(var) keep (dense_rank first order by id_version desc) as var
from tab
group by id
Demo
You could use ranking function:
SELECT *
FROM (SELECT tab.*, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID_Version DESC) rn
FROM tab)
WHERE rn = 1
Oracle has the keep syntax, so you can also use aggregation:
select id, max(id_version) as id_version,
max(var) keep (dense_rank first order by id_version desc) as var
from a
group by id;
You could also use a simple join to do what you want, see below :
SELECT A.id, A.var FROM A
JOIN
(SELECT id, MAX(id_version) as id_version
FROM A
GROUP BY id) temp ON (temp.id = A.id AND temp.id_version = A.id_version)
Or you could also use a subquery like this :
SELECT a1.id, a1.var FROM A a1
WHERE a1.id_version = (SELECT MAX(id_version) FROM A a2 WHERE a2.id = a1.id)

Join two queries from the same table - SELECT DISTINCT?

I have two tables linked by an AUTO_KEY field, from one table I'm retrieving the number (id), from the other I get several statuses by number(id), each status has a date associated to it.
I need to restrict the results only to the maximum/latest date for all numbers(ids) and the corresponding status
SELECT
OPERATION.NUMBER,
STATUS.STATUS,
Max(STATUS.DATE)
FROM
STATUS,
OPERATION
WHERE
OPERATION.AUTO_KEY = STATUS.AUTO_KEY
From here
Number Status Date
-----------------------------
1 A 10/20/13
1 B 10/15/13
2 A 10/10/13
2 AX 10/05/13
2 AD 10/03/13
3 DD 10/03/13
The outcome should be
Number Status Date
-----------------------------
1 A 10/20/13
2 A 10/10/13
3 DD 10/03/13
Thanks in advance
You can use a CTE with ROW_NUMBER() function. Also Please use a Table JOIN instead FROM STATUS, OPERATION
;With CTE AS (
SELECT O.NUMBER, S.STATUS, S.DATE,
ROW_NUMBER() OVER (ORDER BY S.DATE DESC) RN
FROM STATUS S JOIN OPERATION O
ON O.AUTO_KEY = S.AUTO_KEY
)
SELECT NUMBER, STATUS, DATE
FROM CTE
WHERE RN = 1
ORDER BY NUMBER
SELECT OPERATION.CNUMBER,
STATUS.STATUS,
STATUS.CDATE
FROM STATUS,
OPERATION
WHERE OPERATION.AUTO_KEY = STATUS.AUTO_KEY
AND STATUS.CDATE = (
SELECT MAX(STATUS.CDATE) MAX_DATE
FROM STATUS,
OPERATION
WHERE OPERATION.AUTO_KEY = STATUS.AUTO_KEY
GROUP BY OPERATION.CNUMBER )