SQL window functions to find the last row of a joined table - sql

I have tables as follows in a MariaDB 10.3 server.
person (id, name)
donation (id, person_id, amount, date_given, campaign)
With a one to many relationship from person to donation. Imagine 'campaign' describes what the donation was for, but let's just use campaign names A and B. There are only these 2.
I want a query that returns a row for each person, with columns being all the columns from their most recent donation to A and their most recent donation to B.
e.g. (this code is wrong, but may communicate better than my words!)
WITH lastDonationA AS (
SELECT *
FROM donation
WHERE campaign = 'A' AND /* IS LAST ROW */
GROUP BY person_id
)
WITH lastDonationB AS (
SELECT *
FROM donation
WHERE campaign = 'B' AND /* IS LAST ROW */
GROUP BY person_id
)
SELECT person.name, lastDonationA.*, lastDonationB.*
FROM person
LEFT JOIN lastDonationA ON lastDonationA.person_id = person.id
LEFT JOIN lastDonationB ON lastDonationB.person_id = person.id
;
I have a hunch that a window function is going to be good for this, but I can't quite fathom it!

I think I was confused by the FIRST_VALUE window function that sounds perfect but isn't for this use case.
WITH orderedDonationsForA AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date_given DESC) rn
FROM donation
WHERE campaign = 'A'
),
orderedDonationsForB AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date_given DESC) rn
FROM donation
WHERE campaign = 'B'
)
SELECT person.*, lastAdonation.*, lastBdonation.*
FROM person
LEFT JOIN orderedDonationsForA lastAdonation
ON lastAdonation.person_id AND lastAdonation.rn = 1
LEFT JOIN orderedDonationsForB lastBdonation
ON lastAdonation.person_id AND lastBdonation.rn = 1
This works by numbering the rows ordered within each person from most recent to oldest, then the join picks off the first of these by the ON clause.

Related

How to select multiple max values from a sql table

I am trying to get the top performers from a table, grouped by the company but can't seem to get the grouping right.
I have tried to use subqueries but this goes beyond my knowledge
I am trying to make a query that selects the rows in green. In other words I want to include the name, the company, and what they paid but only the top performers of each company.
Here is the raw data
create table test (person varchar(50),company varchar(50),paid numeric);
insert into
test
values
('bob','a',200),
('jane','a',100),
('mark','a',350),
('susan','b',650),
('thabo','b',100),
('thembi','b',210),
('lucas','b',110),
('oscar','c',10),
('janet','c',20),
('nancy','c',30)
You can use MAX() in a subquery as
CREATE TABLE T(
Person VARCHAR(45),
Company CHAR(1),
Paid INT
);
INSERT INTO T
VALUES ('Person1', 'A', 10),
('Person2', 'A', 20),
('Person3', 'B', 10);
SELECT T.*
FROM T INNER JOIN
(
SELECT Company, MAX(Paid) Paid
FROM T
GROUP BY Company
) TT ON T.Company = TT.Company AND T.Paid = TT.Paid;
Demo
Or using a window function as
SELECT Person,
Company,
Paid
FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY Company ORDER BY Paid DESC) RN
FROM T
) TT
WHERE RN = 1;
Demo
Here's your query.
select a.person, a.company, a.paid from tableA a
inner join
(select person, company, row_number() over (partition by company order by paid desc) as rn from tableA) as t1
on t1.person = a.person and t1.company = a.company
where t1.rn = 1
Maybe something like
WITH ranked AS (SELECT person, company, paid
, rank() OVER (PARTITION BY company ORDER BY paid DESC) AS rnk
FROM yourtable)
SELECT person, company, paid
FROM ranked
WHERE rnk = 1
ORDER BY company;
You can use rank() function with partition by clause.
DENSE_RANK gives you the ranking within your ordered partition, but the ranks are consecutive. No ranks are skipped if there are ranks with multiple items.
WITH cte AS (
SELECT person, company, paid
rank() OVER (PARTITION BY company ORDER BY paid desc) rn
FROM yourtable
)
SELECT
*
FROM cte

Insert into another table after fetching latest date and and performing an inner join

I have a table called "Member_Details" which has multiple records for each member_ID. For Example,
I have another table called "BMI_Data" that looks like the following.
The goal is to fetch the names of those members whose "BMI" in "Member_Details" is less than the "target_BMI" in "BMI_Data" table and insert it into a new table called "results" with "Member_ID, First_Name and BMI" as its schema.
Also, one consideration is to fetch the latest data available in the "Member_Details" for each member (based on date) and then do the comparison
The result for the above scenario would be something like this.
I tried using the following query
INSERT INTO results_table (Member_ID, First_Name, BMI)
select c.Member_ID, First_Name, BMI
from
(SELECT *, ROW_NUMBER() OVER (PARTITION BY Member_ID ORDER BY Date desc)
AS ROWNUM FROM Member_Details) x
JOIN
BMI_Data c ON x.Member_ID = c.Member_ID
where
x.BMI < c.Target_BMI
The above query doesn't fetch the latest date and simply loads all records in which member BMI is less than target_BMI.
Please help !
An alternate query might be
INSERT INTO results_table (Member_ID, First_Name, BMI)
select md2.member_ID, md2.First_Name, md2.BMI
from BMI_Data bd
inner join (select distinct md.member_ID ,md.First_Name ,(select top 1 BMI from Member_Details where member_ID = md.member_ID order by Date desc) BMI from Member_Details md) md2 on md2.member_ID = bd.member_ID
where md2.BMI < bd.Target_BMI
First you haven't specify the condition after row_numbers defined
INSERT INTO results_table (Member_ID, First_Name, BMI)
select c.Member_ID, First_Name, BMI
from (SELECT *,
ROW_NUMBER() OVER (PARTITION BY Member_ID ORDER BY Date desc) AS ROWNUM
FROM Member_Details
) x JOIN BMI_Data c
ON x.Member_ID = c.Member_ID
where x.ROWNUM = 1 and x.BMI < c.Target_BMI;
Wanted to note - there is no such date as '31-April-2018'! You might meant '1-May-2018'
In any case - it is important to make sure that when you are ordering by Date you first cast it to data type of DATE otherwise ordering is not correct. Below makes this ordering proper and in addition proposes alternative way by using ARRAY_AGG() with ORDER BY and LIMIT 1
#standardSQL
INSERT INTO results_table (Member_ID, First_Name, BMI)
SELECT * EXCEPT(Target_BMI)
FROM (
SELECT Member_ID, First_Name,
ARRAY_AGG(BMI ORDER BY PARSE_DATE('%d-%B-%Y', Date) DESC LIMIT 1)[OFFSET(0)] BMI
FROM `project.dataset.member_details`
GROUP BY Member_ID, First_Name
) d
JOIN `project.dataset.bmi_data` t
USING(Member_ID)
WHERE BMI < Target_BMI

Select MAX Value for Each ROW - Oracle Sql

I have one doubt.
I need to find what is the latest occurrence for a specific list of Customers, let's say to simplify, I need it for 3 Customers out of 100.
I need to check when it was the last time each of them got a bonus.
The table would be:
EVENT_TBL
Fields: Account ID, EVENT_DATE, BONUS ID, ....
Can you suggest a way to grab the latest (MAX) EVENT DATE (that means one row each)
I'm using SELECT...IN to specify the Account ID but not sure how to use MAX, Group BY etc etc (if ever needed).
Use the ROW_NUMBER() analytic function:
SELECT *
FROM (
SELECT t.*,
ROW_NUMBER() OVER ( PARTITION BY Account_id ORDER BY event_date DESC ) AS rn
FROM EVENT_TBL t
WHERE Account_ID IN ( 123, 456, 789 )
)
WHERE rn = 1
You can try
with AccountID_Max_EVENT_DATE as (
select AccountID, max(EVENT_DATE) MAX_D
from EVENT_TBL
group by AccountID
)
SELECT E.*
FROM EVENT_TBL E
INNER JOIN AccountID_Max_EVENT_DATE M
ON (E.AccountID = M.AccountID AND M.MAX_D = E.EVENT_DATE)

ORACLE SQL Return only duplicated values (not the original)

I have a database with the following info
Customer_id, plan_id, plan_start_dte,
Since some customer switch plans, there are customers with several duplicated customer_ids, but with different plan_start_dte. I'm trying to count how many times a day members switch to the premium plan from any other plan ( plan_id = 'premium').
That is, I'm trying to do roughly this: return all rows with duplicate customer_id, except for the original plan (min(plan_start_dte)), where plan_id = 'premium', and group them by plan_start_dte.
I'm able to get all duplicate records with their count:
with plan_counts as (
select c.*, count(*) over (partition by CUSTOMER_ID) ct
from CUSTOMERS c
)
select *
from plan_counts
where ct > 1
The other steps have me stuck. First I tried to select everything except the original plan:
SELECT CUSTOMERS c
where START_DTE not in (
select min(PLAN_START_DTE)
from CUSTOMERS i
where c.CUSTOMER_ID = i.CUSTOMER_ID
)
But this failed. If I can solve this I believe all I have to add is an additional condition where c.PLAN_ID = 'premium' and then group by date and do a count. Anyone have any ideas?
I think you want lag():
select c.*
from (select c.*,
lag(plan_id) over (partition by customer_id order by plan_start_date) as prev_plan_id
from customers c
) c
where prev_plan_id <> 'premium' and plan_id = 'premium';
I'm not sure what output you want. For the number of times this occurs per day:
select plan_start_date, count(*)
from (select c.*, lag(plan_id) over (partition by customer_id order by plan_start_date) as prev_plan_id
from customers c
) c
where prev_plan_id <> 'premium' and plan_id = 'premium'
group by plan_start_date
order by plan_start_date;

need query syntax to fetch latest by PKID after joining multiple tables

I have 3 tables:
select * from company
select * from emp_profile
select * from emp_salary_upgrade_tracker
table 1, company_pkid, company_code, company_name
table 2, emp_profile_pkid,company_fk_id, emp_number, emp_name, salary
table 3, salary_pk_id,emp_profile_fk_id,emp_number, old_salary, current salary
whenever an employee's salary is changed, its tracked in emp_salary_upgrade_tracker.
I need to write a query to fetch
company_code, emp_number, emp_name, old_salary, current salary
Here the result should have the latest entry ,ie latest salary change from emp_salary_upgrade_tracker.
After joining the tables, i need to fetch the latest from emp_salary_upgrade_tracker(order by pkid may be).
But am clueless of the query syntax. Please help
Use a subquery to get lates ids:
select c.company_code,
e.emp_number,
e.emp_name,
t.old_salary,
t.current_salary
from(select emp_profile_fk_id, max(salary_pk_id) as salary_pk_id
from emp_salary_upgrade_tracker group by emp_profile_fk_id) m
join emp_salary_upgrade_tracker t on m.salary_pk_id = t.salary_pk_id
join emp_profile e on t.emp_profile_fk_id = e.emp_profile_pk_id
join company c on e.company_fk_id = c.company_pkid
Here's a way using row_number to select the row containing the latest value in the emp_salary_upgrade_tracker table when ordered by its primary key.
select * from (
select *,
row_number() over (partition by esut.emp_number order by esut.salary_pk_id desc) rn
from emp_salary_upgrade_tracker esut
join emp_profile ep on ep.emp_profile_pk_id = esut.emp_profile_fk_id
join company c on c.company_pk_id = ep.company_fk_id
) t where rn = 1