Select second most recent date from inner join - sql

I have this query :
SELECT
companies.display_name, companies.pay_schedule_id,
pay_schedule_periods.schedule_id,
pay_schedule_periods.created_at
FROM
companies
INNER JOIN
pay_schedule_periods ON pay_schedule_id = pay_schedule_periods.schedule_id
ORDER BY
companies.display_name, pay_schedule_periods.created_at DESC;
I get this result :
How can I select only the second most recent created_at date from each unique display_name ?

You could use row_number to assign a sequence to your dates and apply this before joining, then include as part of your join criteria, such as:
select c.display_name, c.pay_schedule_id, psp.schedule_id, psp.created_at
from companies c
join (
select pay_schedule_id, created_at,
Row_Number() over(partition by pay_schedule_id order by created_at desc) rn
from pay_schedule_periods
)psp on psp.schedule_id = c.pay_schedule_id and rn = 2
order by c.display_name, psp.created_at desc;
You could also apply this using a lateral join which would simplify further.

Related

Subquery instead of distinct and order by in postgres

I have two tables: Company, Event.
Company
{
id,
name
}
Event
{
id,
id_company,
date,
name
}
It's one to many relation.
I want to have results of companies (every company only once) with the latest event.
I'm using postgres. I have this query:
select distinct on (COM.id)
COM.id,
COM.name,
EVT.date,
EVT.name
FROM Company COM
LEFT JOIN Event EVT on EVT.id_company = COM.id
ORDER BY COM.id, EVT.date DESC
It looks good but I'm wondering if I could have the same result using subqueries or something else instead of distinct and order by date of the event.
You can use row_number and then select row number 1, like below:
select COM.id,
COM.name,
EVT.date,
EVT.name
FROM Company COM
LEFT JOIN (select EVT.*, row_number() over(partition by id_company order by EVT.date DESC) rnk from Event EVT ) EVT on EVT.id_company = COM.id and rnk=1
You could use rank() to achieve your results using a subquery or CTE such as this one.
with event_rank as (
select id_company, date, name,
rank() over (partition by id_company order by date desc) as e_rank
from event
)
select c.id, c.name, er.date, er.name
from company c
left join event_rank er
on c.id = er.id_company
where er.e_rank = 1
or er.e_rank is null --to get the companies who don't have an event
View on DB Fiddle

JOIN 2 tables ORDER BY SUM value

I have 2 tables: 1st is comment, 2nd is rating
SELECT * FROM comment_table a
INNER JOIN (SELECT comment_id, SUM(rating_value) AS total_rating FROM rating_table GROUP BY comment_id) b
ON a.comment_id = b.comment_id
ORDER BY b.total_rating DESC
I tried the above SQL but doesn't work!
Object is to display a list of comments order by rating points of each comments.
SELECT s.* FROM (
SELECT * FROM comment_table a
INNER JOIN (SELECT comment_id, SUM(rating_value) AS total_rating FROM rating_table GROUP BY comment_id) b
ON a.comment_id = b.comment_id
) AS s
ORDER BY s.total_rating DESC
Nest it inside an another select. It will then output the data in the correct order.

Average of top 2

I would like to get the average of the top2 limit1 per policyid. I need my resulting table to also have objectid.
Limit1 and objectid come from the table p_coverage.
Policyid comes from the table p_risk.
The table p_item is a linking table between p_risk and p_coverage.
The way I thought I should build my query is: create a ranking of limit1 within each policyid. Then take the avg top2.
However the ranking doesn't work and give wrong result. My query works if I take columns from ONE table, but as soon as I add joins between them it gives false ranking.
SELECT policyid, limit1, /*pcob,*/ RANK() OVER(PARTITION BY policyid ORDER BY limit1 DESC) AS rn
FROM (SELECT policyid, limit1/*, pc.objectid ASpcob*/
FROM p_risk pr
LEFT JOIN p_item
ON pr.objectid=p_item.riskobjectid
LEFT JOIN p_coverage pc
ON p_item.objectid=pc.insuranceitemid) AS s
) AS SubQueryAlias
GROUP BY
policyid, limit1/*, pcob*/, rn
ORDER BY rn,policyid,limit1 DESC
The table at the end of the picture is what I'd like to have. The first table is the result of the query of Golden Linoff
If I understand correctly, you want the ROW_NUMBER() in the subquery and then to aggregate and filter in the outer query:
SELECT policyid, AVG(limit1) as avg_top2_limit1
FROM (SELECT policyid, limit1,
DENSE_RANK() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
) p
WHERE seqnum <= 2
GROUP BY policyid
thanks to previous comment! I succeed to do what I wanted. There is the query
select b.policyid, avg(b.limit1) as avg_top2_limit1 from(
SELECT distinct(policyid) policyid, limit1
FROM (SELECT policyid, limit1,
Dense_rank() OVER (PARTITION BY policyid ORDER BY limit1 DESC) as
seqnum
FROM p_risk pr LEFT JOIN
p_item i
ON pr.objectid = i.riskobjectid LEFT JOIN
p_coverage pc
ON i.objectid = pc.insuranceitemid) AS s
WHERE seqnum <= 2 ) as b
GROUP BY policyid`

Get Min date as condition

I have a table that contains invoices for all phone numbers, and each number has several invoices, I want to display only the first invoice for precise number but i don't really know how get only first invoice , this is my query
SELECT
b.contrno
a.AR_INVDATE
FROM P_STG_TABS.IVM_INVOICE_RECORD a
INNER JOIN P_EDW_TMP.invoice b
ON b.contrno=a.contrno
WHERE a.AR_INVDATE< (SELECT AR_INVDATE FROM P_STG_TABS.IVM_INVOICE_RECORD WHERE contrno=b.contrno )
Teradata supports a QUALIFY clause to filter the result of an OLAP-function (similar to HAVING after GROUP BY), which greatly simplifies Tim Biegeleisens's answer:
SELECT *
FROM P_STG_TABS.IVM_INVOICE_RECORD a
INNER JOIN P_EDW_TMP.invoice b
ON b.contrno = a.contrno
QUALIFY
ROW_NUMBER()
OVER (PARTITION BY b.contrno
ORDER BY a.AR_INVDATE) = 1
Additionally you can apply the ROW_NUMBER before the join (might be more efficient depending on additional conditions):
SELECT *
FROM
( SELECT *
FROM P_STG_TABS.IVM_INVOICE_RECORD a
QUALIFY
ROW_NUMBER()
OVER (PARTITION BY b.contrno
ORDER BY a.AR_INVDATE) = 1
) AS a
INNER JOIN P_EDW_TMP.invoice b
ON b.contrno = a.contrno
Use ROW_NUMBER():
SELECT
t.contrno,
t.AR_INVDATE
FROM
(
SELECT
b.contrno,
a.AR_INVDATE,
ROW_NUMBER() OVER (PARTITION BY b.contrno ORDER BY a.AR_INVDATE) rn
FROM P_STG_TABS.IVM_INVOICE_RECORD a
INNER JOIN P_EDW_TMP.invoice b
ON b.contrno = a.contrno
) t
WHERE t.rn = 1;
If you are worried about ties, and you want to display all ties, then you can replace ROW_NUMBER with either RANK or DENSE_RANK.
If I correctly understand, then one way is to use group by with min(a.AR_INVDATE):
SELECT
b.contrno,
min(a.AR_INVDATE)
FROM P_STG_TABS.IVM_INVOICE_RECORD a
INNER JOIN P_EDW_TMP.invoice b
ON b.contrno=a.contrno
group by b.contrno

How can I fix my GROUP BY clause

I have 2 tables as seen below:
Now the question is :
How can I have a view which shows the details of the last Owner? in other words I need the details of person who has MAX(StartDate) in tbl_Owners table?
I want to find the latest owner of each apartment.
I tried different approaches but I couldn't find the way to do that.
I know I need to get the personID in a Group By clause which groups records by AppID but I can't do that
Thank you
Try this
select t1.* from tbl_persons as t1 inner join
(
select t1.* from tbl_owners as t1 inner join
(
select appid,max(startdate) as startdate from tbl_owners group by appid
) as t2
on t1.appid=t2.appid and t1.startdate=t2.startdate
) as t2
on t1.personid=t2.personid
Add this to your query:
JOIN (select AppId, MAX(StartDate) as MAxStartDate
from dbo.tbl_Owners
group by PersonId) o2
ON dbo.tbl_Owners.AppId= o2.AppId and
dbo.tbl_Owners.StartDate = o2.MAxStartDate
The sub-query above returns every AppId together with it's latest StartDate. Self-joining with that result will give you what you want.
You can USE CTE for this purpose
;WITH CTE AS
(
SELECT AppID,PersonID,StartDate,
ROW_NUMBER() OVER (PARTITION BY AppID ORDER BY StartDate DESC) RN
FROM TableNAme
GROUP BY AppID,PersonID,StartDate
)
SELECT * FROM CTE
WHERE RN=1
Using row_number
select t.*, p.* -- change as needed
from (select *, rn= row_number() over(partition by AppID order by StartDate desc)
from dbo.tbl_Owners
) t
join dbo.tbl_Persons p on t.rn=1 and t.PersonId = p.PersonId
using cross apply
select t.*, p.* -- change as needed
from dbo.tbl_Persons p
cross apply (
select top(1) *
from dbo.tbl_Owners o
where o.PersonId = p.PersonId
order by o.StartDate desc
) t
SELECT dbo.tbl_Owners.*,dbo.tbl_Persons.PersonFullname FROM dbo.tbl_Owners
INNER JOIN
dbo.tbl_Persons ON dbo.tbl_Owners.PersonID=dbo.tbl_Persons.PersonID
GROUP BY dbo.tbl_Owners.StartDate HAVING MAX(StartDate);
Use GROUP BY on StartDate instead on PersonID.