Sql get MAX for other value - sql

I would like to figure out how to select a table where the value is the ID of the person with the highest pay.
So if I had
Table=theJobs
JobID Pay
----------
12345 10
12346 12
12347 13
table=thePerson
Person JobID
--------------
Person1 1
Person2 2
Person3 3
table=hire(FKs)
JobID PersonID
----------------
12345 2
12347 1
12346 3
I'd like it to show the max payed person so it should show
Person1
I tried to use where for the a Max function but it seems to fail. I'm pretty sucky at these group functions. I guess I'm more asking how to use a group function as a constraint than anything. Since I had a similar issue a bit ago.

This solution will work in pretty much any database system:
Select ....
From thePerson As P
Join (
Select H1.PersonId, Max( J1.Pay ) As MaxPay
From hire As H1
Join theJobs As J1
On J1.JobId = H1.JobID
Group By H1.PersonId
) As PayPerPerson
On PayPerPerson.PersonId = P.Person
Where Exists (
Select 1
From hire As H2
Join theJobs As J2
On J2.JobId = H2.JobID
Where H2.PersonId = P.Person
Having Max( J2.Pay ) = PayPerPerson.MaxPay
)
If you are using a DBMS that supports ranking functions and common-table expressions such as SQL Server 2005 and later, then the problem is easier. This solution will show only one name and ignore ties:
With RankedPay As
(
Select ...
, Row_Number() Over( Order By J.Pay Desc ) As Rnk
From thePerson As P
Join hire As H
On H.PersonId = P.Person
Join theJobs As J
On J.JobId = H.JobId
)
Select ...
From RankedPay
Where Rnk = 1
This solution will show any that match the top pay and include ties:
With RankedPay As
(
Select ...
, Rank() Over( Order By J.Pay Desc ) As Rnk
From thePerson As P
Join hire As H
On H.PersonId = P.Person
Join theJobs As J
On J.JobId = H.JobId
)
Select ...
From RankedPay
Where Rnk = 1

SELECT p.Person
FROM person p JOIN hire h ON p.PersonID = h.PersonID
JOIN theJobs j ON h.JobID = j.JobID
ORDER BY j.Pay DESC
LIMIT 1;
If you are using a RDBMS that does not support the LIMIT clause, try using the TOP clause instead:
SELECT TOP 1 p.Person
FROM person p JOIN hire h ON p.PersonID = h.PersonID
JOIN theJobs j ON h.JobID = j.JobID
ORDER BY j.Pay DESC

Related

Return only the row where the date in one column is closest to the date in another column?

I'm working on a query but it's returning some duplicate values and I need to return only the row where the date in one column is closest to the date in another column.
My query looks something like this:
SELECT p.Id, r.ReferralDate, s.SupervisionDate
FROM person p
INNER JOIN referral r on r.PersonId = p.Id
INNER JOIN supervision s on s.PersonId = p.Id
Which returns something like this:
Id
Supervision Date
Referral Date
123
2015-09-30
2015-08-30
123
2020-02-30
2015-08-30
123
2020-06-30
2015-08-30
456
2010-06-30
2010-07-30
456
2005-06-30
2010-07-30
How can I write a query that returns the Supervision Date that is closest to the Referral Date? So that the final output looks like this:
Id
Supervision Date
Referral Date
123
2015-09-30
2015-08-30
456
2010-06-30
2010-07-30
You can use two ways in this scenario to select the record that you need.
I prefer this way as it is very effective solution. We get the row order by date difference and select the mini date difference. first way you can select what columns to show. This is also more generic solution.
SELECT A.Id
, A.ReferalDate
, A.Supervision
FROM
(
SELECT
p.Id
, r.ReferalDate
, s.Supervision
,ROW_NUMBER() OVER (PARTITION BY p.Id ORDER BY DATEDIFF(DD,r.ReferalDate,s.Supervision) ASC) as [row_num]
FROM person p
INNER JOIN referral r
on r.pid = p.Id
INNER JOIN supervision s
on s.pid = p.Id
) AS A
WHERE A.row_num = 1
this way we order by date difference and get the top 1 record. If you add this in to CTE or Derived table, you can still return only the columns you need. This query is scenario specific as we do not consider partitions.
SELECT top 1
p.Id
, r.ReferalDate
, s.Supervision
,DATEDIFF(DD,r.ReferalDate,s.Supervision) as [date dif]
FROM person p
INNER JOIN referral r
on r.pid = p.Id
INNER JOIN supervision s
on s.pid = p.Id
ORDER BY [date dif] ASC
Both ways return the same result as our order by is placed on date difference.
here is one way:
select p.Id, r.ReferralDate, s.SupervisionDate from (
select p.Id, r.ReferralDate, s.SupervisionDate , row_number() over (partition by id order by abs(datediff(day , ReferralDate,SupervisionDate))) rn
from person p
join referral r on r.PersonId = p.Id
join supervision s on s.PersonId = p.Id
) t
where rn = 1
another way:
select ID, SUP from
( SELECT
p.Id 'ID',
r.ReferralDate 'REF',
s.SupervisionDate 'SUP',
min(julianday(s.SupervisionDate) - julianday(r.ReferralDate) ) 'diff'
FROM person p
INNER JOIN referral r on r.PersonId = p.Id
INNER JOIN supervision s on s.PersonId = p.Id
GROUP BY p.ID
)
This will return ID and SupervisionDate as reqested.

PARTITION BY duplicated id and JOIN with the ID with the least value

I need to JOIN through a view in SQLServer 2008 tables hstT and hstD. The main table contains a data regarding employees and their "logins" (so multiple records associated to x employee in x month) and the second table has info about their area based on months, and I need to join both tables but keeping the earliest record as reference for the join and the rest of records associated to that id.
So hstT its something like:
id id2 period name
----------------------
x 1 0718 john
x 1 0818 john
y 2 0718 jane
And hstD:
id2 period area
----------------------
1 0718 sales
1 0818 hr
2 0707 mng
With an OUTER JOIN I manage to merge all data based on ID2 (user id) and the period BUT as I mentioned I need to join the other table based on the earliest record by associating ID (which I could use as criteria) so it would look like this:
id id2 period name area
---------------------------
x 1 0718 john sales
x 1 0818 john sales
y 2 0718 jane mng
I know I could use ROW_number but I don't know how to use it in a view and JOIN it on those conditions:
SELECT T.*,D.*, ROW_NUMBER() OVER (PARTITION BY T.ID ORDER BY T.PERIOD ASC) AS ORID
FROM dbo.hstT AS T LEFT OUTER JOIN
dbo.hstD AS D ON T.period = D.period AND T.id2 = D.id2
WHERE ORID = 1
--prompts error as orid doesn't exist in any table
You can use apply for this:
select t.*, d.area
from hstT t outer apply
(select top (1) d.*
from hstD d
where d.id2 = t.id2 and d.period <= t.period
order by d.period asc
) d;
Actually, if you just want the earliest period, then you can filter and join:
select t.*, d.area
from hstT t left join
(select d.*, row_number() over (partition by id2 order by period asc) as seqnum
from hstD d
order by d.period asc
) d;
on d.id2 = t.id2 and seqnum = 1;

How to make LEFT JOIN with row having max date?

I have two tables in Oracle DB
Person (
id
)
Bill (
id,
date,
amount,
person_id
)
I need to get person and amount from last bill if exist.
I trying to do it this way
SELECT
p.id,
b.amount
FROM Person p
LEFT JOIN Bill b
ON b.person_id = p.id AND b.date = (SELECT MAX(date) FROM Bill WHERE person_id = 1)
WHERE p.id = 1;
But this query works only with INNER JOIN. In case of LEFT JOIN it throws ORA-01799 a column may not be outer-joined to a subquery
How can I get amoun from the last bill using left join?
Please try the below avoiding sub query to be outer joined
SELECT
p.id,
b.amount
FROM Person p
LEFT JOIN(select * from Bill where date =
(SELECT MAX(date) FROM Bill b1 WHERE person_id = 1)) b ON b.person_id = p.id
WHERE p.id = 1;
What you are looking for is a way to tell in bills, for each person, what is the latest record, and that one is the one to join with. One way is to use row_number:
select * from person p
left join (select b.*,
row_number() over (partition by person_id order by date desc) as seq_num
from bills b) b
on p.id = b.person_id
and seq_num = 1
You cannot have a subquery inside an ON statement.
Instead you need to convert your LEFT JOIN statement into a whole subquery.
Not tested but this should work.
SELECT
p.id,
b.amount
FROM Person p
LEFT JOIN (
SELECT id FROM Bill
WHERE person_id = p.id
AND date = (SELECT date FROM Bill WHERE person_id = 1)) b
WHERE p.id = 1;
I'm not quite sure why you would want to filter for the date though.
Simply filtering for the person_id should do the trick
you should join Person and Bill to the result for max date in bill related to person_id
select Person.id, bill.amount
from Person
left join bill on bill.person_id = person.id
left join (
select person_id, max(date) as max_date
from bill
group by person_id ) t on t.person_id = Person.id and b.date = t.max_date
Hey you can do like this
SELECT
p.id,
b.amount
FROM Person p
LEFT JOIN Bill b
ON b.person_id = p.id AND b.date = (SELECT max(date) FROM Bill WHERE person_id = 1)
WHERE p.id = 1
SELECT
p.id,
b.amount
FROM Person p
LEFT JOIN Bill b
ON b.person_id = p.id
WHERE (SELECT max(date) FROM bill AS sb WHERE sb.person_id=p.id LIMIT 1)=b.date;
SELECT
p.id,
c.amount
FROM Person p
LEFT JOIN (select b.person_id as personid,b.amount as amount from Bill b where b.date1= (select max(date1) from Bill where person_id=1)) c
ON c.personid = p.id
WHERE p.id = 1;
try this
select * from person p
left join (select MAX(id) KEEP (DENSE_RANK FIRST ORDER BY date DESC)
from bills b) b
on p.id = b.person_id
I use GREATEST() function in join condition:
SELECT
p.id,
b.amount
FROM Person p
LEFT JOIN Bill b
ON b.person_id = p.id
AND b.date = GREATEST(b.date)
WHERE p.id = 1
This allows you to grab the whole row if necessary and grab the top x rows
SELECT p.id
,b.amount
FROM person p
LEFT JOIN
(
SELECT * FROM
(
SELECT date
,ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date DESC) AS row_num
FROM bill
)
WHERE row_num = 1
) b ON p.id = b.person_id
WHERE p.id = 1
;

Getting oldest Date SQL Complexity

I have a problem which I cannot resolve no matter what without using code, instead of SQL SCRIPT.
I have 2 tables
Person
ID Name Type
1 A A1
2 B A2
3 C A3
4 D A4
5 E A6
PersonHomes
HOMEID Location PurchaseDate PersonID
1 CA 20160101 1
2 CT 20160202 1
3 DT 20160101 2
4 BT 20170102 3
5 CT 20160303 1
6 CA 20160101 2
PersonID is foreign key of Person Table
There are no other rowz in the tables
So, we have to show detail of EACH person WITH home
The rule to write output is
IF Person has SINGLE entry in PersonHomes then use it
IF Person has MORE than ONE entry in PersonHomes then we have to look at purchase date, IF they are different then USE the PersonHomes ROW with OLDEST date in it. AND DELETE OTHER ROWS OF HIM
IF Person has MORE than ONE entry in PersonHomes then we have to look at purchase date, and IF DATES are SAME then USE the ROW with LOWER ID AND DELETE THE OTHER ROWS of HIM
This is very easy to do in code but using SQL it is complex
What I tried was to
WITH PERSON (
SELECT * FROM Person)
SELECT * FROM PERSON
INNER JOIN PersonHomes ON Person.ID = PersonHomes.PersonID
WHERE PersonHomes.PersonID = CASE WHEN (COUNT (*) FROM PersonHomes...)
Then I think I can write SQL function ?
I am stuck, Please help!
SAMPLE OUTPUT for PERSON A
ID NAME Type HOMEID Location PurchaseDate
1 A A1 5 CT 20160303
For PERSON B
ID NAME Type HOMEID Location PurchaseDate
1 A A2 3 DT 20160101
Aiden
It is not so easy to get desired output with SQL. we should write more than one sql queries.
First I created a temp table which consists of home details:
select PersonID, count(*) as HomeCount, count(distinct PurchaseDate) as
PurchaseDateCount, min(PurchaseDate) oldestPurchaseDate, min(HOMEID) as
LowerHomeID into #PersonHomesAbstractTable from PersonHomes group by PersonID
Then for the output of your first rule:
select p.ID, p.NAME, p.Type, ph.HOMEID, ph.Location, ph.PurchaseDate from Person p
inner join #PersonHomesAbstractTable a on p.ID = a.PersonID
inner join PersonHomes ph on p.ID = ph.PersonID
where a.HomeCount = 1
For the output of your second rule:
select p.ID, p.NAME, p.Type, ph.HOMEID, ph.Location, ph.PurchaseDate
from Person p inner join #PersonHomesAbstractTable a on p.ID = a.PersonID
inner join PersonHomes ph on p.ID = ph.PersonID and
ph.PurchaseDate = a.oldestPurchaseDate
where a.HomeCount > 1 and a.PurchaseDateCount <> 1
And finally for the output of your third rule:
select p.ID, p.NAME, p.Type, ph.HOMEID, ph.Location, ph.PurchaseDate
from Person p inner join #PersonHomesAbstractTable a on p.ID = a.PersonID
inner join PersonHomes ph on p.ID = ph.PersonID and
ph.HOMEID = a.LowerHomeID
where a.HomeCount > 1 and a.PurchaseDateCount = 1
Of course there are some other ways, but now this way is come to my mind.
If you want to delete undesired rows, you can use scripts below:
delete from PersonHomes where HOMEID in
(
select ph.HOMEID from #PersonHomesAbstractTable a
inner join PersonHomes ph on a.PersonID = ph.PersonID and
ph.PurchaseDate <> a.oldestPurchaseDate
where a.HomeCount > 1 and a.PurchaseDateCount <> 1
union
select p.HOMEID from #PersonHomesAbstractTable a
inner join PersonHomes ph on a.PersonID = ph.PersonID and
ph.HOMEID <> a.LowerHomeID
where a.HomeCount > 1 and a.PurchaseDateCount = 1
)
You seem to have a prioritization query. I would solve this using row_number():
select ph.*
from (select ph.*,
row_number() over (partition by personid
order by purchasedate asc, homeid asc
) as seqnum
from personhomes ph
) ph
where seqnum = 1;
This doesn't actually change the data in the table. Although you say delete, it seems like you just want a result set with one home per person.
This is shortest approach got by Link
;WITH cte AS
(
SELECT *, RowN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY AddressMoveDate DESC) FROM Address
)
DELETE FROM cte WHERE RowN > 1

SQL Server Query using GROUP BY

I am having trouble writing a query that will select all Skills, joining the Employee and Competency records, but only return one skill per employee, their newest Skill. Using this sample dataset
Skills
======
id employee_id competency_id created
1 1 1 Jan 1
2 2 2 Jan 1
3 1 2 Jan 3
Employees
===========
id first_name last_name
1 Mike Jones
2 Steve Smith
Competencies
============
id title
1 Problem Solving
2 Compassion
I would like to retrieve the following data
Skill.id Skill.employee_id Skill.competency_id Skill.created Employee.id Employee.first_name Employee.last_name Competency.id Competency.title
2 2 2 Jan 1 2 Steve Smith 2 Compassion
3 1 2 Jan 3 1 Mike Jones 2 Compassion
I was able to select the employee_id and max created using
SELECT MAX(created) as created, employee_id FROM skills GROUP BY employee_id
But when I start to add more fields in the select statement or add in a join I get the 'Column 'xyz' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.' error.
Any help is appreciated and I don't have to use GROUP BY, it's just what I'm familiar with.
The error that you were getting is because SQL Server requires any item in the SELECT list to be included in the GROUP BY if there is an aggregate function being used.
The problem with that is you might have unique values in some columns which can throw off the result. So you will want to rewrite the query to use one of the following:
You can use a subquery to get this result. This gets the max(created) in a subquery and then you use that result to get the correct employee record:
select s.id SkillId,
s.employee_id,
s.competency_id,
s.created,
e.id employee,
e.first_name,
e.last_name,
c.id competency,
c.title
from Employees e
left join Skills s
on e.id = s.employee_id
inner join
(
SELECT MAX(created) as created, employee_id
FROM skills
GROUP BY employee_id
) s1
on s.employee_id = s1.employee_id
and s.created = s1.created
left join Competencies c
on s.competency_id = c.id
See SQL Fiddle with Demo
Or another way to do this is to use row_number():
select *
from
(
select s.id SkillId,
s.employee_id,
s.competency_id,
s.created,
e.id employee,
e.first_name,
e.last_name,
c.id competency,
c.title,
row_number() over(partition by s.employee_id
order by s.created desc) rn
from Employees e
left join Skills s
on e.id = s.employee_id
left join Competencies c
on s.competency_id = c.id
) src
where rn = 1
See SQL Fiddle with Demo
For every non-aggregated column you add to your SELECT statement you need to update your GROUP BY to include it.
This article may help you understand why.
;WITH
MAX_SKILL_created AS
(
SELECT
MAX(skills.created) as created,
skills.employee_id
FROM
skills
GROUP BY
skills.employee_id
),
MAX_SKILL_id AS
(
SELECT
MAX(skills.id) as id,
skills.employee_id
FROM
skills
INNER JOIN MAX_SKILL_created
ON MAX_SKILL_created.employee_id = skills.employee_id
AND MAX_SKILL_created.created = skills.created
GROUP BY
skills.employee_id
)
SELECT
* -- type all your columns here
FROM
employees
INNER JOIN MAX_SKILL_id
ON MAX_SKILL_id.employee_id = employees.employee_id
INNER JOIN skills
ON skills.id = MAX_SKILL_id.id
INNER JOIN competencies
ON competencies.id = skills.competency_id
If you are using SQL Server than you can use OUTER APPLY
SELECT *
FROM employees E
OUTER APPLY (
SELECT TOP 1 *
FROM skills
WHERE employee_id = E.id
ORDER BY created DESC
) S
INNER JOIN competencies C
ON C.id = S.competency_id