PostgreSQL join two tables with LIMIT 1 - sql

I have two tables:
First table "persons"
id | name |
---------------
1 | peter |
3 | martin |
5 | lucy |
Second table "meetings"
id | date | id_persons |
--------------------------------
1 | 2014-12-08 | 1 |
2 | 2013-05-10 | 2 |
3 | 2015-08-25 | 1 |
4 | 2016-10-18 | 1 |
5 | 2012-01-01 | 3 |
6 | 2016-09-28 | 5 |
I need somehow get only last date from "meeting" table for every person (or selected). And result table must be order by name. I thought, it could be like this, but WHERE clause in LEFT JOIN can't be used:
SELECT meetings.id, meetings.date, persons.name FROM persons
LEFT JOIN (SELECT meetings.date, meetings.id, meetings.id_persons FROM
meetings WHERE persons.id = meetings.id_persons ORDER BY
meetings.date DESC LIMIT 1) m ON m.id_persons = persons.id
WHERE persons.id < 6 ORDER BY persons.name
So I started with DISTINCT and it worked, but I think that it is not good idea:
SELECT * FROM
(SELECT DISTINCT ON (persons.id) persons.id, persons.name,
m.date, m.id FROM persons
LEFT JOIN (SELECT meetings.id, meetings.date, meetings.id_persons
FROM meetings ORDER BY meetings.date DESC) m
ON m.id_persons = persons.id
WHERE persons.id < 6 ORDER BY persons.id) p
ORDER BY p.name
Result what I need is:
name | date | id_meetings
-----------------------------------
lucy | 2016-09-28 | 6
martin | 2012-01-01 | 5
peter | 2016-10-18 | 4
Could you help me with better solution?

In Postgres, the easiest way is probably distinct on:
select distinct on (p.id) p.*, m.*
from persons p left join
meetings m
on m.id_persons = p.id
order by p.id, m.date desc;
Note: distinct on is specific to Postgres.

Related

Postgresql left join

I have two tables cars and usage. I create a record in usage once a month for some of cars.
Now I want to get distinct list of cars with their latest usage that I saved.
first of all look at the tables please
cars:
| id | model | reseller_id |
|----|-------------|-------------|
| 1 | Samand Sall | 324228 |
| 2 | Saba 141 | 92933 |
usages:
| id | car_id | year | month | gas |
|----|--------|------|-------|-----|
| 1 | 2 | 2020 | 2 | 68 |
| 2 | 2 | 2020 | 3 | 94 |
| 3 | 2 | 2020 | 4 | 33 |
| 4 | 2 | 2020 | 5 | 12 |
The problem is here
I need only the latest usage of year and month
I tried a lot of ways but none of them is good enough. because sometimes this query gets me one ofnot latest records of usages.
SELECT * FROM cars AS c
LEFT JOIN
(select *
from usages
) u on (c.id = u.car_id)
order by u.gas desc
You can do this with a DISTINCT ON in the derived table:
SELECT *
FROM cars AS c
LEFT JOIN (
select distinct on (u.car_id) *
from usages u
order by u.car_id, u.year desc, u.month desc
) lu on c.id = lu.car_id
order by u.gas desc;
I think you need window function row_number. Here is the demo.
select
id,
model,
reseller_id
from
(
select
c.id,
model,
reseller_id,
row_number() over (partition by u.car_id order by u.id desc) as rn
from cars c
left join usages u
on c.id = u.car_id
) subq
where rn = 1

PostgreSQL - Max value for each id

I'm trying to get max value of exam_id from table exams for each protege.
proteges
protege_id | protege_patron | protege_firstname | protege_lastname
------------+----------------+-------------------+------------------
1 | 1 | Andrzej | Maniek
2 | 1 | Anna | Maj
3 | 1 | Joanna | Jankowska
exams
exam_id | exam_protege | exam_weight | exam_glucose | exam_pressure
---------+--------------+-------------+--------------+---------------
1 | 1 | 84 | 3ml | 123/84
2 | 1 | 99 | 23ml | 124/72
3 | 2 | 99 | 23ml | 124/72
4 | 3 | 94 | 23ml | 124/72
First I've tried
SELECT DISTINCT protege_patron, exams.*
FROM exams INNER JOIN proteges ON protege_id = exam_protege
WHERE exam_id = (SELECT MAX(exam_id) FROM exams WHERE protege_patron = 1);
and the output was:
protege_patron | exam_id | exam_protege | exam_weight | exam_glucose | exam_pressure
----------------+---------+--------------+-------------+--------------+---------------
1 | 4 | 3 | 94 | 23ml | 124/72
(1 row)
After trying SELECT protege_firstname, protege_lastname, MAX (exam_id) FROM exams JOIN proteges ON protege_id = exam_protege GROUP BY protege_id; the output is:
protege_firstname | protege_lastname | max
-------------------+------------------+-----
Andrzej | Maniek | 2
Anna | Maj | 3
Joanna | Jankowska | 4
(3 rows)
So, logical way was to add more things like exam_weight
That's what I did :
SELECT protege_firstname, protege_lastname, exam_weight, MAX (exam_id) FROM exams JOIN proteges ON protege_id = exam_protege GROUP BY protege_id;
ERROR: column "exams.exam_weight" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select protege_firstname, protege_lastname, exam_weight, MAX...
^
Atm I don't know how to fix that. Tried distinct, read some about aggregate functions... Is there any way to do that? All I want to do is to JOIN two tables and for each protege select all of his values and values of his exam with max exam_id...
You can to use distinct on. I think the logic is:
select distinct on (exam_protege) e.*
from exams e
order by exam_protege, exam_id desc;
You can, of course also bring in the protege information using a join:
select distinct on (exam_protege) e.*, p.*
from exams e join
protege p
on e.exam_protege = p.protege_id
order by exam_protege, exam_id desc;
You can do it like this:
select * from
(
select max(exam_id) maxexamid, exam_protege from exams group by exam_protege
) as maxexams
inner join proteges p
on maxexams.exam_protege = p.protege_id
inner join exams e
on e.exam_id = maxexams.maxexamid

SQL How to combine MAX and COUNT using inner query properly

I'm having a problem with selecting rows only with maximum values from column ProblemsAmount, which represents COUNT(*) from inner query. It looks like:
PersonID | PersonName | ProblemID | ProblemsAmount
1 | Johny | 1 | 10
1 | Johny | 2 | 5
1 | Johny | 3 | 18
2 | Sara | 4 | 2
2 | Sara | 5 | 12
3 | Katerina | 6 | 17
3 | Katerina | 7 | 2
4 | Elon | 8 | 20
5 | Willy | 9 | 6
5 | Willy | 10 | 2
What I want to get:
PersonID | PersonName | ProblemID | ProblemsAmount
1 | Johny | 3 | 18
2 | Sara | 5 | 12
3 | Katerina | 6 | 17
4 | Elon | 8 | 20
5 | Willy | 9 | 6
The code I have right now:
SELECT A.PersonID,
A.PersonName,
A.ProblemID,
MAX(A.ProblemsCounter) AS ProblemsAmount
FROM (SELECT Person.PersonId AS PersonID,
Person.Name AS PersonName,
Problem.ProblemId AS ProblemID,
COUNT(*) AS ProblemsCounter
FROM Person,
Problem
WHERE Problem.ProblemId = Person.ProblemId
GROUP BY Person.PersonId, Person.Name, Problem.ProblemId
) A
GROUP BY A.PersonID, A.PersonName, A.ProblemID
ORDER BY A.PersonName, ProblemsAmount DESC;
Inner query returns the same thing as outer does, I'm confused with MAX function. It doesn't work and I don't understand why. I tried to fix it using HAVING, but it wasn't successfully.
Thanks in advance.
A simple method that doesn't require a subquery is TOP (1) WITH TIES and ROW_NUMBER():
SELECT TOP (1) WITH TIES p.PersonId, p.Name AS PersonName,
pr.ProblemId, COUNT(*) AS ProblemsCounter
FROM Person p JOIN
Problem pr
ON pr.ProblemId = p.ProblemId
GROUP BY p.PersonId, p.Name, pr.ProblemId
ORDER BY ROW_NUMBER() OVER (PARTITION BY p.PersonId ORDER BY COUNT(*) DESC);
Note that I have also fixed the JOIN syntax. Always use proper, explicit, standard JOIN syntax.
no need subquery try like below and avoid coma separated join
SELECT Person.PersonID ,
Person.Name AS PersonName,
COUNT(Problem.ProblemId) AS ProblemsCounter
,max(ProblemsAmount) as ProblemsAmount
FROM Person left join
Problem
on Problem.ProblemId = Person.ProblemId
GROUP BY Person.PersonID, Person.Name

Unique rows from PostgreSQL database and ordering

how can I get unique rows from PostgreSQL (8.4.20), at this moment my query looks like:
SELECT
DISTINCT ON(student.id) student.id,
student.*,
programme_stage.*
FROM
person AS student
INNER JOIN
programme ON (student.id = programme.person_id)
RIGHT JOIN
programme_stage ON (programme.id = programme_stage.programme_id)
ORDER BY
student.id,
student.last_name ASC,
student.first_name ASC
LIMIT 10 OFFSET 0
The above query works correctly, but I want to sort by the last and first name in first order.
Sample data:
| id | first_name | last_name | programme_id | programme_stage _id |
|----|------------|-----------|--------------|---------------------|
| 1 | Michał | Nowak | 1 | 1 |
| 2 | Jan | Kowalski | 2 | 2 |
| 3 | Tomasz | Thomas | 2 | 1 |
Expected output:
| id | first_name | last_name | programme_id | programme_stage _id |
|----|------------|-----------|--------------|---------------------|
| 2 | Jan | Kowalski | 2 | 2 |
| 1 | Michał | Nowak | 1 | 1 |
| 3 | Tomasz | Thomas | 2 | 1 |
If I try to remove from order statment student.id column, I getting error:
..SELECT DISTINCT ON expressions must match the expressions .. ORDER BY LINE 2: DISTINCT ON(student_person.id) student_person.id,
^
Use a subquery:
SELECT s.*
FROM (SELECT DISTINCT ON (s.id) s.*, ps.*
FROM programme_stage ps LEFT JOIN
programme p
ON programme.id = ps.programme_id LEFT JOIN
person s
ON s.id = p.person_id)
ORDER BY s.id
LIMIT 10 OFFSET 0
) s
ORDER BY s.last_name ASC, s.first_name ASC;
Notes:
I prefer shorter table aliases so the query is easier to write and to read.
I prefer LEFT JOIN to RIGHT JOIN. However, because you are using the person table for aggregation, you probably want inner joins.
There is no need to select the id twice.
You may need to select explicit columns instead of * if columns have the same name.

how to select unique records from a table based on a column which has distinct values in another column

I have below table SUBJ_SKILLS which has records like
TCHR_ID | LINE_NBR | SUBJ | SUBJ_TYPE
--------| ------- | ---------- | ----------
1 | 1 | Maths | R
1 | 2 | 101 | U
2 | 1 | BehaviourialTech | U
3 | 2 | Maths | R
4 | 1 | RegionalLANG | U
5 | 3 | ForeignLANG | U
5 | 4 | Maths | R
6 | 2 | Science | R
7 | 1 | 101 | U
7 | 3 | Physics | R
..
..
I am trying to retrieve records like below (i.e. single teacher who taught multiple different subjects)
TCHR_ID | LINE_NBR | SUBJ | SUBJ_TYPE
--------| ------- | ---------- | ----------
5 | 3 | ForeignLANG | U
5 | 4 | Maths | R
7 | 1 | 101 | U
7 | 3 | Physics | R
1 | 1 | Maths | R
1 | 2 | 101 | U
Here, the line numbers are unique, means that TCHR_ID:5 taught Physics (which was LINE_NBR=1, but was removed later). So, the LINE_NBR are not updated and stay as is.
i also have a look up table (SUBJ_LKUP) for subject and their categories/type like below ('R' for Regular subject and 'U' for Unique subject )
SUBJ | SUBJ_TYPE
----------------- | ------------
Maths | R
Physics | R
ForeignLANG | U
101 | U
Science | R
BehaviourialTech | U
RegionalLANG | U
My approach to resolve this was to create a table which have 2 records for Teacher and use another query on base table (SUBJ_SKILLS) and new table to filter out distinct records. I came up with below queries..
Query-1:
create table tchr_with_2_subj as select SS.TCHR_ID
from SUBJ_SKILLS SS, SUBJ_LKUP SL
where SS.SUBJ = SL.SUBJ
and SL.SUBJ_TYPE IN ('R', 'U') AND SS.TCHR_ID IN
(select SS.TCHR_ID from SUBJ_SKILLS SS)
GROUP BY SS.TCHR_ID HAVING COUNT(*) = 2)
Query-2:
select SS.TCHR_ID from SUBJ_SKILLS SS, tchr_with_2_subj tw2s
where SS.TCHR_ID = tw2s.TCHR_ID
GROUP BY SS.TCHR_ID,SS.SUBJ_TYPE HAVING COUNT(*) > 1)
Question:
1)'IN' condition in Query-1 is causing problems and pulling wrong records.
2) Is there a better way to write query to pull matching records using a single query (i.e. instead of creating a table)
Could someone help me on this pls.
For the answer to your original question, I would use window functions:
select ss.*
from (select ss.*,
min(subj) over (partition by tchr_id) as mins,
max(subj) over (partition by tchr_id) as maxs
from SUBJ_SKILLS ss
) ss
where mins <> maxs;
It is unclear how the subject type fits in, but if you need to include that, similar logic will work.
Your second table can be obtained from your first table with:
select ss.*
from
subj_skills as ss
inner join (
select tchr_id
from subj_skills
group by tchr_id
having count(*) > 1
) as mult on mult.tchr_id=ss.tchr_id;
I'd use analytic functions here, asomething like:
select tchr_id, line_nbr, subj, SUBJ_TYPE
from (select count(distinct subj) over (partition by tchr_id) as grp_cnt,
s.*
from subj_skills s)
where grp_cnt > 1
If you need to filter out invalid records, you can do it in the inner query. If a teacher cannot teach the same subject multiple times (the req 'multiple different subjects' can be translated to 'multiple subjects'), then I'd rather use count(*) instead of count(distinct subj).