Unique rows from PostgreSQL database and ordering - sql

how can I get unique rows from PostgreSQL (8.4.20), at this moment my query looks like:
SELECT
DISTINCT ON(student.id) student.id,
student.*,
programme_stage.*
FROM
person AS student
INNER JOIN
programme ON (student.id = programme.person_id)
RIGHT JOIN
programme_stage ON (programme.id = programme_stage.programme_id)
ORDER BY
student.id,
student.last_name ASC,
student.first_name ASC
LIMIT 10 OFFSET 0
The above query works correctly, but I want to sort by the last and first name in first order.
Sample data:
| id | first_name | last_name | programme_id | programme_stage _id |
|----|------------|-----------|--------------|---------------------|
| 1 | Michał | Nowak | 1 | 1 |
| 2 | Jan | Kowalski | 2 | 2 |
| 3 | Tomasz | Thomas | 2 | 1 |
Expected output:
| id | first_name | last_name | programme_id | programme_stage _id |
|----|------------|-----------|--------------|---------------------|
| 2 | Jan | Kowalski | 2 | 2 |
| 1 | Michał | Nowak | 1 | 1 |
| 3 | Tomasz | Thomas | 2 | 1 |
If I try to remove from order statment student.id column, I getting error:
..SELECT DISTINCT ON expressions must match the expressions .. ORDER BY LINE 2: DISTINCT ON(student_person.id) student_person.id,
^

Use a subquery:
SELECT s.*
FROM (SELECT DISTINCT ON (s.id) s.*, ps.*
FROM programme_stage ps LEFT JOIN
programme p
ON programme.id = ps.programme_id LEFT JOIN
person s
ON s.id = p.person_id)
ORDER BY s.id
LIMIT 10 OFFSET 0
) s
ORDER BY s.last_name ASC, s.first_name ASC;
Notes:
I prefer shorter table aliases so the query is easier to write and to read.
I prefer LEFT JOIN to RIGHT JOIN. However, because you are using the person table for aggregation, you probably want inner joins.
There is no need to select the id twice.
You may need to select explicit columns instead of * if columns have the same name.

Related

PostgreSQL - Max value for each id

I'm trying to get max value of exam_id from table exams for each protege.
proteges
protege_id | protege_patron | protege_firstname | protege_lastname
------------+----------------+-------------------+------------------
1 | 1 | Andrzej | Maniek
2 | 1 | Anna | Maj
3 | 1 | Joanna | Jankowska
exams
exam_id | exam_protege | exam_weight | exam_glucose | exam_pressure
---------+--------------+-------------+--------------+---------------
1 | 1 | 84 | 3ml | 123/84
2 | 1 | 99 | 23ml | 124/72
3 | 2 | 99 | 23ml | 124/72
4 | 3 | 94 | 23ml | 124/72
First I've tried
SELECT DISTINCT protege_patron, exams.*
FROM exams INNER JOIN proteges ON protege_id = exam_protege
WHERE exam_id = (SELECT MAX(exam_id) FROM exams WHERE protege_patron = 1);
and the output was:
protege_patron | exam_id | exam_protege | exam_weight | exam_glucose | exam_pressure
----------------+---------+--------------+-------------+--------------+---------------
1 | 4 | 3 | 94 | 23ml | 124/72
(1 row)
After trying SELECT protege_firstname, protege_lastname, MAX (exam_id) FROM exams JOIN proteges ON protege_id = exam_protege GROUP BY protege_id; the output is:
protege_firstname | protege_lastname | max
-------------------+------------------+-----
Andrzej | Maniek | 2
Anna | Maj | 3
Joanna | Jankowska | 4
(3 rows)
So, logical way was to add more things like exam_weight
That's what I did :
SELECT protege_firstname, protege_lastname, exam_weight, MAX (exam_id) FROM exams JOIN proteges ON protege_id = exam_protege GROUP BY protege_id;
ERROR: column "exams.exam_weight" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select protege_firstname, protege_lastname, exam_weight, MAX...
^
Atm I don't know how to fix that. Tried distinct, read some about aggregate functions... Is there any way to do that? All I want to do is to JOIN two tables and for each protege select all of his values and values of his exam with max exam_id...
You can to use distinct on. I think the logic is:
select distinct on (exam_protege) e.*
from exams e
order by exam_protege, exam_id desc;
You can, of course also bring in the protege information using a join:
select distinct on (exam_protege) e.*, p.*
from exams e join
protege p
on e.exam_protege = p.protege_id
order by exam_protege, exam_id desc;
You can do it like this:
select * from
(
select max(exam_id) maxexamid, exam_protege from exams group by exam_protege
) as maxexams
inner join proteges p
on maxexams.exam_protege = p.protege_id
inner join exams e
on e.exam_id = maxexams.maxexamid

how to select unique records from a table based on a column which has distinct values in another column

I have below table SUBJ_SKILLS which has records like
TCHR_ID | LINE_NBR | SUBJ | SUBJ_TYPE
--------| ------- | ---------- | ----------
1 | 1 | Maths | R
1 | 2 | 101 | U
2 | 1 | BehaviourialTech | U
3 | 2 | Maths | R
4 | 1 | RegionalLANG | U
5 | 3 | ForeignLANG | U
5 | 4 | Maths | R
6 | 2 | Science | R
7 | 1 | 101 | U
7 | 3 | Physics | R
..
..
I am trying to retrieve records like below (i.e. single teacher who taught multiple different subjects)
TCHR_ID | LINE_NBR | SUBJ | SUBJ_TYPE
--------| ------- | ---------- | ----------
5 | 3 | ForeignLANG | U
5 | 4 | Maths | R
7 | 1 | 101 | U
7 | 3 | Physics | R
1 | 1 | Maths | R
1 | 2 | 101 | U
Here, the line numbers are unique, means that TCHR_ID:5 taught Physics (which was LINE_NBR=1, but was removed later). So, the LINE_NBR are not updated and stay as is.
i also have a look up table (SUBJ_LKUP) for subject and their categories/type like below ('R' for Regular subject and 'U' for Unique subject )
SUBJ | SUBJ_TYPE
----------------- | ------------
Maths | R
Physics | R
ForeignLANG | U
101 | U
Science | R
BehaviourialTech | U
RegionalLANG | U
My approach to resolve this was to create a table which have 2 records for Teacher and use another query on base table (SUBJ_SKILLS) and new table to filter out distinct records. I came up with below queries..
Query-1:
create table tchr_with_2_subj as select SS.TCHR_ID
from SUBJ_SKILLS SS, SUBJ_LKUP SL
where SS.SUBJ = SL.SUBJ
and SL.SUBJ_TYPE IN ('R', 'U') AND SS.TCHR_ID IN
(select SS.TCHR_ID from SUBJ_SKILLS SS)
GROUP BY SS.TCHR_ID HAVING COUNT(*) = 2)
Query-2:
select SS.TCHR_ID from SUBJ_SKILLS SS, tchr_with_2_subj tw2s
where SS.TCHR_ID = tw2s.TCHR_ID
GROUP BY SS.TCHR_ID,SS.SUBJ_TYPE HAVING COUNT(*) > 1)
Question:
1)'IN' condition in Query-1 is causing problems and pulling wrong records.
2) Is there a better way to write query to pull matching records using a single query (i.e. instead of creating a table)
Could someone help me on this pls.
For the answer to your original question, I would use window functions:
select ss.*
from (select ss.*,
min(subj) over (partition by tchr_id) as mins,
max(subj) over (partition by tchr_id) as maxs
from SUBJ_SKILLS ss
) ss
where mins <> maxs;
It is unclear how the subject type fits in, but if you need to include that, similar logic will work.
Your second table can be obtained from your first table with:
select ss.*
from
subj_skills as ss
inner join (
select tchr_id
from subj_skills
group by tchr_id
having count(*) > 1
) as mult on mult.tchr_id=ss.tchr_id;
I'd use analytic functions here, asomething like:
select tchr_id, line_nbr, subj, SUBJ_TYPE
from (select count(distinct subj) over (partition by tchr_id) as grp_cnt,
s.*
from subj_skills s)
where grp_cnt > 1
If you need to filter out invalid records, you can do it in the inner query. If a teacher cannot teach the same subject multiple times (the req 'multiple different subjects' can be translated to 'multiple subjects'), then I'd rather use count(*) instead of count(distinct subj).

how to bake in a record count in a sql query

I have a query that looks like this:
select id, extension, count(distinct(id)) from publicids group by id,extension;
This is what the results looks like:
id | extension | count
-------------+-------------------------+-------
18459154909 | 12333 | 1
18459154909 | 9891114 | 1
18459154919 | 43244 | 1
18459154919 | 8776232 | 1
18766145025 | 12311 | 1
18766145025 | 1122111 | 1
18766145201 | 12422 | 1
18766145201 | 14141 | 1
But what I really want is for the results to look like this:
id | extension | count
-------------+-------------------------+-------
18459154909 | 12333 | 2
18459154909 | 9891114 | 2
18459154919 | 43244 | 2
18459154919 | 8776232 | 2
18766145025 | 12311 | 2
18766145025 | 1122111 | 2
18766145201 | 12422 | 2
18766145201 | 14141 | 2
I'm trying to get the count field to show the total number of records that have the same id.
Any suggestions would be appreciated
I think you want to count distincts extentions, not ids.
Run this query:
select id
, extension
(select count(*) from publicids p1 where p.id = p1.id ) distinct_id_count
from publicids p
group by id,extension;
This is more or less the same as Pastor's answer. Depending on what the optimizer does it might be faster with higher record count source tables.
select p.id, p.extension, p2.id_count
from publicids p
inner join (
select id, count(*) as id_count
from publicids group by id
) as p2 on p.id = p2.id

PostgreSQL join two tables with LIMIT 1

I have two tables:
First table "persons"
id | name |
---------------
1 | peter |
3 | martin |
5 | lucy |
Second table "meetings"
id | date | id_persons |
--------------------------------
1 | 2014-12-08 | 1 |
2 | 2013-05-10 | 2 |
3 | 2015-08-25 | 1 |
4 | 2016-10-18 | 1 |
5 | 2012-01-01 | 3 |
6 | 2016-09-28 | 5 |
I need somehow get only last date from "meeting" table for every person (or selected). And result table must be order by name. I thought, it could be like this, but WHERE clause in LEFT JOIN can't be used:
SELECT meetings.id, meetings.date, persons.name FROM persons
LEFT JOIN (SELECT meetings.date, meetings.id, meetings.id_persons FROM
meetings WHERE persons.id = meetings.id_persons ORDER BY
meetings.date DESC LIMIT 1) m ON m.id_persons = persons.id
WHERE persons.id < 6 ORDER BY persons.name
So I started with DISTINCT and it worked, but I think that it is not good idea:
SELECT * FROM
(SELECT DISTINCT ON (persons.id) persons.id, persons.name,
m.date, m.id FROM persons
LEFT JOIN (SELECT meetings.id, meetings.date, meetings.id_persons
FROM meetings ORDER BY meetings.date DESC) m
ON m.id_persons = persons.id
WHERE persons.id < 6 ORDER BY persons.id) p
ORDER BY p.name
Result what I need is:
name | date | id_meetings
-----------------------------------
lucy | 2016-09-28 | 6
martin | 2012-01-01 | 5
peter | 2016-10-18 | 4
Could you help me with better solution?
In Postgres, the easiest way is probably distinct on:
select distinct on (p.id) p.*, m.*
from persons p left join
meetings m
on m.id_persons = p.id
order by p.id, m.date desc;
Note: distinct on is specific to Postgres.

Sql two table query most duplicated foreign key

I got those two tables sport and student:
First table sport:
|idsport | name |
_______________________
| 1 | bobsled |
| 2 | skating |
| 3 | boarding |
| 4 | iceskating |
| 5 | skiing |
Second table student:
foreign key
|idstudent | name | sport_idsport
__________________________________________
| 1 | john | 3 |
| 2 | pauly | 2 |
| 3 | max | 1 |
| 4 | jane | 2 |
| 5 | nico | 5 |
so far i did this it output which number is mostly inserted, but cant get it to work
with two tables
SELECT sport_idsport
FROM (SELECT sport_idsport FROM student GROUP BY sport_idsport ORDER BY COUNT(*) desc)
WHERE ROWNUM<=1;
I need to output name of most popular sport, in that case it would be skating.
I use oracle sql.
with counter as (
Select sport_idsport,
count(*) as cnt,
dense_rank() over (order by count(*) desc) as rn
from student
group by sport_idsport
)
select s.*, c.cnt
from sport s
join counter c on c.sport_idsport = s.idsport and c.rn = 1;
SQLFiddle example: http://sqlfiddle.com/#!4/b76e21/1
select cnt, sport_idsport from (
select count(*) cnt, sport_idsport
from student
group by sport_idsport
order by count(*) desc
)
where rownum = 1