Postgres do array agg for each row - sql

I have a query which will take jobs_locum_hospital_ids from my doctor table, it will then join this to the hospital table on id and fetch the name, then placing all of these into an array.
so [187,123] --> ("George Eliot Hospital - Acute Services"),("Good Hope Hospital")
select array_agg(t)
from (
select h.name from (select jsonb_array_elements_text(d.jobs_locum_hospital_ids)::int as id from doctor d
where d.id = 11720) as q1
left join hospital h on h.id = q1.id
)t
But this is only performing this for where d.id = 11720
What I'd like to do is do this for each row. So in a way joining to
select * from doctor
left join that thing above

It is a bit hard to figure out your data structure or why you are using json functions for this. From what I can tell, doctors have an array of hospital ids and you want the names:
select d.*,
(select array_agg(h.name)
from unnest(d.jobs_locum_hospital_ids) dh join
hospital h
on dh = h.id
) as hospital_names
from doctors;
Just the fact that you want to do this suggests that you really want a junction table, doctorHospitals with one row per doctor and per hospital.

Related

Getting single row from JOIN given an additional condition

I'm making a select in which I give a year (hardcoded as 1981 below) and I expect to get one row per qualifying band. The main problem is to get the oldest living member for each band:
SELECT b.id_band,
COUNT(DISTINCT a.id_album),
COUNT(DISTINCT s.id_song),
COUNT(DISTINCT m.id_musician),
(SELECT name FROM MUSICIAN WHERE year_death IS NULL ORDER BY(birth)LIMIT 1)
FROM BAND b
LEFT JOIN ALBUM a ON(b.id_band = a.id_band)
LEFT JOIN SONG s ON(a.id_album = s.id_album)
JOIN MEMBER m ON(b.id_band= m.id_band)
JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)
/*LEFT JOIN(SELECT name FROM MUSICIAN WHERE year_death IS NULL
ORDER BY(birth) LIMIT 1) AS alive FROM mu*/ -- ??
WHERE b.year_formed = 1981
GROUP BY b.id_band;
I would like to obtain the oldest living member from mu for each band. But I just get the oldest musician overall from the relation MUSICIAN.
Here is screenshot showing output for my current query:
Well, I think you can follow the structure that you have, but you need JOINs in in the subquery.
SELECT b.id_band,
COUNT(DISTINCT a.id_album),
COUNT(DISTINCT s.id_song),
COUNT(DISTINCT mem.id_musician),
(SELECT m.name
FROM MUSICIAN m JOIN
MEMBER mem
ON mem.id_musician = m.id_musician
WHERE m.year_death IS NULL AND mem.id_band = b.id_band
ORDER BY m.birth
LIMIT 1
) as oldest_member
FROM BAND b LEFT JOIN
ALBUM a
ON b.id_band = a.id_band LEFT JOIN
SONG s
ON a.id_album = s.id_album LEFT JOIN
MEMBER mem
ON mem.id_band = b.id_band
WHERE b.year_formed = 1981
GROUP BY b.id_band
Following query will give you oldest member of each band group. You can put filter by year_formed = 1981 if you need.
SELECT
b.id_band,
total_albums,
total_songs,
total_musicians
FROM
(
SELECT b.id_band,
COUNT(DISTINCT a.id_album) as total_albums,
COUNT(DISTINCT s.id_song) as total_songs,
COUNT(DISTINCT m.id_musician) as total_musicians,
dense_rank() over (partition by b.id_band order by mu.year_death desc) as rnk
FROM BAND b
LEFT JOIN ALBUM a ON(b.id_band = a.id_band)
LEFT JOIN SONG s ON(a.id_album = s.id_album)
JOIN MEMBER m ON(b.id_band= m.id_band)
JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)
WHERE mu.year_death is NULL
)
where rnk = 1
You can reference a table that is out of this nested select, like so
SELECT b.id_band,
COUNT(DISTINCT a.id_album),
COUNT(DISTINCT s.id_song),
COUNT(DISTINCT m.id_musician),
(SELECT name FROM MUSICIAN WHERE year_death IS NULL ORDER BY(birth) AND
MUSICIAN.id_BAND = b.id_band LIMIT 1)
FROM BAND b
LEFT JOIN ALBUM a ON(b.id_band = a.id_band)
LEFT JOIN SONG s ON(a.id_album = s.id_album)
JOIN MEMBER m ON(b.id_band= m.id_band)
JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)
/*LEFT JOIN(SELECT name FROM MUSICIAN WHERE year_death IS NULL ORDER
BY(birth)LIMIT 1) AS alive FROM mu*/
WHERE b.year_formed= 1981
GROUP BY b.id_band
For queries where you want to find the "max person by age" you can use ROW_NUMBER() grouped by the band
SELECT b.id_band,
COUNT(DISTINCT a.id_album),
COUNT(DISTINCT s.id_song),
COUNT(DISTINCT m.id_musician),
oldest_living_members.*
FROM
band b
LEFT JOIN album a ON(b.id_band = a.id_band)
LEFT JOIN song s ON(a.id_album = s.id_album)
LEFT JOIN
(
SELECT
m.id_band
mu.*,
ROW_NUMBER() OVER(PARTITION BY m.id_band ORDER BY mu.birthdate ASC) rown
FROM
MEMBER m
JOIN MUSICIAN mu ON(m.id_musician = mu.id_musician)
WHERE year_death IS NULL
) oldest_living_members
ON
b.id_band = oldest_living_members.id_band AND
oldest_living_members.rown = 1
WHERE b.year_formed= 1981
GROUP BY b.id_band
If you run just the subquery you'll see how it's working = artists are joined to member to get the band id, and this forms a partition. Rownumber will start numbering from 1 according to the order of birthdates (I didn't know what your column name for birthday was; you'll have to edit it) so the oldest person (earliest birthday) gets a 1.. Every time the band id changes the numbering will restart from 1 with the oldest person in that band. Then when we join it we just pick the 1s
I think this should be considerably faster (while also solving your problem):
SELECT b.id_band, a.*, m.*
FROM band b
LEFT JOIN LATERAL (
SELECT count(*) AS ct_albums, sum(ct_songs) AS ct_songs
FROM (
SELECT id_album, count(*) AS ct_songs
FROM album a
LEFT JOIN song s USING (id_album)
WHERE a.id_band = b.id_band
GROUP BY 1
) ab
) a ON true
LEFT JOIN LATERAL (
SELECT count(*) OVER () AS ct_musicians
, name AS senior_member -- any other columns you need?
FROM member m
JOIN musician mu USING (id_musician)
WHERE m.id_band = b.id_band
ORDER BY year_death IS NOT NULL -- sorts the living first
, birth
, name -- as tiebreaker (my optional addition)
LIMIT 1
) m ON true
WHERE b.year_formed = 1981;
Getting the senior band member is solved in the LATERAL subquery m - without multiplying the cost for the base query. It works because the window function count(*) OVER () is computed before ORDER BY and LIMIT are applied. Since bands naturally only have few members, this should be the fastest possible way. See:
Best way to get result count before LIMIT was applied
What is the difference between LATERAL and a subquery in PostgreSQL?
Prevent duplicate values in LEFT JOIN
The other optimization for counting albums and songs builds on the assumption that the same id_song is never included in multiple albums of the same band. Else, those are counted multiple times. (Easily fixed, and uncorrelated to the task of getting the senior band member.)
The point is to eliminate the need for DISTINCT at the top level after multiplying rows at the N-side repeatedly (I like to call that "proxy cross join"). That would produce a possibly huge number of rows in the derived table without need.
Plus, it's much more convenient to retrieve additional column (like more columns for the senior band member) than with some other query styles.

Oracle SQL - How do i select tables with 0 or more values from other tables

Soo i need to make a consult that shows the id of the city, name, and how much clients that city have including cities that have 0 clients;
I was first trying to just get the cities that have clients but have no ideia on how to include cities that have no clients.
I have a table: CITIES that cointains ID_city, NAME, and REGION
and the table: CLIENTS that cointains ID_client, NAME and ID_city
query:
select l.name, l.ID_city, count(c.name) from clients c
JOIN cities l on l.ID_city = c.ID_city
GROUP BY l.name, l.ID_city;
use left join
select l.nomecidade, l.codcidade, count(c.nomecliente) from prova.clientes c
left JOIN prova.cidades l on l.codcidade = c.codcidade
GROUP BY l.nomecidade, l.codcidade
Use left join but make sure which table you considering first.
The code below should solve the problem:
select C1.name, C1.ID_city, count(C2.name) from cities C1
LEFT JOIN clients C2 on C1.ID_city = C2.ID_city
GROUP BY C1.name, C1.ID_city;

SQL: IN condition from array

The best way I can describe my problem is by giving you an example of my problem:
I have a table called "person" with the columns id | name | hobbys
hobbys would be a manyToMany association
So I have this statement: SELECT * FROM person p LEFT JOIN hobbys h ON p.hobby_id = h.id WHERE p.hobby_id IN($array);
The problem here is, it will select all persons that have one of the hobbys in that array, but I want the selected persons MUST have all of the hobbys in that array.
Is there a function in sql?
Use GROUP BY and HAVING.
SELECT p.id, p.name
FROM person p
JOIN hobbys h ON p.hobby_id = h.id
WHERE p.hobby_id IN($array)
GROUP BY p.id, p.name
HAVING count(distinct h.id) = <size_of_array>
There are also other solutions using INTERSECTION, IN, or EXISTS however this one will keep the list of values behind IN.
You have to search each item hobby in person and match them with a AND conjunction with all contents in hobby table
first, You have to get all records persons have in hobby
SELECT * FROM person p LEFT JOIN hobbys h ON p.hobby_id = h.id
Now build the AND conditional statement, and later query it with
SELECT * FROM person p LEFT JOIN hobbys h ON p.hobby_id = h.id
WHERE first_result IN($array) and second_result in ($array) and....;

Get some data which corresponds to the maximum date

I have these 3 tables:
Table ORG:
Fields:historyid, personid
Table PERSON:
Fields: id
Table HISTORY:
Fields: id,date,personid
Both HISTORY and ORG are linked to PERSON with an 1:N relationship. Also, ORG is linked to HISTORY with an 1:N relationship. I want to get from table ORG for each person just one row: this which corresponds to the HISTORY row with the highest date. The following SQL gives the highest date for a certain person. However, I do not know how to combine this with the above requirement.
SELECT ash1.id
FROM
(SELECT * FROM history a WHERE a.personid=person.id) ash1
LEFT JOIN
(SELECT * FROM history b WHERE b.personid=person.id) ash2
ON ash1.personid=ash2.personid
AND ash1.date < ash2.date
WHERE ash2.date IS NULL
I think you can do it by using MAX() and GROUP BY:
SELECT
o.historyid AS o_hist,
o.personid AS o_per,
h.id AS h_id,
MAX(h.date) AS h_date,
h.personid AS h_person
FROM
org o
LEFT JOIN
person p ON p.id = o.personid
LEFT JOIN
history h ON h.id = o.historyid AND h.personid = p.id
GROUP BY o_per
Try the below query..
;WITH_CTE_HighestHistory
AS (SELECT PersonID,MAX([Date]) HDate
FROM History
GROUP BY PersonID)
SELECT org.*,h.*
FROM org o
LEFT JOIN History h ON o.Historyid=h.Id and o.PersonID=h.PersonId
INNER JOIN WITH_CTE_HighestHistory ch ON h.Personid=ch.Personid and h.[Date]=ch.[Date]
WHERE EXISTS(SELECT 1 FROM Person p WHERE p.Id=o.PersonID )
There are multiple ways to approach this, depending on the database. However, your data structure is awkward. Why does org have historyid? That doesn't really make sense to me.
In any case, based on your description, this should work:
select o.*, h.*
from org o join
history h
on h.personid = o.personid
where h.date = (select max(h2.date)
from history h2
where h2.personid = h.personid
);
You might want to start the from clause as:
from (select distinct personid from org) o
So, you only get one person, if they are repeated in the table.

SQL Query - max(count()) - possible without CTE?

I'm studying for my Database Systems exam at the moment, having some trouble with an exercise creating a query.
I have four tables:
A referent-table with personal data of referents,
A course-table with course data (with the responsible referent as foreign key),
A workshop-table with workshop data (with the corresponding course as foreign key),
A booking-table which manages bookings (with the corresponding workshop which has been booked as a foreign key)
My exercise is to find out how much money a referent earns (there's a price-column in workshop)
It's not very difficult to list how much money he earns per workshop; I created this query to show me:
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.price AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid
This returns this:
2;"Anna";0.60
4;"Ahmed";3.5
1;"Hans";
2;"Anna";13.20
3;"Wolfgang";
As you can see, it works fine.
I now have two rows for Anna (because she is responsible for two courses..) and want to have one row with the sum of both tables.
Unfortunately, the only way to do this (as I found out) is with a Common Table Expression (CTE) - with a CTE it works:
WITH incomepercourse AS (
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.price AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid
)
SELECT referentid, name, SUM(income) FROM incomepercourse GROUP BY referentid, name
this returns:
3;"Wolfgang";
4;"Ahmed";3.50
2;"Anna";13.80
1;"Hans";
Is there any way to avoid a CTE?
My professor didn't talk about CTE, and it also isn't in his lecture notes - so there has to be some other, simpler way.
Is there anyone out there who knows a better way to achieve this?
Thanks in Advance!
You can wrap the CTE part and query as shown below.
SELECT tbl1.referentid, tbl1.name, SUM(tbl1.income) FROM (
SELECT r.referentid,
r.name,
(SELECT COUNT(*) FROM g22_courses c WHERE c.responsiblerefid = r.referentid)*c.Preis AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid, c.responsiblerefid)tbl1
GROUP BY tbl1.referentid, tbl1.name
You do not need to join the workshop and booking table if you do not use it.
If the price is in the course table as shown in your first request I think what you are asking for is answered by this request:
SELECT referent.referentid, referent.name, sum(price)
FROM referent LEFT JOIN g22_courses ON g22_courses.responsiblerefid=referent.id
GROUP BY referent.referentid, referent.name
If the price is in the workshop table, just add a join to this table.
Okay, I solved it.
It was a problem with my understanding of the problem after all... ._.
I know managed to make a query like a_horse_with_no_name said in his comment:
SELECT r.referentid,
r.name,
SUM(c.price) AS income
FROM referent r
LEFT JOIN g22_courses c ON (c.responsiblerefid = r.referentid)
LEFT JOIN g22_workshop w ON (w.courseid = c.id)
LEFT JOIN g22_booking b ON (b.workshopid = w.id)
GROUP BY r.referentid
ORDER BY r.referentid
This solves my problem perfectly, returning the right values.
Thank you!