SQL: couple people who assisted to the same event - sql

create table people(
id_pers int,
nom_pers char(25),
d_nais date,
d_mort date,
primary key(id_pers)
);
create table event(
id_evn int,
primary key(id_evn)
);
create table assisted_to(
id_pers int,
id_evn int,
foreign key (id_pers) references people(id_pers),
foreign key (id_evn) references event(id_evn)
);
insert into people(id_pers, nom_pers, d_nais, d_mort) values (1, 'A', current_date - integer '20', current_date);
insert into people(id_pers, nom_pers, d_nais, d_mort) values (2, 'B', current_date - integer '50', current_date - integer '20');
insert into people(id_pers, nom_pers, d_nais, d_mort) values (3, 'C', current_date - integer '25', current_date - integer '20');
insert into event(id_evn) values (1);
insert into event(id_evn) values (2);
insert into event(id_evn) values (3);
insert into event(id_evn) values (4);
insert into event(id_evn) values (5);
insert into assisted_to(id_pers, id_evn) values (1, 5);
insert into assisted_to(id_pers, id_evn) values (2, 5);
insert into assisted_to(id_pers, id_evn) values (2, 4);
insert into assisted_to(id_pers, id_evn) values (3, 5);
insert into assisted_to(id_pers, id_evn) values (3, 4);
insert into assisted_to(id_pers, id_evn) values (3, 3);
I need to find couples who assisted to the same event on any particular day.
I tried:
select p1.id_pers, p2.id_pers from people p1, people p2, assisted_event ae
where ae.id_pers = p1.id_pers
and ae.id_pers = p2.id_pers
But returns 0 rows.
What am I doing wrong?

Try this:
select distint ae.id_evn,
p1.nom_pers personA, p2.nom_pers PersonB
from assieted_to ae
Join people p1
On p1.id_pers = ae.id_pers
Join people p2
On p2.id_pers = ae.id_pers
And p2.id_pers > p1.id_pers
This generates all pairs of people [couples] who assisted on the same event. With your schema, there is no way to restrict the results to cases where they assisted on the same day. The assumption is that if they assisted on the same event, then that event can only have occurred on one day.

You select two persons, so you need to select two assisted_event rows as well, because each person has its own assignment row in the assisted_event table. The idea is to build a link between p1 and p2 through a pair of assisted_event rows sharing the same id_evn
select p1.id_pers, p2.id_pers
from people p1, people p2
where exists (
select *
from assisted_event e1
join assisted_event e2 on e1.id_evn=e2.id_evn
where e1.id_pers=p1.id_pers and e2.id_pers=p2.id_pers
)

When re-phrased into ANSI JOIN syntax so I can read it, your query reads:
select p1.id_pers, p2.id_pers
from assisted_event ae
inner join people p1 ON (ae.id_pers = p1.id_pers)
inner join people p2 ON (ae.id_pers = p2.id_pers)
Since id_pers is the primary key of p1, it is impossible for ae.id_pers to be simultaneously equal to p1.id_pers and p2.id_pers. You'll need to find another approach.
You don't need to join on people at all for this, though you'll probably want to in order to populate their details. You need to self-join the people-to-events join table not the people table in order to get the desired results, filtering the self-join to include only rows where the event ID is the same but the people are different. Using > rather than <> means you don't have to use another pass to filter out the (a,b) vs (b,a) pairings.
Something like:
select ae1.id_evn event_id, ae1.id_pers id_pers1, ae2.id_pers id_pers2
from assisted_to ae1
inner join assisted_to ae2
on (ae2.id_evn = ae1.id_evn and ae1.id_pers > ae2.id_pers)
You can now, if desired, add additional joins on the event and persion tables to populate details. You'll need to join people twice with different aliases to populate the two different "sides". See Charles Bretana's example.

Related

How can I get rows from a table that matches all the foreign keys from another table

Say I have two tables role and roleApp defined like this:
create table #tempRole(roleId int);
insert into #tempRole (roleId) values (1)
insert into #tempRole (roleId) values (2)
create table #tempRoleApp(roleId int, appId int);
insert into #tempRoleApp (roleId, appId) values (1, 26)
insert into #tempRoleApp (roleId, appId) values (2, 26)
insert into #tempRoleApp (roleId, appId) values (1, 27)
So, from #tempRoleApp table, I want to get only the rows that matches all the values of the #tempRole table (1 and 2), so in this case the output needs to be 26 (as it matches both 1 and 2) but not 27 as the table does not have 2, 27).
#tempRole table is actually the output from another query so it can have arbitrary number of values.
I tried few things like:
select *
from #tempRoleApp
where roleId = ALL(select roleId FROM #tempRole)
Which does not give anything... tried few more things but not getting what I want.
I believe this gives what you were looking for.
select tra.appId
from #tempRoleApp as tra
join #tempRole as tr on tra.roleId = tr.roleId
group by tra.appId
having count(distinct tra.roleId) = (select count(distinct roleId) from #tempRole)
It uses count distinct to get the total unique roleId's in the tempRole table and compares that with the unique count of these per appId, after confirming the roleIds match between the tables.
As you clarified in the comment, once you add another tempRole roleId, now no entry has all of the Ids so no rows are returned.

Can you sort the result in GROUP BY?

I have two tables one is objects with the attribute of id and is_green.The other table is object_closure with the attributes of ancestor_id, descendant_od, and created_at. ie.
Objects: id, is_green
Object_closure: ancestor_id, descendant_od, created_at
There are more attributes in the Object table but not necessary to mention in this question.
I have a query like this:
-- create a table
CREATE TABLE objects (
id INTEGER PRIMARY KEY,
is_green boolean
);
CREATE TABLE object_Closure (
ancestor_id INTEGER ,
descendant_id INTEGER,
created_at date
);
-- insert some values
INSERT INTO objects VALUES (1, 1 );
INSERT INTO objects VALUES (2, 1 );
INSERT INTO objects VALUES (3, 1 );
INSERT INTO objects VALUES (4, 0 );
INSERT INTO objects VALUES (5, 1 );
INSERT INTO objects VALUES (6, 1 );
INSERT INTO object_Closure VALUES (1, 2, 12-12-2020 );
INSERT INTO object_Closure VALUES (1, 3, 12-13-2020 );
INSERT INTO object_Closure VALUES (2, 3, 12-14-2020 );
INSERT INTO object_Closure VALUES (4, 5, 12-15-2020 );
INSERT INTO object_Closure VALUES (4, 6, 12-16-2020 );
INSERT INTO object_Closure VALUES (5, 6, 12-17-2020 );
-- fetch some values
SELECT
O.id,
P.id,
group_concat(DISTINCT P.id ) as p_ids
FROM objects O
LEFT JOIN object_Closure OC on O.id=OC.descendant_id
LEFT JOIN objects P on OC.ancestor_id=P.id AND P.is_green=1
GROUP BY O.id
The result is
query result
I would like to see P.id for O.id=6 is also 5 instead of null. Afterall,5 is still a parentID (p.id). More importantly, I also want the id shown in P.id as the first created id if there are more than one. (see P.created_at).
I understand the reason why it happens is that the first one the system pick is null, and the null was created by the join with the condition of is_green; however, I need to filter out those objects that are green only in the p.id.
I cannot do an inner join (because I need the other attributes of the table and sometimes both P.id and p_ids are null, but still need to show in the result) I cannot restructure the database. It is already there and cannot be changed. I also cannot just use a Min() or Max() aggregation because I want the ID that is picked is the first created one.
So is there a way to skip the null in the join?
or is there a way to filter the selection in the select clause?
or do an order by before the grouping?
P.S. My original code concat the P.id by the order of P.created_at. For some reason, I cannot replicate it in the online SQL simulator.

Referencing a calculated field in SQL statement

I have the following schema
CREATE TABLE QUOTE (id int, amount int);
CREATE TABLE QUOTE_LINE (id int, quote_id int, line_amount int);
INSERT INTO QUOTE VALUES(1, 100);
INSERT INTO QUOTE VALUES(2, 200);
INSERT INTO QUOTE VALUES(3, 100);
INSERT INTO QUOTE VALUES(4, 300);
INSERT INTO QUOTE_LINE VALUES(1, 1, 5);
INSERT INTO QUOTE_LINE VALUES(2, 1, 6);
INSERT INTO QUOTE_LINE VALUES(3, 1, 4);
INSERT INTO QUOTE_LINE VALUES(4, 1, 2);
INSERT INTO QUOTE_LINE VALUES(1, 2, 5);
INSERT INTO QUOTE_LINE VALUES(2, 2, 5);
INSERT INTO QUOTE_LINE VALUES(3, 2, 5);
INSERT INTO QUOTE_LINE VALUES(4, 2, 5);
And I need to run the following query:
SELECT QUOTE.id,
line_amount*12 AS amount,
amount*2 as amount_doubled
from QUOTE_LINE
LEFT JOIN QUOTE ON QUOTE_LINE.quote_id=QUOTE.id;
The 3rd line in the query amount*2 as amount_double needs to reference the amount calculated in the prior line i.e. line_amount*12 AS amount.
However if I run this query, it picks the amount from the QUOTE table instead the amount that was calculated. How can I make my query use the calculated amount without changing the name of the calculated field?
Here is the sqlfiddle for this:
http://sqlfiddle.com/#!17/914b2/1
Note: I understand that I can create a sub-query, CTE or a lateral join, but the tables I am working are very very wide tables, and the queries have many many joins. As such, I need to keep the LEFT INNER JOINS and also I don't always know if a calculated field will be duplicated in JOINed table or not. Table structures change.
Move the definition to the FROM clause using a LATERAL JOIN:
select q.id, v.amount, v.amount * 2 as as amount_doubled
from QUOTE_LINE ql left join
QUOTE q
on ql.quote_id = q.id CROSS JOIN LATERAL
(values (line_amount*12)) v(amount);
You can also use a subquery or CTE, but I like the lateral join method.
Note: I would expect QUOTE to be the first table in the LEFT JOIN.
Qualify all column names with the table name and use a subquery:
SELECT q.id,
q.amount,
q.amount * 2 AS amount_doubled
FROM (SELECT quote.id,
quote_line.line_amount * 12 AS amount,
FROM quote_line
LEFT JOIN quite
ON quote_line.quote_id = quote.id
) AS q;
Just a little simple algebra resolves the issue quite easily. It is clear that calculated amount is 12 times the line_amount and that amount_doubles is 2 times that. So
select q.id
, ql.line_amount*12 as amount
, ql.line_amount*12*2 as amount_doubled
from quote_line ql
left join quote q
on ql.quote_id = q.id;
However, child left join parent seems strange as it basically says "Give me the quote line amounts where there is no quote". One would hope a FK from line to quote would prevent that from happening.
If so then a inner join would suffice. Further if the id is the only column from quote the join can removed by taking quote_id from quote_line. So perhaps reducing to:
select ql.quote_id as id
, ql.line_amount*12 as amount
, ql.line_amount*24 as amount_doubled
from quote_line ql;

Query database for distinct values and aggregate data based on condition

I am trying to extract distinct items from a Postgres database pairing a column from a table with a column from another table based on a condition. Simplified version looks like this:
CREATE TABLE users
(
id SERIAL PRIMARY KEY,
name VARCHAR(255)
);
CREATE TABLE photos
(
id INT PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
flag VARCHAR(255)
);
INSERT INTO users VALUES (1, 'Bob');
INSERT INTO users VALUES (2, 'Alice');
INSERT INTO users VALUES (3, 'John');
INSERT INTO photos VALUES (1001, 1, 'a');
INSERT INTO photos VALUES (1002, 1, 'b');
INSERT INTO photos VALUES (1003, 1, 'c');
INSERT INTO photos VALUES (1004, 2, 'a');
INSERT INTO photos VALUES (1004, 2, 'x');
What I need is to extract each user name, only once, and a flag value for each of them. The flag value should prioritize a specific one, let's say b. So, the result should look like:
Bob b
Alice a
Where Bob owns a photo having the b flag, while Alice does not and John has no photos. For Alice the output for the flag value is not important (a or x would be just as good) as long as she owns no photo flagged b.
The closest thing I found were some self-join queries where the flag value would have been aggregated using min() or max(), but I am looking for a particular value, which is not first, nor last. Moreover, I found out that you can define your own aggregate functions, but I wonder if there is an easier way of conditioning the query in order to obtain the required data.
Thank you!
Here is a method with aggregation:
select u.name,
coalesce(max(flag) filter (where flag = 'b'),
min(flag)
) as flag
from users u left join
photos p
on u.id = p.user_id
group by u.id, u.name;
That said, a more typical method would be a prioritization query. Perhaps:
select distinct on (u.id) u.name, p.flag
from users u left join
photos p
on u.id = p.user_id
order by u.id, (p.flag = 'b') desc;

SQL query for querying counts from a table

The prompt is to form a SQL query.
That finds the students name and ID who attend all lectures having ects more than 4.
The tables are
CREATE TABLE CLASS (
STUDENT_ID INT NOT NULL,
LECTURE_ID INT NOT NULL
);
CREATE TABLE STUDENT (
STUDENT_ID INT NOT NULL,
STUDENT_NAME VARCHAR(255),
PRIMARY KEY (STUDENT_ID)
)
CREATE TABLE LECTURE (
LECTURE_ID INT NOT NULL,
LECTURE_NAME VARCHAR(255),
ECTS INT,
PRIMARY KEY (LECTURE_ID)
)
I came up with this query but this didn't seem to work on SQLFIDDLE. I'm new to SQL and this query has been a little troublesome for me. How would you query this?
SELECT STUD.STUDENT_NAME FROM STUDENT STUD
INNER JOIN CLASS CLS AND LECTURE LEC ON
CLS.STUDENT_ID = STUD.STUDENT_ID
WHERE LEC.CTS > 4
How do I fix this query?
UPDATE
insert into STUDENT values(1, 'wick', 20);
insert into STUDENT values(2, 'Drake', 25);
insert into STUDENT values(3, 'Bake', 42);
insert into STUDENT values(4, 'Man', 5);
insert into LECTURE values(1, 'Math', 6);
insert into LECTURE values(2, 'Prog', 6);
insert into LECTURE values(3, 'Physics', 1);
insert into LECTURE values(4, '4ects', 4);
insert into LECTURE values(5, 'subj', 4);
insert into SCLASS values(1, 3);
insert into SCLASS values(1, 2);
insert into SCLASS values(2, 3);
insert into SCLASS values(3, 1);
insert into SCLASS values(3, 2);
insert into SCLASS values(3, 3);
insert into SCLASS values(4, 4);
insert into SCLASS values(4, 5);
The following approach might get the job done.
It works by generating two subqueries :
one that counts how many lectures whose ects is greater than 4 were taken by each user
another that just counts the total number of lectures whose ects is greater than 4
Then, the outer query filters in users whose count reaches the total :
SELECT x.student_id, x.student_name
FROM
(
SELECT s.student_id, s.student_name, COUNT(DISTINCT l.lecture_id) cnt
FROM
student s
INNER JOIN class c ON c.student_id = s.student_id
INNER JOIN lecture l ON l.lecture_id = c.lecture_id
WHERE l.ects > 4
GROUP BY s.student_id, s.student_name
) x
CROSS JOIN (SELECT COUNT(*) cnt FROM lecture WHERE ects > 4 ) y
WHERE x.cnt = y.cnt ;
As GMB already said in their answer: count required lections and compare with those taken per student. Here is another way to write such query. We outer join classes to all lectures with ECTS > 4. Analytic window functions allow us to aggregate by two different groups at the same time (here: all rows and student's rows).
select *
from student
where (student_id, 0) in -- 0 means no gap between required and taken lectures
(
select
student_id,
count(distinct lecture_id) over () -
count(distinct lecture_id) over (partition by c.student_id) as gap
from lecture l
left join class c using (lecture_id)
where l.ects > 4
);
Demo: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=74371314913565243863c225847eb044
You can try the following query.
SELECT distinct
STUD.STUDENT_NAME,
STUD.STUDENT_ID
FROM STUDENT STUD
INNER JOIN CLASS CLS ON CLS.STUDENT_ID = STUD.STUDENT_ID
INNER JOIN LECTURE LEC ON LEC.LECTURE_ID=CLS.LECTURE_ID
where LEC.ECTS > 4 group by STUD.STUDENT_ID,STUD.STUDENT_NAME
having COUNT(STUD.STUDENT_ID) =(SELECT COUNT(*) FROM LECTURE WHERE ECTS > 4)