Postgres Join Query is SOMETIMES taking the cartesian product - sql

I'm attempting to join multiple tables for one query and I am getting inconsistent results from the database, I believe my query is taking the cartesian product of all the users, when I only want users who are in the DirectConversation.
The Schema for reference:
The query is (where $id stands for the variable User.id):
SELECT c.*, count(dm.id),
u1.first_name, u1.last_name, u1.company, u1.picture,
u2.first_name, u2.last_name, u2.company, u2.picture
FROM "DirectConversation" as c, "DirectMessage" as dm, "Profile" as u1, "Profile" as u2
WHERE u1."id_User" = c."id_User1"
AND u2."id_User" = c."id_User2"
AND c.id = dm."id_DirectConversation"
AND dm.viewed = 'f' AND dm.deleted = 'f'
AND c."id_User1" = $id OR c."id_User2" = $id
GROUP BY c.id, u1.id, u2.id;
The expected result (the result when the user id = 1 ):
id | id_User1 | id_User2 | count | first_name | last_name | company | picture | first_name | last_name | company | picture
----+----------+----------+-------+------------+-----------+--------------------+-----------------------------------------------------------------------------+------------+-----------+----------------+--------------------------------------------------------------------------
1 | 1 | 2 | 3 | Albert | Einstein | alberts inventions | http://upload.wikimedia.org/wikipedia/commons/d/d3/Albert_Einstein_Head.jpg | Nikola | Tesla | Teslas Widgets | http://upload.wikimedia.org/wikipedia/commons/7/79/Tesla_circa_1890.jpeg
(1 row)
(END)
The error result (the result when the user id= 2):
id | id_User1 | id_User2 | count | first_name | last_name | company | picture | first_name | last_name | company | picture
----+----------+----------+-------+------------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------+------------+-----------+--------------------+----------------------------------------------------------------------------------------------------------------
1 | 1 | 2 | 4 | Albert | Einstein | alberts inventions | http://upload.wikimedia.org/wikipedia/commons/d/d3/Albert_Einstein_Head.jpg | Albert | Einstein | alberts inventions | http://upload.wikimedia.org/wikipedia/commons/d/d3/Albert_Einstein_Head.jpg
1 | 1 | 2 | 4 | Albert | Einstein | alberts inventions | http://upload.wikimedia.org/wikipedia/commons/d/d3/Albert_Einstein_Head.jpg | Nikola | Tesla | Teslas Widgets | http://upload.wikimedia.org/wikipedia/commons/7/79/Tesla_circa_1890.jpeg
1 | 1 | 2 | 4 | Albert | Einstein | alberts inventions | http://upload.wikimedia.org/wikipedia/commons/d/d3/Albert_Einstein_Head.jpg | Rosalind | Franklin | DNA R US | http://upload.wikimedia.org/wikipedia/en/9/97/Rosalind_Franklin.jpg
1 | 1 | 2 | 4 | Albert | Einstein | alberts inventions | http://upload.wikimedia.org/wikipedia/commons/d/d3/Albert_Einstein_Head.jpg | Charles | Babbage | Babbages Cabbages | http://upload.wikimedia.org/wikipedia/commons/6/6b/Charles_Babbage_-_1860.jpg
... Note this was truncated for brevity. I believe this is taking the cartesian product of all the users, however I am unaware as to why
The version of postgres I'm using:
version
-----------------------------------------------------------------------------------------------------
PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit

I just moved your where clauses for the joins to the ON statement and made proper joins, if this doesn't work I'll set up a sqlfiddle and see what the problem with this sql is
SELECT c.*, count(dm.id),
u1.first_name, u1.last_name, u1.company, u1.picture,
u2.first_name, u2.last_name, u2.company, u2.picture
FROM "DirectConversation" c
JOIN "DirectMessage" dm ON c.id = dm."id_DirectConversation"
JOIN "Profile" u1 ON u1."id_User" = c."id_User1"
JOIN "Profile" u2 ON u2."id_User" = c."id_User2"
WHERE
dm.viewed = 'f' AND dm.deleted = 'f'
AND (c."id_User1" = $id OR c."id_User2" = $id)
GROUP BY c.id, u1.id, u2.id;
Edit: grouped the OR clause just to be safe

Related

Remove Duplicate Result on Query

could help me solve this duplication problem where it returns more than 1 result for the same record I want to bring only 1 result for each id, and only the last history of each record.
My Query:
SELECT DISTINCT ON(tickets.ticket_id,ticket_histories.created_at)
ticket.id AS ticket_id,
tickets.priority,
tickets.title,
tickets.company,
tickets.ticket_statuse,
tickets.created_at AS created_ticket,
group_user.id AS group_id,
group_user.name AS user_group,
ch_history.description AS ch_description,
ch_history.created_at AS ch_history
FROM
tickets
INNER JOIN company ON (company.id = tickets.company_id)
INNER JOIN (SELECT id,
tickets_id,
description,
user_id,
MAX(tickets.created_at) AS created_ticket
FROM
ch_history
GROUP BY id,
created_at,
ticket_id,
user_id,
description
ORDER BY created_at DESC LIMIT 1) AS ch_history ON (ch_history.ticket_id = ticket.id)
INNER JOIN users ON (users.id = ch_history.user_id)
INNER JOIN group_users ON (group_users.id = users.group_user_id)
WHERE company = 15
GROUP BY
tickets.id,
ch_history.created_at DESC;
Result of my query, but returns 3 or 5 identical ids with different histories
I want to return only 1 id of each ticket, and only the last recorded history of each tick
ticket_id | priority | title | company_id | ticket_statuse | created_ticket | company | user_group | group_id | ch_description | ch_history
-----------+------------+--------------------------------------+------------+-----------------+----------------------------+------------------------------------------------------+-----------------+----------+------------------------+----------------------------
49713 | 2 | REMOVE DATA | 1 | t | 2019-12-09 17:50:35.724485 | SAME COMPANY | people | 5 | TEST 1 | 2019-12-10 09:31:45.780667
49706 | 2 | INCLUDE DATA | 1 | f | 2019-12-09 09:16:35.320708 | SAME COMPANY | people | 5 | TEST 2 | 2019-12-10 09:38:52.769515
49706 | 2 | ANY TITLE | 1 | f | 2019-12-09 09:16:35.320708 | SAME COMPANY | people | 5 | TEST 3 | 2019-12-10 09:39:22.779473
49706 | 2 | NOTING ELSE MAT | 1 | f | 2019-12-09 09:16:35.320708 | SAME COMPANY | people | 5 | TESTE 4 | 2019-12-10 09:42:59.50332
49706 | 2 | WHITESTRIPES | 1 | f | 2019-12-09 09:16:35.320708 | SAME COMPANY | people | 5 | TEST 5 | 2019-12-10 09:44:30.675434
wanted to return as below
ticket_id | priority | title | company_id | ticket_statuse | created_ticket | company | user_group | group_id | ch_description | ch_history
-----------+------------+--------------------------------------+------------+-----------------+----------------------------+------------------------------------------------------+-----------------+----------+------------------------+----------------------------
49713 | 2 | REMOVE DATA | 1 | t | 2019-12-09 17:50:10.724485 | SAME COMPANY | people | 5 | TEST 1 | 2020-01-01 18:31:45.780667
49707 | 2 | INCLUDE DATA | 1 | f | 2019-12-11 19:22:21.320701 | SAME COMPANY | people | 5 | TEST 2 | 2020-02-05 16:38:52.769515
49708 | 2 | ANY TITLE | 1 | f | 2019-12-15 07:15:57.320950 | SAME COMPANY | people | 5 | TEST 3 | 2020-02-06 07:39:22.779473
49709 | 2 | NOTING ELSE MAT | 1 | f | 2019-12-16 08:30:28.320881 | SAME COMPANY | people | 5 | TESTE 4 | 2020-01-07 11:42:59.50332
49701 | 2 | WHITESTRIPES | 1 | f | 2019-12-21 11:04:00.320450 | SAME COMPANY | people | 5 | TEST 5 | 2020-01-04 10:44:30.675434
I wanted to return as shown below, see that the field ch_description, and ch_history bring only the most recent records and only the last of each ticket listed, without duplication I wanted to bring this way could help me.
Two things jump out at me:
You have listed "created at" as part of your "distinct on," which is going to inherently give you multiple rows per ticket id (unless there happens to be only one)
The distinct on should make the subquery on the ticket history unnecessary... and even if you chose to do it this way, you again are going on the "created at" column, which will give you multiple results. The ideal subquery, should you choose this approach, would have been to group by ticket_id and only ticket_id.
Slightly related:
An alternative approach to the subquery would be an analytic function (windowing function), but I'll save that for another day.
I think the query you want, which will give you one row per ticket_id, based on the history table's created_at field would be something like this:
select distinct on (t.id)
<your fields here>
from
tickets t
join company c on t.company_id = c.id
join ch_history ch on ch.ticket_id = t.id
join users u on ch.user_id = u.ud
join group_users g on u.group_user_id = g.id
where
company = 15
order by
t.id, ch.created_at -- this is what tells distinct on which record to choose

SQL - joining 3 tables and choosing newest logged entry per id

I got rather complicated riddle to solve. So far I'm unlocky.
I got 3 tables which I need to join to get the result.
Most important is that I need highest h_id per p_id. h_id is uniqe entry in log history. And I need newest one for given point (p_id -> num).
Apart from that I need ext and name as well.
history
+----------------+---------+--------+
| h_id | p_id | str_id |
+----------------+---------+--------+
| 1 | 1 | 11 |
| 2 | 5 | 15 |
| 3 | 5 | 23 |
| 4 | 1 | 62 |
+----------------+---------+--------+
point
+----------------+---------+
| p_id | num |
+----------------+---------+
| 1 | 4564 |
| 5 | 3453 |
+----------------+---------+
street
+----------------+---------+-------------+
| str_id | ext | name |
+----------------+---------+-------------+
| 15 | | Mein st. 33 | - bad name
| 11 | | eck st. 42 | - bad name
| 62 | abc | Main st. 33 |
| 23 | efg | Back st. 42 |
+----------------+---------+-------------+
EXPECTED RESULT
+----------------+---------+-------------+-----+
| num | ext | name |h_id |
+----------------+---------+-------------+-----+
| 3453 | efg | Back st. 42 | 3 |
| 4564 | abc | Main st. 33 | 4 |
+----------------+---------+-------------+-----+
I'm using Oracle SQL. Tried using query below but result is not true.
SELECT num, max(name), max(ext), MAX(h_id) maxm FROM history
INNER JOIN street on street.str_id = history._str_id
INNER JOIN point on point.p_id = history.p_id
GROUP BY point.num
In Oracle, you can use keep:
SELECT p.num,
MAX(h.h_id) as maxm,
MAX(s.name) KEEP (DENSE_RANK FIRST ORDER BY h.h_id DESC) as name,
MAX(s.ext) KEEP (DENSE_RANK FIRST ORDER BY h.h_id DESC) as ext
FROM history h INNER JOIN
street s
ON s.str_id = h._str_id INNER JOIN
point p
ON p.p_id = h.p_id
GROUP BY p.num;
The keep syntax allows you to do "first()" and "last()" for aggregations.

How to print the students name in this query?

The concerned tables are as follows:
students(rollno, name, deptcode)
depts(deptcode, deptname)
course(crs_rollno, crs_name, marks)
The query is
Find the name and roll number of the students from each department who obtained
highest total marks in their own department.
Consider:
i) Courses of different department are different.
ii) All students of a particular department take same number and same courses.
Then only the query makes sense.
I wrote a successful query for displaying the maximum total marks by a student in each department.
select do.deptname, max(x.marks) from students so
inner join depts do
on do.deptcode=so.deptcode
inner join(
select s.name as name, d.deptname as deptname, sum(c.marks) as marks from students s
inner join crs_regd c
on s.rollno=c.crs_rollno
inner join depts d
on d.deptcode=s.deptcode
group by s.name,d.deptname) x
on x.name=so.name and x.deptname=do.deptname group by do.deptname;
But as mentioned I need to display the name as well. Accordingly if I include so.name in select list, I need to include it in group by clause and the output is as below:
Kendra Summers Computer Science 274
Stewart Robbins English 80
Cole Page Computer Science 250
Brian Steele English 83
expected output:
Kendra Summers Computer Science 274
Brian Steele English 83
Where is the problem?
I guess this can be easily achieved if you use window function -
select name, deptname, marks
from (select s.name as name, d.deptname as deptname, sum(c.marks) as marks,
row_number() over(partition by d.deptname order by sum(c.marks) desc) rn
from students s
inner join crs_regd c on s.rollno=c.crs_rollno
inner join depts d on d.deptcode=s.deptcode
group by s.name,d.deptname) x
where rn = 1;
To solve the problem with a readable query I had to define a couple of views:
total_marks: For each student the sum of their marks
create view total_marks as select s.deptcode, s.name, s.rollno, sum(c.marks) as total from course c, students s where s.rollno = c.crs_rollno group by s.rollno;
dept_max: For each department the highest total score by a single student of that department
create view dept_max as select deptcode, max(total) max_total from total_marks group by deptcode;
So I can get the desidered output with the query
select a.deptcode, a.rollno, a.name from total_marks a join dept_max b on a.deptcode = b.deptcode and a.total = b.max_total
If you don't want to use views you can replace their selects on the final query, which will result in this:
select a.deptcode, a.rollno, a.name
from
(select s.deptcode, s.name, s.rollno, sum(c.marks) as total from course c, students s where s.rollno = c.crs_rollno group by s.rollno) a
join (select deptcode, max(total) max_total from (select s.deptcode, s.name, s.rollno, sum(c.marks) as total from course c, students s where s.rollno = c.crs_rollno group by s.rollno) a_ group by deptcode) b
on a.deptcode = b.deptcode and a.total = b.max_total
Which I'm sure it is easily improvable in performance by someone more skilled then me...
If you (and anybody else) want to try it the way I did, here is the schema:
create table depts ( deptcode int primary key auto_increment, deptname varchar(20) );
create table students ( rollno int primary key auto_increment, name varchar(20) not null, deptcode int, foreign key (deptcode) references depts(deptcode) );
create table course ( crs_rollno int, crs_name varchar(20), marks int, foreign key (crs_rollno) references students(rollno) );
And here all the entries I inserted:
insert into depts (deptname) values ("Computer Science"),("Biology"),("Fine Arts");
insert into students (name,deptcode) values ("Turing",1),("Jobs",1),("Tanenbaum",1),("Darwin",2),("Mendel",2),("Bernard",2),("Picasso",3),("Monet",3),("Van Gogh",3);
insert into course (crs_rollno,crs_name,marks) values
(1,"Algorithms",25),(1,"Database",28),(1,"Programming",29),(1,"Calculus",30),
(2,"Algorithms",24),(2,"Database",22),(2,"Programming",28),(2,"Calculus",19),
(3,"Algorithms",21),(3,"Database",27),(3,"Programming",23),(3,"Calculus",26),
(4,"Zoology",22),(4,"Botanics",28),(4,"Chemistry",30),(4,"Anatomy",25),(4,"Pharmacology",27),
(5,"Zoology",29),(5,"Botanics",27),(5,"Chemistry",26),(5,"Anatomy",25),(5,"Pharmacology",24),
(6,"Zoology",18),(6,"Botanics",19),(6,"Chemistry",22),(6,"Anatomy",23),(6,"Pharmacology",24),
(7,"Sculpture",26),(7,"History",25),(7,"Painting",30),
(8,"Sculpture",29),(8,"History",24),(8,"Painting",30),
(9,"Sculpture",21),(9,"History",19),(9,"Painting",25) ;
Those inserts will load these data:
select * from depts;
+----------+------------------+
| deptcode | deptname |
+----------+------------------+
| 1 | Computer Science |
| 2 | Biology |
| 3 | Fine Arts |
+----------+------------------+
select * from students;
+--------+-----------+----------+
| rollno | name | deptcode |
+--------+-----------+----------+
| 1 | Turing | 1 |
| 2 | Jobs | 1 |
| 3 | Tanenbaum | 1 |
| 4 | Darwin | 2 |
| 5 | Mendel | 2 |
| 6 | Bernard | 2 |
| 7 | Picasso | 3 |
| 8 | Monet | 3 |
| 9 | Van Gogh | 3 |
+--------+-----------+----------+
select * from course;
+------------+--------------+-------+
| crs_rollno | crs_name | marks |
+------------+--------------+-------+
| 1 | Algorithms | 25 |
| 1 | Database | 28 |
| 1 | Programming | 29 |
| 1 | Calculus | 30 |
| 2 | Algorithms | 24 |
| 2 | Database | 22 |
| 2 | Programming | 28 |
| 2 | Calculus | 19 |
| 3 | Algorithms | 21 |
| 3 | Database | 27 |
| 3 | Programming | 23 |
| 3 | Calculus | 26 |
| 4 | Zoology | 22 |
| 4 | Botanics | 28 |
| 4 | Chemistry | 30 |
| 4 | Anatomy | 25 |
| 4 | Pharmacology | 27 |
| 5 | Zoology | 29 |
| 5 | Botanics | 27 |
| 5 | Chemistry | 26 |
| 5 | Anatomy | 25 |
| 5 | Pharmacology | 24 |
| 6 | Zoology | 18 |
| 6 | Botanics | 19 |
| 6 | Chemistry | 22 |
| 6 | Anatomy | 23 |
| 6 | Pharmacology | 24 |
| 7 | Sculpture | 26 |
| 7 | History | 25 |
| 7 | Painting | 30 |
| 8 | Sculpture | 29 |
| 8 | History | 24 |
| 8 | Painting | 30 |
| 9 | Sculpture | 21 |
| 9 | History | 19 |
| 9 | Painting | 25 |
+------------+--------------+-------+
I take chance to point out that this database is badly designed. This becomes evident with course table. For these reasons:
The name is singular
This table does not represent courses, but rather exams or scores
crs_name should be a foreign key referencing the primary key of another table (that would actually represent the courses)
There is no constrains to limit the marks to a range and to avoid a student to take twice the same exam
I find more logical to associate courses to departments, instead of student to departments (this way also would make these queries easier)
I tell you this because I understood you are learning from a book, so unless the book at one point says "this database is poorly designed", do not take this exercise as example to design your own!
Anyway, if you manually resolve the query with my data you will come to this results:
+----------+--------+---------+
| deptcode | rollno | name |
+----------+--------+---------+
| 1 | 1 | Turing |
| 2 | 6 | Bernard |
| 3 | 8 | Monet |
+----------+--------+---------+
As further reference, here the contents of the views I needed to define:
select * from total_marks;
+----------+-----------+--------+-------+
| deptcode | name | rollno | total |
+----------+-----------+--------+-------+
| 1 | Turing | 1 | 112 |
| 1 | Jobs | 2 | 93 |
| 1 | Tanenbaum | 3 | 97 |
| 2 | Darwin | 4 | 132 |
| 2 | Mendel | 5 | 131 |
| 2 | Bernard | 6 | 136 |
| 3 | Picasso | 7 | 81 |
| 3 | Monet | 8 | 83 |
| 3 | Van Gogh | 9 | 65 |
+----------+-----------+--------+-------+
select * from dept_max;
+----------+-----------+
| deptcode | max_total |
+----------+-----------+
| 1 | 112 |
| 2 | 136 |
| 3 | 83 |
+----------+-----------+
Hope I helped!
Try the following query
select a.name, b.deptname,c.marks
from students a
, crs_regd b
, depts c
where a.rollno = b.crs_rollno
and a.deptcode = c.deptcode
and(c.deptname,b.marks) in (select do.deptname, max(x.marks)
from students so
inner join depts do
on do.deptcode=so.deptcode
inner join (select s.name as name
, d.deptname as deptname
, sum(c.marks) as marks
from students s
inner join crs_regd c
on s.rollno=c.crs_rollno
inner join depts d
on d.deptcode=s.deptcode
group by s.name,d.deptname) x
on x.name=so.name
and x.deptname=do.deptname
group by do.deptname
)
Inner/Sub query will fetch the course name and max marks and the outer query gets the corresponding name of the student.
try and let know if you got the desired result
Dense_Rank() function would be helpful in this scenario:
SELECT subquery.*
FROM (SELECT Student_Total_Marks.rollno,
Student_Total_Marks.name,
Student_Total_Marks.deptcode, depts.deptname,
rank() over (partition by deptcode order by total_marks desc) Student_Rank
FROM (SELECT Stud.rollno,
Stud.name,
Stud.deptcode,
sum(course.marks) total_marks
FROM students stud inner join course course on stud.rollno = course.crs_rollno
GROUP BY stud.rollno,Stud.name,Stud.deptcode) Student_Total_Marks,
dept dept
WHERE Student_Total_Marks.deptcode = dept.deptname
GROUP BY Student_Total_Marks.deptcode) subquery
WHERE suquery.student_rank = 1

MS Access SQL - Aggregate Count with 2 Example Table

I have a situation where I need to query from a 2 table in MS Access.
I'm new to this SQL access.
Table: UserInfo
===================
| UserID | Gender |
===================
| K01 | M |
| K02 | M |
| K03 | F |
| K04 | M |
===================
Table: OrderInfo
===================================
| OrderID | Type | UserID_FK |
===================================
| 1 | Food | K01 |
| 2 | Food | K01 |
| 3 | Toolkit | K02 |
| 4 | Food | K04 |
| 5 | Toolkit | K03 |
===================================
The question is:
I want to query so the result produce this table, how can I do it in MS Access?
I'm thinking that I should do a subquery but I don't know how to do it.
Table: Summary
================================================================
| UserID | Gender | CountOfToolkit | CountOfFood | TotalCount |
================================================================
| K01 | M | 0 | 2 | 2 |
| K02 | M | 1 | 0 | 1 |
| K03 | F | 1 | 0 | 1 |
| K04 | M | 0 | 1 | 1 |
================================================================
First, Type is a reserved word in MS Access.
You should avoid using reserved words.
Having said that, try this:
SELECT a.UserID, a.Gender, SUM(IIF(b.[Type] ='Food',1,0)) AS CountOfFood, SUM(IIF(b.[Type] ='Toolkit',1,0)) AS CountOfToolkit, COUNT(*) AS TotalCount
FROM UserInfo a INNER JOIN OrderInfo b ON a.UserId = b.UserID_FK
GROUP BY a.UserID, a.Gender
This crosstab query doesn't rely on having just Food & Toolkit as your types - it will count any new types you add.
The NZ wrapped around the Count(sType) ensures the 0 values are shown, while the CLNG ensures it's still treated as a number rather than text.
You could just use Count(sType) AS CountOfType.
TRANSFORM CLNG(NZ(Count(sType),0)) AS CountOfType
SELECT UserID
,Gender
,COUNT(sType) AS TotalCount
FROM UserInfo LEFT JOIN OrderInfo ON UserInfo.UserID = OrderInfo.UserID_FK
GROUP BY UserID, Gender
PIVOT sType

Split column into two columns based on type code in third column

My SQL is very rusty. I'm trying to transform this table:
+----+-----+--------------+-------+
| ID | SIN | CONTACT | TYPE |
+----+-----+--------------+-------+
| 1 | 737 | b#bacon.com | email |
| 2 | 760 | 250-555-0100 | phone |
| 3 | 737 | 250-555-0101 | phone |
| 4 | 800 | 250-555-0102 | phone |
| 5 | 850 | l#lemon.com | email |
+----+-----+--------------+-------+
Into this table:
+----+-----+--------------+-------------+
| ID | SIN | PHONE | EMAIL |
+----+-----+--------------+-------------+
| 1 | 737 | 250-555-0101 | b#bacon.com |
| 2 | 760 | 250-555-0100 | |
| 4 | 800 | 250-555-0102 | |
| 5 | 850 | | l#lemon.com |
+----+-----+--------------+-------------+
I wrote this query:
SELECT *
FROM (SELECT *
FROM people
WHERE TYPE = 'phone') phoneNumbers
FULL JOIN (SELECT *
FROM people
WHERE TYPE = 'email') emailAddresses
ON phoneNumbers.SIN = emailAddresses.SIN;
Which produces:
+----+-----+--------------+-------+------+-------+-------------+--------+
| ID | SIN | CONTACT | TYPE | ID_1 | SIN_1 | CONTACT_1 | TYPE_1 |
+----+-----+--------------+-------+------+-------+-------------+--------+
| 2 | 760 | 250-555-0100 | phone | | | | |
| 3 | 737 | 250-555-0101 | phone | 1 | 737 | b#bacon.com | email |
| 4 | 800 | 250-555-0102 | phone | | | | |
| | | | | 5 | 850 | l#lemon.com | email |
+----+-----+--------------+-------+------+-------+-------------+--------+
I know that I can select the columns I want, but the SIN column is incomplete. I seem to recall that I should join in the table a third time to get a complete SIN column, but I cannot remember how.
How can I produce my target table (ID, SIN, PHONE, EMAIL)?
Edit and clarification: I am grateful for the answers I have received so far, but as a SQL greenhorn I am unfamiliar with the techniques you are using (case statements, conditional aggregation, and pivoting). Can this not be done using JOIN and SELECT? Please excuse my ignorance in this matter. (It's not that I am not interested in superior techniques, but I do not want to move too fast too soon.)
One way to approach this is conditional aggregation:
select min(ID), SIN,
max(case when type = 'phone' then contact end) as phone,
max(case when type = 'email' then contact end) as email
from people t
group by sin;
Seems a pivot (oracle.com) would work easily here.
SELECT ID, SIN, PHONE, EMAIL
FROM PEOPLE
PIVOT (
MAX(CONTACT)
FOR TYPE IN ('EMAIL', 'PHONE')
)
I realize this is less elegant than all the solutions posted, but here it is anyhow, a solution using only JOIN and SELECT:
SELECT sins.SIN, phone, email
FROM ((SELECT SIN email_sin, contact email
FROM people
WHERE TYPE = 'email') email
FULL JOIN (SELECT SIN phone_sin, contact phone
FROM people
WHERE TYPE = 'phone') phone
ON email.email_sin = phone.phone_sin)
RIGHT JOIN (SELECT DISTINCT SIN FROM people) sins
ON sins.SIN = phone_sin OR sins.SIN = email_sin;
This lacks the ID column.