Postgres group by columns and within group select other columns by max aggregate - sql

This is probably a standard problem, and I've keyed off some other greatest-n-per-group answers, but so far been unable to resolve my current problem.
A B C
+----+-------+ +----+------+ +----+------+-------+
| id | start | | id | a_id | | id | b_id | name |
+----+-------+ +----+------+ +----+------+-------+
| 1 | 1 | | 1 | 1 | | 1 | 1 | aname |
| 2 | 2 | | 2 | 1 | | 2 | 2 | aname |
+----+-------+ | 3 | 2 | | 3 | 3 | aname |
+----+------+ | 4 | 3 | bname |
+----+------+-------+
In English what I'd like to accomplish is:
For each c.name, select its newest entry based on the start time in a.start
The SQL I've tried is the following:
SELECT a.id, a.start, c.id, c.name
FROM a
INNER JOIN (
SELECT id, MAX(start) as start
FROM a
GROUP BY id
) a2 ON a.id = a2.id AND a.start = a2.start
JOIN b
ON a.id = b.a_id
JOIN c
on b.id = c.b_id
GROUP BY c.name;
It fails with errors such as:
ERROR: column "a.id" must appear in the GROUP BY clause or be used in an aggregate function Position: 8
To be useful I really need the ids from the query, but cannot group on them since they are unique. Here is an example of output I'd love for the first case above:
+------+---------+------+--------+
| a.id | a.start | c.id | c.name |
+------+---------+------+--------+
| 2 | 2 | 3 | aname |
| 2 | 2 | 4 | bname |
+------+---------+------+--------+
Here is a Sqlfiddle
Edit - removed second case

Case 1
select distinct on (c.name)
a.id, a.start, c.id, c.name
from
a
inner join
b on a.id = b.a_id
inner join
c on b.id = c.b_id
order by c.name, a.start desc
;
id | start | id | name
----+-------+----+-------
2 | 2 | 3 | aname
2 | 2 | 4 | bname
Case 2
select distinct on (c.name)
a.id, a.start, c.id, c.name
from
a
inner join
b on a.id = b.a_id
inner join
c on b.id = c.b_id
where
b.a_id in (
select a_id
from b
group by a_id
having count(*) > 1
)
order by c.name, a.start desc
;
id | start | id | name
----+-------+----+-------
1 | 1 | 1 | aname

Related

Get left table data completely even when there is no reference in right joined table

Database used: SQL Server
I have three tables A,B,C.
TABLE A:
------------------
| ID | Name |
------------------
| 1 | X |
------------------
| 2 | Y |
------------------
TABLE B:
----------------------
| ID | Date |
----------------------
| 1 | 2019-11-06 |
----------------------
| 2 | 2019-11-05 |
----------------------
TABLE C:
----------------------------------
| ID | B.ID | A.ID | Amount |
----------------------------------
| 1 | 1 | 1 | 500 |
----------------------------------
| 2 | 2 | 2 | 1000 |
----------------------------------
The result I would like to get is all entries of table A.Name with their amount in table C.amount where table B.Date = 2019-11-06. The result set should include all A.name entries even it have no reference in Table C.
Required result is:
-----------------------
| A.Name | C.Amount |
-----------------------
| X | 500 |
-----------------------
| Y | NULL |
-----------------------
Code I tried with :
SELECT A.Name,C.Amount
FROM A
LEFT OUTER JOIN C ON C.A_ID=A.ID
LEFT OUTER JOIN B ON B.ID = C.B_ID ON
WHERE B.Date='2019-11-06'
The result I obtained with above code is :
------------------
| Name | Amount |
------------------
| X | 500 |
------------------
There is no Y in the result, its because there is no entry for Y on that particular date. I just want to show Y and amount as null or zero.
SQL Fiddle with my query
Please help me with this.
There's is no relationship between your A and B, so we need to group B and C using a subquery to filter with date before doing the left join.
SELECT A.Name, t1.Amount
FROM A
LEFT JOIN
(SELECT C.A_ID, C.Amount FROM C
INNER JOIN B ON B.ID = C.B_ID
WHERE B.Date='2019-11-06') t1
ON t1.A_ID=A.ID
see dbfiddle
Try this-
Fiddle Here
SELECT A.Name,C.Amount
FROM A
LEFT JOIN B ON A.ID = B.ID AND B.Date = '2019-11-06'
LEFT JOIN C ON B.ID = C.ID
Output is-
Name Amount
X 500
Y (null)

SQL: CROSS JOIN over table partitions

I have the following table
session_id | page_viewed
1 | A
1 | B
1 | C
2 | B
2 | E
What I would like to do is a cross join of the page_viewed column with itself but where the cross join is done on the partitions from session_id. So, from the table above the query would return:
session_id | page_1 | page_2
1 | A | A
1 | A | B
1 | A | C
1 | B | A
1 | B | B
1 | B | C
1 | C | A
1 | C | B
1 | C | C
2 | B | B
2 | B | E
2 | E | B
2 | E | E
I have looked into window functions today trying to find a way around it but it seems join functions cannot be used. Can anyone help?
You may join giving only the session_id as the join criteria:
SELECT
t1.session_id,
t1.page_viewed AS page_1,
t2.page_viewed AS page_2
FROM yourTable t1
INNER JOIN yourTable t2
ON t1.session_id = t2.session_id;
-- ORDER BY clause optional, if you need it here
Demo
Hmmm . . . you seem to want a self-join:
select t1.session_id, t1.page_viewed as page_1, t2.page_viewed as page_2
from t t1 join
t t2
on t1.session_id = t2.session_id
order by t1.session_id, t1.page_viewed, t2.page_viewed;

Use COUNT and JOIN in a same query SQL

My tables:
Customers:
+------+----+
| Name | ID |
+------+----+
| Phu | 12 |
| Nam | 23 |
| Mit | 33 |
+------+----+
Orders:
+----+------------+
| ID | Order |
+----+------------+
| 12 | Laptop |
| 12 | Mouse |
| 33 | Smartphone |
| 23 | Keyboard |
| 33 | Computer |
+----+------------+
I want to get output like this:
+------+--------+
| Name | Orders |
+------+--------+
| Phu | 2 |
| Mit | 2 |
+------+--------+
I use this query but this doesn't work:
SELECT
Name,
COUNT(*) AS 'Orders'
FROM
Orders a
INNER JOIN
Customers b ON a.ID = b.ID
GROUP BY
a.ID
HAVING
COUNT(*) > 1;
It has the error like this:
Column 'Customers.Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Any help is really appreciated. Thank you.
SELECT c.Name, COUNT(o.ID) AS Orders
FROM Customers c INNER JOIN Orders o ON c.ID = o.ID
GROUP BY o.ID
HAVING Orders > 1
Working Sqlfiddle: http://sqlfiddle.com/#!2/869790/4
As mentioned in my comment every column in the select statement has to be in the group by clause.
SELECT Name, COUNT(*) AS 'Orders'
FROM Orders a
INNER JOIN Customers b
ON a.ID = b.ID
GROUP BY b.Name
HAVING COUNT(*)>1;

SQL query for this combined expected results - pivot?

Sorry for this long winded question, but I'm not sure how to go about constructing this SQL query needed for the results I want. I'll outline the two queries that I currently run and work fine and I will outline the results I need. Any help will be appreciated.
1st Query:
SELECT c.name AS name, count(*) AS total, sum(a.views) AS total_views, sum(a.views) / count(*) as average_views
FROM table_a a
JOIN table_b b ON b.id = a.b_id
JOIN table_c c ON c.id = b.c_id
WHERE a.status = 0 AND a.type in (2, 4, 5)
GROUP BY c.name ORDER BY c.name;
Result:
--------------------------------------------
name | total | total_views | average_views |
--------------------------------------------
aaaa | 2 | 150 | 75 |
bbbb | 1 | 75 | 75 |
dddd | 1 | 25 | 25 |
--------------------------------------------
2nd query:
SELECT c.name AS name, count(*) AS total, sum(a.views) AS total_views, sum(a.views) / count(*) as average_views
FROM table_a a
JOIN table_b b ON b.id = a.b_id
JOIN table_c c ON c.id = b.c_id
WHERE a.status = 0 AND a.type in (1, 3)
GROUP BY c.name ORDER BY c.name;
2nd results:
--------------------------------------------
name | total | total_views | average_views |
--------------------------------------------
aaaa | 2 | 200 | 100 |
bbbb | 1 | 100 | 100 |
dddd | 1 | 25 | 25 |
--------------------------------------------
Given these tables with this data:
Table table_a:
-----------------------------------
id | b_id | views | type | status |
-----------------------------------
1 | 100 | 100 | 2 | 0 |
2 | 200 | 75 | 4 | 0 |
3 | 300 | 50 | 5 | 0 |
4 | 400 | 25 | 2 | 0 |
5 | 500 | 100 | 1 | 0 |
6 | 600 | 100 | 1 | 0 |
7 | 700 | 100 | 3 | 0 |
8 | 800 | 25 | 3 | 0 |
-----------------------------------
Table table_b:
-------------
id | c_id |
-------------
100 | 1000 |
200 | 2000 |
300 | 1000 |
400 | 4000 |
500 | 1000 |
600 | 2000 |
700 | 4000 |
800 | 1000 |
-------------
Table table_c:
-------------
id | name |
-------------
1000 | aaaa |
2000 | bbbb |
3000 | cccc |
4000 | dddd |
-------------
This is the table that I actually want, which is simply a concantenation of the above two tables with the common column being the name column.
-------------------------------------------------------------------------------------------------------------------------------
name | total_type245 | total_views_type245 | average_views_type245 | total_type13 | total_views_type13 | average_views_type13 |
-------------------------------------------------------------------------------------------------------------------------------
aaaa | 2 | 150 | 75 | 2 | 200 | 100 |
bbbb | 1 | 75 | 75 | 1 | 100 | 100 |
dddd | 1 | 25 | 25 | 1 | 25 | 25 |
-------------------------------------------------------------------------------------------------------------------------------
It's most likely quite a simple query, but I cannot work out how to do it.
Thanks.
Just join the results together;
SELECT ResultsA.name,
total_type245,
total_views_type245,
average_views_type245,
total_type13,
total_views_type13,
average_views_type13
FROM
(
SELECT c.name AS name, count(*) AS total_type245, sum(a.views) AS total_views_type245, sum(a.views) / count(*) as average_views_type245
FROM table_a a
JOIN table_b b ON b.id = a.b_id
JOIN table_c c ON c.id = b.c_id
WHERE a.status = 0 AND a.type in (2, 4, 5)
GROUP BY name
) as ResultsA
JOIN
(
SELECT c.name AS name, count(*) AS total_type13, sum(a.views) AS total_views_type13, sum(a.views) / count(*) as average_views_type13
FROM table_a a
JOIN table_b b ON b.id = a.b_id
JOIN table_c c ON c.id = b.c_id
WHERE a.status = 0 AND a.type in (1, 3)
GROUP BY name
) as ResultsB ON ResultsA.name = ResultsB.name
ORDER BY ResultsA.name
Ok, so with Matt's help this query works:
SELECT c.name, total_type245, total_views_type245, average_views_type245, total_type13, total_views_type13, average_views_type13
FROM table_c c
LEFT JOIN (
SELECT c.name AS name, count(*) AS total_type245, sum(a.views) AS total_views_type245, sum(a.views) / count(*) as average_views_type245
FROM table_a a
JOIN table_b b ON b.id = a.b_id
JOIN table_c c ON c.id = b.c_id
WHERE a.status = 0 AND a.type in (2, 4, 5)
GROUP BY name
) as ResultsA ON ResultsA.name = c.name
LEFT JOIN (
SELECT c.name AS name, count(*) AS total_type13, sum(a.views) AS total_views_type13, sum(a.views) / count(*) as average_views_type13
FROM table_a a
JOIN table_b b ON b.id = a.b_id
JOIN table_c c ON c.id = b.c_id
WHERE a.status = 0 AND a.type in (1, 3)
GROUP BY name
) as ResultsB ON ResultsB.name = c.name;
Is this the most efficient query for the job though? It seems I'm repeating lots of the query with the only change being the a.type value being the difference.

Distinct multi-columns

For this table:
mysql> select * from work;
+------+---------+-------+
| code | surname | name |
+------+---------+-------+
| 1 | John | Smith |
| 2 | John | Smith |
+------+---------+-------+
I'd like to get the pair of code where the names are equal, so I do this:
select distinct A.code, B.code from work A, work B where A.name = B.name group by A.code, B.code;
However, I get the follow result back:
+------+------+
| code | code |
+------+------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
+------+------+
As you can see, This result has 2 duplicates, obviously from a cartesian product. I'd like to find out how I can do this such that it outputs only:
+------+------+
| code | code |
+------+------+
| 1 | 2 |
+------+------+
Any clue? Thanks!
This should work (assuming code is the primary key):
SELECT A.code, B.code
FROM work A, work B
WHERE A.name = B.name AND A.code < B.code
try this
Select A.Code, B.Code
From work a
Join work b
On A.surname = b.surname
And A.Name = B.Name
And A.Code > B.Code
You need to use A.Code > B.Code rather than != to eliminate dupes of the type
{1, 2} and {2, 1}
(If you only care about when the name is the same and not the surname, eliminate that predicate from the join condition)