SQL: CROSS JOIN over table partitions - sql

I have the following table
session_id | page_viewed
1 | A
1 | B
1 | C
2 | B
2 | E
What I would like to do is a cross join of the page_viewed column with itself but where the cross join is done on the partitions from session_id. So, from the table above the query would return:
session_id | page_1 | page_2
1 | A | A
1 | A | B
1 | A | C
1 | B | A
1 | B | B
1 | B | C
1 | C | A
1 | C | B
1 | C | C
2 | B | B
2 | B | E
2 | E | B
2 | E | E
I have looked into window functions today trying to find a way around it but it seems join functions cannot be used. Can anyone help?

You may join giving only the session_id as the join criteria:
SELECT
t1.session_id,
t1.page_viewed AS page_1,
t2.page_viewed AS page_2
FROM yourTable t1
INNER JOIN yourTable t2
ON t1.session_id = t2.session_id;
-- ORDER BY clause optional, if you need it here
Demo

Hmmm . . . you seem to want a self-join:
select t1.session_id, t1.page_viewed as page_1, t2.page_viewed as page_2
from t t1 join
t t2
on t1.session_id = t2.session_id
order by t1.session_id, t1.page_viewed, t2.page_viewed;

Related

SQL select all rows that are not equal to an id, and replace the id column with the value - without cross join

Say I have a table like this:
+----+-------+
| id | value |
+----+-------+
| 1 | a |
| 1 | b |
| 2 | c |
| 2 | d |
| 3 | e |
| 3 | f |
+----+-------+
And I want to select all rows with id that are not a, and change their id to a; select all rows with id that are not b, and change the id to b; and select all rows with id that are not c, and change their id to c.
Here is the output I want:
+----+-------+
| id | value |
+----+-------+
| 1 | c |
| 1 | d |
| 1 | e |
| 1 | f |
| 2 | a |
| 2 | b |
| 2 | e |
| 2 | f |
| 3 | a |
| 3 | b |
| 3 | c |
| 3 | d |
+----+-------+
The only solution I can think of is through cross join and distinct:
select distinct a.id, b.value
from table a
cross join table b
where a.id != b.id
Is there any other way to avoid such expensive operation?
I think the typical way to write this is to generate all pairs of id and value and then remove the ones that exist:
select i.id, v.value
from (select distinct id from t) i cross join
(select distinct value from t) v left join
t
on t.id = i.id and t.value = i.value
where t.id is null;
First, I don't think this is what your query does. But this is what you seem to be describing.
From a performance perspective, you might have other sources for i and v that don't require subqueries. If so, use those for performance.
Finally, I don't think you can do much to improve the performance of this, apart from using explicit tables -- and perhaps having appropriate indexes on all the tables.

Iterate over the rows of a second table to return resultset with cumulative sum

Yesterday, after the help of a SO user #
Iterate over the rows of a second table to return resultset
I was able to make a combination of rows with a selfjoin.
After some modifications, to adapt to my implementation, I faced a new challenge that I'm stuck: how to make an aggregate sum of a third column?
My issue is better explained in the image below:
Based on the code
SELECT
b1.table_a_id,
b1.label_x,
b2.label_y
FROM table_a a
INNER JOIN table_b b1
ON b1.table_a_id = a.table_a_id
INNER JOIN table_b b2
ON b2.table_a_id = b1.table_a_id AND
b2.label_y > b1.label_x
ORDER BY
b1.table_a_id,
b1.label_x,
b2.label_y;
I was able to acquire the combinations.
What should be the next step to get the cumulative sum based on a third column?
I couldn't think of a solution without using a second service, such as python with pandas, using a cumsum function.
To generate the expected resultset, you would need to join the table with itself with an inequality condition on the order column. Then, you can do a window sum:
select
t1.table_a_id,
t1.label_x,
t2.label_y,
sum(t2.value) over(
partition by t1.table_a_id, t1.label_x
order by t1."order", t2."order"
) agg_value
from
table_b t1
inner join table_b t2
on t1.table_a_id = t2.table_a_id
and t2."order" >= t1."order"
order by t1."order", t2."order"
Note: order is a reserved word, so it needs to be quoted; if you actual database column has a different name, you can remove the double quotes.
Demo on DB Fiddle:
TABLE_A_ID | LABEL_X | LABEL_Y | AGG_VALUE
---------: | :------ | :------ | --------:
1 | A | B | 1
1 | A | C | 3
1 | A | D | 6
1 | A | E | 10
1 | A | F | 15
1 | B | C | 2
1 | B | D | 5
1 | B | E | 9
1 | B | F | 14
1 | C | D | 3
1 | C | E | 7
1 | C | F | 12
1 | D | E | 4
1 | D | F | 9
1 | E | F | 5
You seem to want a cumulative sum:
SELECT b1.table_a_id, b1.label_x, b2.label_y,
SUM(b1.value) OVER (PARTITION BY b1.table_a_id, b1.label_x
ORDER BY b2.order
) as AGG_VALUE

SQL JOIN two table & show all rows for table A

I have a question about JOIN.
TABLE A | TABLE B |
-----------------------------------------|
PK | div | PK | div | val |
-----------------------------------------|
A | a | 1 | a | 10 |
B | b | 2 | a | 100 |
C | c | 3 | c | 9 |
------------------| 4 | c | 99 |
-----------------------
There are two tables something like above, and I have been trying to join two tables but I want to see all rows from TABLE A.
Something like
SELECT T1.PK, T1.div, T2.val
FROM A T1
LEFT OUTER JOIN B T2
ON T1.div = T2.div
and I want the result would look like this below.
PK | div | val |
-------------------------
A | a | 10 |
A | a | 100 |
B | null | null |
C | c | 9 |
C | c | 99 |
I have tried all JOINs I know but B doesn't appear because it doesn't exist. Is it possible to show all rows on TABLE A and just show null if it doesn't exists on TABLE B?
Thanks in advance!
If you change your query to
SELECT T1.PK, T2.div, T2.val
FROM A T1
LEFT OUTER JOIN B T2
ON T1.div = T2.div
(Note, that div comes from T2 here.), you'll get exactly the result posted (but maybe in a different order, add an ORDER BY clause if you want a specific order).
Your query as it stands will get you:
PK | div | val |
-------------------------
A | a | 10 |
A | a | 100 |
B | b | null |
C | c | 9 |
C | c | 99 |
(Note, that div is b for the row with the PK of B, not null.)
To get to your resultset, all you need to do is use T2.Div as that is the value that does not exist in the second table:
SELECT T1.PK, T2.div, T2.val
FROM A T1
LEFT OUTER JOIN B T2
ON T1.div = T2.div

Postgres group by columns and within group select other columns by max aggregate

This is probably a standard problem, and I've keyed off some other greatest-n-per-group answers, but so far been unable to resolve my current problem.
A B C
+----+-------+ +----+------+ +----+------+-------+
| id | start | | id | a_id | | id | b_id | name |
+----+-------+ +----+------+ +----+------+-------+
| 1 | 1 | | 1 | 1 | | 1 | 1 | aname |
| 2 | 2 | | 2 | 1 | | 2 | 2 | aname |
+----+-------+ | 3 | 2 | | 3 | 3 | aname |
+----+------+ | 4 | 3 | bname |
+----+------+-------+
In English what I'd like to accomplish is:
For each c.name, select its newest entry based on the start time in a.start
The SQL I've tried is the following:
SELECT a.id, a.start, c.id, c.name
FROM a
INNER JOIN (
SELECT id, MAX(start) as start
FROM a
GROUP BY id
) a2 ON a.id = a2.id AND a.start = a2.start
JOIN b
ON a.id = b.a_id
JOIN c
on b.id = c.b_id
GROUP BY c.name;
It fails with errors such as:
ERROR: column "a.id" must appear in the GROUP BY clause or be used in an aggregate function Position: 8
To be useful I really need the ids from the query, but cannot group on them since they are unique. Here is an example of output I'd love for the first case above:
+------+---------+------+--------+
| a.id | a.start | c.id | c.name |
+------+---------+------+--------+
| 2 | 2 | 3 | aname |
| 2 | 2 | 4 | bname |
+------+---------+------+--------+
Here is a Sqlfiddle
Edit - removed second case
Case 1
select distinct on (c.name)
a.id, a.start, c.id, c.name
from
a
inner join
b on a.id = b.a_id
inner join
c on b.id = c.b_id
order by c.name, a.start desc
;
id | start | id | name
----+-------+----+-------
2 | 2 | 3 | aname
2 | 2 | 4 | bname
Case 2
select distinct on (c.name)
a.id, a.start, c.id, c.name
from
a
inner join
b on a.id = b.a_id
inner join
c on b.id = c.b_id
where
b.a_id in (
select a_id
from b
group by a_id
having count(*) > 1
)
order by c.name, a.start desc
;
id | start | id | name
----+-------+----+-------
1 | 1 | 1 | aname

Single row for multiple case in a select

I need a solution for the below senario
I have a table temp with columns: a, b, c, d and the data looks like this:
TABLE TEMP
+---+----+----+----+
|a | b | c | d |
+===+====+====+====+
| 1 | 1 | 1 | m |
| 1 | 2 | 1 | d |
| 1 | 3 | 1 | w |
| 2 | 1 | 1 | m |
| 2 | 2 | 1 | d |
| 2 | 2 | 1 | w |
+---+----+----+----+
QUERY
SELECT CASE WHEN B=1 AND C=1 THEN D END as T1,
CASE WHEN B=2 AND C=1 THEN D END as T2,
CASE WHEN B=3 AND C=1 THEN D END as T3
FROM TEMP
WHERE A=1
The above query gives multiple rows with null values where value is not present
I need a result set with a single row that looks like this:
Expected Result
+------+-------+------+
| T1 | T2 | T3 |
+======+=======+======+
| m | d | w |
+------+-------+------+
Do like this (using CTE)
QUERY
WITH
CTE1 as (select top 1 d as T1 from temp where b=1 and c=1),
CTE2 as (select top 1 d as T2 from temp where b=2 and c=1),
CTE3 as (select top 1 d as T3 from temp where b=3 and c=1)
select CTE1.*, CTE2.*, CTE3.*
FROM CTE1 CROSS JOIN CTE2 CROSS JOIN CTE3
SQL fiddle
About the multiple CTE
Please let me whether it works!