Postgres SQL query to get the first row of distinct id - sql

channels table
id | name
------------
1 | ABC
2 | XYZ
3 | MNO
4 | ASD
user_channels table
user_id | channel_id
----------------------
555 | 1
666 | 1
777 | 1
555 | 2
888 | 2
999 | 3
555 | 3
user_chats table
id | created_at | channel_id | content
---------------------------------------
2 | time 1 | 1 | Hello
3 | time 2 | 1 | Hi
4 | time 3 | 2 | Good day
5 | time 4 | 2 | Morning
I have these 3 tables in postgres SQL,
I want to write a sql query to get user_channels by user_id and it's latest message only (time 1 is oldest message) from user_chats table. How can I do that?
For example, for user_id = 555, the query should return
channel_id | content | created_at
---------------------------------------
1 | Hi | time 2
2 | Morning | time 4
3 | Null | Null

Use distinct on:
select distinct on (a.channel_id) a.*
from user_chats a
inner join user_channels l on l.channel_id = a.channel_id
where l.user_id = 555
order by a.channel_id, a.createt_at desc
If you want this for all users at once:
select distinct on (l.user_id, a.channel_id) l.user_id, a.*
from user_chats a
inner join user_channels l on l.channel_id = a.channel_id
order by l.user_id, a.channel_id, a.createt_at desc

You can use distinct on:
select distinct on (c.channel_id) c.channel_id, uc.content, uc.created_at
from user_channels c left join
user_chats uc
on uc.channel_id = c.channel_id
where c.user_id = ?
order by c.idchannel_id, uc.created_at desc;

Related

Iterating multiple times over same table (postgres sql)

i am working with sql from few days as beginner and stuck at a problem!
Problem:
Given A table user_email_table which columns are (id, user_id, user_id_email)
How to get all user co-related to each and every user extending himself
user_email_table
id | user_id | email_id
1 | 1 | xyz
2 | 2 | xyz2
3 | 3 | xyz3
4 | 4 | xyz4
Desired Output
id | user_id_1 | user_id_2 | user_id_1_email | user_id_2_email |
-----------------------------
1 | 1 | 2 | xyz|xyz2|
1 | 1 | 3 |xyz|xyz3|
1 | 1 | 4 |xyz|xyz4|
1 | 2 | 1 |xyz2|xyz1|
1 | 2 | 3 |xyz2|xyz3|
1 | 2 | 4 |xyz2|xyz4|
1 | 3 | 1 |xyz3|xyz1|
1 | 3 | 2 |xyz3|xyz2|
1 | 3 | 4 |xyz3|xyz4|
1 | 4 | 1 |xy4|xyz1|
1 | 4 | 2 |xyz4|xyz2|
1 | 4 | 3 |xyz4|xyz3|
Please Ignore validate data of email fields This are just for reminding to mention these columns in output-table
What SQL query can result into this table
You want a "cross join" of the table against itself excluding the self-matching rows. You can do:
select
1 as id,
a.user_id as user_id_1,
b.user_id as user_id_2,
a.email_id as user_id_1_email,
b.email_id as user_id_2_email
from user_email_table a
cross join user_email_table b
where a.user_id <> b.user_id
EDIT:
As #IMSoP points out the query can also use a common join with a join predicate and can be rephrased as:
select
1 as id,
a.user_id as user_id_1,
b.user_id as user_id_2,
a.email_id as user_id_1_email,
b.email_id as user_id_2_email
from user_email_table a
join user_email_table b on a.user_id <> b.user_id

Postgres - Unique values for id column using CTE, Joins alongside GROUP BY

I have a table referrals:
id | user_id_owner | firstname | is_active | user_type | referred_at
----+---------------+-----------+-----------+-----------+-------------
3 | 2 | c | t | agent | 3
5 | 3 | e | f | customer | 5
4 | 1 | d | t | agent | 4
2 | 1 | b | f | agent | 2
1 | 1 | a | t | agent | 1
And another table activations
id | user_id_owner | referral_id | amount_earned | activated_at | app_id
----+---------------+-------------+---------------+--------------+--------
2 | 2 | 3 | 3.0 | 3 | a
4 | 1 | 1 | 6.0 | 5 | b
5 | 4 | 4 | 3.0 | 6 | c
1 | 1 | 2 | 2.0 | 2 | b
3 | 1 | 2 | 5.0 | 4 | b
6 | 1 | 2 | 7.0 | 8 | a
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Here is the query I ran:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select id, app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id )
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
Here is the result I got:
id | activations_count | amount_earned | referred_at | last_activated_at | id | best_selling_app | best_selling_app_count | best_selling_app_rank
----+-------------------+---------------+-------------+-------------------+----+------------------+------------------------+-----------------------
2 | 3 | 14.0 | 2 | 8 | 2 | b | 2 | 1
1 | 1 | 6.0 | 1 | 5 | 1 | b | 1 | 2
2 | 3 | 14.0 | 2 | 8 | 2 | a | 1 | 2
4 | 1 | 3.0 | 4 | 6 | 4 | c | 1 | 2
The problem with this result is that the table has a duplicate id of 2. I only need unique values for the id column.
I tried a workaround by harnessing distinct that gave desired result but I fear the query results may not be reliable and consistent.
Here is the workaround query:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select
distinct on(id), app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id
order by id, best_selling_app_count desc)
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
I need a recommendation on how best to achieve this.
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Your question is really complicated with a very complicated SQL query. However, the above is what looks like the actual question. If so, you can use:
select r.*,
a.app_id as most_common_app_id,
a.cnt as most_common_app_id_count
from referrals r left join
(select distinct on (a.referral_id) a.referral_id, a.app_id, count(*) as cnt
from activations a
group by a.referral_id, a.app_id
order by a.referral_id, count(*) desc
) a
on a.referral_id = r.id;
You have not explained the other columns that are in your result set.

Group By with MAX value from another column

Table FieldStudies is :
ID Name
---|-----------------------|
1 | Industrial Engineering|
2 | Civil Engineering |
3 | Architecture |
4 | Chemistry |
And table Eductionals is :
ID UserID Degree FieldStudy_ID
---|------|--------|------------|
1 | 100 | 3 | 4 |
2 | 101 | 2 | 2 |
3 | 101 | 3 | 2 |
4 | 101 | 4 | 3 |
5 | 103 | 3 | 4 |
6 | 103 | 4 | 2 |
I want to find the number of students in each FieldStudies , provided that the highest Degree is considered.
Output desired:
ID Name Count
---|-----------------------|--------|
1 | Industrial Engineering| 0 |
2 | Civil Engineering | 0 |
3 | Architecture | 1 |
4 | Chemistry | 2 |
I have tried:
select Temptable2.* , count(*) As CountField from
(select fs.*
from FieldStudies fs
left outer join
(select e.UserID , Max(e.Degree) As ID_Degree , e.FieldStudy_ID
from Eductionals e
group by e.UserID) Temptable
ON fs.ID = Temptable.FieldStudy_ID) Temptable2
group by Temptable2.ID
But I get the following error :
Column 'Eductionals.FieldStudy_ID' is invalid in the select list
because it is not contained in either an aggregate function or the
GROUP BY clause.
If I understand correctly, you want only the highest degree for each person. If so, you can use row_number() to whittle down the multiple rows for a given person and the rest is aggregation and join:
select fs.id, fs.Name, count(e.id)
from fieldstudies fs left join
(select e.*,
row_number() over (partition by userid order by degree desc) as seqnum
from educationals e
) e
on e.FieldStudy_ID = fs.id and seqnum = 1
group by fs.id, fs.Name
order by fs.id;

joining more than two tables without repeating values

I want to join three tables,
I have three tables user, profession and education where "uid" is primary key for user table and foreign key for other two tables. I want to join these tables to produce result in one single table
user profession education
+------+-------+ +-----+----------+ +-----+---------+
| uid | uName | | uid | profName | | uid | eduName |
+------+-------+ +-----+----------+ +-----+---------+
| 1 | aaa | | 1 | prof1 | | 1 | edu1 |
| 2 | bbb | | 1 | prof2 | | 1 | edu2 |
| 3 | ccc | | 2 | prof1 | | 1 | edu3 |
| | | | 3 | prof3 | | 3 | edu4 |
| | | | 3 | prof2 | | | |
+------+-------+ +-----+----------+ +-----+---------+
Expected output
+------+-------+-----+----------+-----+---------+
| uid | uName | uid | profName | uid | eduName |
+------+-------+-----+----------+-----+---------+
| 1 | aaa | 1 | prof1 | 1 | edu1 |
| null | null | 1 | prof2 | 1 | edu2 |
| null | null |null | null | 1 | edu3 |
| 2 | bbb | 2 | prof1 | null| null |
| 3 | ccc | 3 | prof3 | 3 | edu4 |
| null | null | 3 | prof2 | null| null |
+------+-------+-----+----------+-----+---------+
I tried following query
select u.uid ,u.uName,p.uid , p.profName,e.uid,e.eduName
from user u inner join profession p on u.uid=p.pid
inner join education e on u.uid = e.uid
where u.uid=p.uid
and u.uid=e.uid
and i.uid=1
Which gives me duplicate values
+------+-------+-----+----------+-----+---------+
| uid | uName | uid | profName | uid | eduName |
+------+-------+-----+----------+-----+---------+
| 1 | aaa | 1 | prof1 | 1 | edu1 |
| 1 | aaa | 1 | prof2 | 1 | edu1 |
| 1 | aaa | 1 | prof1 | 1 | edu2 |
| 1 | aaa | 1 | prof2 | 1 | edu2 |
| 1 | aaa | 1 | prof1 | 1 | edu3 |
| 1 | aaa | 1 | prof2 | 1 | edu3 |
+------+-------+-----+----------+-----+---------+
Is there a way to get the output with not repeating the values.
Thanks
Bit of a swine this one.
I agree with #GordonLinoff that ideally this presentation would be done on the client side.
However, if we wish to do it in SQL, then the basic approach is that you have to get the maximum number of rows that will be consumed by each user (based on a count of how many entries they have in each of the professions and educations tables, and then of these counts, the max count).
Once we have the number of rows required for each user, we expand the rows out for each user as necessary using a numbers table (I've included a number generator for the purpose).
Then we join each table on, according to the uid and the row number of the entry in the joined table relative to the row number of the "expanded" rows for each user. Then we select the relevant columns, and that's us done. Pay the nurse on the way out!
WITH
number_table(number) AS
(
SELECT
(ones.n) + (10 * tens.n) + (100 * hundreds.n) AS number
FROM --available range 0 to 999
(VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS ones(n)
,(VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS tens(n)
,(VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS hundreds(n)
)
,users(u_uid, userName) AS
(
SELECT 1, 'aaa'
UNION ALL
SELECT 2, 'bbb'
UNION ALL
SELECT 3, 'ccc'
)
,professions(p_u_uid, profName) AS
(
SELECT 1, 'prof1'
UNION ALL
SELECT 1, 'prof2'
UNION ALL
SELECT 2, 'prof1'
UNION ALL
SELECT 3, 'prof3'
UNION ALL
SELECT 3, 'prof2'
)
,educations(e_u_uid, eduName) AS
(
SELECT 1, 'edu1'
UNION ALL
SELECT 1, 'edu2'
UNION ALL
SELECT 1, 'edu3'
UNION ALL
SELECT 3, 'edu4'
)
,row_counts(uid, row_count) AS
(
SELECT u_uid, COUNT(u_uid) FROM users GROUP BY u_uid
UNION ALL
SELECT p_u_uid, COUNT(p_u_uid) FROM professions GROUP BY p_u_uid
UNION ALL
SELECT e_u_uid, COUNT(e_u_uid) FROM educations GROUP BY e_u_uid
)
,max_counts(uid, max_count) AS
(
SELECT uid, MAX(row_count) FROM row_counts GROUP BY uid
)
SELECT
u_uid
,userName
,p_u_uid
,profName
,e_u_uid
,eduName
FROM
max_counts
INNER JOIN
number_table ON number BETWEEN 1 AND max_count
LEFT JOIN
(
SELECT u_uid, userName, ROW_NUMBER() OVER (PARTITION BY u_uid ORDER BY userName) AS user_match
FROM users
) AS users
ON u_uid = uid
AND number = user_match
LEFT JOIN
(
SELECT p_u_uid, profName, ROW_NUMBER() OVER (PARTITION BY p_u_uid ORDER BY profName) AS prof_match
FROM professions
) AS professions
ON p_u_uid = uid
AND number = prof_match
LEFT JOIN
(
SELECT e_u_uid, eduName, ROW_NUMBER() OVER (PARTITION BY e_u_uid ORDER BY eduName) AS edu_match
FROM educations
) AS educations
ON e_u_uid = uid
AND number = edu_match
ORDER BY
IIF(COALESCE(u_uid, p_u_uid, e_u_uid) IS NULL, 1, 0) ASC --nulls last
,COALESCE(u_uid, p_u_uid, e_u_uid) ASC
,IIF(COALESCE(p_u_uid, e_u_uid) IS NULL, 1, 0) ASC --nulls last
,COALESCE(p_u_uid, e_u_uid) ASC
,IIF(e_u_uid IS NULL, 1, 0) ASC --nulls last
,e_u_uid ASC
And the results:
u_uid userName p_u_uid profName e_u_uid eduName
----------- -------- ----------- -------- ----------- -------
1 aaa 1 prof1 1 edu1
NULL NULL 1 prof2 1 edu2
NULL NULL NULL NULL 1 edu3
2 bbb 2 prof1 NULL NULL
3 ccc 3 prof2 3 edu4
NULL NULL 3 prof3 NULL NULL
Did you try the distinct keyword?
select DISTINCT u.uid ,u.uName,p.uid , p.profName,e.uid,e.eduName
from user u inner join profession p on u.uid=p.pid
inner join education e on u.uid = e.uid
where u.uid=p.uid
and u.uid=e.uid
and i.uid=1

SQL Sum Columns

I want to get the following output:
Main table:
Email | Group | id
a#gmail.com | Y | 1
a#gmail.com | Y | 2
b#gmail.com | N | 3
c#gmail.com | N | 4
Join Table:
Email | Value
a#gmail.com | 10
b#gmail.com | 20
c#gmail.com | 30
Desired result (only take the a#gmail.com value once, despite appearing in the first table twice):
Group | Email Count | Sum
Y | 1 | 10
N | 2 | 50
Here is the sqlfiddle I've been playing around with:
http://sqlfiddle.com/#!9/c2a24d/8
You were close in your SQLFiddle. You just needed to join on a distinct select.
SELECT
e.Unsub as Unsub,
count(e.email) as EmailCount,
sum(c.sum) as EmailSum
FROM CountTable c
JOIN (select distinct email, Unsub from EmailsTable) e on c.email = e.email
GROUP BY e.unsub
SQLFiddle
First remove the duplicates, and then do the calculations
SQL DEMO
SELECT filter.`Unsub`, COUNT(*), SUM(`sum`)
FROM (
SELECT DISTINCT `Unsub`, `email`
FROM EmailsTable ) as filter
JOIN CountTable
ON filter.`email` = CountTable.`email`
GROUP BY filter.`Unsub`
OUTPUT
| Unsub | COUNT(*) | SUM(`sum`) |
|-------|----------|------------|
| N | 2 | 50 |
| Y | 1 | 10 |