Postgres, groupBy and count for table and relations at the same time - sql

I have a table called 'users' that has the following structure:
id (PK)
campaign_id
createdAt
1
123
2022-07-14T10:30:01.967Z
2
1234
2022-07-14T10:30:01.967Z
3
123
2022-07-14T10:30:01.967Z
4
123
2022-07-14T10:30:01.967Z
At the same time I have a table that tracks clicks per user:
id (PK)
user_id(FK)
createdAt
1
1
2022-07-14T10:30:01.967Z
2
2
2022-07-14T10:30:01.967Z
3
2
2022-07-14T10:30:01.967Z
4
2
2022-07-14T10:30:01.967Z
Both of these table are up to millions of records... I need the most efficient query to group the data per campaign_id.
The result I am looking for would look like this:
campaign_id
total_users
total_clicks
123
3
1
1234
1
3
I unfortunately have no idea how to achieve this while minding performance and most important of it all I need to use WHERE or HAVING to limit the query in a certain time range by createdAt

Note, PostgreSQL is not my forte, nor is SQL. But, I'm learning spending some time on your question. Have a go with INNER JOIN after two seperate SELECT() statements:
SELECT * FROM
(
SELECT campaign_id, COUNT (t1."id(PK)") total_users FROM t1 GROUP BY campaign_id
) tbl1
INNER JOIN
(
SELECT campaign_id, COUNT (t2."user_id(FK)") total_clicks FROM t2 INNER JOIN t1 ON t1."id(PK)" = t2."user_id(FK)" GROUP BY campaign_id
) tbl2
USING(campaign_id)
See an online fiddle. I believe this is now also ready for a WHERE clause in both SELECT statements to filter by "createdAt". I'm pretty sure someone else will come up with something better.
Good luck.

Hope this will help you.
select u.campaign_id,
count(distinct u.id) users_count,
count(c.user_id) clicks_count
from
users u left join clicks c on u.id=c.user_id
group by 1;
See here query output

Related

SQL Sort table by number of items in common

I have 3 tables, user, artist and a join table.
I'd like to find for a particular user, the ordering of the rest of the user table by the number of artists they have in common in the join table, or potentially just the n other users who are have the most in common with them.
For example in the table:
userID | artistID
-------|----------
1 | 1
1 | 2
2 | 1
2 | 2
3 | 1
I'd want to get that the ordering for user1 would be (2,3) because user2 shares both artist1 and artist2 with user1, whereas user3 only shares artist1.
Is there a way to get this from SQL?
Thanks
Assuming that you always know the user ID you want to check agaist, you can also do the following:
SELECT user, count(*) as in_common
FROM user_artist
WHERE
user<>1 AND
artist IN (SELECT artist FROM user_artist WHERE user=1)
GROUP BY user
ORDER BY in_common DESC;
This avoids joining which might have better performance on a large table. Your example is sqlfiddle here
You can do this with a self-join and aggregation:
select ua.userID, count(ua1.artistID) as numInCommonWithUser1
from userartists ua left join
userartists ua1
on ua.artistID = ua1.artistID and ua1.userID = 1
group by ua.userID
order by numInCommonWithUser1 desc;
If Suppose you know the user ID you are going to check then this query will complete your requirement and also perform very well.
SELECT ua1.user, count(*) as all_Common
FROM user_artist ua1
WHERE
(
Select count(*)
From user_artist ua2
Where ua2.user=1
AND ua2.artist=ua1.artist
)>0
AND ua1.user = 1
GROUP BY ua1.user
ORDER BY ua1.all_Common DESC;
Let me know if any question!

SQL getting data from 2 tables

I've got a tricky (at least for me it's tricky) question, I want to arrange data by comment count. My first table is called all_comments which has these columns (more but not essential):
comment, target_id
My second table is called our_videos which has these columns (more but not essential):
id, title
I want to get the count of all comments that have target_id same as id on 2nd table and arrange that data by comment count. Here is example of what I want:
TABLE #1:
id target_id
----------------
1 3
2 5
3 5
4 3
5 3
TABLE #2:
id title
-----------
1 "test"
2 "another-test"
3 "testing"
5 "......"
This is basically saying that data, that is in 2nd database and have id of 3 have 3 comments, and data that have id of 5 have 2 comments, and I want to arrange that data by this comment count and get result like this:
RESULT:
id title
----------------
3 "testing"
5 "......."
1 "test"
2 "another-test"
If I missed any important info needed for this question just ask, thanks for help, peace :)
it is very simple query and you definitely have to look at any sql tutorial
naive variant will be:
select videos.id, videos.title, count(*) as comment_count
from videos
left outer join
comments
on (videos.id = comments.target_id)
group by videos.id, videos.title
order by comment_count desc
this version has some performance problems, because you have to group by name, to speed up it we usually do next thing:
select videos.id, videos.title, q.cnt as comment_count
from videos
left outer join
(
select target_id, count(*)
from comments
group by target_id
) as q
on videos.id = q.target_id
order by q.cnt DESC
select videos.id, videos.title, isnull(cnt, 0) as cnt
from videos
left outer join
(select target_id, count(*) as cnt
from comments
group by target_id) as cnts
on videos.id = cnts.target_Id
order by isnull(cnt, 0) desc, videos.title
Some systems will let you write this even though sorting is not strictly supposed to happen on an column not included in the output. I don't necessarily recommend it but I might argue it's the most straightforward.
select id, title from videos
order by (select count(*) from comments where target_id = videos.id) desc, title
If you don't mind having it in the output it's a quick change:
select id, title from videos,
(select count(*) from comments where target_id = videos.id) as comment_count
order by comment_count desc, title
SQL generally has a lot of options.

How to get a correlated subquery as column

I dont know how I can do this sql query, probably its simple but I don't know how i can do it.
I have 2 tables:
Table_Articles:
COD NAME
1 Bottle
2 Car
3 Phone
Table_Articles_Registered
COD_ARTICLE DATE
1 05/11/2014
1 06/11/2014
1 07/11/2014
2 08/11/2014
2 09/11/2014
3 05/11/2014
I want take in the table Table_Articles_Registered the row with the MAX date , finally I want get this result:
COD NAME DATE
1 Bottle 07/11/2014
2 Car 09/11/2014
3 Phone 05/11/2014
I need use the sencente like this. The problem its in the subquery. Later I use other inner join in the sentence, this is only a fragment.
select
_Article.Code,
_Article.Description ,
from Tbl_Articles as _Article left join
(
select top 1 *
from ArticlesRegisterds where DATE_REGISTERED <= '18/11/2014'
order by DATE_REGISTERED
)
as regAux
on regAux.CODE_ARTICLE= _Article.CODE
I dont know how can I connect the field CODE_ARTICLE in the table ArticlesRegisterds with the first query.
I think this is a basic aggregation query with a join:
select a.cod, a.name, max(ar.date) as date
from Artiles a join
ArticlesRegisterds ar
on ar.cod_article = a.cod
group by a.cod, a.name
Try this:-
SELECT TAR.COD_ARTICLE, TA.NAME, MAX(TAR.DATE)
FROM Table_Articles_Registered TAR JOIN
Table_Articles.TA ON TAR.COD_ARTICLE = TA.COD
GROUP BY TAR.COD_ARTICLE, TA.NAME;
Can't you just do this?:
SELECT
Table_Articles.COD,
Table_Articles.NAME,
(
SELECT MAX(Table_Articles_Registered.DATE)
FROM Table_Articles_Registered
WHERE Table_Articles.COD_ARTICLE=Table_Articles.COD
) AS DATE
FROM
Table_Articles

Best solution for SQL without looping

I'm relatively new to SQL, and am trying to find the best way to attack this problem.
I am trying to take data from 2 tables and start merging them together to perform analysis on it, but I don't know the best way to go about this without looping or many nested subqueries.
What I've done so far:
I have 2 tables. Table1 has user information and Table2 has information on orders(prices and dates, as well as user)
What I need to do:
I want to have a single row for each user that has a summary of information about all of their orders. I'm looking to find the sum of prices of all orders by each user, the max price paid by that user, and the number of orders. I'm not sure how to best manipulate my data in SQL.
Currently, my code looks as follows:
Select alias1.*, Table2.order_id, Table2.price, Table2.order_date
From (Select * from Table1 where country='United States') as alias1
LEFT JOIN Table2
on alias1.user_id = Table2.user_id
This filters out the datatypes by country, and then joins it with users, creating a record of each order including the user information. I don't know if this is a helpful step, but this is part of my first attempt playing around with the data. I was thinking of looping over this, but I know that is against the spirit of SQL
Edit: Here is an example of what I have and what I want:
Table 1(user info):
user_id user_country
1 United States
2 United Kingdom
(etc)
Table 2(order info):
order_id price user_id
100 5.00 1
101 3.50 2
102 2.50 1
103 1.00 1
104 8.00 2
What I would like output:
user_id user_country total_price max_price number_of_orders
1 United States 8.50 5.00 3
2 United Kingdom 11.50 8.00 2
Here's one way to do this:
SELECT alias1.user_id,
MAX(alias1.user_name) As user_name,
SUM(Table2.price) As UsersTotalPrice,
MAX(Table2.price) As UsersHighestPrice
FROM Table1 As alias1
LEFT JOIN Table2 ON alias1.user_id = Table2.user_id
WHERE country = 'United States'
GROUP BY user_id
If you can give us the actual table definitions, then we can show you some actual working queries.
Something like this? Agregate the rows in table2 and then join to table 1 for the detail info you want?
SELECT Table1.*,agg.thesum FROM
(SELECT UserID, SUM(aggregatedata) as thesum FROM Table2 GROUP BY UserID) agg
INNER JOIN Table1 on table1.userid = agg.userid
This should work
select table1.*, t2.total_price, t2.max_price, t2.order_count
from table1
join (selectt user_id, sum(table2.price) as total_price, max(table2.price) as max_price, count(order_id) as order_count from table2 as t2 group by t2.user_id)
on table1.user_id = t2.user_id
where t1.country = 'untied_states'
EDIT: (removed:"dont use explicit join" this was wrong, I meant:)
Try to use the following Sytax, for better understanding what goes on:
1st step:
select
user.user_id, -- < you must tell the DB userid of which column
user_country,
price,
price
from -- now just the two tables:
Table1 as user, --table1 is a bad name, we use 'user'
Table2 as order
where user.user_id = order.user_id
so you will get somthing like:
user_id user_country price price
1 alabama 5 5
2 nebrasca 1 1
2 alabama 7 7
1 alabama 7 7
2 alabama 3 7
and so on ..
the next step is to add an other where usercountry='alabama' so 'nebrasca' is off
user_id user_country price price
1 alabama 5 5
2 alabama 7 7
1 alabama 7 7
2 alabama 3 7
now you are ready for "aggregate": just select the MAX and SUM of price, but you have to tell the SQL engine what columes are 'fixed' = group by
select
user.user_id, user_country, MAX(price), SUM(price)
from
Table1 as user,
Table2 as order
where user.user_id = order.user_id
and user_country='alabama'
group by user_id, user_country

How to perform "Select Count" with complicated "Where" statement to compute co-occurrences?

Let's have an example to declare my concern:
Suppose we have a Table (Tags) which has two columns like this
UserID -------------------------------- Tag
1 -------------------------------------- SQL
1 -------------------------------------- Select
1 -------------------------------------- DB
2 -------------------------------------- SQL
2 -------------------------------------- Programming
2 -------------------------------------- Code
2 -------------------------------------- Software
3 -------------------------------------- Code
4 -------------------------------------- SQL
4 -------------------------------------- Code
I need to count DISTINCT co-occurrences for each tag based on UserID
So, the output should be like this (with Order by Co-occurrences desc):
Tag -------------------------------- Co-occurrences
---------------------------------------------
SQL --------------------------------------- 5
Programming ------------------------------- 3
Code -------------------------------------- 3
Software ---------------------------------- 3
Select ------------------------------------ 2
DB ---------------------------------------- 2
This is just an example..
How can I make a Select statement that can do this?
I came up with one way but for only ONE specific tag:
SELECT count (distinct (Tag)) - 1 as Co_occurrences
FROM Tags
WHERE Tag is NOT NULL and UserID in
( SELECT UserID
FROM Tags
where tag = 'SQL')
Is it possible to change the above statement to make it general for all tags in the table?
SELECT t2.tag, count (distinct (t1.Tag)) - 1 as Co_occurrences
FROM Tags t1 inner join
Tags t2 on t1.UserId = t2.UserId
GROUP BY t2.tag
ORDER BY count (distinct (t1.Tag)) desc
A GROUP BY is what you are looking for:
SELECT
UserID,
Tag,
COUNT(DISTINCT Tag) - 1 AS Co_occurrences
FROM Tags
GROUP BY UserID, Tag
ORDER BY UserID, Tag
Edit: As mentioned in the comments, the above does not answer the question. I improved the answer of #OSA-E a bit, to explain what the -1 is doing after the count.
SELECT
[t1].[Tag],
COUNT(DISTINCT [t2].[Tag]) AS [Co_occurrences]
FROM [Tags] [t1]
INNER JOIN [Tags] [t2] ON [t1].[UserID] = [t2].[UserID]
WHERE [t1].[Tag] <> [t2].[Tag]
GROUP BY [t1].[Tag]
ORDER BY [Co_occurrences] DESC
Here is the Fiddle.