Counting associations from multiple tables - sql

I want to see how many association each of my records in a given table have. Some of these association have some conditions attached to them
So far I have
-- Count app associations
SELECT
distinct a.name,
COALESCE(v.count, 0) as visitors,
COALESCE(am.count, 0) AS auto_messages,
COALESCE(c.count, 0) AS conversations
FROM apps a
LEFT JOIN (SELECT app_id, count(*) AS count FROM visitors GROUP BY 1) v ON a.id = v.app_id
LEFT JOIN (SELECT app_id, count(*) AS count FROM auto_messages GROUP BY 1) am ON a.id = am.app_id
LEFT JOIN (
SELECT DISTINCT c.id, app_id, count(c) AS count
FROM conversations c LEFT JOIN messages m ON m.conversation_id = c.id
WHERE m.visitor_id IS NOT NULL
GROUP BY c.id) c ON a.id = c.app_id
WHERE a.test = false
ORDER BY visitors DESC;
I run into problem with the last join statement for conversations. I want to count the number of conversations that have at least 1 message where the visitor_id is not null. For some reason, I get multiple records for each app, ie. the conversations are not being grouped properly.
Any ideas?

My gut feeling, based on limited understanding of the big picture: in the nested query selecting from conversations,
remove DISTINCT
remove c.id from SELECT list
GROUP BY c.app_id instead of c.id
EDIT: try this
...
LEFT JOIN (
SELECT app_id, count(*) AS count
FROM conversations c1
WHERE
EXISTS (
SELECT *
FROM messages m
WHERE m.conversation_id = c1.id and
M.visitor_id IS NOT NULL
)
GROUP BY c1.app_id) c
ON a.id = c.app_id

Related

SQL Sum and Count JOIN Multiple tables

Please help, I want to merge this two query
I have 3 Tables (places,ratings,places_image)
places
id
name
description
ratings
id
rating
place_id
user_id
places_image
id
place_id
image
These are the 2 queries:
SELECT places.*, SUM(rating) AS total_rating,COUNT(ratings.user_id) AS total_user FROM ratings, places WHERE ratings.place_id = places.id GROUP BY places.id
SELECT places.*, places_images.image FROM places, places_images WHERE places.id = places_images.place_id GROUP BY places.id
Query 1
Query 2
I tried to do this query but it give duplicate data for the aggregate function
SELECT places.id, places.name, places.description, places_images.image ,SUM(rating) AS total_rating,COUNT(ratings.user_id) AS total_user FROM places_images JOIN places ON places_images.place_id = places.id JOIN ratings ON ratings.place_id = places.id GROUP BY places.id
Query 3
How can i combine it ?
You need to use 2 Inner Join on the place_id and user_id
It'll be something like
SELECT SUM(r.rating) AS total_ratings, COUNT(r.user_id) AS total_users
FROM ratings r
INNER JOIN places p ON p.id = r.user_id
INNER JOIN places_image pi ON pi.place_id = r.place_id
GROUP BY r.id
Seems the only thing i need is to subquery it
select a.*, d.image, b.totalRating, c.totalUser from places a, ( select place_id, sum(rating) AS totalRating from ratings group by place_id ) b, ( select place_id, count(id) AS totalUser from ratings group by place_id ) c, places_images d where c.place_id = a.id and b.place_id = a.id and d.place_id = a.id GROUP BY a.id
Result
Thank you very much, GBU

Is there any alternative way to write this t-sql query?

I have these 3 tables:
For each car I need to visualize the data about the last (most recent) reservation:
the car model (Model);
the user who reserved the car (Username);
when it was reserved (ReservedOn);
when it was returned (ReservedUntil).
If there is no reservation for a given car, I have to show only the car model. Other fields must be empty.
I wrote the following query:
SELECT
Reservations.CarId,
res.MostRecent,
Reservations.UserId,
Reservations.ReservedOn,
Reservations.ReservedUntil
FROM
Reservations
JOIN (
Select
Reservations.CarId,
MAX(Reservations.ReservedOn) AS 'MostRecent'
FROM
Reservations
GROUP BY
Reservations.CarId
) AS res ON res.carId = Reservations.CarId
AND res.MostRecent = Reservations.ReservedOn
This first one works but I got stuck to obtain the result that I need. How could I write complete the query?
It looks like a classic top-n-per-group problem.
One way to do it is to use OUTER APPLY. It is a correlated subquery (lateral join), which returns the latest Reservation for each row in the Cars table. If such reservation doesn't exist for a certain car, there will be nulls.
If you create an index for Reservations table on (CarID, ReservedOn DESC), this query should be more efficient than self-join.
SELECT
Cars.CarID
,Cars.Model
,A.ReservedOn
,A.ReservedUntil
,A.UserName
FROM
Cars
OUTER APPLY
(
SELECT TOP(1)
Reservations.ReservedOn
,Reservations.ReservedUntil
,Users.UserName
FROM
Reservations
INNER JOIN Users ON Users.UserId = Reservations.UserId
WHERE
Reservations.CarID = Cars.CarID
ORDER BY
Reservations.ReservedOn DESC
) AS A
;
For other approaches to this common problem see Get top 1 row of each group
and Retrieving n rows per group
With not exists:
select r.* from reservations r
where not exists (
select 1 from reservations
where carid = r.carid and reservedon > r.reservedon
)
You can create a CTE with the above code and join it to the other tables:
with cte as (
select r.* from reservations r
where not exists (
select 1 from reservations
where carid = r.carid and reservedon > r.reservedon
)
)
select c.carid, c.model, u.username, cte.reservedon, cte.reserveduntil
from cars c
left join cte on c.carid = cte.carid
left join users u on u.userid = cte.userid
If you don't want to use a CTE:
select c.carid, c.model, u.username, t.reservedon, t.reserveduntil
from cars c
left join (
select r.* from reservations r
where not exists (
select 1 from reservations
where carid = r.carid and reservedon > r.reservedon
)
) t on c.carid = t.carid
left join users u on u.userid = t.userid

JOIN only one row from second table and if no rows exist return null

In this query I need to show all records from the left table and only the records from the right table where the result is the highest date.
Current query:
SELECT a.*, c.*
FROM users a
INNER JOIN payments c
ON a.id = c.user_ID
INNER JOIN
(
SELECT user_ID, MAX(date) maxDate
FROM payments
GROUP BY user_ID
) b ON c.user_ID = b.user_ID AND
c.date = b.maxDate
WHERE a.package = 1
This returns all records where the join is valid, but I need to show all users and if they didn't make a payment yet the fields from the payments table should be null.
I could use a union to show the other rows:
SELECT a.*, c.*
FROM users a
INNER JOIN payments c
ON a.id = c.user_ID
INNER JOIN
(
SELECT user_ID, MAX(date) maxDate
FROM payments
GROUP BY user_ID
) b ON c.user_ID = b.user_ID AND
c.date = b.maxDate
WHERE a.package = 1
union
SELECT a.*, c.*
FROM users a
--here I would need to join with payments table to get the columns from the payments table,
but where the user doesn't have a payment yet
WHERE a.package = 1
The option to use the union doesn't seem like a good solution, but that's what I tried.
So, in other words, you want a list of users and the last payment for each.
You can use OUTER APPLY instead of INNER JOIN to get the last payment for each user. The performance might be better and it will work the way you want regarding users with no payments.
SELECT a.*, b.*
FROM users a
OUTER APPLY ( SELECT * FROM payments c
WHERE c.user_id = a.user_id
ORDER BY c.date DESC
FETCH FIRST ROW ONLY ) b
WHERE a.package = 1;
Here is a generic version of the same concept that does not require your tables (for other readers). It gives a list of database users and the most recently modified object for each user. You can see it properly includes users that have no objects.
SELECT a.*, b.*
FROM all_users a
OUTER APPLY ( SELECT * FROM all_objects b
WHERE b.owner = a.username
ORDER BY b.last_ddl_time desc
FETCH FIRST ROW ONLY ) b
I like the answer from #Matthew McPeak but OUTER APPLY is 12c or higher and isn't very idiomatic Oracle, historically anyway. Here's a straight LEFT OUTER JOIN version:
SELECT *
FROM users a
LEFT OUTER JOIN
(
-- retrieve the list of payments for just those payments that are the maxdate per user
SELECT payments.*
FROM payments
JOIN (SELECT user_id, MAX(date) maxdate
FROM payments
GROUP BY user_id
) maxpayment_byuser
ON maxpayment_byuser.maxdate = payments.date
AND maxpayment_byuser.user_id = payments.user_id
) b ON a.ID = b.user_ID
If performance is an issue, you may find the following more performant but for simplicity you'll end up with an extra "maxdate" column.
SELECT *
FROM users a
LEFT OUTER JOIN
(
-- retrieve the list of payments for just those payments that are the maxdate per user
SELECT *
FROM (
SELECT payments.*,
MAX(date) OVER (PARTITION BY user_id) maxdate
FROM payments
) max_payments
WHERE date = maxdate
) b ON a.ID = b.user_ID
A generic approach using row_number() is very useful for "highest date" or "most recent" or similar conditions:
SELECT
*
FROM users a
LEFT OUTER JOIN (
-- determine the row corresponding to "most recent"
SELECT
payments.*
, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY date DESC) is_recent
FROM payments
) b ON a.ID = b.user_ID
AND b.is_recent = 1
(reversing the ORDER BY within the over clause also enables "oldest")

Oracle: How to use left outer join to get all entries from left table and satisfying the condition in Where clause

I have the tables below.
Client:
ID | clientName
--------------
1 A1
2 A2
3 A3
Order:
OrdID clientID status_cd
------------------------
100 1 DONE
101 1 SENT
102 3 SENT
Status:
status_cd status_category
DONE COMPL
SENT INPROG
I have to write a query to get all the clients and count of order against all of them, whether the client_id exists in Order table or not and has the orders with "COMPL" as status category.
In this case, I am using the query below but it's filtering out the clients which has no orders. I want to get all clients such that the expected result is as below.
Query:
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID=o.client_id(+)
and o.status_cd=s.status_cd where s.status_category='COMPL'
group by c.ID
Expected result:
C.ID count(distinct o.OrdID)
----------------------------
1 1
2 0
3 0
Can someone please help me with this? I know, in this case, left outer join is behaving like inner join when I am using where clause, but is there any other way to achieve the results above?
This can be dealt with a lot easier when using an explicit join operator:
select c.ID, count(distinct s.status_cd)
from client c
left join orders o on o.clientid = c.id
left join status s on s.status_cd = o.status_cd and s.status_category='COMPL'
group by c.ID;
The above assumes that orders.status_cd is defined as not null
Another option is to move the join between orders and status in a derived table:
select c.ID, count(distinct o.ordid)
from client c
left join (
select o.ordid
from orders o
join status s on s.status_cd = o.status_cd
where s.status_category='COMPL'
) o on o.clientid = c.id
group by c.ID;
The above "states" more clearly (at least in my eyes) that only orders within that status category are of interest compared to the first solution
As usual, there are lots of ways to express this requirement.
Try ANSI join people will hate me an vote down this answer ;) :
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID = o.client_id(+)
and o.status_cd = s.status_cd
and s.status_category='COMPL'
group by c.ID
;
or
select c.ID
, nvl((select count(distinct o.OrdID)
from order o, status s
where c.ID = o.client_id
and o.status_cd = s.status_cd
and s.status_category='COMPL'
), 0) as order_count
from client c
group by c.ID
;
or
with ord as
(select client_id, count(distinct o.OrdID) cnt
from order o, status s
where 1=1
and o.status_cd = s.status_cd
and s.status_category='COMPL'
group by client_id
)
select c.ID
, nvl((select cnt from ord o where c.ID = o.client_id ), 0) as order_count
from client c
group by c.ID
;
or
...
The second WHERE should be an AND.
Other than that, you need the plus sign, (+), marking left outer join, in the second join condition as well. It is not enough to left-outer-join the first two tables.
Something like
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID=o.client_id(+)
and o.status_cd=s.status_cd(+) AND s.status_category='COMPL'
-- ^^^ ^^^ (not WHERE)
group by c.ID
Of course, it would be much better if you used proper (SQL Standard) join syntax.

SQL query to find the top 3 in a category

Calling all sql enthusiasts!
Quick info: using PostgreSQL.
I have a query that return the maximum number of likes for a user per category. What I want now, is to show the top 3 users with the most likes per category.
A helpful resource was using this example to solve the problem:
select type, variety, price
from fruits
where (
select count(*) from fruits as f
where f.type = fruits.type and f.price <= fruits.price
) <= 2;
I understand this, but my query is using joins and I am also a beginner, so I was not able to use this information effectively.
Down to business, this is my query for returning the MAX likes for a user per category.
SELECT category, username, MAX(post_likes) FROM (
SELECT c.name category, u.username username, SUM(p.like_count) post_likes, COUNT(*) post_num
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id
GROUP BY c.name, u.username) AS leaders
WHERE post_likes > 0
GROUP BY category, username
HAVING MAX(post_likes) >= (SELECT SUM(p.like_count)
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id WHERE c.name = leaders.category
GROUP BY u.username order by sum desc limit 1)
ORDER BY MAX(post_likes) DESC;
Any and all help would be greatly appreciated. I am having a difficult time wrapping my head around this problem. Thank!
If you want the most likes per category, use window functions:
SELECT cu.*
FROM (SELECT c.name as category, u.username as username,
SUM(p.like_count) as post_likes, COUNT(*) as post_num,
ROW_NUMBER() OVER (PARTITION BY c.name ORDER BY COUNT(*) DESC) as seqnum
FROM categories c JOIN
topics t
ON c.id = t.category_id JOIN
posts p
ON t.id = p.topic_id JOIN
users u
ON u.id = p.user_id
GROUP BY c.name, u.username
) cu
WHERE seqnum <= 3;
This always returns three rows per category, even if there are ties. If you want to do something else, then consider DENSE_RANK() or RANK() instead of ROW_NUMBER().
Also, use as for column aliases in the FROM clause. Although optional, one day you will leave out a comma and be grateful that you are in the habit of using as.