Join two tables and with sum and fitering - sql

I have two tables Users and Inputs with the following schema
Users - id, name, create_time
Inputs - id, user_id, create_time, amount
I've created 2 similar queries that select join data from the two tables.
The first query returns all users, adding a field called daily_amount which sums all the Inputs of each user in a given time range - Works fine
The second query - Adds a user id filter
I want to limit the query to a specific user by Id (id = 12 in the given query), im getting inconsistent results, I get a single record, but it has a different Id and the daily_amount is incorrect.
Your assistance is appreciated.
-- All users query - Works fine
SELECT Users.id,
Users.name,
Users.create_time,
SUM(Inputs.amount) AS daily_amount
FROM Users
LEFT JOIN Inputs ON Users.id = Inputs.user_id
AND Inputs.create_time BETWEEN startTime AND endTime
GROUP BY Users.id,
Users.name
-- User specific query
SELECT Users.id,
Users.name,
Users.create_time,
SUM(Inputs.amount) AS daily_amount
FROM Users
LEFT JOIN Inputs ON Users.id = Inputs.user_id
AND Users.id = 12 -- trying to filter only specific user by id
AND Inputs.create_time BETWEEN startTime AND endTime
GROUP BY Users.id,
Users.name

You must put the condition in the WHERE clause:
SELECT Users.id,
Users.name,
Users.create_time,
SUM(Inputs.amount) AS daily_amount
FROM Users
LEFT JOIN Inputs ON Users.id = Inputs.user_id
WHERE Users.id = 12
AND Inputs.create_time BETWEEN startTime AND endTime
GROUP BY Users.id, Users.name, Users.create_time

Related

Count how many times a user logged in 1x, 2x, 3x

I'm just beginning to learn SQL and this has completely stumped me. I join two tables on user_id where the event was a login. So far so good. Then I need to group those occurrences and count them to return the answer. How many times did users log in 1x, 2x, 3x...?
What I am having trouble with is referencing the first count (occurrences) and the fact that I can't group by occurrences since it is an aggregate function.
Here is the code, it returns two columns, user_id and occurrences. The data is on www.mode.com.
SELECT
Users.user_id,
COUNT(Users.user_id) AS occurrences
FROM
tutorial.playbook_users Users
JOIN tutorial.playbook_events EVENTS ON Users.user_id = EVENTS.user_id
WHERE
EVENTS.event_name = 'login'
GROUP BY
1
ORDER BY
2
So just aggregate it again
SELECT
q.UserLogins AS occurrences,
COUNT(*) AS Total
FROM
(
SELECT
Users.user_id,
COUNT(EVENTS.user_id) AS UserLogins
FROM tutorial.playbook_users Users
JOIN tutorial.playbook_events EVENTS
ON EVENTS.user_id = Users.user_id
AND EVENTS.event_name = 'login'
GROUP BY Users.user_id
) q
GROUP BY q.UserLogins
ORDER BY q.UserLogins

LEFT JOIN discarding left rows in results?

Simplifying my issue, let's say I have two tables:
"Users" storing user_id and event_date from users who access each day.
"Purchases" storing user_id, event_date and product_id from users who make purchases each day.
I need to get from all users, their respective product purchases, or null value for product_id if a user didn't make a purchase. For that purpose I made this query:
with all_users as (
select user_id from `my_project.my_dataset.Users`
where event_date = "2019-12-01"
)
select user_id,product_id
from all_users
left join `my_project.my_dataset.Purchases`
using(user_id)
where event_date = "2019-12-01"
But this query returns only user_id who made purchases, in other words, there are rows in the LEFT from_item (all_users) that are being ommited in the result.
Is this working as spected? I read that LEFT JOIN always retains all rows of the left from_item.
EDIT 1:
Adding some screenshots:
This is the full query detailed before, but with real names (table "Users" is "user_metrics_daily" and table "Purchases" is "virtual_currency_daily"). As you can see, I added the count(distinct user_pseudo_id)OVER() to count how many distinct users are in the result.
In the other hand, this is a query to get the number of users I expect to have in the result (8935 users, with null values in product_id for users who don't purchase). But actually I got 2724 distinct users (the number of users who made purchases).
EDIT 2: I found a solution to my desired result, but still I don't understand what's wrong with my first query.
Your query (as it is) should return an error because user_id is ambiguous. BigQuery does not know if you want the column from all_users or my_project.my_dataset.Purchases.
Discarding that, you need to explicitly say from which table the projected columns should come from. In your case, user_id from all_users and product_id from my_project.my_dataset.Purchases.
with all_users as (
select user_id from `my_project.my_dataset.Users`
where event_date = "2019-12-01"
)
select
a.user_id,
p.product_id
from all_users as a
left join `my_project.my_dataset.Purchases` as p on a.user_id = p.user_id
where event_date = "2019-12-01"

Count number of posts of each user - SQL

I need to get the number of posts each user has created.
This is the structure of both tables (users, microposts).
Microposts
id
user_id
content
created_at
Users
id
name
email
admin
SELECT users.*, count( microposts.user_id )
FROM microposts LEFT JOIN users ON users.id=microposts.user_id
GROUP BY microposts.user_id
This gets me only the users that have posts. I need to get all users, even if they have 0 posts
You have the join in the wrong order.
In a LEFT JOIN you ensure you keep all the records in the table written first (to the left).
So, join in the other order (users first/left), and then group by the user table's id, and not the microposts table's user_id...
SELECT users.*, count( microposts.user_id )
FROM users LEFT JOIN microposts ON users.id=microposts.user_id
GROUP BY users.id

Do these SQL queries make sense for the given tables?

Give the following two SQL tables I want to run two queries (venue_id is a foreign key with id from the venues table):
SELECT venues.name FROM venues INNER JOIN users ON venues.venue_id = users.venue_id WHERE users.id = {param}
To return the venue name associated with the user_id that was passed in
SELECT users.name FROM users WHERE users.venue_id = {param] AND users.expiration_time > CURRENT_TIMESTAMP
To return all the current users with a given venue_id where the expiration_time is after the current time.
Do both these queries do what I expect them to do?
Thanks!
SELECT venues.name FROM venues INNER JOIN users ON venues.id = users.venue_id WHERE users.id = {param}
SELECT users.name FROM users WHERE users.venue_id = {param] AND users.expiration_time > CURRENT_TIMESTAMP
Both queries should be doing what you expect them to do

Return records of groupby with a count of one

I have two tables user_expenses and users.
The foreign key for user_expenses is user_expenses.user_id which corresponds to users.id.
I would like to get some information from both tables, in which the users have only ONE expense.
I gave this a shot:
SELECT
users.id, users.email, users.stripe_plan, users.previous_plan,
users.created_at, user_expenses.created_at, user_expenses.description
FROM
users
INNER JOIN
user_expenses ON user_expenses.user_id = users.id
WHERE
user_expenses.description NOT LIKE "%free%"
GROUP BY
user_expenses.user_id
HAVING
COUNT(*) = 1
But of course, this yields the following problem:
SELECT list is not in GROUP BY clause and contains nonaggregated column 'app.user_expenses.created_at' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
Adding this column into the group_by is problematic because it will actually return users who have multiple expenses with different descriptions.
Can anyone offer some advice on how to approach this problem? I only want users with a single entry in the user_expenses table, regardless of the type of description.
You could do either a subquery or you could pseudo-aggregate values that are not in the group by list:
(1):
SELECT users.id, users.email, users.stripe_plan, users.previous_plan, users.created_at, user_expenses.created_at, user_expenses.description
FROM users
INNER JOIN user_expenses
ON user_expenses.user_id = users.id
WHERE user_expenses.description NOT LIKE "%free%"
and users.id not in
(select ue2.user_id from user_expenses ue2 group by user_id having count(*) > 1)
(2)
SELECT users.id, max(users.email), max(users.stripe_plan), max(users.previous_plan), max(users.created_at), max(user_expenses.created_at), max(user_expenses.description)
FROM users
INNER JOIN user_expenses
ON user_expenses.user_id = users.id
WHERE user_expenses.description NOT LIKE "%free%"
GROUP BY user_expenses.user_id
HAVING COUNT(*) = 1
You can do a dummy aggregation. Change:
user_expenses.created_at, user_expenses.description
in the select list by:
min(user_expenses.created_at) created_at, min(user_expenses.description) description
... which will be the same as the original value, since you know you only have one per group.
It would also be more natural to group by the users.id field, which has as advantage that it allows for outer joining the user_expenses table (if ever you would need that):
group by users.id
NB: in MySql 5.7+ it is not necessary to aggregate fields that are functionally dependent on the grouped-by fields. Since all fields of the users record are determined by the users.id value they can go without aggregation.