SQL join and count rows (count people in table) - sql

I have a table with users and a table with their posts. I do not understand how to calculate the number of users, for example, whose number of posts is less than 10.
select Users.DisplayName, count(Users.id) as Questions from Users
LEFT JOIN
Posts on Users.id=Posts.OwnerUserId
GROUP BY Users.DisplayName
HAVING (count(Users.Id)<10) or (count(Users.Id)>10 and count(Users.Id)<20)

If you want the number of users that have less than 10 records, than you can simply use a COUNT:
SELECT
COUNT(a.DisplayName)
FROM
Users a
LEFT JOIN Posts b ON a.id = b.OwnerUserId
HAVING
COUNT(b.Id) < 10
You can see that here using the data explorer.

Related

SQL - Guarantee at least n unique users with 2 appearances each in query

I'm working with AWS Personalize and one of the service Quotas is to have "At least 1000 records containing a min of 25 unique users with at least 2 records each", I know my raw data has those numbers but I'm trying to find a way to guarantee that those numbers will always be met, even if the query is run by someone else in the future.
The easy way out would be to just use the full dataset, but right now we are working towards a POC, so that is not really my first option. I have covered the "two records each" section by just counting the appearances, but I don't know how to guarantee the min of 25 users.
It is important to say that my data is not shuffled in any way at the time of saving.
My query
SELECT C.productid AS ITEM_ID,
A.userid AS USER_ID,
A.createdon AS "TIMESTAMP",
B.fromaddress_countryname AS "LOCATION"
FROM A AS orders
JOIN B AS sub_orders ON orders.order_id = sub_orders.order_id
JOIN C AS order_items ON orders.order_id = order_items.order_id
WHERE orders.userid IN (
SELECT orders.userid
FROM A AS ORDERS
GROUP BY orders.userid
HAVING count(*) > 2
)
LIMIT 10
I use the LIMIT to just query a subset since I'm in AWS Athena.
The IN query is not very efficient since it needs to compare each row with all (worst case) the elements of the subquery to find a match.
It would be easier to start by storing all users with at least 2 records in a common table expression (CTE) and do a join to select them.
To ensure at least 25 distinct users you will need a window function to count the unique users since the first row and add a condition on that count. Since you can't use a window function in the where clause, you will need a second CTE and a final query that queries it.
For example:
with users as (
select userid as good_users
from orders
group by 1
having count(*) > 1 -- this condition ensures at least 2 records
),
cte as (
SELECT C.productid AS ITEM_ID,
A.userid AS USER_ID,
A.createdon AS "TIMESTAMP",
B.fromaddress_countryname AS "LOCATION",
count(distinct A.userid) over (rows between unbounded preceding and current row) as n_distinct_users
FROM A AS orders
JOIN B AS sub_orders ON orders.order_id = sub_orders.order_id
JOIN C AS order_items ON orders.order_id = order_items.order_id
JOIN users on A.userid = users.userid --> ensure only users with 2 records
order by A.userid -- needed for the window function
)
select * from cte where n_distinct_users < 26
sorting over userid in cte will ensure that at least 2 records per userid will appear in the results.

Count how many times a user logged in 1x, 2x, 3x

I'm just beginning to learn SQL and this has completely stumped me. I join two tables on user_id where the event was a login. So far so good. Then I need to group those occurrences and count them to return the answer. How many times did users log in 1x, 2x, 3x...?
What I am having trouble with is referencing the first count (occurrences) and the fact that I can't group by occurrences since it is an aggregate function.
Here is the code, it returns two columns, user_id and occurrences. The data is on www.mode.com.
SELECT
Users.user_id,
COUNT(Users.user_id) AS occurrences
FROM
tutorial.playbook_users Users
JOIN tutorial.playbook_events EVENTS ON Users.user_id = EVENTS.user_id
WHERE
EVENTS.event_name = 'login'
GROUP BY
1
ORDER BY
2
So just aggregate it again
SELECT
q.UserLogins AS occurrences,
COUNT(*) AS Total
FROM
(
SELECT
Users.user_id,
COUNT(EVENTS.user_id) AS UserLogins
FROM tutorial.playbook_users Users
JOIN tutorial.playbook_events EVENTS
ON EVENTS.user_id = Users.user_id
AND EVENTS.event_name = 'login'
GROUP BY Users.user_id
) q
GROUP BY q.UserLogins
ORDER BY q.UserLogins

Trying to count the number of occurences that 3 columns from 2 tables have on my organizations table? I need the occurrences joined in one table

-- 2. In one table, show how many private topics, admins, and standard users each organization has.
SELECT organizations.name, COUNT(topics.privacy) AS private_topic, COUNT(users.type) AS user_admin, COUNT(users.type) AS user_standard
FROM organizations
LEFT JOIN topics
ON organizations.id=topics.org_id
AND topics.privacy='private'
LEFT JOIN users
ON users.org_id=organizations.id
AND users.type='admin'
LEFT JOIN users
ON users.org_id=organizations.id
AND users.type='standard'
GROUP BY organizations.name
;
org_id is the foreign key that reals both the users table and topics table. It keeps giving me the wrong result by only either counting the number of admins or standard users and putting that for all rows in the each column. Any help is really appreciated as I have been stuck on this for a while now!
So, I am getting an error when I do as you said which is that the users table cannot be specified more than once. I updated the code to how you said to write it but still nothing. They really don't give me any sample data either but I just made some queries and saw the number of times there are private topics for example, which is in the privacy column of the topics table. When I dont get this error as I said, the joins seem to overwrite themselves where each row for all the columns is the same as the last join.
It appears to me that topics and users have no relationship. You're just trying to get the result together in a single query. There are other and possibly better ways to accomplish that but I think this will fix what you've got already (assuming you have id columns for each table.)
SELECT
organizations.name,
COUNT(DISTINCT topics.id) AS private_topic,
COUNT(DISTINCT users.id) FILTER (WHERE users.type = 'admin') AS user_admin,
COUNT(DISTINCT users.id) FILTER (WHERE users.type = 'standard') AS user_standard`
FROM organizations
LEFT JOIN topics
ON organizations.id = topics.org_id AND topics.privacy = 'private'
LEFT JOIN users
ON users.org_id = organizations.id
GROUP BY organizations.name;
I propose this as a more straightforward way:
SELECT
min(o.name) as "name",
(
select count(*) from topics t
where t.org_id = o.id AND t.privacy = 'private'
) as private_topics,
(
select count(*) from users u
where u.org_id = o.id and u.type = 'admin'
) AS user_admin,
(
select count(*) from users u
where u.org_id = o.id and u.type = 'standard'
) AS user_standard
FROM organizations o
GROUP BY o.id;

SQL How can I create an inner join from these 2 tables

I am just learning sql and I am creating a social network as a learning experience . I have 2 tables called Streams and Votes, Streams pulls user created content and Votes stores the content that people have liked . What I am trying to figure out is how can I return the data from both tables to check if a user a liked a particular post being shown . For instance this is how both my tables look . If you see they both have a field in common stream_id and they both have number 278. How can I do an inner join that checks to see if there are any common stream_ID in both tables ? This is the sql code that I use that gets me the Stream data
Query 1
select post,profile_id,
votes,id as stream_id FROM streams WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10
The User ID is 11 and both Streams.profile_id and Votes.my_id are the same field . I have tried this SQL query but this only returns in total 1 result . Again I would like to return all results from the Streams table which I do in query 1 and also add another column to the results from the Votes table where Votes.stream_id=Streams.ID because it'll show that the particular user has liked that post. Any hemp would be great
Query 2
select s.post,s.profile_id,
s.votes,s.id as stream_id, v.my_id as ID FROM streams s inner join Votes v on (s.id = v.stream_id) WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10
Streams
Votes
You need to use a LEFT OUTER JOIN instead of an INNER JOIN.
SELECT s.post, s.profile_id, s.votes, s.id as stream_id, v.my_id as ID
FROM streams s
LEFT OUTER JOIN Votes v on (s.id = v.stream_id) WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10
For more information about joins, look here.
I think that you are looking for a LEFT JOIN
select s.post,s.profile_id,
s.votes,s.id as stream_id, v.my_id as ID FROM streams s LEFT JOIN Votes v on (s.id = v.stream_id) WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10

Left join with on a limited right table

I have cards, schedules, users. Each schedule has foreign keys to a user and a card. Assuming I have 100 cards and 2 users, each with 10 schedules. I would like a set of 100 cards joined with the 10 schedules of only 1 of the users. When I try to limit my dataset to 1 of the users, I only end up with 10 rows. How can I set up a query to achieve what I want?
Basically you want something in the form of:
select *
from cards c
left join
(
select *
from schedules s
join users u on s.UserID = u.ID
where u.ID = 1
) x on c.ID = x.CardID
So that your where clause doesn't interfere with the left join. You'll have to change the column names to whatever you are using, and possibly explicitly list the columns in the sub-query so there isn't any issues with overlapping column names.