Is group by good for this use case? - sql

I have a table that has user_id and role_id. I want to group user_id: 1 which also can multiple roles and In the end, I want to show count. Like for this user_id, there are 7 roles.
How can I achieve this in raw SQL query?

SELECT user_id, count(role_id) c
FROM atable
GROUP BY user_id

It seems that you are looking for typical group by:
select user_id,
count(role_id)
from MyTable
group by user_id;
here we group all records within MyTable by their user_id and then count all non null roled_id within each group. Depending on how role_id should be count you may want to put
count(all role_id)
to count null role_id as well as not null ones
count(distinct role_id)
to count distinct role_id in each group.

Related

BigQuery GROUP BY ... HAVING asks for other columns to be grouped

Running something like this:
SELECT user_id, username FROM `table` GROUP BY user_id HAVING COUNT(user_id) = 1
But BQ console complains that username is neither grouped nor aggregated. I'm looking at this post that explains how to remove rows that appears more than once. I'm assuming this error message is because there's no primary key or uniques in BQ? How can I get around this? I just want to eliminate repeated rows by user_id.
I just want to eliminate repeated rows by user_id.
below should do
SELECT user_id, ANY_VALUE(username) as username
FROM `table`
GROUP BY user_id
If you want one row per user_id, you can just use an aggregation function such as:
SELECT user_id, MAX(username) as username
FROM `table`
GROUP BY user_id
HAVING COUNT(user_id) = 1;
However, I might suggest using QUALIFY instead:
select t.*
from table t
where 1=1
qualify count(*) over (partition by user_id) = 1;

How to insert unique ID from subquery into a table?

I have two SQL Server tables: users_flags and users.
users:
user_id
email_address
1
john#company.com
2
amy#company.com
3
john#company.com
2
amy#company.com
users_flags:
flag_id
user_id
How do I insert all unique user_id values from the users table into the users_flags table, using a subquery to filter by email_address?
For example, I have a list of email addresses that I need to retrieve the user_id for:
SELECT user_id FROM users
WHERE email_address IN ('john#company.com',
'amy#company.com',
'guster#company.com')
Normally, I would just use this as a subquery of my INSERT statement (I need to hard code the flag_id):
INSERT INTO users (flag_id, user_id)
SELECT 3,
user_id
FROM users
WHERE email_address IN ('john#company.com',
'amy#company.com',
'guster#company.com')
However, since my users dataset currently has some duplicate data, I need to get only the DISTINCT user_id records from that table.
I can not use the DISTINCT keyword on user_id in my subquery, though (invalid syntax). How would I update my INSERT statement to account for only unique user IDs?
use window function row_number()
INSERT INTO users (flag_id, user_id)
SELECT *
FROM
(SELECT 3 as flag_id,
user_id,
row_number() over(partition by user_id order by (select null)) as seq
FROM users
WHERE email_address IN ('john#company.com',
'amy#company.com',
'guster#company.com')) T
WHERE seq = 1
You should be able to use select distinct:
INSERT INTO users (flag_id, user_id)
SELECT DISTINCT 3, user_id
FROM users
WHERE email_address IN ('john#company.com',
'amy#company.com',
'guster#company.com');
They're are many solution for this to avoid duplicates, for example:
NOT EXISTS
INSERT INTO users (flag_id, user_id)
SELECT u.3,
u.user_id
FROM users u
WHERE NOT EXISTS(SELECT user_id
FROM users u2
WHERE u2.user_id = u.user_id)
;with Temp as(
SELECT Distinct user_id FROM users
WHERE email_address IN ('john#company.com',
'amy#company.com',
'guster#company.com')
)
INSERT INTO users (flag_id, user_id) select 3, user_id from Temp
The above solution will work im almost all DBs

How to group by in sql with condition

I am trying to find users that have a count of purchase_id>x (10 in this example's case). How to condition the group by? what I did in the exam[ple below doesn't work
count(purchases.id), user_id from purchases
group by if(count(purchases.id)>10)
This is what did work (need to add a condition here)
select count(purchases.id), user_id from purchases
group by user_id
Thanks in advance
I think you want the having clause:
select count(*), user_id
from purchases
group by user_id
having count(*) > 10;

Postgresql count results for returned user field

I am running the following postgres SQL query:
SELECT user_id FROM user_log WHERE date>='2016-08-09' ORDER by user_id ASC
It returns the result and groups them by user_id, so for example I can end up with multiple results from same user_id, like the example below:
user_id
1001
1001
1001
1008
1008
instead of listing each user_id, i want to just count how many results for each user_id. So for the example above I would like to know that 1001 is 3 and 1008 is 2.
Is there any way to do this directly with a SQL query?
You can try doing a simple GROUP BY query, with user_id determining the group for which you want the count:
SELECT user_id, COUNT(*) AS userCount
FROM user_log
WHERE date>='2016-08-09'
GROUP BY user_id
ORDER by user_id ASC
If you want to restrict, for example, to users only having a count of at least 3, then you can add a HAVING clause:
SELECT user_id, COUNT(*) AS userCount
FROM user_log
WHERE date>='2016-08-09'
GROUP BY user_id
HAVING COUNT(*) >= 3
ORDER by user_id ASC

How can I count the non-unique combinations of values in MySQL?

I have a table with some legacy data that I suspect may be a little messed up. It is a many-to-many join table.
LIST_MEMBERSHIPS
----------------
list_id
address_id
I'd like to run a query that will count the occurrences of each list_id-address_id pair and show the occurrence count for each from highest to lowest number of occurrences.
I know it's got to involve COUNT() and GROUP BY, right?
select list_id, address_id, count(*) as count
from LIST_MEMBERSHIPS
group by 1, 2
order by 3 desc
You may find it useful to add
having count > 1
select count(*), list_id, address_id
from list_membership
group by list_id, address_id
order by count(*) desc