SQL Select on priority of not exist

SQL Select on priority of not exist - sql

I'm quite confused as to how I'd structure a query I need:
select distinct NodeID from GroupsTable
where GroupID in
(
select GroupID from UserGroupsTable
where UserID = #UserID
)
I need to select NodeID from GroupsTable where the User belongs to multiple groups, UNLESS any group that the user belongs to does not have that NodeID.
The above code only selects a NodeID from GroupsTable where it apears in any one of the users groups.
So:
Group|Node
1|A
1|B
2|A
I need only B to be selected for example.
Any Ideas?

declare #UserID int
select G.NodeID
from GroupsTable G
inner join
(
select GroupID, count(*) over () GroupCount
from UserGroupsTable
where UserID = #UserID
) UG on UG.GroupID = G.GroupID
GROUP BY G.NodeID, UG.GroupCount
HAVING COUNT(*) != UG.GroupCount OR UG.GroupCount = 1
-- count=1 is special. it will always equal groupcount, but let it through
(assuming this is SQL Server and version >= 2005 for using the count(*) over)
#answer HAVING condition changed following comments

You can use a having clause to ensure that the count of groups per node is less than the count of all groups the user is in:
select g.NodeID
from GroupsTable g
join UserGroupsTable ug
on g.GroupID = ug.GroupID
where ug.UserID = 1
group by
ug.UserID
, g.NodeID
having COUNT(distinct g.GroupID) <
(
select COUNT(distinct GroupID)
from UserGroupsTable ug2
where ug2.UserID = ug.UserID
)

Related

SQL Select for Users only in certain List of Groups

I need to select a group of users that are in certain user_groups, but only in those user_groups.
User
1
2
3
Group
a
b
c
UserToGroup
1, a
1, b
1, c
2, a
2, c
3, b
3, c
User must only be in Groups a and c
Result
2, a
2, c
Group and User are both scaling tables, so excluding unwanted groups in the query is not an option.

EDIT: I modified the query to include other data from the user table.
If you need to get other data from user and/or group, you can use a simple JOIN with NOT EXISTS to eliminate records with something other than your select list. It should optimize pretty well. The HAVING count(*) relies on there not being more than one linking record in the usertorgroup for a user and group. If either is a concern, the query can be filtered further to eliminate those duplicates. It also needs to be passed how many groups are being searched. How are you passing this list of values to the query?
SELECT u.uid AS userid, u.otherstuff
FROM usr u
INNER JOIN usertogroup utg ON u.uid = utg.uid
INNER JOIN grp g ON utg.gid = g.gid
AND g.gid IN ('a','c')
WHERE NOT EXISTS (
SELECT 1
FROM usertogroup utg2
WHERE utg2.gid NOT IN ('a','c')
AND utg2.uid = u.uid
)
GROUP BY u.uid, u.otherstuff
HAVING count(*) = 2 /* # of items in list to search. */
Demo:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=f51dbd8a6013d9ec94120cf1ec512735
http://sqlfiddle.com/#!9/96e2a7/1/0

Use:
SELECT *
FROM UserToGroup
WHERE `user` IN (
SELECT `user`
FROM UserToGroup
GROUP BY `user`
HAVING count(distinct `group`)
= count(distinct CASE WHEN `group` IN ('a','c') THEN `group` END )
)
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=19bfcfa320ca5e2afbf48a1bbb09a1a1
I need distinct users. I have reduced the columns of user to easen the
example.
To get data from Users use this query:
SELECT *
FROM Users
WHERE `user` IN (
SELECT `user`
FROM UserToGroup
GROUP BY `user`
HAVING count(distinct `group`)
= count(distinct CASE WHEN `group` IN ('a','c') THEN `group` END )
)
Demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=7a75d024ce0c871bca6ff3062ca1bc0f

If you want users that are only in A and C, then you can do this:
SELECT
MIN(USER_ID) AS USER_ID
FROM
UserToGroup
HAVING
MIN(USER_GROUP) = 'a' AND MAX(USER_GROUP) = 'c'
If they are in A and C and D, then that query won't work for you.
You could also try
SELECT
USER_ID
FROM
UserToGroup
WHERE USER_GROUP = 'a'
AND USER_ID IN (SELECT USER_ID FROM UserToGroup WHERE USER_GROUP = 'c')
This query gets everyone that is in 'a', and also looks in a subquery against the same table for users that are in 'c'.

You can try conditional aggregation in a HAVING clause, that checks for the counts of 'a' and 'c' to be exactly 1 and all others to be 0.
Assuming the column names of usertogroup are user and group.
SELECT ug1.user,
ug1.group
FROM usertogroup ug1
INNER JOIN (SELECT ug2.user
FROM usertogroup ug2
GROUP BY ug2.user
HAVING count(CASE
WHEN ug2.group = 'a') THEN
1
END) = 1
AND count(CASE
WHEN ug2.group = 'c') THEN
1
END) = 1
AND count(CASE
WHEN ug2.group NOT IN ('a', 'c') THEN
1
END) = 0) x
ON x.user = ug1.user;

I hope this is what you want :
SELECT
CASE WHEN GROUPS IN ('a','c')
THEN
USERS||'-'||GROUPS
ELSE
'USER IN SOME OTHER GROUP'
END CASE
FROM TABLE order by Groups ;

First get the users that belong to exactly 2 groups and then apply the condition that these groups are a and c:
select u.* from UserToGroup as u
inner join (
select userid from UserToGroup
group by userid
having count(*) = 2
) as t
on
t.userid = u.userid
and
u.groupid IN ('a', 'c')
and
not exists (
select 1
from UserToGroup as uu
where
uu.userid = t.userid
and
uu.groupid NOT IN ('a', 'c')
)
See the demo

How to optimize multiple subqueries to the same data set

Imagine I have a query like the following one:
SELECT
u.ID,
( SELECT
COUNT(*)
FROM
POSTS p
WHERE
p.USER_ID = u.ID
AND p.TYPE = 1
) AS interesting_posts,
( SELECT
COUNT(*)
FROM
POSTS p
WHERE
p.USER_ID = u.ID
AND p.TYPE = 2
) AS boring_posts,
( SELECT
COUNT(*)
FROM
COMMENTS c
WHERE
c.USER_ID = u.ID
AND c.TYPE = 1
) AS interesting_comments,
( SELECT
COUNT(*)
FROM
COMMENTS c
WHERE
c.USER_ID = u.ID
AND c.TYPE = 2
) AS boring_comments
FROM
USERS u;
( Hopefully it's correct because I just came up with it and didn't test it )
where I try to calculate the number of interesting and boring posts and comments that the user has.
Now, the problem with this query is that we have 2 sequential scans on both the posts and comments table and I wonder if there is a way to avoid that?
I could probably LEFT JOIN both posts and comments to the users table and do some aggregation but it's gonna generate a lot of rows before aggregation and I am not sure if that's a good way to go.

Aggregate posts and comments and outer join them to the users table.
select
u.id as user_id,
coaleasce(p.interesting, 0) as interesting_posts,
coaleasce(p.boring, 0) as boring_posts,
coaleasce(c.interesting, 0) as interesting_comments,
coaleasce(c.boring, 0) as boring_comments
from users u
left join
(
select
user_id,
count(case when type = 1 then 1 end) as interesting,
count(case when type = 2 then 1 end) as boring
from posts
group by user_id
) p on p.user_id = u.id
left join
(
select
user_id,
count(case when type = 1 then 1 end) as interesting,
count(case when type = 2 then 1 end) as boring
from comments
group by user_id
) c on c.user_id = u.id;

compare results and execution plan (here you scan posts once):
with c as (
select distinct
count(1) filter (where TYPE = 1) over (partition by USER_ID) interesting_posts
, count(1) filter (where TYPE = 2) over (partition by USER_ID) boring_posts
, USER_ID
)
, p as (select USER_ID,max(interesting_posts) interesting_posts, max(boring_posts) boring_posts from c)
SELECT
u.ID, interesting_posts,boring_posts
, ( SELECT
COUNT(*)
FROM
COMMENTS c
WHERE
c.USER_ID = u.ID
) AS comments
FROM
USERS u
JOIN p on p.USER_ID = u.ID

How to get Max column with group by using MS SQL?

What I want to query is "get user's last fatal logs". When I query below statement it only returns "username" and "logDate" fields but I also want to get this "logDate"'s corresponding row(I mean logid, logdata);
SELECT user.username, MAX(log.logDate) FROM user
INNER JOIN log ON user.userid = log.userid
WHERE log.logtype = 'fatal'
GROUP BY user.username
My user table;
userid username
-----------------
1 robert
2 ronaldo
log table;
logid logDate logtype userid logdata
----------------------------------------------------------
1 2016-11-28 19:37:53.000 fatal 1 data
2 2016-11-28 22:37:53.000 fatal 1 data
3 2016-11-28 12:37:53.000 fatal 2 data

I will do this using CROSS APPLY(preferred approach with proper index added to Log table)
SELECT *
FROM [USER] u
CROSS apply (SELECT TOP 1 *
FROM log l
WHERE u.userid = l.userid
AND l.logtype = 'fatal'
ORDER BY l.logDate DESC) cs
If the log table is very large then create a Non Clustered Index on Log table to improve the performance
CREATE NONCLUSTERED INDEX NIX_Log_logtype_userid
ON [log] (logtype,userid)
INCLUDE (logid,logDate,logdata)
Another approach using ROW_NUMBER
SELECT *
FROM (SELECT *,
Row_number()OVER(partition BY [USER].username ORDER BY log.logDate DESC) AS rn
FROM [USER]
INNER JOIN log
ON [USER].userid = log.userid
WHERE log.logtype = 'fatal') A
WHERE rn = 1
Another approach using ROW_NUMBER and TOP 1 with ties
SELECT TOP 1 WITH ties *
FROM [USER]
INNER JOIN log
ON [USER].userid = log.userid
WHERE log.logtype = 'fatal'
ORDER BY Row_number()OVER(partition BY [USER].username ORDER BY log.logDate DESC)
Note : All the queries result all the column from both the tables select the required columns

You can use ROW_NUMBER for this:
SELECT user.username,
log.logid, log.logtype, log.logDate, log.logdata
FROM (
SELECT user.username,
log.logid, log.logtype, log.logDate, log.logdata,
ROW_NUMBER() OVER (PARTITION BY user.username
ORDER BY log.logDate DESC) AS rn
FROM user
INNER JOIN log ON user.userid = log.userid
WHERE log.logtype = 'fatal') AS t
WHERE t.rn = 1

A quick option would be to get the max logdate in a subquery. This way, you can select any fields you need from the user table and don't have to aggregate in the outer query. The only issue with this one is that your logdate needs to not have duplicates. If it's a datetime then this isn't likely but you may have duplicates if it's just a date field. Worth checking.
SELECT
u.username
,u.logdate
,u.logid
,u.logdata
FROM user u
INNER JOIN (SELECT
userid
,MAX(logdate) MaxLog
FROM log
WHERE logtype = 'fatal'
GROUP BY userid) l
ON u.userid = l.userid
AND u.logdate = l.MaxLog

WITH MaxLogDate AS (
SELECT user.userid, MAX(log.logDate) logDate FROM user
INNER JOIN log ON user.userid = log.userid
WHERE log.logtype = 'fatal'
GROUP BY user.userid
)
SELECT log.logid, log.logDate, log.logtype, u.userid, u.username
FROM user u
JOIN MaxLogDate m ON u.userid = m.userid
JOIN log ON log.logDate = m.logDate AND log.userid = m.userid
WHERE log.logtype = 'fatal' --This line is optional, may increase the performance.

Remove grouped data set when total of count is zero with subquery

I'm generating a data set that looks like this
category user total
1 jonesa 0
2 jonesa 0
3 jonesa 0
1 smithb 0
2 smithb 0
3 smithb 5
1 brownc 2
2 brownc 3
3 brownc 4
Where a particular user has 0 records in all categories is it possible to remove their rows form the set? If a user has some activity like smithb does, I'd like to keep all of their records. Even the zeroes rows. Not sure how to go about that, I thought a CASE statement may be of some help but I'm not sure, this is pretty complicated for me. Here is my query
SELECT DISTINCT c.category,
u.user_name,
CASE WHEN (
SELECT COUNT(e.entry_id)
FROM category c1
INNER JOIN entry e1
ON c1.category_id = e1.category_id
WHERE c1.category_id = c.category_id
AND e.user_name = u.user_name
AND e1.entered_date >= TO_DATE ('20140625','YYYYMMDD')
AND e1.entered_date <= TO_DATE ('20140731', 'YYYYMMDD')) > 0 -- I know this won't work
THEN 'Yes'
ELSE NULL
END AS TOTAL
FROM user u
INNER JOIN role r
ON u.id = r.user_id
AND r.id IN (1,2),
category c
LEFT JOIN entry e
ON c.category_id = e.category_id
WHERE c.category_id NOT IN (19,20)
I realise the case statement won't work, but it was an attempt on how this might be possible. I'm really not sure if it's possible or the best direction. Appreciate any guidance.

Try this:
delete from t1
where user in (
select user
from t1
group by user
having count(distinct category) = sum(case when total=0 then 1 else 0 end) )
The sub query can get all the users fit your removal requirement.
count(distinct category) get how many category a user have.
sum(case when total=0 then 1 else 0 end) get how many rows with activities a user have.

There are a number of ways to do this, but the less verbose the SQL is, the harder it may be for you to follow along with the logic. For that reason, I think that using multiple Common Table Expressions will avoid the need to use redundant joins, while being the most readable.
-- assuming user_name and category_name are unique on [user] and [category] respectively.
WITH valid_categories (category_id, category_name) AS
(
-- get set of valid categories
SELECT c.category_id, c.category AS category_name
FROM category c
WHERE c.category_id NOT IN (19,20)
),
valid_users ([user_name]) AS
(
-- get set of users who belong to valid roles
SELECT u.[user_name]
FROM [user] u
WHERE EXISTS (
SELECT *
FROM [role] r
WHERE u.id = r.[user_id] AND r.id IN (1,2)
)
),
valid_entries (entry_id, [user_name], category_id, entry_count) AS
(
-- provides a flag of 1 for easier aggregation
SELECT e.[entry_id], e.[user_name], e.category_id, CAST( 1 AS INT) AS entry_count
FROM [entry] e
WHERE e.entered_date BETWEEN TO_DATE('20140625','YYYYMMDD') AND TO_DATE('20140731', 'YYYYMMDD')
-- determines if entry is within date range
),
user_categories ([user_name], category_id, category_name) AS
( SELECT u.[user_name], c.category_id, c.category_name
FROM valid_users u
-- get the cartesian product of users and categories
CROSS JOIN valid_categories c
-- get only users with a valid entry
WHERE EXISTS (
SELECT *
FROM valid_entries e
WHERE e.[user_name] = u.[user_name]
)
)
/*
You can use these for testing.
SELECT COUNT(*) AS valid_categories_count
FROM valid_categories
SELECT COUNT(*) AS valid_users_count
FROM valid_users
SELECT COUNT(*) AS valid_entries_count
FROM valid_entries
SELECT COUNT(*) AS users_with_entries_count
FROM valid_users u
WHERE EXISTS (
SELECT *
FROM user_categories uc
WHERE uc.user_name = u.user_name
)
SELECT COUNT(*) AS users_without_entries_count
FROM valid_users u
WHERE NOT EXISTS (
SELECT *
FROM user_categories uc
WHERE uc.user_name = u.user_name
)
SELECT uc.[user_name], uc.[category_name], e.[entry_count]
FROM user_categories uc
INNER JOIN valid_entries e ON (uc.[user_name] = e.[user_name] AND uc.[category_id] = e.[category_id])
*/
-- Finally, the results:
SELECT uc.[user_name], uc.[category_name], SUM(NVL(e.[entry_count],0)) AS [entry_count]
FROM user_categories uc
LEFT OUTER JOIN valid_entries e ON (uc.[user_name] = e.[user_name] AND uc.[category_id] = e.[category_id])

Here's another method:
WITH totals AS (
SELECT
c.category,
u.user_name,
COUNT(e.entry_id) AS total,
SUM(COUNT(e.entry_id)) OVER (PARTITION BY u.user_name) AS user_total
FROM
user u
INNER JOIN
role r ON u.id = r.user_id
CROSS JOIN
category c
LEFT JOIN
entry e ON c.category_id = e.category_id
AND u.user_name = e.user_name
AND e1.entered_date >= TO_DATE ('20140625', 'YYYYMMDD')
AND e1.entered_date <= TO_DATE ('20140731', 'YYYYMMDD')
WHERE
r.id IN (1, 2)
AND c.category_id IN (19, 20)
GROUP BY
c.category,
u.user_name
)
SELECT
category,
user_name,
total
FROM
totals
WHERE
user_total > 0
;
The totals derived table calculates the totals per user and category as well as totals across all categories per user (using SUM() OVER ...). The main query returns only rows where the user total is greater than zero.

Rows not being returned in stored proc

I have two rows in dbo.Members but my stored proc is not returning a count. I can run the query alone like SELECT COUNT(*) FROM dbo.Members WHERE MemberID = 1234 and it returns the count as 2 which is correct.
Why does it not return the rows in my stored proc?
SELECT
ValidCount,
InvalidCount,
(SELECT COUNT(*) FROM dbo.Members WHERE MemberID = #pMemberID) AS 'TotalMembers'
FROM
dbo.Reporting
WHERE
MemberID = #pMemberID

Probably because you don't have entries in Reporting with MemberId = 1234.
Try this:
SELECT COALESCE(validCount, 0) AS validCount,
COALESCE(invalidCount, 0) AS invalidCount,
(
SELECT COUNT(*)
FROM members m
WHERE m.memberId = p.memberId
) AS totalMembers
FROM (
SELECT #pMemberId AS memberId
) p
LEFT JOIN
reporting r
ON r.memberId = p.memberId

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Select on priority of not exist - sql

Related

SQL Select for Users only in certain List of Groups

How to optimize multiple subqueries to the same data set

How to get Max column with group by using MS SQL?

Remove grouped data set when total of count is zero with subquery

Rows not being returned in stored proc

Categories

Resources