Group By with Where - sql

I want to do a group by on a dataset with a where clause based upon a datetime, but I need to return a count of 0 for any users in the Account table that do not meet the where date requirement. Here is my SQL statement:
select a.userid, count(c.codeentryid)
from [account] a
left join codesentered c
on a.userid = c.userid
where a.camp = 0 and c.entrydate > '2013-12-03 00:00:00'
group by a.userid
order by a.userid
Currently I get counts for all the users who meet the entrydate requirement, but how would I also return the users who don't meet this requirement with a count of 0?

You can include the condition in the join. Since it is a left outer join, it will always show all records from account, and only those of codesentered which match the condition:
select a.userid, count(c.codeentryid)
from [account] a
left outer join codesentered c
on a.userid = c.userid
/* here */ and c.entrydate > '2013-12-03 00:00:00'
where a.camp = 0
group by a.userid
order by a.userid

When you are using a left join, all conditions on the second table should go into the on clause. Otherwise, the outer join becomes an inner join. So, try this:
select a.userid, count(c.codeentryid)
from [account] a left join
codesentered c
on a.userid = c.userid and c.entrydate > '2013-12-03 00:00:00'
where a.camp = 0
group by a.userid
order by a.userid;
Conditions on the first table, in the on clause are basically ignored. A left join returns all rows from the first table, even when the on clause evaluates to false or NULL.

Something like this maybe. It would be easier to test with some sample data.
select a.userid, SUM(CASE WHEN c.entrydate > '2013-12-03 00:00:00' THEN 1 ELSE 0 END)
from [account] a
left join codesentered c
on a.userid = c.userid
where a.camp = 0
group by a.userid
order by a.userid

Related

Slow query when using NOT EXIST in Query

I would like to seek some help regarding the query below.
Running this Script causes the system to timeout. The query is so slow it took 5 minutes to run for just 22 records. I believe this has something to do with "NOT IN" statement. I already look for answers here in Stackoverflow regarding this and some are suggesting using LEFT OUTER JOIN and WHERE NOT EXIST but I can't seem to incorporate it in this query.
SELECT a.UserId, COUNT(DISTINCT(a.CustomerId)) AS TotalUniqueContact
FROM [UserActivityLog] a WITH(NOLOCK)
WHERE CAST(a.ActivityDatetime AS DATE) BETWEEN '2015-09-28' AND '2015-09-30' AND a.ID
NOT IN (
SELECT DISTINCT(COALESCE(a.activitylogid, 0))
FROM [CustomerNoteInteractions] a WITH(NOLOCK)
WHERE a.reason IN ('20', '36') AND CAST(a.datecreated AS DATE) BETWEEN '2015-09-28' AND '2015-09-30' AND a.UserId IN (SELECT b.Id
FROM [User] b
WHERE b.UserType = 'EpicUser' AND b.IsEpicEmployee = 1 AND b.IsActive = 1)
)
AND a.UserId IN (
SELECT b.Id
FROM [User] b
WHERE b.UserType = 'EpicUser' AND b.IsEpicEmployee = 1 AND b.IsActive = 1)
GROUP BY a.UserId
Here is what should be an equivalent query using EXISTS and NOT EXISTS:
SELECT a.UserId,
COUNT(DISTINCT a.CustomerId) AS TotalUniqueContact
FROM [UserActivityLog] a WITH(NOLOCK)
WHERE CAST(a.ActivityDatetime AS DATE) BETWEEN '2015-09-28' AND '2015-09-30'
AND EXISTS (SELECT *
FROM [User] b
WHERE b.Id = a.UserId
AND b.UserType = 'EpicUser'
AND b.IsEpicEmployee = 1
AND b.IsActive = 1)
AND NOT EXISTS (SELECT *
FROM [CustomerNoteInteractions] b WITH(NOLOCK)
JOIN [User] c
ON c.Id = b.UserId
AND c.UserType = 'EpicUser'
AND c.IsEpicEmployee = 1
AND c.IsActive = 1
WHERE b.activitylogid = a.ID
AND b.reason IN ('20', '36')
AND CAST(b.datecreated AS DATE) BETWEEN '2015-09-28' AND '2015-09-30' )
GROUP BY a.UserId
Obviously, it's hard to understand what will truly help your performance without understanding your data. But here is what I expect:
I think the EXISTS/NOT EXISTS version of the query will help.
I think your conditions on UserActivityLog.ActivityDateTime and CustomerNoteInteractions.datecreated are a problem. Why are you casting? Is it not a date type? If not, why not? You would probably get big gains if you could take advantage of an index on those columns. But with the cast, I don't think you can use an index there. Can you do something about it?
You'll also probably benefit from indexes on User.Id (probably the PK anyways), and CustomerNoteInteractions.ActivityLogId.
Also, not a big fan of using with (nolock) to improve performance (Bad habits : Putting NOLOCK everywhere).
EDIT
If your date columns are of type DateTime as you mention in the comments, and so you are using the CAST to eliminate the time portion, a much better alternative for performance is to not cast, but instead modify the way you filter the column. Doing this will allow you to take advantage of any index on the date column. It could make a very big difference.
The query could then be further improved like this:
SELECT a.UserId,
COUNT(DISTINCT a.CustomerId) AS TotalUniqueContact
FROM [UserActivityLog] a WITH(NOLOCK)
WHERE a.ActivityDatetime >= '2015-09-28'
AND a.ActivityDatetime < dateadd(day, 1, '2015-09-30')
AND EXISTS (SELECT *
FROM [User] b
WHERE b.Id = a.UserId
AND b.UserType = 'EpicUser'
AND b.IsEpicEmployee = 1
AND b.IsActive = 1)
AND NOT EXISTS (SELECT *
FROM [CustomerNoteInteractions] b WITH(NOLOCK)
JOIN [User] c
ON c.Id = b.UserId
AND c.UserType = 'EpicUser'
AND c.IsEpicEmployee = 1
AND c.IsActive = 1
WHERE b.activitylogid = a.ID
AND b.reason IN ('20', '36')
AND b.datecreated >= '2015-09-28'
AND b.datecreated < dateadd(day, 1, '2015-09-30'))
GROUP BY a.UserId
This should get you pretty close or exactly work:
SELECT a.UserId, COUNT(DISTINCT(a.CustomerId)) AS TotalUniqueContact
FROM [UserActivityLog] a WITH(NOLOCK)
inner join [User] b with (Nolock) on a.userid = b.id
and b.UserType = 'EpicUser' AND b.IsEpicEmployee = 1 AND b.IsActive = 1
left outer join [CustomerNoteInteractions] c with (nolock) on a.id = c.activitylogid
and c.reason IN ('20', '36') AND CAST(c.datecreated AS DATE) BETWEEN '2015-09-28' AND '2015-09-30'
left outer join [User] d with (nolock) on c.userid = d.id
and d.UserType = 'EpicUser' AND d.IsEpicEmployee = 1 AND d.IsActive = 1
WHERE CAST(a.ActivityDatetime AS DATE) BETWEEN '2015-09-28' AND '2015-09-30'
and c.activitylogid is null
GROUP BY a.UserId

Count based on subset of data

I have a join to a table and I want to include all users who have a record after a certain date, but to only include records after another date in the count.
Here is my SQL :
select a.userid, count(ce.codeentryid)
from [account] a
inner join [profile] p
on a.userid = p.userid
inner join codesentered ce
on a.userid = ce.userid and ce.entrydate > '2011-01-01 00:00:00'
where a.camp = 0
group by a.userid
order by a.userid
So here I want to view a list of all users who have entered a code after 1st Jan 2011, but to only include in the count codes entered after 1st Jan 2013. How would I do this?
EDIT : So this would give me all users who have entered a code after 01/01/2011, but only include codes entered after 01/01/2013 in the count?
select a.userid, count(CASE WHEN ce.entrydate > '2013-01-01 00:00:00' THEN 1 ELSE 0 END)
from [account] a
inner join [profile] p
on a.userid = p.userid
inner join codesentered ce
on a.userid = ce.userid and ce.entrydate > '2011-01-01 00:00:00'
where a.camp = 0
group by a.userid
order by a.userid
Remove the date condition from the ON clause, and use this in the SELECT clause instead of COUNT(ce.codeentryid):
SUM(CASE WHEN ce.entrydate > '2011-01-01 00:00:00' THEN 1 ELSE 0 END)
Your question doesn't make sense, because using two dates is redundant. Unless I assume that you want users whose first count is after 2011-01-01 and then only count what happens after 2013-01-01.
If that is what you want, then use a having clause:
select a.userid, sum(CASE WHEN ce.entrydate > '2013-01-01 00:00:00' THEN 1 ELSE 0 END)
from [account] a inner join
[profile] p
on a.userid = p.userid inner join
codesentered ce
on a.userid = ce.userid
where a.camp = 0
group by a.userid
having min(ce.entrydate) > '2011-01-01 00:00:00'
order by a.userid;
Note that count(CASE WHEN ce.entrydate > '2013-01-01 00:00:00' THEN 1 ELSE 0 END) is the same as count(*). count() counts non-null values. Use sum() instead.

Display including Zero using SQL count(*) and group by

For this query i want to display records zero using SQL Count(*) and group by below is my SQL Query:
SELECT B.BranchName as Filter,
Coalesce(COUNT(*), '0') AS NoofSplits,
SUM(ls.Amount) AS TotalLoanValue
FROM dbo.tblBranch B
LEFT OUTER JOIN dbo.tblLoan L ON L.BranchID = B.BranchID
LEFT OUTER JOIN dbo.tblLoanSplit LS ON L.LoanID = LS.LoanID
WHERE LS.DateSettlement BETWEEN #StartDate AND #EndDate
GROUP BY B.BranchName
ORDER BY B.BranchName
Don't use COUNT(*) when you do an outer join, this will return 1 svn if there was no row to join. You must COUNT the join column of the Outer Table, in your case :
SELECT B.BranchName as Filter,
COUNT(LS.LoanID) AS NoofSplits,
SUM(ls.Amount) AS TotalLoanValue
FROM dbo.tblBranch B
LEFT OUTER JOIN dbo.tblLoan L ON L.BranchID = B.BranchID
LEFT OUTER JOIN dbo.tblLoanSplit LS ON L.LoanID = LS.LoanID
WHERE LS.DateSettlement BETWEEN #StartDate AND #EndDate
GROUP BY B.BranchName
ORDER BY B.BranchName

Using a left join and checking if the row existed along with another check in where clause

I have the following tables:
Users
Banned
SELECT u.*
FROM Users
WHERE u.isActive = 1
AND
u.status <> 'disabled'
I don't want to include any rows where the user may also be in the Banned table.
What's the best way to do this?
I could do this put a subquery in the where clause so it does something like:
u.status <> 'disabled' and not exist (SELECT 1 FORM Banned where userId = #userId)
I think the best way would be to do a LEFT JOIN, how could I do that?
According to this answer, in SQL-Server using NOT EXISTS is more efficient than LEFT JOIN/IS NULL
SELECT *
FROM Users u
WHERE u.IsActive = 1
AND u.Status <> 'disabled'
AND NOT EXISTS (SELECT 1 FROM Banned b WHERE b.UserID = u.UserID)
EDIT
For the sake of completeness this is how I would do it with a LEFT JOIN:
SELECT *
FROM Users u
LEFT JOIN Banned b
ON b.UserID = u.UserID
WHERE u.IsActive = 1
AND u.Status <> 'disabled'
AND b.UserID IS NULL -- EXCLUDE ROWS WITH A MATCH IN `BANNED`
You would just check that the value you got from LEFT JOINing with Banned was NULL:
SELECT U.*
FROM Users U
LEFT JOIN Banned B ON B.userId = U.userId
WHERE U.isActive = 1
AND U.status <> 'disabled'
AND B.userId IS NULL -- no match in the Banned table.
select u.*
from Users u
left outer join Banned b on u.userId = b.userId
where u.isActive = 1
and u.status <> 'disabled'
and b.UserID is null
SELECT u.*
FROM Users u
LEFT JOIN Banned b ON u.userId = b.userId AND b.userRoles = 'VIP'
WHERE u.isActive = 1 AND b.id IS NULL
Use it if You need result and something should be excluded and it is not a key id for table.

Latest entry SQL problem

I have two tables:
UserTable contains (UserID, UserName) and StoryTable contains
(StoryID, UserID(foreignkey), StoryName, InsertedDate)
How can I query to get each User Name along with the latest story name that he has posted ? (I m new to queries so kindly excuse if its quite basic)
I tried:
SELECT a.Username, b.StoryName FROM [dbo].[UserTable] as A INNER JOIN
[dbo].[StoryTable] as b ON a.UserID = b.UserID WHERE InsertedDate =
MAX(InsertedDate) group by a.UserName;
but it throws error in sql server 2008.
Change your query to be this:
SELECT a.Username, b.StoryName
FROM [dbo].[UserTable] as A
INNER JOIN [dbo].[StoryTable] as b ON a.UserID = b.UserID
WHERE b.InsertedDate =
(SELECT MAX(InsertedDate) FROM [StoryTable] AS z WHERE z.UserID = A.UserID)
Edited as per comment:
SELECT a.Username, b.StoryName
FROM [dbo].[UserTable] as A
INNER JOIN [dbo].[StoryTable] as b ON a.UserID = b.UserID
WHERE b.StoryID =
(SELECT MAX(z.StoryID) FROM [StoryTable] AS z WHERE z.UserID = A.UserID)
SELECT Top 1 a.Username, b.StoryName
FROM [dbo].[UserTable] as A
INNER JOIN [dbo].[StoryTable] as b ON a.UserID = b.UserID
order by b.InsertedDate desc
MAX is an aggregate function, to filter using an aggregate function, you need to use the HAVING keyword instead of WHERE
You can do like this
SELECT u.Username, s.StoryName
FROM [dbo].[UserTable] AS u
CROSS APPLY (SELECT TOP 1 StoryName
FROM [dbo].[StoryTable] AS ss
WHERE ss.UserID = u.UserID
ORDER BY ss.InsertedDate DESC
) AS s