SQL Server - Aggregate the number of errors per user per day - sql

The application that I maintain stores user errors in a SQL Server table. When an error occurs it jots down the username of the person that caused it, the time, the error message, and some other housekeeping stuff.
I'm trying to build out a report where we could see the "top 3" errors each day for the past year - the three users with the most errors on each day, the three most common types of errors on each day, etc.
My goal is something like this:
DATE USER1 ERR_COUNT1 USER2 ERR_COUNT2 USER3 ERR_COUNT3
1/1/18 BOB 70 BILL 50 JOE 30
1/2/18 JILL 55 JOY 30 BOB 20
...
I've got a rough loop set up to pull this data from our error logs but when I run it I get the error There is already an object named '#TempErrorLog'in the database. Loop code below:
DECLARE #StartDate AS DATE,
#EndDate AS DATE
SET #StartDate = '2018.1.1'
WHILE #StartDate <= CONVERT(DATE, GETDATE())
BEGIN
SET #EndDate = DATEADD(DAY, 1, #StartDate)
SELECT #StartDate AS date_err,
( u.name_frst+' '+u.name_lst ) AS user_err,
COUNT(e.id_err) AS count_err
INTO dbo.#TempErrLog
FROM err_log AS e
LEFT JOIN users AS u ON e.id_user = u.id_user
WHERE e.dttm_err >= #StartDate AND
e.dttm_err < #EndDate AND
e.id_user <> 'system'
GROUP BY ( u.name_frst+' '+u.name_lst )
ORDER BY count_err DESC;
SET #StartDate = DATEADD(DAY, 1, #StartDate)
CONTINUE
END
SELECT * FROM #TempErrLog
My guess is that it is trying to create a new temporary table each time the loop iterates. Is there a better approach I should be using here?

You can pivot this using conditional aggregation and row_number(). For the results in your question:
with ue as (
select e.*, (u.name_frst + ' ' + u.name_lst ) as user_name,
cast(e.dttm_err as date) as err_date
from err_log e join
users u
on e.id_user = u.id_user
)
select u.err_date,
max(case when u.seqnum = 1 then u.user_name end) as user_1,
max(case when u.seqnum = 1 then u.cnt end) as cnt_1,
max(case when u.seqnum = 2 then u.user_name end) as user_2,
max(case when u.seqnum = 2 then u.cnt end) as cnt_2,
max(case when u.seqnum = 3 then u.user_name end) as user_3,
max(case when u.seqnum = 3 then u.cnt end) as cnt_3
from (select err_date, user_name, count(*) as cnt,
row_number() over (partition by err_date order by count(*) desc) as seqnum
from ul
group by err_date, user_name
) u
group by u.err_date

Related

Fill in blank dates for rolling average - CTE in Snowflake

I have two tables – activity and purchase
Activity table:
user_id date videos_watched
1 2020-01-02 3
1 2020-01-04 5
1 2020-01-07 5
Purchase table:
user_id purchase_date
1 2020-01-01
2 2020-02-02
What I would like to do is to get a 30 day rolling average since purchase on how many videos has been watched.
The base query is like this:
SELECT
DATEDIFF(DAY, p.purchase_date, a.date) AS day_since_purchase,
AVG(A.VIDEOS_VIEWED)
FROM PURCHASE P
LEFT OUTER JOIN ACTIVITY A ON P.USER_ID = A.USER_ID AND
A.DATE >= P.PURCHASE_DATE AND A.DATE <= DATEADD(DAY, 30, P.PURCHASE_DATE)
GROUP BY 1;
However, the Activity table only has records for each day a video has been logged. I would like to fill in the blanks for days a video has not been viewed.
I have started to look into using a CTE like this:
WITH cte AS (
SELECT date('2020-01-01') as fdate
UNION ALL
SELECT CAST(DATEADD(day,1,fdate) as date)
FROM cte
WHERE fdate < date('2020-04-01')
) select * from cte
cross join purchases p
left outer join activity a
on p.user id = a.user_id
and a.fdate = p.purchase_date
and a.date >= p.purchase_date and a.date <= dateadd(day, 30, p.purchase_date)
The end goal is to have something like this:
days_since_purchase videos_watched
1 3
2 0 --CTE coalesce inserted value
3 0
4 5
Been trying for the last couple of hours to get it right, but still can't really get the hang of it.
If you want to fill in the gaps in the result set, then I think you should be generating integers rather than dates:
WITH cte AS (
SELECT 1 as day_since_purchase
UNION ALL
SELECT 1 + day_since_purchase
FROM cte
WHERE day_since_purchase < 4
)
SELECT cte.day_since_purchase, COALESCE(avg_videos_viewed, 0)
FROM cte LEFT JOIN
(SELECT DATEDIFF(DAY, p.purchase_date, a.date) AS day_since_purchase,
AVG(A.VIDEOS_VIEWED) as avg_videos_viewed
FROM purchases p JOIN
activity a
ON p.user id = a.user_id AND
a.fdate = p.purchase_date AND
a.date >= p.purchase_date AND
a.date <= dateadd(day, 30, p.purchase_date)
GROUP BY 1
) pa
ON pa.day_since_purchase = cte.day_since_purchase;
You can use a recursive query to generate the 30 days following each purchase, then bring the activity table:
with cte as (
select
purchase_date,
client_id,
0 days_since_purchase,
purchase_date dt
from purchases
union all
select
purchase_date,
client_id,
days_since_purchase + 1
dateadd(day, days_since_purchase + 1, purchase_date)
from cte
where days_since_purchase < 30
)
select
c.days_since_purchase,
avg(colaesce(a. videos_watch, 0)) avg_ videos_watch
from cte c
left join activity a
on a.client_id = c.client_id
and a.fdate = c.purchase_date
and a.date = c.dt
group by c.days_since_purchase
Your question is unclear on whether you have a column in the activity table that stores the purchase date each row relates to. Your query has column fdate but not your sample data. I used that column in the query (without such column, you might end up counting the same activity in different purchases).

Multiple Selects into one select

I'm trying to put some data together for a High Charts Bar chart using ASP.NET. Basically, i have three users who i need to track when they have logged into the system. the variants to be used are:
1) Today
2) This Week
3) Last Week
4) Last Month
So, i've created individual tsql scripts for today and and last week, but i'm now a little stuck on how to combine the two statemets, which will eventually be four.
SELECT Count(*) as CountToday from hitsTable WHERE Convert(date,hitDate) =
Convert(date,GETDATE()) Group by UserId
SELECT count(*) as CountLatWeek from hitTable
where hitDate between (DATEADD(week, DATEDIFF (week,0,GETDATE()),-1))
AND getDate() Group by UserId
Searhing on google, leads me to nested select statements, which all seem to form dependacies with the two statements. However, what i need to do is produce a table of results like this:
EDIT
I've set up a SQL Fiddle, so we can test out the examples
http://www.sqlfiddle.com/#!6/a21ec
the fiddle has tsql for today and tsql for last week (which may need some tweaking)
Select Distinct
UserId
, ( Select Count(*) as CountToday from hitsTable h2
Where h2.UserId = h1.UserId
And Convert(date,hitDate) = Convert(date,GETDATE())
) As CountToday
, ( Select count(*) as CountLatWeek from hitsTable h2
Where h2.UserId = h1.UserId
And hitDate Between DATEADD(dd, -(DATEPART(dw, GetDate())-1)-7, GetDate())
And DATEADD(dd, 7-(DATEPART(dw, GetDate()))-7, GetDate())
) As CountLastWeek
FROM hitsTable h1
Here’s another alternative based on #Avinash comment on the question.
Select
UserId
, CountTodayTable.CountToday
, CountLatWeekTable.CountLatWeek
, ...
FROM hitsTable h1
Inner Join
( Select Count(*) as CountToday from hitsTable h2
Where h2.UserId = h1.UserId
And Convert(date,hitDate) = Convert(date,GETDATE())
) CountTodayTable
On CountTodayTable.UserId = h1.UserId
Inner Join
( Select count(*) as CountLatWeek from hitTable h2
Where h2.UserId = h1.UserId
And hitDate between (DATEADD(week, DATEDIFF (week,0,GETDATE()),-1)) And getDate()
) CountLatWeekTable
On CountLatWeekTable.UserId = h1.UserId
...
Try this query
select
id,
sum(case when Convert(date,hitDate) = Convert(date,GETDATE()) then 1 else 0 end) as as CountToday,
sum(hitDate between (DATEADD(week, DATEDIFF (week,0,GETDATE()),-1)) AND getDate() then 1 else 0 end) as CountLatWeek,
...... -- Add more condition
from
hitsTable
group by
UserId
Edit
select
userid,
sum(case when Convert(date,hitDate) =
Convert(date,GETDATE()) then 1 else 0 end) as cnt
from
hitstable
group by userid
FIDDLE
| USERID | CNT |
|--------|-----|
| User1 | 3 |
| User2 | 0 |

SQL: Count Users with Activity in the Past Week

I am trying to count the number of users who have had at least two sessions within 7 days of OR ten in 30 days of all dates.
My data is as follow:
Date UserID SessionID
1/1/2013 Bob1234 1
2/1/2013 Bob1234 2
2/2/2013 Bob1234 3
2/3/2013 Cal5678 4
Which would result in the following table (only select dates shown)
Date CountActiveUsers
1/1/2013 1
1/15/2013 0
2/2/2013 1
2/3/2013 2
The real data set has values for all dates in a continuous data range and the results table should have an entry for every date.
SessionIDs are unique and a UserID always refers to the same person.
So far I have two queries that do something close-ish. The first returns the count of sessions in the past week by user:
SELECT Count(
d.[SessionID]
) As SessionPastWeek
,m.[UserID]
,m.[Date]
FROM [Cosmos].[dbo].[Sessions_tbl] as m
Inner Join [Cosmos].[dbo].[Sessions_tbl] as d
on m.[UserID2] = d.[UserID] AND
--Between does not work here for some reason
d.[Date] <= m.[Date] AND
d.[Date] > DATEADD(d,-7,m.[date])
Group By m.[UserID]
,m.[Date]
The other is from the following link which count the number of active users in a given date
Active Users SQL query
I am in SQL Server 2012
I am having trouble combining the two.
Edit for clarification: the query I need likely won't have any getdate() or similar as I need to know how many users fit the 'active' criteria on Jan 1, today, and all the dates inbetween.
Thanks for any help!
I think you just need to add a HAVING clause:
HAVING COUNT(d.[SessionID]) >= 2
On your 10 in 30 query, just change your DATEADD() to have 30 days, and change the HAVING clause to be >= 10.
SELECT COUNT(d.[SessionID]) AS SessionPastPeriod
, m.[UserID]
, m.[Date]
FROM Sessions_tbl AS m
INNER JOIN Sessions_tbl as d
ON m.UserID = d.UserID
AND d.[Date] <= m.[Date]
AND d.[Date] > DATEADD(d,-7,m.[Date])
GROUP BY m.UserID
, m.[Date]
HAVING COUNT(d.[SessionID]) >= 2
I hope this helps.
You are too close.
SELECT Count(d.[SessionID]) As SessionPastWeek
,m.[UserID]
,m.[Date]
FROM [Cosmos].[dbo].[Sessions_tbl] as m
Inner Join [Cosmos].[dbo].[Sessions_tbl] as d on m.[UserID2] = d.[UserID]
--Between does not work here for some reason
where --ADD where clause
d.[Date] <= getdate() AND
d.[Date] > DATEADD(d,-7,getdate())
Group By m.[UserID],m.[Date]
having Count(d.[SessionID])>1 --The magical clause for you.
select count(*)
from (
select UserID
, sum(case when Date between dateadd(day, -7, getdate()) and getdate()
then 1 end) as LastWeek
, sum(case when Date between dateadd(day, -30, getdate()) and getdate()
then 1 end) as Last30Days
from Sessions_tbl
group by
UserID
) SubQueryAlias
where LastWeek >= 2
or Last30Days >= 10
The following query works:
Select
Count(UserID) As CountUsers
,[Date]
From(
SELECT COUNT(d.[SessionID]) AS SessionPastPeriod
, m.[Date]
, m.UserID
FROM [Sessions_tbl] AS m
INNER JOIN [Sessions_tbl] as d
ON m.UserID = d.UserID
AND d.[Date] <= m.[Date]
AND d.[Date] > DATEADD(d,-7,m.[Date])
GROUP BY
m.UserID
,m.[Date]
HAVING COUNT(d.[SessionID]) >= 2) SQ
Group By [Date]

SQL Query in CRM Report

A "Case" in CRM has a field called "Status" with four options.
I'm trying to
build a report in CRM that fills a table with every week of the year (each row is a different week), and then counts the number of cases that have each Status option (the columns would be each of the Status options).
The table would look like this
Status 1 Status 2 Status 3
Week 1 3 55 4
Week 2 5 23 5
Week 3 14 11 33
So far I have the following:
SELECT
SUM(case WHEN status = 1 then 1 else 0 end) Status1,
SUM(case WHEN status = 2 then 1 else 0 end) Status2,
SUM(case WHEN status = 3 then 1 else 0 end) Status3,
SUM(case WHEN status = 4 then 1 else 0 end) Status4,
SUM(case WHEN status = 5 then 1 else 0 end) Status5
FROM [DB].[dbo].[Contact]
Which gives me the following:
Status 1 Status 2 Status 3
2 43 53
Now I need to somehow split this into 52 rows for the past year and filter these results by date (columns in the Contact table). I'm a bit new to SQL queries and CRM - any help here would be much appreciated.
Here is a SQLFiddle with my progress and sample data: http://sqlfiddle.com/#!2/85b19/1
Sounds like you want to group by a range. The trick is to create a new field that represents each range (for you one per year) and group by that.
Since it also seems like you want an infinite range of dates, marc_s has a good summary for how to do the group by trick with dates in a generic way: SQL group by frequency within a date range
So, let's break this down:
You want to make a report that shows, for each contact, a breakdown, week by week, of the number of cases registered to that contact, which is divided into three columns, one for each StateCode.
If this is the case, then you would need to have 52 date records (or so) for each contact. For calendar like requests, it's always good to have a separate calendar table that lets you query from it. Dan Guzman has a blog entry that creates a useful calendar table which I'll use in the query.
WITH WeekNumbers AS
(
SELECT
FirstDateOfWeek,
-- order by first date of week, grouping calendar year to produce week numbers
WeekNumber = row_number() OVER (PARTITION BY CalendarYear ORDER BY FirstDateOfWeek)
FROM
master.dbo.Calendar -- created from script
GROUP BY
FirstDateOfWeek,
CalendarYear
), Calendar AS
(
SELECT
WeekNumber =
(
SELECT
WeekNumber
FROM
WeekNumbers WN
WHERE
C.FirstDateOfWeek = WN.FirstDateOfWeek
),
*
FROM
master.dbo.Calendar C
WHERE
CalendarDate BETWEEN '1/1/2012' AND getutcdate()
)
SELECT
C.FullName,
----include the below if the data is necessary
--Cl.WeekNumber,
--Cl.CalendarYear,
--Cl.FirstDateOfWeek,
--Cl.LastDateOfWeek,
'Week: ' + CAST(Cl.WeekNumber AS VARCHAR(20))
+ ', Year: ' + CAST(Cl.CalendarYear AS VARCHAR(20)) WeekNumber
FROM
CRM.dbo.Contact C
-- use a cartesian join to produce a table list
CROSS JOIN
(
SELECT
DISTINCT WeekNumber,
CalendarYear,
FirstDateOfWeek,
LastDateOfWeek
FROM
Calendar
) Cl
ORDER BY
C.FullName,
Cl.WeekNumber
This is different from the solution Ben linked to because Marc's query only returns weeks where there is a matching value, whereas you may or may not want to see even the weeks where there is no activity.
Once you have your core tables of contacts split out week by week as in the above (or altered for your specific time period), you can simply add a subquery for each StateCode to see the breakdown in columns as in the final query below.
WITH WeekNumbers AS
(
SELECT
FirstDateOfWeek,
WeekNumber = row_number() OVER (PARTITION BY CalendarYear ORDER BY FirstDateOfWeek)
FROM
master.dbo.Calendar
GROUP BY
FirstDateOfWeek,
CalendarYear
), Calendar AS
(
SELECT
WeekNumber =
(
SELECT
WeekNumber
FROM
WeekNumbers WN
WHERE
C.FirstDateOfWeek = WN.FirstDateOfWeek
),
*
FROM
master.dbo.Calendar C
WHERE
CalendarDate BETWEEN '1/1/2012' AND getutcdate()
)
SELECT
C.FullName,
--Cl.WeekNumber,
--Cl.CalendarYear,
--Cl.FirstDateOfWeek,
--Cl.LastDateOfWeek,
'Week: ' + CAST(Cl.WeekNumber AS VARCHAR(20)) +', Year: ' + CAST(Cl.CalendarYear AS VARCHAR(20)) WeekNumber,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Active'
) ActiveCases,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Resolved'
) ResolvedCases,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Canceled'
) CancelledCases
FROM
CRM.dbo.Contact C
CROSS JOIN
(
SELECT
DISTINCT WeekNumber,
CalendarYear,
FirstDateOfWeek,
LastDateOfWeek
FROM
Calendar
) Cl
ORDER BY
C.FullName,
Cl.WeekNumber

How to determine if two records are 1 year apart (using a timestamp)

I need to analyze some weblogs and determine if a user has visited once, taken a year break, and visited again. I want to add a flag to every row (Y/N) with a VisitId that meets the above criteria.
How would I go about creating this sql?
Here are the fields I have, that I think need to be used (by analyzing the timestamp of the first page of each visit):
VisitID - each visit has a unique Id (ie. 12356, 12345, 16459)
UserID - each user has one Id (ie. steve = 1, ted = 2, mark = 12345, etc...)
TimeStamp - looks like this: 2010-01-01 00:32:30.000
select VisitID, UserID, TimeStamp from page_view_t where pageNum = 1;
thanks - any help would be greatly appreciated.
You could rank every user's rows, then join the ranked row set to itself to compare adjacent rows:
;
WITH ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY TimeStamp)
FROM page_view_t
),
flagged AS (
SELECT
*,
IsReturnVisit = CASE
WHEN EXISTS (
SELECT *
FROM ranked
WHERE UserID = r.UserID
AND rnk = r.rnk - 1
AND TimeStamp <= DATEADD(YEAR, -1, r.TimeStamp)
)
THEN 'Y'
ELSE 'N'
END
FROM ranked r
)
SELECT
VisitID,
UserID,
TimeStamp,
IsReturnVisit
FROM flagged
Note: the above flags only return visits.
UPDATE
To flag the first visits same as return visits, the flagged CTE could be modified as follows:
…
SELECT
*,
IsFirstOrReturnVisit = CASE
WHEN p.UserID IS NULL OR r.TimeStamp >= DATEADD(YEAR, 1, p.TimeStamp)
THEN 'Y'
ELSE 'N'
END
FROM ranked r
LEFT JOIN ranked p ON r.UserID = p.UserID AND r.rnk = p.rnk + 1
…
References that might be useful:
WITH common_table_expression (Transact-SQL)
Ranking Functions (Transact-SQL)
ROW_NUMBER (Transact-SQL)
The other guy was faster but since I took time to do it and it's a completely different approach I might as well post It :D.
SELECT pv2.VisitID,
pv2.UserID,
pv2.TimeStamp,
CASE WHEN pv1.VisitID IS NOT NULL
AND pv3.VisitID IS NULL
THEN 'YES' ELSE 'NO' END AS IsReturnVisit
FROM page_view_t pv2
LEFT JOIN page_view_t pv1 ON pv1.UserID = pv2.UserID
AND pv1.VisitID <> pv2.VisitID
AND (pv1.TimeStamp <= DATEADD(YEAR, -1, pv2.TimeStamp)
OR pv2.TimeStamp <= DATEADD(YEAR, -1, pv1.TimeStamp))
AND pv1.pageNum = 1
LEFT JOIN page_view_t pv3 ON pv1.UserID = pv3.UserID
AND (pv3.TimeStamp BETWEEN pv1.TimeStamp AND pv2.TimeStamp
OR pv3.TimeStamp BETWEEN pv2.TimeStamp AND pv1.TimeStamp)
AND pv3.pageNum = 1
WHERE pv2.pageNum = 1
Assuming page_view_t table stores UserID and TimeStamp details of each visit of the user, the following query will return users who have visited taking a break of at least an year (365 days) between two consecutive visits.
select t1.UserID
from page_view_t t1
where (
select datediff(day, max(t2.[TimeStamp]), t1.[TimeStamp])
from page_view_t t2
where t2.UserID = t1.UserID and t2.[TimeStamp] < t1.[TimeStamp]
group by t2.UserID
) >= 365