SQL Server - Count events that happen from 15 min to 14 days from base time - sql

I am using SQL Server 2005. I am trying to count the number of repeats that would fall in between 15 minuites and 14 days when the Client and Type are the same.
The Table [Interactions] looks like:
eci_date user_ID Type Client
2012-05-01 10:29:59.000 user1 12 14
2012-05-01 10:35:04.000 user1 3 15
2012-05-01 10:45:14.000 user3 4 14
2012-05-01 11:50:22.000 user1 5 15
------------------------------------------
2012-05-02 10:30:28.000 user2 12 14
2012-05-02 10:48:59.000 user5 12 14
2012-05-02 10:52:23.000 user2 12 15
2012-05-02 12:49:45.000 user8 3 14
------------------------------------------
2012-05-03 10:30:47.000 user4 5 15
2012-05-03 10:35:00.000 user6 4 12
2012-05-03 10:59:10.000 user7 4 12
I would like the output to look like:
eci_date Type Total_Calls Total_Repeats
2012-05-01 12 1 2
2012-05-01 3 1 0
2012-05-01 4 1 0
2012-05-01 5 1 1
---------------------------------------------
2012-05-02 12 3 0
2012-05-02 3 1 0
---------------------------------------------
2012-05-03 4 2 1
2012-05-03 5 1 0
So there would be 2 repeats because client 14 called in 2 times after the first date they called in because Client and Type must be the same and because I need to filter by day.
Thank You.

With Metrics As
(
Select T1.Client, T1.Type
, Min(eci_Date) As FirstCallDate
From Table1 As T1
Group By T1.Client, T1.Type
)
Select DateAdd(d, DateDiff(d,0,T1.eci_date), 0) As [Day], Type, Count(*) As TotalCalls
, (
Select Count(*)
From Table1 As T2
Join Metrics As M2
On M2.Client = T2.Client
And M2.Type = T2.Type
Where T2.eci_Date >= DateAdd(mi,15,M2.FirstCallDate)
And T2.eci_date <= DateAdd(d,15,M2.FirstCallDate)
And DateAdd(d, DateDiff(d,0,T1.eci_date), 0) = DateAdd(d, DateDiff(d,0,T2.eci_date), 0)
) As Total_Repeats
From Table1 As T1
Group By DateAdd(d, DateDiff(d,0,T1.eci_date), 0), Type
Order By [Day] Asc, Type Desc
SQL Fiddle

Your question is vague, so I'm interpreting it to mean the following:
* The "total_count" column is the number of distinct users on a given day
* The number of repeats is the number of calls after the first one in the next 14 days
The following query accomplishes this:
select eci_date, count(distinct id) as numusers, count(*) as Total_repeats
from
(
select cast(eci_date as date) as eci_date,
id,
count(*) as total,
min(eci_date) as firstcall
from table t
group by cast(eci_date as date), user_id
) t
left outer join table t2
on t.user_id = t2.user_id
and t2.eci_date between firstcall and dateadd(day, 14, firstcall)
and t2.eci_date <> firstcall
group by eci_date
Note this uses the syntax cast(<datetime> as date) to extract the date portion from a datetime.

Related

Calculating moving average over irregular data

I am trying to calculate a moving average of several fields in a SQL Server database that involved irregularly-spaced values over time. I realized that for regularly-spaced data I can use an SELECT grp, AVG(count) FROM t ... OVER (PARTITION BY grp ... ROWS 7 PRECEDING) to create a moving average of the prior week's data. However, I have data organized as follows:
DATE GRP COUNT
2018-07-05 1 10
2018-07-08 1 4
2018-07-11 1 6
2018-07-12 1 6
2018-07-11 2 5
2018-07-15 2 10
2018-07-17 2 8
2018-07-20 2 10
...
Where for most groups there are no observations for some dates. The output I'm looking for is:
DATE GRP MOVING_AVG
2018-07-05 1 10
2018-07-08 1 7
2018-07-11 1 6.67
2018-07-13 1 5.33
2018-07-11 2 5
2018-07-15 2 7.5
2018-07-16 2 7.67
2018-07-20 2 9.33
Is there a way of specifying dates instead of rows in the PRECEDING clause, or do I have to create some sort of mask to average over?
EDITED FOR CLARIFICATION BASED ON COMMENTS
In SQL Server, I think this might be simpler achieved with a lateral join:
select
date,
grp,
(
select avg(count)
from mytable t1
where
t1.grp = t.grp
and t1.date >= dateadd(year, -1, t.date)
and t1.date <= t.date
) as cnt
from mytable
If i'm not misunderstanding. You want 7 or whatever days but rows before a date.
DATE GRP COUNT
2018-07-11 2 5
2018-07-15 2 10
2018-07-17 2 8
2018-07-20 2 10 <--- the AVG of this row must include 7 days before,so 2018-07-11 not include
In that case :
select
date,
grp,
(
select avg(count)
from t t1
where
t1.grp = t.grp
and DATEDIFF(day, t1.date, t.date) <= 7 /*7 or whatever day you want*/
and t1.date <= t.date
) as MOVING_AVG
from t

Subtract subsequent row from previous row based on User

I have the following data and I want to subtract current row from previous row based on the UserID. I tried the code below is not given me what I want
DECLARE #DATETBLE TABLE (UserID INT, Dates DATE)
INSERT INTO #DATETBLE VALUES
(1,'2018-01-01'), (1,'2018-01-02'), (1,'2018-01-03'),(1,'2018-01-13'),
(2,'2018-01-15'),(2,'2018-01-16'),(2,'2018-01-17'), (5,'2018-02-04'),
(5,'2018-02-05'),(5,'2018-02-06'),(5,'2018-02-11'), (5,'2018-02-17')
;with cte as (
select UserID,Dates, row_number() over (order by UserID) as seqnum
from #DATETBLE t
)
select t.UserID,t.Dates, datediff(day,tprev.Dates,t.Dates)as diff
from cte t left outer join
cte tprev
on t.seqnum = tprev.seqnum + 1;
Current Output
UserID Dates diff
1 2018-01-01 NULL
1 2018-01-02 1
1 2018-01-03 1
1 2018-01-13 10
2 2018-01-15 2
2 2018-01-16 1
2 2018-01-17 1
5 2018-02-04 18
5 2018-02-05 1
5 2018-02-06 1
5 2018-02-11 5
5 2018-02-17 6
My Expected Output
UserID Dates diff
1 2018-01-01 NULL
1 2018-01-02 1
1 2018-01-03 1
1 2018-01-13 10
2 2018-01-15 NULL
2 2018-01-16 1
2 2018-01-17 1
5 2018-02-04 NULL
5 2018-02-05 1
5 2018-02-06 1
5 2018-02-11 5
5 2018-02-17 6
Your tag (sql-server-2008) suggests me to use APPLY :
select t.userid, t.dates, datediff(day, t1.dates, t.dates) as diff
from #DATETBLE t outer apply
( select top (1) t1.*
from #DATETBLE t1
where t1.userid = t.userid and
t1.dates < t.dates
order by t1.dates desc
) t1;
If you have SQL Server version 2012 or higher, you could use LAG() with a partition by UserID:
SELECT UserID
, DATEDIFF(dd,COALESCE(LAG_DATES, Dates), Dates) as diff
FROM
(
SELECT UserID
, Dates
, LAG(Dates) OVER (PARTITION BY UserID ORDER BY Dates) as LAG_DATES
FROM #DATETBLE
) exp
This will give you a 0 value instead of a NULL value for the first date in the sequence though.
Since you tagged the post with SQL Server 2008, however, you may need to use a method that doesn't rely on this windowed function.

How to get single result when a column has the same value but the second column have different value

I have this table
Id VendorId ClaimRequestDate
1 5 2017-12-14 00:00:00.000
2 5 2018-02-02 00:00:00.000
7 5 2018-02-07 11:08:25.257
I want my result to show only the latest date for each VendorId starting from date later than 2 Feb 2018
what I've done now
SELECT DISTINCT
[Project1].[Id] AS [Id],
[Project1].[VendorId] AS [VendorId],
[Project1].[ClaimRequestDate] AS [ClaimRequestDate]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[VendorId] AS [VendorId],
[Extent1].[ClaimRequestDate] AS [ClaimRequestDate]
FROM [dbo].[Claim] AS [Extent1]
WHERE [Extent1].[ClaimRequestDate] >= '2018-02-02 00:00:00.000'
) AS [Project1]
ORDER BY [Project1].[ClaimRequestDate] DESC
But my result is
Id VendorId ClaimRequestDate
7 5 2018-02-07 11:08:25.257
2 5 2018-02-02 00:00:00.000
Can someone help me with this
There are tree problem in your query. one is
WHERE [Extent1].[ClaimRequestDate] >= '2018-02-02 00:00:00.000'
Row is >= should be ">" second one is your query gain all rows from date later than 2018-02-02 If a vendorId has more than a value Query would return you can try this
SELECT * FROM Claim c
where ClaimRequestDate IN (select MAX(ClaimRequestDate) from claim c1
where c.vendorId =c1.vendorId and c1.Claimrequestdate >'2018.02.02')
Third is this query when your vendorId has more than same max(Claimrequestdate) would return all of them
Id VendorId ClaimRequestDate
1 5 2017-12-14 00:00:00.000
2 5 2018-02-02 00:00:00.000
7 5 2018-02-07 11:08:25.257
8 5 2018-02-07 11:08:25.257
returns
Id VendorId ClaimRequestDate
7 5 2018-02-07 11:08:25.257
8 5 2018-02-07 11:08:25.257
For these reason I suggest this query for use
SELECT * FROM Claim c
where CAST(ClaimRequestDate AS VARCHAR)+ CAST(ID AS VARCHAR) IN (select
MAX(CAST(ClaimRequestDate AS VARCHAR)+ CAST(ID AS VARCHAR)) from claim c1
where c.vendorId =c1.vendorId and c1.Claimrequestdate >'2018.02.02'
)
Try the following SQL:
select aa.* from [Claim] as aa inner join
(
select [VendorId], max([Id]) as maxId from [Claim]
where [ClaimRequestDate] >= '2018-02-03 00:00:00'
group by [VendorId]
) as bb on aa.[Id] = bb.[maxId]

SQL (Vertica) - Calculate number of users who returned to the app at least x days in the past 7 days

Suppose I have my table like:
uid day_used_app
--- -------------
1 2012-04-28
1 2012-04-29
1 2012-04-30
2 2012-04-29
2 2012-04-30
2 2012-05-01
2 2012-05-21
2 2012-05-22
Suppose I want the number of unique users who returned to the app at least 2 different days in the last 7 days (from 2012-05-03).
So as an example to retrieve the number of users who have used the application on at least 2 different days in the past 7 days:
select count(distinct case when num_different_days_on_app >= 2
then uid else null end) as users_return_2_or_more_days
from (
select uid,
count(distinct day_used_app) as num_different_days_on_app
from table
where day_used_app between current_date() - 7 and current_date()
group by 1
)
This gives me:
users_return_2_or_more_days
---------------------------
2
The question I have is:
What if I want to do this for every day up to now so that my table looks like this, where the second field equals the number of unique users who returned 2 or more different days within a week prior to the date in the first field.
date users_return_2_or_more_days
-------- ---------------------------
2012-04-28 2
2012-04-29 2
2012-04-30 3
2012-05-01 4
2012-05-02 4
2012-05-03 3
Would this help?
WITH
-- your original input, don't use in "real" query ...
input(uid,day_used_app) AS (
SELECT 1,DATE '2012-04-28'
UNION ALL SELECT 1,DATE '2012-04-29'
UNION ALL SELECT 1,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-04-29'
UNION ALL SELECT 2,DATE '2012-04-30'
UNION ALL SELECT 2,DATE '2012-05-01'
UNION ALL SELECT 2,DATE '2012-05-21'
UNION ALL SELECT 2,DATE '2012-05-22'
)
-- end of input, start "real" query here, replace ',' with 'WITH'
,
one_week_b4 AS (
SELECT
uid
, day_used_app
, day_used_app -7 AS day_used_1week_b4
FROM input
)
SELECT
one_week_b4.uid
, one_week_b4.day_used_app
, count(*) AS users_return_2_or_more_days
FROM one_week_b4
JOIN input
ON input.day_used_app BETWEEN one_week_b4.day_used_1week_b4 AND one_week_b4.day_used_app
GROUP BY
one_week_b4.uid
, one_week_b4.day_used_app
HAVING count(*) >= 2
ORDER BY 1;
Output is:
uid|day_used_app|users_return_2_or_more_days
1|2012-04-29 | 3
1|2012-04-30 | 5
2|2012-04-29 | 3
2|2012-04-30 | 5
2|2012-05-01 | 6
2|2012-05-22 | 2
Does that help your needs?
Marco the Sane ...
SELECT DISTINCT
t1.day_used_app,
(
SELECT SUM(CASE WHEN t.num_visits >= 2 THEN 1 ELSE 0 END)
FROM
(
SELECT uid,
COUNT(DISTINCT day_used_app) AS num_visits
FROM table
WHERE day_used_app BETWEEN t1.day_used_app - 7 AND t1.day_used_app
GROUP BY uid
) t
) AS users_return_2_or_more_days
FROM table t1

SQL count number of users every 7 days

I am new to SQL and I need to find count of users every 7 days. I have a table with users for every single day starting from April 2015 up until now:
...
2015-05-16 00:00
2015-05-16 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-17 00:00
2015-05-18 00:00
2015-05-18 00:00
...
and I need to count the number of users every 7 days (weekly) so I have data weekly.
SELECT COUNT(user_id), Activity_Date FROM TABLE_NAME
I need output like this:
TotalUsers week1 week2 week3 ..........and so on
82 80 14 16
I am using DB Visualizer to query Oracle database.
You should try following,
Select
sum(Week1) + sum(Week2) + sum(Week3) + sum(Week4) + sum(Week5) as Total,
sum(Week1) as Week1,
sum(Week2) as Week2,
sum(Week3) as Week3,
sum(Week4) as Week4,
sum(Week5) as Week5
From (
select
case when week = 1 then 1 else 0 end as Week1,
case when week = 2 then 1 else 0 end as Week2,
case when week = 3 then 1 else 0 end as Week3,
case when week = 4 then 1 else 0 end as Week4,
case when week = 5 then 1 else 0 end as Week5
from
(
Select
CEILING(datepart(dd,visitdate)/7+1) week,
user_id
from visitor
)T
)D
Here is Fiddle
You need to add month & year in the result as well.
SELECT COUNT(user_id), Activity_Date FROM TABLE_NAME WHERE Activity_Date > '2015-06-31';
That would get the amount of users for the last 7 days.
This is my test table:
user_id act_date
1 01/04/2015
2 01/04/2015
3 04/04/2015
4 05/04/2015
..
This is my query:
select week_offset, count(*) nb from (
select trunc((act_date-to_date('01042015','DDMMYYYY'))/7) as week_offset from test_date)
group by week_offset
order by 1
and this is the output:
week_offset nb
0 6
1 3
4 5
5 7
6 3
7 1
18 1
Week offset is the number of the week from 01/04/2015, and we can show the first day of the week.
See here for live testing.
How do you define your weeks? Here's an approach for SQL Server that starts each seven-day block relative to the start of April. The expressions will vary according to your specific needs:
select
dateadd(
dd,
datediff(dd, cast('20150401' as date), Activity_Date) / 7 * 7,
cast('20150401' as date)
) as WeekStart,
count(*)
from T
group by datediff(dd, cast('20150401' as date), Activity_Date) / 7
Oracle:
select
trunc(Activity_date, 'DAY') as WeekStart,
count(*)
from T
group by trunc(Activity_date, 'DAY') /* D and DAY are the same thing */