Number of days since last activity - SQL - sql

I'm trying to count the number of days since the last activity. My data is weekly aggregated. I'm able to do the lag but not able to include the current week.
Data:
ID DATE CHANNEL VENDOR ENG
xyz 2022-11-18 EMAIL ALPHA 1
xyz 2022-11-25 EMAIL ALPHA 1
xyz 2022-12-09 EMAIL ALPHA 1
xyz 2022-12-16 EMAIL ALPHA 0
xyz 2022-12-23 EMAIL ALPHA 0
xyz 2022-12-30 EMAIL ALPHA 3
I would like to have the output to be as follows:
ID DATE CHANNEL VENDOR ENG n_days
xyz 2022-11-18 EMAIL ALPHA 1 0
xyz 2022-11-25 EMAIL ALPHA 1 0
xyz 2022-12-09 EMAIL ALPHA 1 0
xyz 2022-12-16 EMAIL ALPHA 0 7
xyz 2022-12-23 EMAIL ALPHA 0 14
xyz 2022-12-30 EMAIL ALPHA 3 0
I have written a query but it not able to include the most latest week. Below is my query:
SELECT DISTINCT ID, DATE, CHANNEL, VENDOR,
DATE - LAG(DATE) OVER (PARTITION BY ID, CHANNEL, VENDOR ORDER BY DATE) AS "NDAYS_LAST_ENGAGED_CHANNEL_VENDOR"
FROM
tab1
WHERE
ENG>0

Your WHERE clause means it will only output the rows where eng > 0, so there is no way it will return what you want - its going to throw the last two rows away. And LAG will be just refer to the previous row in the partition, its not going to search for a row that matches your where clause as I think you hoped.
This should hopefully put you on the right track. I've assumed Sql server, though :
DROP TABLE IF EXISTS #t
CREATE TABLE #t (ID VARCHAR(3), [DATE] DATE, CHANNEL VARCHAR(5), VENDOR VARCHAR(5), ENG INT);
INSERT INTO #t VALUES
('xyz','20221118','EMAIL','ALPHA',1),
('xyz','20221125','EMAIL','ALPHA',0),
('xyz','20221209','EMAIL','ALPHA',1),
('xyz','20221216','EMAIL','ALPHA',0),
('xyz','20221223','EMAIL','ALPHA',0);
SELECT t1.ID, t1.DATE, t1.CHANNEL, t1.VENDOR, t1.ENG,
MIN(CASE WHEN t1.ENG = 0
THEN DATEDIFF(DAY, t_eng.DATE, t1.DATE)
ELSE 0
END) AS N_DAYS
FROM #t t1
LEFT OUTER JOIN #t t_eng ON t_eng.ENG = 1
AND t_eng.CHANNEL = t1.CHANNEL
AND t_eng.ID = t1.ID
AND t_eng.VENDOR = t1.VENDOR
AND t_eng.DATE < t1.DATE
GROUP BY t1.ID, t1.DATE, t1.CHANNEL, t1.VENDOR, t1.ENG

I was able to construct a right query that would give difference in days since last event in a chronological order with a massive help of James Casey. The query is as below:
QUERY:
SELECT T1.ID, T1.DATE, T1.CHANNEL, T1.VENDOR,
IFNULL(MIN(DATEDIFF(DD, T_ENG.EVENT_DATE, T1.EVENT_DATE)), -1) AS "N_DAYS"
FROM
TAB1 AS T1
LEFT OUTER JOIN
TAB1 AS T_ENG
ON T_ENG.ENG>0 AND T1.ID=T_ENG.ID AND T1.CHANNEL=T_ENG.CHANNEL AND T1.VENDOR=T_ENG.VENDOR
AND DATEDIFF(DD, T_ENG.DATE, T1.DATE)>=0
GROUP BY
T1.ID, T1.DATE, T1.CHANNEL, T1.VENDOR
ORDER BY 2;

Related

SQL Query for retrieving records having the same ID and Type but only separated by 3 minutes apart

I am trying to write a query to retrieve the IDs from a table which looks like this:
ID
TYPE
CREATED_TIME
1234
start
2021-11-01 21:43:48.0000000
1234
start
2021-11-01 21:44:40.0000000
1234
end
2021-11-04 15:27:50.0000000
4567
start
2021-09-02 20:12:40.0000000
4567
start
2021-09-02 23:01:11.0000000
Ideally I want the query to return the ID's which have 2 or more records of the same type and were created less than 3 minutes apart. So it should return ID 1234 because it has the 2 records of the type = start and created time less than 1 minute apart.
It should not return 4567 because the createdtime is 3 hours apart.
Assuming your table is called DATA this should work:
SELECT DISTINCT t1.ID
FROM
DATA t1 JOIN
DATA t2 ON t1.ID = t2.ID
AND t1.TYPE = t2.TYPE
AND t1.CREATED_TIME <> t2.CREATED_TIME
AND (ABS(DATEDIFF(MINUTE, t1.CREATED_TIME, t2.CREATED_TIME)) < 3)
Self-joining is inefficient. You should use window functions for this
SELECT ID
FROM (
SELECT *,
PrevVal = LAG(CREATED_TIME, 1, '19000101') OVER (PARTITION BY ID, Type ORDER BY CREATED_TIME)
FROM YourTable t
) t
WHERE DATEADD(minute, 3, t.PrevVal) >= t.CREATED_TIME
GROUP BY ID;
db<>fiddle

trying to get Statistics for data based on another parameter

Struggling again on statistics on data based on other sets of data.
I have a list of customers. like the following:
CustomerID Value Date
1 3 01/01/2017
2 2 01/02/2017
3 1 01/02/2017
1 5 01/04/2017
1 6 01/04/2017
2 1 01/04/2017
2 2 01/04/2017
I want to get an average for a date range for Customer 1 on the days where customer 2 also has values. Does anyone have any thoughts?
example
Select avg(value)
from Table where customerid=1
and (customer 2 values are not blank)
and date between '01/01/2017' and '01/31/2017'
I am using SQL Server Express 2012.
Another Option
Select AvgValue = Avg(Value+0.0) -- Remove +0.0 if you want an INT
From YourTable
Where CustomerID = 1
and Date in (Select Distinct Date from YourTable Where CustomerID=2)
Returns
AvgValue
5.500000
You can select the dates using exists or in and then calculate the average:
select avg(value)
from datatbl t
where customerid = 1 and
exists (select 1 from datatbl t2 where t2.customerId = 2 and t2.date = t.date);
If you want the average per date, then include group by date.

Selecting data from DB where there are multiple records and fieldvalue is the same in both all records

I have a table in SQL Server and I'm having a difficult time querying for the data that I need.
Here's what it looks like....
ClientNo RecordNo ApptDate
-----------------------------------------------
7 1 10/31/2016
7 2 10/31/2016
7 3 10/15/2016
9 1 11/12/2016
9 2 11/11/2016
18 1 9/19/2016
So looking at this table - each client can have 1 or multiple records. I'm trying to find all clients that have more than 1 recordNo, and for all clients that have more than 1 record - I need to make sure to only display those that have the same ApptDate for both entries.
My end goal is to see this...
ClientNo RecordNo ApptDate
-------------------------------------------
7 1 10/31/2016
7 2 10/31/2016
So client 7 has 3 records (1,2,3) and the ApptDate is the for 2 out of 3 records. I only want to see the records where ApptDate is the same and skip the record where ApptDate = 10/15/2016 since it's irrelevant!
I have never done anything like this where I'm specifying that ApptDate = ApptDate and really haven't a clue how to do this.
Try this:
SELECT *
FROM mytable
wHERE ClientNo IN (SELECT ClientNo
FROM mytable
GROUP BY ClientNo
HAVING COUNT(DISTINCT RecordNo) > 1 AND
COUNT(DISTINCT ApptDate) = 1
The first predicate of the HAVING clause:
COUNT(DISTINCT RecordNo) > 1
filters out ClientNo values having only one related RecordNo value.
The second predicate of the HAVING clause:
COUNT(DISTINCT ApptDate) = 1
filters out ClientNo values being related to more than one ApptDate values.
Edit:
To get records having the same ClientNo, different RecordNo and the same ApptDate you can use a simple JOIN:
SELECT t1.*
FROM mytable AS t1
JOIN mytable AS t2
ON t1.ClientNo = t2.ClientNo AND
t1.ApptDate = t2.ApptDate AND
t1.RecordNo <> t2.RecordNo
I believe this is what you are looking for... the window function (https://msdn.microsoft.com/en-us/library/ms189461.aspx) will let you find clients that have the same date.
select
clientno,
recordno,
apptdate
from
(
select
clientno,
recordno,
apptdate,
count(*) over(partition by clientno, apptdate) as numrecs
from
table
)
where
numrecs > 1

How to use a select command to find all the records that has the maximum date value for a specific item?

Say I have a table like this, we call it tbl_test
ID thedate actionid songid
1 2014-10-01 100 10
2 2014-09-30 100 10
3 2014-10-01 80 10
4 2014-09-30 80 10
5 2014-10-01 80 21
6 2014-09-30 100 21
Now I want to find all the record thats in the tbl_test where actionid=100 and with the latest [thedate] value. In this case, I want the final select result to be
(this is the result I want, not an existing table)
ID thedate actionid songid
1 2014-10-01 100 10
6 2014-09-30 100 21
Question, how am I going to do that use nothing but a single select command in MS SQL Server?
Use a join to a query that returns the latest date for each song:
select tbl_test.*
from tbl_test
join (select songid, max(theDate) maxDate
from tbl_test
where actionId = 100
group by songid) t on t.songId = tbl_test.songId and theDate = maxDate
where actionid = 100
This should perform pretty well as it makes only 2 passes over the table - one for the inner query that determines the latest date, and another to output the matching rows
A general SQL way to get this is using not exists:
select t.*
from tbl_test t
where actionid = 100 and
not exists (select 1
from tbl_test t2
where t2.songid = t.songid and t2.actionid = 100 and t2.thedate > t.thedate
);
For performance, you want an index on songid, actionid, thedate.

Basic Cursor in MS SQL Server

I am looking for basic direction on Cursor use in MSSS.
Say there is a table, TABLE1, with 2 fields (ID, Date). The ID is not a unique key. The table records events by id, and some ids occur frequently, some infrequently.
For example:
ID | Date
1 | 2010-01-01
2 | 2010-02-01
3 | 2010-02-15
2 | 2010-02-15
4 | 2010-03-01
I would like to create a new table with the following fields: ID, Date, Number of times ID appears in 6 months previous to Date, Number of times ID appears in 6 months after Date.
Is there a best way to go about accomplishing this? Thanks kindly.
This is one side (I think - not tested)
select t1.id, t1.date, count(*) as 'count'
from table t1
join table t2
on t2.id = t1.id
and DateDiff(mm,t1.date,t2.date) <= 6
and DateDiff(mm,t1.date,t2.date) > 0
group by t1.id, t1.date
I think you can skip the > 0 and use case to count the positive and negative
sum(WHEN t1.date > t2.date then 0 else 1) as prior
sum(WHEN t1.date < t2.date then 0 else 1) as next
and DateDiff(mm,t1.date,t2.date) <= 6
and DateDiff(mm,t2.date,t2.date) <= 6
May have prior and next backwards