Basic Cursor in MS SQL Server - sql

I am looking for basic direction on Cursor use in MSSS.
Say there is a table, TABLE1, with 2 fields (ID, Date). The ID is not a unique key. The table records events by id, and some ids occur frequently, some infrequently.
For example:
ID | Date
1 | 2010-01-01
2 | 2010-02-01
3 | 2010-02-15
2 | 2010-02-15
4 | 2010-03-01
I would like to create a new table with the following fields: ID, Date, Number of times ID appears in 6 months previous to Date, Number of times ID appears in 6 months after Date.
Is there a best way to go about accomplishing this? Thanks kindly.

This is one side (I think - not tested)
select t1.id, t1.date, count(*) as 'count'
from table t1
join table t2
on t2.id = t1.id
and DateDiff(mm,t1.date,t2.date) <= 6
and DateDiff(mm,t1.date,t2.date) > 0
group by t1.id, t1.date
I think you can skip the > 0 and use case to count the positive and negative
sum(WHEN t1.date > t2.date then 0 else 1) as prior
sum(WHEN t1.date < t2.date then 0 else 1) as next
and DateDiff(mm,t1.date,t2.date) <= 6
and DateDiff(mm,t2.date,t2.date) <= 6
May have prior and next backwards

Related

Number of days since last activity - SQL

I'm trying to count the number of days since the last activity. My data is weekly aggregated. I'm able to do the lag but not able to include the current week.
Data:
ID DATE CHANNEL VENDOR ENG
xyz 2022-11-18 EMAIL ALPHA 1
xyz 2022-11-25 EMAIL ALPHA 1
xyz 2022-12-09 EMAIL ALPHA 1
xyz 2022-12-16 EMAIL ALPHA 0
xyz 2022-12-23 EMAIL ALPHA 0
xyz 2022-12-30 EMAIL ALPHA 3
I would like to have the output to be as follows:
ID DATE CHANNEL VENDOR ENG n_days
xyz 2022-11-18 EMAIL ALPHA 1 0
xyz 2022-11-25 EMAIL ALPHA 1 0
xyz 2022-12-09 EMAIL ALPHA 1 0
xyz 2022-12-16 EMAIL ALPHA 0 7
xyz 2022-12-23 EMAIL ALPHA 0 14
xyz 2022-12-30 EMAIL ALPHA 3 0
I have written a query but it not able to include the most latest week. Below is my query:
SELECT DISTINCT ID, DATE, CHANNEL, VENDOR,
DATE - LAG(DATE) OVER (PARTITION BY ID, CHANNEL, VENDOR ORDER BY DATE) AS "NDAYS_LAST_ENGAGED_CHANNEL_VENDOR"
FROM
tab1
WHERE
ENG>0
Your WHERE clause means it will only output the rows where eng > 0, so there is no way it will return what you want - its going to throw the last two rows away. And LAG will be just refer to the previous row in the partition, its not going to search for a row that matches your where clause as I think you hoped.
This should hopefully put you on the right track. I've assumed Sql server, though :
DROP TABLE IF EXISTS #t
CREATE TABLE #t (ID VARCHAR(3), [DATE] DATE, CHANNEL VARCHAR(5), VENDOR VARCHAR(5), ENG INT);
INSERT INTO #t VALUES
('xyz','20221118','EMAIL','ALPHA',1),
('xyz','20221125','EMAIL','ALPHA',0),
('xyz','20221209','EMAIL','ALPHA',1),
('xyz','20221216','EMAIL','ALPHA',0),
('xyz','20221223','EMAIL','ALPHA',0);
SELECT t1.ID, t1.DATE, t1.CHANNEL, t1.VENDOR, t1.ENG,
MIN(CASE WHEN t1.ENG = 0
THEN DATEDIFF(DAY, t_eng.DATE, t1.DATE)
ELSE 0
END) AS N_DAYS
FROM #t t1
LEFT OUTER JOIN #t t_eng ON t_eng.ENG = 1
AND t_eng.CHANNEL = t1.CHANNEL
AND t_eng.ID = t1.ID
AND t_eng.VENDOR = t1.VENDOR
AND t_eng.DATE < t1.DATE
GROUP BY t1.ID, t1.DATE, t1.CHANNEL, t1.VENDOR, t1.ENG
I was able to construct a right query that would give difference in days since last event in a chronological order with a massive help of James Casey. The query is as below:
QUERY:
SELECT T1.ID, T1.DATE, T1.CHANNEL, T1.VENDOR,
IFNULL(MIN(DATEDIFF(DD, T_ENG.EVENT_DATE, T1.EVENT_DATE)), -1) AS "N_DAYS"
FROM
TAB1 AS T1
LEFT OUTER JOIN
TAB1 AS T_ENG
ON T_ENG.ENG>0 AND T1.ID=T_ENG.ID AND T1.CHANNEL=T_ENG.CHANNEL AND T1.VENDOR=T_ENG.VENDOR
AND DATEDIFF(DD, T_ENG.DATE, T1.DATE)>=0
GROUP BY
T1.ID, T1.DATE, T1.CHANNEL, T1.VENDOR
ORDER BY 2;

SQL Query for retrieving records having the same ID and Type but only separated by 3 minutes apart

I am trying to write a query to retrieve the IDs from a table which looks like this:
ID
TYPE
CREATED_TIME
1234
start
2021-11-01 21:43:48.0000000
1234
start
2021-11-01 21:44:40.0000000
1234
end
2021-11-04 15:27:50.0000000
4567
start
2021-09-02 20:12:40.0000000
4567
start
2021-09-02 23:01:11.0000000
Ideally I want the query to return the ID's which have 2 or more records of the same type and were created less than 3 minutes apart. So it should return ID 1234 because it has the 2 records of the type = start and created time less than 1 minute apart.
It should not return 4567 because the createdtime is 3 hours apart.
Assuming your table is called DATA this should work:
SELECT DISTINCT t1.ID
FROM
DATA t1 JOIN
DATA t2 ON t1.ID = t2.ID
AND t1.TYPE = t2.TYPE
AND t1.CREATED_TIME <> t2.CREATED_TIME
AND (ABS(DATEDIFF(MINUTE, t1.CREATED_TIME, t2.CREATED_TIME)) < 3)
Self-joining is inefficient. You should use window functions for this
SELECT ID
FROM (
SELECT *,
PrevVal = LAG(CREATED_TIME, 1, '19000101') OVER (PARTITION BY ID, Type ORDER BY CREATED_TIME)
FROM YourTable t
) t
WHERE DATEADD(minute, 3, t.PrevVal) >= t.CREATED_TIME
GROUP BY ID;
db<>fiddle

Access Top N Query where N is given in another table

I have two tables in MS SQL Server. Table2 has the following:
TaskId TopN
1 2
2 3
3 1
Table1 has the following:
TaskId TopN Value
1 2 12
1 2 12
1 2 12
2 3 1
2 3 1
2 3 5
2 3 12
2 3 8
2 3 5
I want to be able to select the top N records based on the TopN field in table2 (which is the same TopN value found in table1, so maybe I don't even need to bother using two tables). The desired output should be as follows:
TaskId TopN Value
1 2 12
1 2 12
2 3 12
2 3 8
2 3 5
I have tried the below SQL statement, but it skips TaskId=1. Any idea of what I am doing wrong?
SELECT DISTINCT T1.TaskId,
T1.TopN,
T1.values
FROM Table1 T1 INNER JOIN Table1 T2 ON
T1.TaskId = T2.TaskId AND
T1.TopN = T2.TopN AND
T1.Value <= T2.Value
GROUP BY T1.TaskId,
T1.TopN,
T1.Value
HAVING COUNT(*) <= (
SELECT TopN
FROM table2
WHERE table2.TaskID = T1.TaskId
)
Please note that in the question you have named Table2 as the one which has the fields - TaskId, TopN, Values however in your query you have used the opposite. Assuming Table2 is the one which has the details, you can use the query below to get the desired result. You would not need to use the other table (Table1 - as per the question) which has just the task_id and topN since all the info is already present in Table2.
Select Taskid, TopN, Values
from
(Select T1.*, row_number() over(partition by Taskid order by Values desc) As rnk
from Table2 T1) Tb
where Tb.TopN >= Tb.rnk;
** Fixed the typo in the code (changed to >= instead of <=), it should work fine now.
The problem is that you have three rows with the same values -- and 3 > 2. That is, the subquery returns "3" which is not less than "2". In SQL Server, you would do this much more simply using row_number().
If you are using MS Access, you need a column that distinguishes the rows.
EDIT:
In SQL Server, you would use:
select t1.*
from (select t1.*,
row_number() over (partition by taskid order by value desc) as seqnum
from table1 t1
) t1
where t1.seqnum <= t1.topn;

how to concat year =2019 and period=6 to give 20106 and not 20196 number and year =2018 , period=3112 to give 20183112

i need to join two tables based on tiemperiod . such that
table1.dateperiod =table2.(combination of period and year)
i.e 201906=201906 (year 2019 month is 6 both integers)
i.e 20183112=20183112
the problem is how to concat them while period is 6 and such that after concatenating it will be 201906 and not 20196
and
table 1
ID NAME DATE_PERIOD
1 conan 201906
1 conan 202012
1 conan 20183112
2 andy 201903
table2
ID PROFILE YEAR PERIOD
1 host 2019 6
1 writer 2018 3112
1 anchor 2020 12
2 sidekic 2019 3
please refer this db fiddle -here
select
*
from table2 t2
inner join table1 t1 on t1.id = t2.id and t1.date_period= (t2.year*100+t2.period)
expected solution
ID PROFILE YEAR MONTH ID NAME DATE_PERIOD
1 anchor 2020 12 1 conan 202012
1 host 2019 6 1 conan 201906
1 sidekick 2020 12 1 andy 202003
1 writer 2018 3112 1 conan 20183112
If they're both integer data types, you should be able to construct the six-digit variant from year and month by just using:
year * 100 + month
If they're character data types, Oracle provides an lpad function for just this purpose, something like:
concat(year, lpad(month, 2, '0'))
Sybase, on the other hand, can use replicate to do padding but it's much uglier, something like:
year || replicate('0', (2 - char_length(month))) || month
Not sure which one you want since you gave both tags.
And re your edit, where the month may also hold DDMM format, you can simply use modulo arithmetic to get the final two characters. I won't bother looking up the function for that in Oracle/Sybase, I'll leave that as an exercise for the reader. Suffice to say 3112 mod 100 is 12.
You need to use LPAD as follows:
select
*
from table2 t2
inner join table1 t1
on t1.id = t2.id and t1.date_period= concat(year,LPAD(month,2,0))
See db<>fiddle
--UPDATE--
You can use CASE statement as follows:
select
*
from table2 t2
inner join table1 t1 on t1.id = t2.id
and t1.date_period = concat(year,
LPAD(period,
case when length(period) <= 2
then 2
else 4
end,
0))
db<>fiddle
You can use the below to handle cases where the length of period is lesser or greater than four
select
*
from table2 t2,table1 t1
where t1.id = t2.id
and t1.date_period = concat(year,
LPAD(period,
decode(length(period),1,2,3,4,length(period))
,
0));

Running Totals again. No over clause, no cursor, but increasing order

I am still having trouble creating an running total based on the increasing order of the value. Row id has no real meaning, it is just the PK. My server doesn't support OVER.
Row Value
1 3
2 7
3 1
4 2
Result:
Row Value
3 1
4 3
1 6
2 13
I have tried self and cross joins where I specify that the value of the second amount(the one being summed up) is less than the current value of the first. I have also tried doing this with the having clause but that always threw an error when I tried it that way. Can someone explain why it would be wrong to use it in that manner and how I should be doing it?
Here is one way to do a running total:
select row, value,
(select sum(value) from t t2 where t2.value <= t.value) as runningTotal
from t
you can use the with rollup command if you have sql server 2008.
select sum(value) from t t2 where t2.value <= t.value with rollup
If your platform supports recursive queries(IIRC you should omit the RECURSIVE keyword for microsoft stuff). Because the CTE needs to estimate the begin/end of a "chain", unfortunately, the tuples need to be ordered in some way (I use the "row" field; an internal tuple-id would be perfect for this purpose):
WITH RECURSIVE sums AS (
-- Terminal part
SELECT d0.row
, d0.value AS value
, d0.value AS runsum
FROM data d0
WHERE NOT EXISTS (
SELECT * FROM data nx
WHERE nx.row < d0.row
)
UNION
-- Recursive part
SELECT t1.row AS row
, t1.value AS value
, t0.runsum + t1.value AS runsum
FROM data t1
, sums t0
WHERE t1.row > t0.row
AND NOT EXISTS (
SELECT * FROM data nx
WHERE nx.row > t0.row
AND nx.row < t1.row
)
)
SELECT * FROM sums
;
RESULT:
row | value | runsum
-----+-------+--------
1 | 3 | 3
2 | 7 | 10
3 | 1 | 11
4 | 2 | 13
(4 rows)