Creating one record for a continuous sequnce of dates to a new table - sql

We have a table in Microsoft SQL Server 2014 as shown below which has Id, LogId, AccountId, StateCode, Number and LastSentDate column.
Our goal was to move the data to a new table. When we move it we need to maintain the first and last record for that series. Based on our data the lastsentdate starts from 5/1 and continues till 5/5, then we should create a new row as shown below(we set the FirstSentDate as 5/1, Log Id as first log id that appeared - 28369 and since the series ended on 5/5 we update LastsentDate as 5/5 and LastSentLog Id as 28752)
if there are some dates with the difference in time, the desired output will be
Since our date series continues the last row in the new table will be
We were trying to group by date and achieve this
WITH t
AS (SELECT LastSentDate d,
ROW_NUMBER() OVER(
ORDER BY LastSentDate) i
FROM [dbo].[RegistrationActivity]
GROUP BY LastSentDate)
SELECT MIN(d),
MAX(d)
FROM t
GROUP BY DATEDIFF(day, i, d);

Use lag() to define where a group begins. Then use a cumulative sum to assign a group id to each group. And finally, extract the data you want. I'm not sure what data you actually want, but here is the idea:
select accountid, min(lastsentdate), max(lastsentdate)
from (select t.*,
sum(case when prev_lsd > dateadd(day, 1, lastsentdate )then 0 else 1 end) over (partition by accountid order by lastsentdate) as grp
from (select t.*, lag(lastsentdate) over (partition by accountid) as prev_lsd
from t
) t
) t
group by accountid;

Related

how can i reset the count to 0 in sql when i have a condition that is false?

i have a sql table which the following data shown in the picture
I need to create a query in sql which counts for ticker the number of consecutive days per year in which
the close_value is greater than the open_value, if close_value is less than the open value the counter must be reset to zero and I have to save the counter in that instant
This is an example of a gaps-and-islands problem. You can use the difference of row_numbers():
select ticker, min(date), max(date), min(open_value), max(close_value),
count(*) as num_rows
from (select t.*,
row_number() over (partition by ticker order by date) as seqnum,
row_number() over (partition by ticker, (case when close_value > open_value then 1 else 2 end) order by date) as seqnum_2
from t
) t
where close_value > open_value
group by ticker, (seqnum - seqnum_2);
This returns all such periods. You haven't specified what the result set should look like, but this should be pretty close.

SQL count new values only with partition by - running count with no duplicates

Based on table below in Presto I need a column for all new 'rid'. What I managed to do is the same what I can achieve with partition by but it's not exactly what I'm looking for (db<>fiddle demo).
Goal is to have many groupings counts but I think this should describe problem sufficiently.
I need data truncated by days and column for new users every day as shown at example below. In simple words - if value repeats don't count it. I've tried to find correlation between this and relational division problem but I just stuck.
You could use row_number() to rank the records of each rid by time; then you can aggregate and count in only the top record per group.
select
date_trunc(day, t.time) dy,
count(*) rid_count,
sum(case when t.rn = 1 then 1 else 0 end) new_rid_count
from (
select
t.*
row_number() over(partition by t.rid order by t.time) rn
from mytable t
) t
group by date_trunc(day, t.time)
I think of this as two levels of aggregation. The inner one to get the earliest date. The outer to aggregate:
select first_day, count(*)
from (select rid, date_trunc('day', min(time))::date as first_day
from orders o
group by rid
) r
group by 1

How to take only one entry from a table based on an offset to a date column value

I have a requirement to get values from a table based on an offset conditions on a date column.
Say for eg: for the below attached table, if there is any dates that comes close within 15 days based on effectivedate column I should return only the first one.
So my expected result would be as below:
Here for A1234 policy, it returns 6/18/16 entry and skipped 6/12/16 entry as the offset between these 2 dates is within 15 days and I took the latest one from the list.
If you want to group rows together that are within 15 days of each other, then you have a variant of the gaps-and-islands problem. I would recommend lag() and cumulative sum for this version:
select polno, min(effectivedate), max(expirationdate)
from (select t.*,
sum(case when prev_ed >= dateadd(day, -15, effectivedate)
then 1 else 0
end) over (partition by polno order by effectivedate) as grp
from (select t.*,
lag(expirationdate) over (partition by polno order by effectivedate) as prev_ed
from t
) t
) t
group by polno, grp;

SQL: transposing a time series table into a start-end time table if an event occur

I am trying to use a select statement to create a view, transposing a table with datetime into a table with records in each row, the start-end time when the consecutive values by time (partition by station) in 'record' field is not 0.
Here is a sample of the initial table.
And how it should look like after transposing.
Can anyone help?
You can use the conditional_change_event analytical function to create a special grouping identifier to split these out in a simple query:
select row_number() over () unique_id,
station,
min(datetime) startdate,
max(datetime) enddate
from (
select t.*, CONDITIONAL_CHANGE_EVENT(decode(recording,0,0,1))
over (partition by station order by datetime) chg
from mytable t
) x
where recording > 0
group by station, chg
order by 1, 2
The decode is just to set up your islands and gaps (where gaps are recording <= 0 and islands are recording > 0). Then the change event on that will generate a new identifier for grouping. Also note that I am grouping on the change event even though it isn't part of the output.
ROW_NUMBER() is the best for partitioning. Next, you can do a self join on the partitioned tables to see if the difference between times is greater than five minutes. I think the best solution is to partition on the rolling sum of the timestamp difference, offset by 5 minutes based on your pattern. If the five minutes is not a regular pattern then there is probably a generalized approach that can be used with the zeroes.
Solution written as a CTE below for easy view creation (it's a slow view though).
WITH partitioned as (
SELECT datetime, station, recording,
ROW_NUMBER() OVER(PARTITION BY station
ORDER BY datetime ASC) rn
FROM table --Not sure what the tablename is
WHERE recording != 0),
diffed as (
SELECT a.datetime, a.station,
DATEDIFF(mi,ISNULL(b.datetime,a.datetime),a.datetime)-5) Difference
--The ISNULL logic is for when a.datetime is the beginning of the block,
--we want a 0
FROM partitioned a
LEFT JOIN partitioned b on a.rn = b.rn + 1 and a.station=b.station
GROUP BY a.datetime,a.station),
cumulative as (
SELECT a.datetime, a.station, SUM(b.difference) offset_grouping
FROM diff a
LEFT JOIN diff b on a.datetime >= b.datetime and a.station = b.station ),
ordered as (SELECT datetime,station,
ROW_NUMBER() OVER(PARTITION BY station,offset_grouping ORDER BY datetime asc) starter,
ROW_NUMBER() OVER(PARTITION BY station,offset_grouping ORDER BY datetime desc) ender
FROM cumulative)
SELECT ROW_NUMBER() OVER(ORDER BY a.datetime) unique_id,a.station,a.datetime startdate, b.datetime enddate
FROM ordered a
JOIN ordered b on a.starter = b.ender and a.station=b.station and a.starter=1
This is the only solution I can think of but again, it's slow depending on the amount of data you have.

SQL Ranking by consecutive date blocks

I'm trying to rank the number of consecutive date blocks but what is the best way to do this? Example below shows the first 3 blocks being consecutive and then the 4 has a month between them so the counting would begin again.
Data I'm trying to order:
StartDate | EndDate |Rank
----------+-----------+----
01/01/2016| 01/02/2016| 1
01/02/2016| 01/03/2016| 2
01/03/2016| 01/04/2016| 3
01/05/2016| 01/06/2016| 1
You can do this by identifying where a grouping begins, doing a cumulative sum to identify the group, and then a row number:
select t.*,
row_number() over (partition by grp order by startdate) as rank
from (select t.*,
sum(case when tprev.startdate is null then 1 else 0 end) over (order by startdate) as grp
from t left join
t tprev
on t.startdate = tprev.enddate
) t;
This particular SQL works for the data you have presented. It will not handle data that overlaps by more than one day, nor multiple records that start on the same day. These can be handled. If your data is more like that, then ask another question with appropriate data in it.