Copy value into rows below until greater value is found in SQL - sql

I have been working on copying first sequential value in "episode" until another value > than itself is found(see column "episode_final" below) without too much luck. The logic should partition the data by id ordered by date in SQL server 2012. Any help will be appreciated.

You can try to use LEAD window function get the episode next value.
Then use CASE WHEN check episode> nextVal does increase 1.
CREATE TABLE T(
id varchar(50),
date date,
episode int
);
INSERT INTO T VALUES (123,'2018-01-01',1);
INSERT INTO T VALUES (123,'2018-01-02',1);
INSERT INTO T VALUES (123,'2018-01-10',1);
INSERT INTO T VALUES (123,'2018-01-11',1);
INSERT INTO T VALUES (123,'2018-01-12',1);
INSERT INTO T VALUES (123,'2018-01-20',2);
INSERT INTO T VALUES (123,'2018-03-20',1);
INSERT INTO T VALUES (123,'2018-05-01',1);
INSERT INTO T VALUES (123,'2018-05-10',3);
INSERT INTO T VALUES (123,'2018-05-20',1);
INSERT INTO T VALUES (345,'2018-06-20',1);
INSERT INTO T VALUES (345,'2018-07-21',1);
INSERT INTO T VALUES (345,'2018-07-22',2);
Query 1:
SELECT t1.Id,
t1.Date,
t1.episode,
(SUM(CASE WHEN episode> coalesce(nextVal,preVal) THEN 1 ELSE 0 END) over (partition by id order by [date]) + 1) episode_final
FROM (
SELECT T.*,LEAD(episode) over (partition by id order by [date]) nextVal,
LAG(episode) over (partition by id order by [date]) preVal
FROM T
)t1
Results:
| Id | Date | episode | episode_final |
|-----|------------|---------|---------------|
| 123 | 2018-01-01 | 1 | 1 |
| 123 | 2018-01-02 | 1 | 1 |
| 123 | 2018-01-10 | 1 | 1 |
| 123 | 2018-01-11 | 1 | 1 |
| 123 | 2018-01-12 | 1 | 1 |
| 123 | 2018-01-20 | 2 | 2 |
| 123 | 2018-03-20 | 1 | 2 |
| 123 | 2018-05-01 | 1 | 2 |
| 123 | 2018-05-10 | 3 | 3 |
| 123 | 2018-05-20 | 1 | 3 |
| 345 | 2018-06-20 | 1 | 1 |
| 345 | 2018-07-21 | 1 | 1 |
| 345 | 2018-07-22 | 2 | 2 |

Related

T-SQL Combine rows in continuation

I have a table that looks like the following.
What I want is the the rows in continuation of each other to be grouped together - for each "ID".
The column IsContinued marks if the next row should be combined with the current row
My data looks like this:
+-----+--------+-------------+-----------+----------+
| ID | Period | IsContinued | StartDate | EndDate |
+-----+--------+-------------+-----------+----------+
| 123 | 1 | 1 | 20180101 | 20180404 |
+-----+--------+-------------+-----------+----------+
| 123 | 2 | 1 | 20180501 | 20180910 |
+-----+--------+-------------+-----------+----------+
| 123 | 3 | 0 | 20181001 | 20181201 |
+-----+--------+-------------+-----------+----------+
| 123 | 4 | 1 | 20190105 | 20190228 |
+-----+--------+-------------+-----------+----------+
| 123 | 5 | 0 | 20190401 | 20190430 |
+-----+--------+-------------+-----------+----------+
| 456 | 2 | 1 | 20180201 | 20180215 |
+-----+--------+-------------+-----------+----------+
| 456 | 3 | 0 | 20180301 | 20180401 |
+-----+--------+-------------+-----------+----------+
| 456 | 4 | 0 | 20180501 | 20180530 |
+-----+--------+-------------+-----------+----------+
| 456 | 5 | 0 | 20180701 | 20180705 |
+-----+--------+-------------+-----------+----------+
The end result I want is this:
+-----+-------------+-----------+-----------+----------+
| ID | PeriodStart | PeriodEnd | StartDate | EndDate |
+-----+-------------+-----------+-----------+----------+
| 123 | 1 | 3 | 20180101 | 20181201 |
+-----+-------------+-----------+-----------+----------+
| 123 | 4 | 5 | 20190105 | 20190430 |
+-----+-------------+-----------+-----------+----------+
| 456 | 2 | 3 | 20180201 | 20180401 |
+-----+-------------+-----------+-----------+----------+
| 456 | 4 | 4 | 20180501 | 20180530 |
+-----+-------------+-----------+-----------+----------+
| 456 | 5 | 5 | 20180701 | 20180705 |
+-----+-------------+-----------+-----------+----------+
DDL Statement:
CREATE TABLE #Period (ID INT, PeriodNr INT, IsContinued INT, STARTDATE DATE, ENDDATE DATE)
INSERT INTO #Period VALUES (123,1,1,'20180101', '20180404'),
(123,2,1,'20180501', '20180910'),
(123,3,0,'20181001', '20181201'),
(123,4,1,'20190105', '20190228'),
(123,5,0,'20190401', '20190430'),
(456,2,1,'20180201', '20180215'),
(456,3,0,'20180301', '20180401'),
(456,4,0,'20180501', '20180530'),
(456,5,0,'20180701', '20180705')
The code should be run on SQL Server 2016
Thanks!
Here is one approach:
with removeFluff as
(
SELECT *
FROM (
SELECT ID, PeriodNr, IsContinued, STARTDATE, ENDDATE, LAG(IsContinued,1,2) OVER (PARTITION BY ID ORDER BY PERIODNR) Lag
FROM #Period
) A
WHERE (IsContinued <> Lag) OR (IsContinued + Lag = 0)
)
,getValues as
(
SELECT ID,
CASE WHEN LAG(IsContinued) OVER (PARTITION BY ID ORDER BY PeriodNr) = 1 THEN LAG(PeriodNr) OVER (PARTITION BY ID ORDER BY PeriodNr) ELSE PeriodNr END PeriodStart,
PeriodNr PeriodEnd,
CASE WHEN LAG(IsContinued) OVER (PARTITION BY ID ORDER BY PeriodNr) = 1 THEN LAG(STARTDATE) OVER (PARTITION BY ID ORDER BY PeriodNr) ELSE STARTDATE END StartDate,
EndDate,
IsContinued
FROM removeFluff r
)
SELECT ID, PeriodStart, PeriodEnd, StartDate, EndDate
FROM getValues
WHERE IsContinued = 0
Output:
ID PeriodStart PeriodEnd StartDate EndDate
123 1 3 2018-01-01 2018-12-01
123 4 5 2019-01-05 2019-04-30
456 2 3 2018-02-01 2018-04-01
456 4 4 2018-05-01 2018-05-30
456 5 5 2018-07-01 2018-07-05
Method:
removeFluff cte removes lines that are unimportant. Theses are the records that don't start or end a segment (line 2 in your sample data)
Now that the fluff is removed, we know that either:
A.) The line is complete on it's own (LAG(IsContinued) ... = 0), ie. previous line is complete
B.) The line needs the "start" info from the previous line (LAG(IsContinued) ... = 1)
We apply these two cases in the CASE expression of the getValues cte
Last, the results are narrowed to only the important rows in the final select with IsContinued = 0. This is because we have used LAG to get "start" data on the "end" data row, so we only want to select the end rows

SQL: Show Records Once SUM Threshold Is Reached

I have a table, sorted on a date value (ASC).
+----+------------+-------+
| Id | Date | Value |
+----+------------+-------+
| 1 | 2018-01-01 | 10 |
| 2 | 2018-01-02 | 5 |
| 3 | 2018-01-03 | 15 |
| 4 | 2018-01-04 | 0 |
| 5 | 2018-01-05 | 5 |
| 6 | 2018-01-06 | 10 |
| 7 | 2018-01-07 | 5 |
| 8 | 2018-01-08 | 0 |
| 9 | 2018-01-09 | 0 |
| 10 | 2018-01-10 | 10 |
+----+------------+-------+
I would like to create a view that only returns the records once the SUM of the Value is higher than 30, starting from the first record.
So my threshold is 30, every record with a value that fits in the first 30 should be hidden.
All records that follow once this threshold is reached, need to be shown.
This means that my required result looks like this:
+----+------------+-------+
| Id | Date | Value |
+----+------------+-------+
| 4 | 2018-01-04 | 0 |
| 5 | 2018-01-05 | 5 |
| 6 | 2018-01-06 | 10 |
| 7 | 2018-01-07 | 5 |
| 8 | 2018-01-08 | 0 |
| 9 | 2018-01-09 | 0 |
| 10 | 2018-01-10 | 10 |
+----+------------+-------+
As you can see, Id's 1, 2 and 3 are left out, because their values (10, 5 and 15) SUM up to 30.
Once this threshold is reached, the remaining records are visible (even the 0 value of Id 4).
I've created some scripts to setup a test table with data:
-- Create test table
CREATE TABLE thresholdTest (
[Id] INT IDENTITY(1,1) PRIMARY KEY,
[Date] DATE NOT NULL,
[Value] INT NOT NULL
)
-- Insert dummies
INSERT INTO [thresholdTest] ([Date],[Value])
VALUES
('2018-01-01',10),
('2018-01-02',5),
('2018-01-03',15),
('2018-01-04',0),
('2018-01-05',5),
('2018-01-06',10),
('2018-01-07',5),
('2018-01-08',0),
('2018-01-09',0),
('2018-01-10',10);
-- Select ordered by date
SELECT *
FROM [thresholdTest]
ORDER BY [Date] ASC
All I need is a SELECT statement / view.
The threshold is always static (30 in this example).
The data could ofcourse differ, but it's always sorted on a Date and includes a Value.
Thank you in advance.
I'd use a window function:
;with cte as(
select *, tot = sum([Value]) over (order by [Date])
from thresholdTest
)
select
Id,
[Date],
[Value]
from cte
where
(tot >= 30 and [Value] = 0)
or tot > 30
You can try to use SUM with window function in subquery to accumulated totle then write condition in main query.
select Id,
Date,
Value
from
(
SELECT *,
SUM(Value) OVER(ORDER BY Date) totle
FROM thresholdTest
) t
WHERE totle > 30 OR (Value = 0 AND totle = 30)
[Results]:
| Id | Date | Value |
|----|------------|-------|
| 4 | 2018-01-04 | 0 |
| 5 | 2018-01-05 | 5 |
| 6 | 2018-01-06 | 10 |
| 7 | 2018-01-07 | 5 |
| 8 | 2018-01-08 | 0 |
| 9 | 2018-01-09 | 0 |
| 10 | 2018-01-10 | 10 |
sqlfiddle
Yet another way to do it
select t1.id, t1.Date,t1.Value
from [thresholdTest] t1
inner join [thresholdTest] t2 on t1.id >= t2.id
group by t1.id, t1.value, t1.Date
HAVING SUM(t2.VAlue)>30 OR( SUM(t2.value)=30 AND t1.value=0)

row_number over multiple columns based on trigger

I'm trying to factor in multiple conditions to a dataset I'm working with. Row_number seems like the way to go with lag function in a second query but I can't quite get it 100%.
Here is how my data is structured:
CREATE TABLE emailhell(
mainID INTEGER NOT NULL PRIMARY KEY
,acctID VARCHAR(4) NOT NULL
,emailID VARCHAR(2) NOT NULL
,type INTEGER NOT NULL
,created DATETIME NOT NULL
);
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (1,'1234','1',6,'1/1/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (2,'1234','1',11,'1/1/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (3,'1234','2',6,'1/2/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (4,'1234','3',6,'1/3/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (5,'1234','4',6,'1/4/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (6,'ABC','89',6,'1/5/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (7,'ABC','90',6,'1/6/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (8,'ABC','90',11,'1/7/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (9,'258','22',6,'1/7/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (10,'258','1',6,'1/10/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (11,'258','2',6,'1/30/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (12,'258','3',6,'1/31/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (13,'258','29',6,'2/15/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (14,'258','29',11,'2/16/2018');
INSERT INTO emailhell(mainID,acctID,emailID,type,created) VALUES (15,'258','31',6,'3/1/2018');
and my desired output
+--------+--------+---------+------+-----------+-------+------------+
| mainID | acctID | emailID | type | created | index | touchcount |
+--------+--------+---------+------+-----------+-------+------------+
| 1 | 1234 | 1 | 6 | 1/1/2018 | 1 | |
| 2 | 1234 | 1 | 11 | 1/1/2018 | 2 | 1 |
| 3 | 1234 | 2 | 6 | 1/2/2018 | 1 | |
| 4 | 1234 | 3 | 6 | 1/3/2018 | 2 | |
| 5 | 1234 | 4 | 6 | 1/4/2018 | 3 | |
| 6 | ABC | 89 | 6 | 1/5/2018 | 1 | |
| 7 | ABC | 90 | 6 | 1/6/2018 | 2 | |
| 8 | ABC | 90 | 11 | 1/7/2018 | 3 | 2 |
| 9 | 258 | 22 | 6 | 1/7/2018 | 1 | |
| 10 | 258 | 1 | 6 | 1/10/2018 | 2 | |
| 11 | 258 | 2 | 6 | 1/30/2018 | 3 | |
| 12 | 258 | 3 | 6 | 1/31/2018 | 4 | |
| 13 | 258 | 29 | 6 | 2/15/2018 | 5 | |
| 14 | 258 | 29 | 11 | 2/16/2018 | 6 | 5 |
| 15 | 258 | 31 | 6 | 3/1/2018 | 1 | |
+--------+--------+---------+------+-----------+-------+------------+
Here's what I was working with but It's having issues for some reason when the activity looks like, Type 6 followed by an 11 followed by a 6, 11, etc. Here's my start of the query and I'm sure there's a better way to do this. I am then doing a similar query with the LAG function to grab the times where type 11 appeared.
SELECT dm.TABLE.*,
row_number() over(partition by dm.acctId, dm.type order by dm.acctId, dm.created_date) as index into dm.table2
from dm.TABLE with (NOLOCK)
You are defining groups by acctId and 11. Then for the 11s, you want one less than the size of the group. So, cumulative sum and some other stuff:
select t.*,
row_number() over (partition by acctId, grp order by mainId) as index,
(case when type = 11
then count(*) over (partition by acctId, grp ) - 1
end) as touchcount
from (select t.*,
sum(case when type = 11 then 1 else 0 end) over (partition by acctId order by mainId desc) as grp
from t
) t;
I should note that the definition of group requires counting backwards, rather than forwards. That is because 11 is included in the "previous" group rather than the first record in the "next" group.

Displaying products sales for each day in SQL

I have a problem I can't find a solution.
I have a table "SELLS" with all sells of a shop and I want to display how many sales for each product each day of a period.
For example :
| DATE | NAME | QTY |
|------------|----------|-----|
| 2014-07-03 | Coca | 1 |
| 2014-07-03 | Fanta | 1 |
| 2014-07-03 | Orangina | 5 |
| 2014-07-03 | Coca | 3 |
| 2014-07-04 | Coca | 2 |
| 2014-07-05 | Coca | 4 |
| 2014-07-05 | Fanta | 1 |
| 2014-07-05 | Juice | 2 |
The display i want is :
| NAME | TOTAL | 2014-07-03 | 2014-07-04 | 2014-07-05 |
|------------|--------|------------|-------------|-------------|
| Coca | 10 | 4 | 2 | 4 |
| Fanta | 2 | 1 | 0 | 1 |
| Orangina | 1 | 1 | 0 | 0 |
| Juice | 1 | 0 | 0 | 1 |
The user will specify the period he wants to display, so I have to use a BETWEEN function for date.
I try with PIVOT function, but I'm still not familiar using it
Edit : I'm using SQL Server 2012.
Thanks a lot for your help.
Create table temp
(
tdate date,
name varchar(10),
qty int
)
insert into temp values (getdate(),'A',10)
insert into temp values (getdate(),'B',20)
insert into temp values (getdate(),'C',20)
insert into temp values (getdate(),'A',20)
insert into temp values (getdate(),'B',30)
insert into temp values (getdate(),'C',40)
insert into temp values (getdate()+1,'A',20)
insert into temp values (getdate()+1,'B',30)
insert into temp values (getdate()+1,'C',40)
select * from
( select tdate, name, qty
from temp
) src
pivot (
sum(qty)
for tdate in ([2015-05-12],[2015-05-13])
) piv;
Doesn't this do the trick?
SELECT sum(QTY), DATE, NAME FROM SELLS WHERE DATE BETWEEN .... GROUP BY DATE,NAME
P.S: added the between clause

How can I insert records from one table into second table

The current table is not setup for growth and I'd like to migrate the existing data to a table better suited for expansion. Let me explain:
The current table is set like:
+--------+---------------+----+----+----+------+-----------+
| id | DateOfService | AM | MD | PM | RATE | CLIENT_ID |
+--------+---------------+----+----+----+------+-----------+
| 1 | 3/4/2013 | 1 | 0 | 0 | 10 | 123 |
| 2 | 3/5/2013 | 1 | 0 | 0 | 10 | 123 |
| 3 | 3/6/2013 | 1 | 0 | 0 | 10 | 123 |
| 4 | 3/5/2013 | 0 | 1 | 1 | 50 | 147 |
| 5 | 3/6/2013 | 1 | 1 | 1 | 25 | 189 |
+--------+---------------+----+----+----+------+-----------+
And instead, I want to setup my table like:
+----------+---------------+---------------+-----------+
| pkid | DateOfService | ServiceTypeID | CLIENT_ID |
+----------+---------------+---------------+-----------+
| 1 | 3/4/2013 | 1 | 123 |
| 2 | 3/5/2013 | 1 | 123 |
| 3 | 3/6/2013 | 1 | 123 |
| 4 | 3/5/2013 | 2 | 147 |
| 5 | 3/5/2013 | 3 | 147 |
| 6 | 3/6/2013 | 1 | 189 |
| 7 | 3/6/2013 | 2 | 189 |
| 8 | 3/6/2013 | 3 | 189 |
+----------+---------------+---------------+-----------+
The ServiceTypeID table would be an options table setup something like:
+-------------------+---------+
| ServiceTypeID | Service |
+-------------------+---------+
| 1 | AM |
| 2 | MD |
| 3 | PM |
+-------------------+---------+
I need help coming up with a query I can run that will select and loop over the existing data and populate my new table.
You could use the UNPIVOT function to turn your current table from columns into rows. Then you can join on your options table and insert the data into your new table.
The UNPIVOT code will be:
select id,
dateofservice,
client_id,
col,
value
from yourtable
unpivot
(
value
for col in (AM, MD, PM)
) unpiv
where value <> 0
See SQL Fiddle with Demo
Then you will join to your options table to get the result. You can use this query to INSERT INTO your new table:
-- INSERT INTO yourNewTable
select src.id as pkid,
src.dateofservice,
o.servicetypeid,
src.client_id
from
(
select id,
dateofservice,
client_id,
col,
value
from yourtable
unpivot
(
value
for col in (AM, MD, PM)
) unpiv
where value <> 0
) src
inner join options o
on src.col = o.service
See SQL Fiddle with Demo
Since your options table is "setup", something like this would work:
INSERT INTO newtable
SELECT id, dateofservice, 1, clientid
FROM oldTable
WHERE AM = 1
UNION ALL
SELECT id, dateofservice, 2, clientid
FROM oldTable
WHERE MD = 1
UNION ALL
SELECT id, dateofservice, 3, clientid
FROM oldTable
WHERE PM = 1
NB, I'm making the same assumption blue is, your example table is wrong and pk should have had the values 1,2,3,4,4,5,5,5
If you want that key to be unique then just define it as an auto increment key and don't include it in this select.