Difference of data per day - sql

I have a table called Addim and the data looks like:
TName Idate Number
Integrated 3/21/2012 26984013
Integrated 3/20/2012 26959226
Integrated 3/19/2012 26933190
I want the output as:
Idate Diff
3/21/2012 24787
3/20/2012 26036
I did something like this:
Select Count(*),Idate
from dbo.Addim
group by Idate
But i am getting output like this:
Idate Diff
03/21/2012 1
03/20/2012 1
basically what it does is it takes the difference from previous day
for example:
for 3/21/2012 the diff is 26984013(3/21/2012)-26959226(3/20/2012) = 24787
and for
3/20/2012 is 26959226(3/20/2012)-26933190(3/19/2012) = 26036

the trick is to join the table back to itself for the previous day, like this:
DECLARE #Addim table (TName varchar(10), Idate datetime,Number int)
INSERT INTO #Addim VALUES ('Integrated','3/21/2012',26984013)
INSERT INTO #Addim VALUES ('Integrated','3/20/2012',26959226)
INSERT INTO #Addim VALUES ('Integrated','3/19/2012',26933190)
SELECT
a.TName,a.Idate, a.Number-b.Number
FROM #Addim a
INNER JOIN #Addim b ON a.TName=b.TName AND a.Idate=b.Idate+1
OUTPUT:
TName Idate
---------- ----------------------- -----------
Integrated 2012-03-21 00:00:00.000 24787
Integrated 2012-03-20 00:00:00.000 26036
(2 row(s) affected)
I wasn't sure the significance of TName, so I joined on that column too, assuming that you'd have multiple different values there as well. You can easily remove it from the join if it is not used like that.

Try this:
;WITH CTE AS
(
SELECT TName, Idate, Number, ROW_NUMBER() OVER(ORDER BY Idate) Corr
FROM #Temp1--YourTable
)
SELECT A.Idate, A.number - B.number Diff
FROM CTE A
INNER JOIN CTE B
ON A.Corr = B.Corr + 1
This assumes one record per day, but it will work even if there are missing days.

Related

Calculate Sum From Moving 4 Rows in SQL

I've have the following data.
WM_Week POS_Store_Count POS_Qty POS_Sales POS_Cost
------ --------------- ------ -------- --------
201541 3965 77722 153904.67 102593.04
201542 3952 77866 154219.66 102783.12
201543 3951 70690 139967.06 94724.60
201544 3958 70773 140131.41 95543.55
201545 3958 76623 151739.31 103441.05
201546 3956 73236 145016.54 98868.60
201547 3939 64317 127368.62 86827.95
201548 3927 60762 120309.32 82028.70
I need to write a SQL query to get the last four weeks of data, and their last four weeks summed for each of the following columns: POS_Store_Count,POS_Qty,POS_Sales, and POS_Cost.
For example, if I wanted 201548's data it would contain 201548, 201547, 201546, and 201545's.
The sum of 201547 would contain 201547, 201546, 201545, and 201544.
The query should return 4 rows when ran successfully.
How would I formulate a recursive query to do this? Is there something easier than recursive to do this?
Edit: The version is Azure Sql DW with version number 12.0.2000.
Edit2: The four rows that should be returned would have the sum of the columns from itself and it's three earlier weeks.
For example, if I wanted the figures for 201548 it would return the following:
WM_Week POS_Store_Count POS_Qty POS_Sales POS_Cost
------ --------------- ------- -------- --------
201548 15780 274938 544433.79 371166.3
Which is the sum of the four (non-identity) columns from 201548, 201547, 201546, and 201545.
Pretty sure this will get you what you want.. Im using cross apply after ordering the data to apply the SUMS
Create Table #WeeklyData (WM_Week Int, POS_Store_Count Int, POS_Qty Int, POS_Sales Money, POS_Cost Money)
Insert #WeeklyData Values
(201541,3965,77722,153904.67,102593.04),
(201542,3952,77866,154219.66,102783.12),
(201543,3951,70690,139967.06,94724.6),
(201544,3958,70773,140131.41,95543.55),
(201545,3958,76623,151739.31,103441.05),
(201546,3956,73236,145016.54,98868.6),
(201547,3939,64317,127368.62,86827.95),
(201548,3927,60762,120309.32,82028.7)
DECLARE #StartWeek INT = 201548;
WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY [WM_Week] DESC) rn
FROM #WeeklyData
WHERE WM_Week BETWEEN #StartWeek - 9 AND #StartWeek
)
SELECT *
FROM cte c1
CROSS APPLY (SELECT SUM(POS_Store_Count) POS_Store_Count_SUM,
SUM(POS_Qty) POS_Qty_SUM,
SUM(POS_Sales) POS_Sales_SUM,
SUM(POS_Cost) POS_Cost_SUM
FROM cte c2
WHERE c2.rn BETWEEN c1.rn AND (c1.rn + 3)
) ca
WHERE c1.rn <= 4
You can use SUM() in combination with the OVER Clause
Something like:
SELECT WM_Week.
, SUM(POS_Store_Count) OVER (ORDER BY WM_Week ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)
FROM Table
You should be able to use a SQL window function for this.
Add a column to your query like the following:
SUM(POS_Sales) OVER(
ORDER BY WM_Week
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) AS POS_Sales_4_Weeks
If I understand correctly, you don't want to return 4 rows, but rather 4 summed columns for each group? If so, here's one option:
select max(WM_Week) as WM_Week,
sum(POS_Store_Count),
sum(POS_Qty),
sum(POS_Sales),
sum(POS_Cost)
from (select top 4 *
from yourtable
where wm_week <= 201548
order by wm_week desc) t
This uses a subquery with top to get the 4 rows you want to aggregate based on the where criteria and order by clause.
Here is a condensed fiddle demonstrating the example (sorry fiddle isn't supporting sql server right now, so the syntax is slightly off):

SQL Join on nearest available date

I currently have these tables:
CREATE TABLE #SECURITY_TEMP (ID CHAR(30))
CREATE TABLE #SECURITY_TEMP_PRICE_HISTORY (ID CHAR(30), PRICEDATE DATE, PRICE FLOAT)
CREATE TABLE #SECURITY_POST (ID CHAR(30), SECPOS int)
INSERT INTO #SECURITY_TEMP (ID) VALUES ('APPL') ,('VOD'),('VOW3'), ('AAA')
INSERT INTO #SECURITY_TEMP_PRICE_HISTORY (ID,PRICEDATE, PRICE) VALUES
('APPL', '20150101',10.4), ('APPL', '20150116',15.4), ('APPL', '20150124',22.4),
('VOD', '20150101', 30.5), ('VOD', '20150116',16.5), ('VOD', '20150124',16.5),
('VOW3', '20150101', 45.5), ('VOW3', '20150116',48.8) ,('VOW3', '20150124',50.55),
('AAA', '20100118', 0.002)
INSERT INTO #SECURITY_POST (ID,SECPOS) VALUES ('APPL', 100), ('VOD', 350), ('VOW3', 400)
I want to have a clean table that shows me the security ID, the security position and the latest available price for that security when a date is passed.
Now when I do the following:
SELECT sec.ID, sec.SECPOS, t.PRICE
FROM #SECURITY_POST as SEC INNER JOIN #SECURITY_TEMP_PRICE_HISTORY as t
ON sec.ID = t.ID
WHERE t.PriceDate = '20150101'
GROUP BY sec.ID, secPos, t.price
I get the correct result
1. ID SECPOS PRICE
2. APPL 100 10.4
3. VOD 350 30.5
4. VOW3 400 45.5
However, there may be individual circumstances where, the price of a stock is not available. In that sense, I therefore want to be able to get the most recent price available.
Doing
SELECT sec.ID, sec.SECPOS, t.PRICE
FROM #SECURITY_POST as SEC INNER JOIN
#SECURITY_TEMP_PRICE_HISTORY as t
ON sec.ID = t.ID
WHERE t.PriceDate = '20150117'
GROUP BY sec.ID, secPos, t.price
Returns 0 rows because of no data, and doing
SELECT sec.ID, sec.SECPOS, t.PRICE
FROM #SECURITY_POST as SEC INNER JOIN
#SECURITY_TEMP_PRICE_HISTORY as t
ON sec.ID = t.ID
WHERE t.PriceDate <= '20150117'
GROUP BY sec.ID, sec.secPos, t.price
HAVING sec.secpos <> 0
Returns duplicate rows.
I have tried loads of different methodologies and I just cannot get the output I want. Furthermore, I would also like to be able to get one column with the price nearest a date (call it START_DATE) and one column with the price nearest a second date (call it END_DATE) and one column that is going to be the position Price#END_DATE - Price#START_DATE. The price is always taken from the same #SECURITY_TEMP_PRICE_HISTORY.
However, my SQL knowledge is just embarrassing, and I could not figure out a good efficient way of doing this. Any help would be appreciated. Please also note that the #SECURITY_PRICE_HISTORY table may contain more securities than the #SECURITY_POST Table.
This should do the trick. OUTER APPLY is a join operator that (like CROSS APPLY) allows a derived table to have an outer reference.
SELECT
s.ID,
s.SecPos,
t.Price
t.PriceDate
FROM
#SECURITY_POST s
OUTER APPLY (
SELECT TOP 1 *
FROM #SECURITY_TEMP_PRICE_HISTORY t
WHERE
s.ID = t.ID
AND t.PriceDate <= '20150117'
ORDER BY t.PriceDate DESC
) t
;
You may also want to consider flagging security prices that are very old, or limiting the lookup for the most recent security to a certain period (a week or a month or something).
Make sure that your price history table has an index with (ID, PriceDate) so that the subquery lookups can use range seeks and your performance can be good. Make sure you do any date math on the security table, not the history table, or you will force the price-lookup subquery to be non-sargable, which would be bad for performance as the range seeks would not be possible.
If no price is found for the security, OUTER APPLY will still allow the row to exist, so the price will show as NULL. If you want securities to not be shown when no appropriate price is found, use CROSS APPLY.
For your second part of the question, you can do this with two OUTER APPLY operations, like so:
DECLARE
#StartDate date = '20150101',
#EndDate date = '20150118';
SELECT
S.ID,
S.SecPos,
StartDate = B.PriceDate,
StartPrice = B.Price,
EndDate = E.PriceDate,
EndPrice = E.Price,
Position = B.Price - E.Price
FROM
#SECURITY_POST S
OUTER APPLY (
SELECT TOP 1 *
FROM #SECURITY_TEMP_PRICE_HISTORY B
WHERE
S.ID = B.ID
AND B.PriceDate <= #StartDate
ORDER BY B.PriceDate DESC
) B
OUTER APPLY (
SELECT TOP 1 *
FROM #SECURITY_TEMP_PRICE_HISTORY E
WHERE
S.ID = E.ID
AND E.PriceDate <= #EndDate
ORDER BY E.PriceDate DESC
) E
;
With your data this yields the following result set:
ID SecPos StartDate StartPrice EndDate EndPrice Position
---- ------ ---------- ---------- ---------- -------- --------
APPL 100 2015-01-01 10.4 2015-01-16 15.4 -5
VOD 350 2015-01-01 30.5 2015-01-16 16.5 14
VOW3 400 2015-01-01 45.5 2015-01-16 48.8 -3.3
Last, while not all agree, I would encourage you to name your ID columns with the table name as in SecurityID instead of ID. In my experience the use of ID only leads to problems.
Note: there is a way to solve this problem using the Row_Number() windowing function. If you have relatively few price points compared to the number of stocks, and you're looking up prices for most of the stocks in the history table, then you might get better performance with that method. However, if there are a great number of price points per stock, or you're filtering to just a few stocks, you may get better performance with the method I've shown you.

How to Select the most recent items by dates from another table

I have two tables Projects and Plans, and I need get recent projects from the last plan by create date:
Projects:
ProjectId, PlanId, StartDate, EndDate
(guid) (guid) (datetime) (datetime)
-------------------------------------
00001, 00001, 1/1/2015 31/1/2015
00001, 00002, 3/2/2015 15/2/2015
00002, 00001, 1/2/2015 20/2/2015
00002, 00002, 1/2/2015 21/2/2015
00003, 00001, 1/3/2015 10/3/2015
Plans:
PlanId, CreateDate
(guid) (datetime)
--------------------
00001, 1/1/2015
00002, 5/2/2015
I wrote query that take single project from the last plan, but i can't write query to get many projects by single query.
Here my query:
SELECT TOP 1 pr.ProjectId,
pl.CreateDate,
pr.StartDate,
pr.EndDate
FROM Projects pr
INNER JOIN Plans pl
ON pr.PlanId = pl.PlanId
WHERE ProjectId = '000002'
ORDER BY pl.CreateDate DESC
Desired result is (all projects from the last plans):
ProjectId, PlanId, StartDate, EndDate
--------------------------------------
00001, 00002, 3/2/2015, 15/2/2015
00002, 00002, 1/2/2015, 21/2/2015
00003, 00001, 1/3/2015, 10/3/2015
UPDATE:
Gordon Linoff gave the good answer, but it wasn't solved my question, because both his queries don't take '00003' project (its last plan is '00001').
I wrote my query with 'OVER Clause' (Stanislovas Kalašnikovas note about it).
So I post full answer that solves my question for future googlers:
SELECT * FROM
(SELECT
result.ProjectId,
result.CreateDate,
result.StartDate,
result.EndDate,
ROW_NUMBER() OVER (PARTITION BY ProjectId ORDER BY CreateDate DESC) AS RowNumber
FROM (
SELECT pr.ProjectId AS ProjectId,
pl.CreateDate AS CreateDate,
pr.StartDate AS StartDate,
pr.EndDate AS EndDate
FROM Projects pr
INNER JOIN Plans pl ON pr.PlanId = pl.PlanId
--WHERE ProjectId IN ('000001', '000003') --Filter
) AS result
) AS result
WHERE result.RowNumber = 1
You can use a subquery to get the most recent plan. Then just join this to projects:
SELECT pr.ProjectId, pl.CreateDate, pr.StartDate, pr.EndDate
FROM (SELECT TOP 1 pl.*
FROM plans pl
ORDER BY pl.CreateDate DESC
) pl JOIN
Projects pr
ON pr.PlanId = pl.PlanId;
WHERE ProjectId = '000002'
An alternative method is to just use TOP WITH TIES:
SELECT TOP 1 WITH TIES pr.ProjectId, pl.CreateDate, pr.StartDate, pr.EndDate
FROM plans pl
Projects pr
ON pr.PlanId = pl.PlanId;
WHERE ProjectId = '000002'
ORDER BY pl.CreateDate DESC
This is example of ROW_NUMBER with 1 table, easy you can use It in your case.
CREATE TABLE #Test
(
Id NVARCHAR(100),
Data DATE
)
INSERT INTO #Test VALUES ('1', '2015-01-04'), ('1', '2015-01-07'), ('2', '2015-01-05'), ('2', '2015-01-08')
SELECT Id, Data
FROM (
SELECT Id, Data, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Data DESC) rn
FROM #Test
)x
WHERE rn > 1
DROP TABLE #Test
you can use datediff() e.g if you want to take the entries of last 10 days use:
SELECT pr.ProjectId,
pl.CreateDate,
pr.StartDate,
pr.EndDate
FROM Projects pr
INNER JOIN Plans pl
ON pr.PlanId = pl.PlanId
WHERE datediff(day,pl.CreateDate,getdate())<10

SQL Server 2008 query, time in each status

I'm wondering if anybody can help with a query I am working on. I'm trying to gather information for 'Time in each status' from my call activity table.
I need to set up 3 time ranges in days: <3 days, 4-5 days, 6+ days, returning the number of days each CallID is spending in each status.
The trouble I'm having is that I need to identify from the table below when there was a status change. This table records any activity to the call, i.e changed customer details and not just when a status has been changed.
Apologies if this is unclear, let me know if you need further details.
I'm using SQL Server 2008. Here is the table I'm using and related values:
CREATE TABLE Activity ( CallID varchar(30), Call_Date datetime, [User] varchar(30), Status varchar(10) );
INSERT INTO Activity VALUES (366,'2013/09/27 12:24:33',13,9);
INSERT INTO Activity VALUES (366,'2013/09/28 17:36:14',13,9);
INSERT INTO Activity VALUES (366,'2013/09/29 07:29:18',13,10);
INSERT INTO Activity VALUES (366,'2013/09/30 06:22:12',13,-1);
INSERT INTO Activity VALUES (367,'2013/09/27 12:13:16',9,6);
INSERT INTO Activity VALUES (367,'2013/09/27 12:25:03',9,6);
INSERT INTO Activity VALUES (367,'2013/09/29 12:25:29',9,6);
INSERT INTO Activity VALUES (367,'2013/09/30 12:45:55',9,7);
INSERT INTO Activity VALUES (367,'2013/10/01 12:46:04',9,8);
INSERT INTO Activity VALUES (367,'2013/10/02 15:12:27',9,-1);
INSERT INTO Activity VALUES (368,'2013/08/01 15:09:01',5,10);
INSERT INTO Activity VALUES (368,'2013/08/02 14:11:20',5,13);
INSERT INTO Activity VALUES (368,'2013/08/04 16:41:11',5,13);
INSERT INTO Activity VALUES (368,'2013/08/05 01:12:56',5,-1);
Desired Output 1: E.g. if CallID 35931 took 2 days to change from status 1 to status 2, 2 days would be added to the count in the <3 column
Status <3 Days 4-5 days 6+ Days
------ ------- -------- -------
1 10 3 1
2 8 1 2
3 5 3 1
I'm stuck in the first stage trying to identify the rows where there are status changes and ignoring the rest. I'm working on a subquery which selects the top date for each change of status. It's bringing back negative values. See here:
select CallID, T2.[status], Call_Date,
sum(datediff(dd, nextDate, [Call_Date]) - (datediff(wk, nextDate, [Call_Date]) * 2) -
case when datepart(wk, nextDate) = 1 then 1 else 0 end +
case when datepart(wk, [Call_Date]) = 7 then 1 else 0 end) as TotalDays
from (select *,
(select MAX( T0.[Call_Date])
from [Activity] T0
where T0.[Call_Date] > T1.[Call_Date] and
T0.CallID = T1.CallID
) as nextDate
from [Activity] T1
) T2
where T2.[status] <> '-1'
group by Call_Date, T2.[status], CallID
Thanks for your help in advance.
First of all i think that you need only the rows with the minimum date for each id and status as they would show a status change. This can be done with a CTE and using ROW_NUMBER.
Then you should join the results in a way that on the same record you would have the old status date and the new status date. On the first time you would have nulls for the first status.
;WITH CallsCTE AS
(
SELECT CallId,
Call_Date,
Status,
ROW_NUMBER() OVER(PARTITION BY CallId, Status ORDER BY Call_Date) AS rn
FROM Activity
),
StatusChangesCTE AS
(
SELECT CallID,
Call_Date,
Status
FROM CallsCTE
WHERE rn = 1
)
SELECT Sold.*,
Snew.*
FROM StatusChangesCTE Snew
LEFT JOIN StatusChangesCTE Sold
ON Snew.CallID = Sold.CallID
AND Sold.Call_Date = (SELECT MAX(Call_Date) FROM StatusChangesCTE WHERE CallID = Sold.CallID AND Call_Date < Snew.Call_Date)
I think that you can find your way using the above, as you could use DateDiff on Snew.Call_Date and Sold.Call_Date to find the time needed for a status change.
Let me know if you need any more assistance.

Finding the number of concurrent days two events happen over the course of time using a calendar table

I have a table with a structure
(rx)
clmID int
patid int
drugclass char(3)
drugName char(25)
fillDate date
scriptEndDate date
strength int
And a query
;with PatientDrugList(patid, filldate,scriptEndDate,drugClass,strength)
as
(
select rx.patid,rx.fillDate,rx.scriptEndDate,rx.drugClass,rx.strength
from rx
)
,
DrugList(drugName)
as
(
select x.drugClass
from (values('h3a'),('h6h'))
as x(drugClass)
where x.drugClass is not null
)
SELECT PD.patid, C.calendarDate AS overlap_date
FROM PatientDrugList AS PD, Calendar AS C
WHERE drugClass IN ('h3a','h6h')
AND calendardate BETWEEN filldate AND scriptenddate
GROUP BY PD.patid, C.CalendarDate
HAVING COUNT(DISTINCT drugClass) = 2
order by pd.patid,c.calendarDate
The Calendar is simple a calendar table with all possible dates throughout the length of the study with no other columns.
My query returns data that looks like
The overlap_date represents every day that a person was prescribed a drug in the two classes listed after the PatientDrugList CTE.
I would like to find the number of consecutive days that each person was prescribed both families of drugs. I can't use a simple max and min aggregate because that wouldn't tell me if someone stopped this regimen and then started again. What is an efficient way to find this out?
EDIT: The row constructor in the DrugList CTE should be a parameter for a stored procedure and was amended for the purposes of this example.
You are looking for consecutive sequences of dates. The key observation is that if you subtract a sequence from the dates, you'll get a constant date. This defines a group of dates all in sequence, which can then be grouped.
select patid
,MIN(overlap_date) as start_overlap
,MAX(overlap_date) as end_overlap
from(select cte.*,(dateadd(day,row_number() over(partition by patid order by overlap_Date),overlap_date)) as groupDate
from cte
)t
group by patid, groupDate
This code is untested, so it might have some typos.
You need to pivot on something and a max and min work that out. Can you state if someone had both drugs on a date pivot? Then you would be limiting by date if I understand your question correctly.
EG Example SQL:
declare #Temp table ( person varchar(8), dt date, drug varchar(8));
insert into #Temp values ('Brett','1-1-2013', 'h3a'),('Brett', '1-1-2013', 'h6h'),('Brett','1-2-2013', 'h3a'),('Brett', '1-2-2013', 'h6h'),('Joe', '1-1-2013', 'H3a'),('Joe', '1-2-2013', 'h6h');
with a as
(
select
person
, dt
, max(case when drug = 'h3a' then 1 else 0 end) as h3a
, max(case when drug = 'h6h' then 1 else 0 end) as h6h
from #Temp
group by person, dt
)
, b as
(
select *, case when h3a = 1 and h6h = 1 then 1 end as Logic
from a
)
select person, count(Logic) as DaysOnBothPresriptions
from b
group by person