Get last value and this years value to compare - sql

I have a table on sql server 2014 with date, value, type and project columns, i want to write a query that returns last value from past year and latest value from current year, but if the current year but if there is no entry in this year it returns empty in one row.
My table :
Id | date | value | type | project
1 | 2019-01-01 | 1 | a | p1
2 | 2018-01-01 | 1 | a | p1
3 | 2018-01-01 | 1 | b | p1
Result i expected :
CurrentyearDate | currentyearvalue | pastdate | pastvalue | type | project
2019-01-01 | 1 | 2018-01-01 | 1 | a | p1
Null | null | 2018-01-01 | 1 | b | p1
I've used ROW_NUMBER and partition method but can't make it though

You can try outer self join along with DATEPART:
select t1.date as CurrentyearDate, t1.value as currentyearvalue,
t2.date as pastdate, t2.pastvalue as currentyearvalue ,t2.type ,t2.project
from <<tableName>> t1 right join <<tableName>> t2
on DATEPART(MONTH, t1.lastUpdateTime)=DATEPART(MONTH, t2.lastUpdateTime)
and DATEPART(dd, t1.lastUpdateTime)=DATEPART(dd, t2.lastUpdateTime)
and DATEPART(year, t1.lastUpdateTime)=DATEPART(year, t2.lastUpdateTime)+1;

Related

SQL - get summary of differences vs previous month

I have a table similar to this one:
| id | store | BOMdate |
| 1 | A | 01/10/2018 |
| 1 | B | 01/10/2018 |
| 1 | C | 01/10/2018 |
|... | ... | ... |
| 1 | A | 01/11/2018 |
| 1 | C | 01/11/2018 |
| 1 | D | 01/11/2018 |
|... | ... | ... |
| 1 | B | 01/12/2018 |
| 1 | C | 01/12/2018 |
| 1 | E | 01/12/2018 |
It contains the stores that are active at BOM (beginning of month).
How do I query it to get the amount of stores that are new that month - those that where not active the previous month?
The output should be this:
| BOMdate | #newstores |
| 01/10/2018 | 3 | * no stores on previous month
| 01/11/2018 | 1 | * D is the only new active store
| 01/12/2018 | 2 | * store B was not active on November, E is new
I now how to count the first time that each store is active (nested select, taking the MIN(BOMdate) and then counting). But I have no idea how to check each month vs its previous month.
I use SQL Server, but I am interested in the differences in other platforms if there are any.
Thanks
How do I query it to get the amount of stores that are new that month - those that where not active the previous month?
One option uses not exists:
select bomdate, count(*) cnt_new_stores
from mytable t
where not exists (
select 1
from mytable t1
where t1.store = t.store and t1.bomdate = dateadd(month, -1, t.bomdate)
)
group by bomdate
You can also use window functions:
select bomdate, count(*) cnt_new_stores
from (
select t.*, lag(bomdate) over(partition by store order by bomdate) lag_bomdate
from mytable t
) t
where bomdate <> dateadd(month, 1, lag_bomdate) or lag_bomdate is null
group by bomdate
you can compare a date with previous month's date using DATEDIFF function of TSQL.
Using NOT EXIST you can count the stores which did not appear in last month as well you can get the names in a list using STRING_AGG function of TSQL introduced from SQL 2017.
select BOMDate, NewStoresCount=count(1),NewStores= STRING_AGG(store,',') from
yourtable
where not exists
(
Select 1 from
yourtable y where y.store=store and DATEDIFF(m,y.BOMDate,BOMDate)=1
)
group by BOMDate

SQL interpolating missing values for a specific date range - with some conditions

There are some similar questions on the site, but I believe mine warrants a new post because there are specific conditions that need to be incorporated.
I have a table with monthly intervals, structured like this:
+----+--------+--------------+--------------+
| ID | amount | interval_beg | interval_end |
+----+--------+--------------+--------------+
| 1 | 10 | 12/17/2017 | 1/17/2018 |
| 1 | 10 | 1/18/2018 | 2/18/2018 |
| 1 | 10 | 2/19/2018 | 3/19/2018 |
| 1 | 10 | 3/20/2018 | 4/20/2018 |
| 1 | 10 | 4/21/2018 | 5/21/2018 |
+----+--------+--------------+--------------+
I've found that sometimes there is a month of data missing around the end/beginning of the year where I know it should exist, like this:
+----+--------+--------------+--------------+
| ID | amount | interval_beg | interval_end |
+----+--------+--------------+--------------+
| 2 | 10 | 10/14/2018 | 11/14/2018 |
| 2 | 10 | 11/15/2018 | 12/15/2018 |
| 2 | 10 | 1/17/2019 | 2/17/2019 |
| 2 | 10 | 2/18/2019 | 3/18/2019 |
| 2 | 10 | 3/19/2019 | 4/19/2019 |
+----+--------+--------------+--------------+
What I need is a statement that will:
Identify where this year-end period is missing (but not find missing
months that aren't at the beginning/end of the year).
Create this interval by using the length of an existing interval for
that ID (maybe using the mean interval length for the ID to do it?). I could create the interval from the "gap" between the previous and next interval, except that won't work if I'm missing an interval at the beginning or end of the ID's record (i.e. if the record starts at say 1/16/2015, I need the amount for 12/15/2014-1/15/2015
Interpolate an 'amount' for this interval using the mean daily
'amount' from the closest existing interval.
The end result for the sample above should look like:
+----+--------+--------------+--------------+
| ID | amount | interval_beg | interval_end |
+----+--------+--------------+--------------+
| 2 | 10 | 10/14/2018 | 11/14/2018 |
| 2 | 10 | 11/15/2018 | 12/15/2018 |
| 2 | 10 | 12/16/2018 | 1/16/2018 |
| 2 | 10 | 1/17/2019 | 2/17/2019 |
| 2 | 10 | 2/18/2019 | 3/18/2019 |
+----+--------+--------------+--------------+
A 'nice to have' would be a flag indicating that this value is interpolated.
Is there a way to do this efficiently in SQL? I have written a solution in SAS, but have a need to move it to SQL, and my SAS solution is very inefficient (optimization isn't a goal, so any statement that does what I need is fantastic).
EDIT: I've made an SQLFiddle with my example table here:
http://sqlfiddle.com/#!18/8b16d
You can use a sequence of CTEs to build up the data for the missing periods. In this query, the first CTE (EOYS) generates all the end-of-year dates (YYYY-12-31) relevant to the table; the second (INTERVALS) the average interval length for each ID and the third (MISSING) attempts to find start (from t2) and end (from t3) dates of adjoining intervals for any missing (indicated by t1.ID IS NULL) end-of-year interval. The output of this CTE is then used in an INSERT ... SELECT query to add missing interval records to the table, generating missing dates by adding/subtracting the interval length to the end/start date of the adjacent interval as necessary.
First though we add the interp column to indicate if a row was interpolated:
ALTER TABLE Table1 ADD interp TINYINT NOT NULL DEFAULT 0;
This sets interp to 0 for all existing rows. Then we can do the INSERT, setting interp for all those rows to 1:
WITH EOYS AS (
SELECT DISTINCT DATEFROMPARTS(DATEPART(YEAR, interval_beg), 12, 31) AS eoy
FROM Table1
),
INTERVALS AS (
SELECT ID, AVG(DATEDIFF(DAY, interval_beg, interval_end)) AS interval_len
FROM Table1
GROUP BY ID
),
MISSING AS (
SELECT e.eoy,
ids.ID,
i.interval_len,
COALESCE(t2.amount, t3.amount) AS amount,
DATEADD(DAY, 1, t2.interval_end) AS interval_beg,
DATEADD(DAY, -1, t3.interval_beg) AS interval_end
FROM EOYS e
CROSS JOIN (SELECT DISTINCT ID FROM Table1) ids
JOIN INTERVALS i ON i.ID = ids.ID
LEFT JOIN Table1 t1 ON ids.ID = t1.ID
AND e.eoy BETWEEN t1.interval_beg AND t1.interval_end
LEFT JOIN Table1 t2 ON ids.ID = t2.ID
AND DATEADD(MONTH, -1, e.eoy) BETWEEN t2.interval_beg AND t2.interval_end
LEFT JOIN Table1 t3 ON ids.ID = t3.ID
AND DATEADD(MONTH, 1, e.eoy) BETWEEN t3.interval_beg AND t3.interval_end
WHERE t1.ID IS NULL
)
INSERT INTO Table1 (ID, amount, interval_beg, interval_end, interp)
SELECT ID,
amount,
COALESCE(interval_beg, DATEADD(DAY, -interval_len, interval_end)) AS interval_beg,
COALESCE(interval_end, DATEADD(DAY, interval_len, interval_beg)) AS interval_end,
1 AS interp
FROM MISSING
This adds the following rows to the table:
ID amount interval_beg interval_end interp
2 10 2017-12-05 2018-01-04 1
2 10 2018-12-16 2019-01-16 1
2 10 2019-12-28 2020-01-27 1
Demo on SQLFiddle

How do i get the latest user udpated column value in a table based on timestamp entry on a different table in SQL Server?

I have a temp table #StatusInfo with the following data
+---------+--------------+-------+-------------------------+--+
| OrderNo | GroupLineNum | Type1 | UpdateDate | |
+---------+--------------+-------+-------------------------+--+
| Order85 | NULL | 1 | 2019-11-25 05:15:55.000 | |
+---------+--------------+-------+-------------------------+--+
| Order86 | NULL | 1 | 2019-11-25 05:15:55.000 | |
+---------+--------------+-------+-------------------------+--+
| Order86 | 2 | 2 | 2019-11-25 05:32:23.773 | |
+---------+--------------+-------+-------------------------+--+
| Order87 | NULL | 1 | 2019-11-25 05:15:55.000 | |
+---------+--------------+-------+-------------------------+--+
| Order87 | 1 | 2 | 2019-11-25 05:43:37.637 | | B
+---------+--------------+-------+-------------------------+--+
| Order87 | 2 | 2 | 2019-11-25 05:42:32.390 | | A
+---------+--------------+-------+-------------------------+--+
| Order88 | NULL | 1 | 2019-11-25 06:35:13.000 | |
+---------+--------------+-------+-------------------------+--+
| Order88 | 1 | 2 | 2019-11-25 06:39:16.170 | |
+---------+--------------+-------+-------------------------+--+
Any update the user does on an order will be pulled into this temp table. Type 1 column with value 2 denotes a 'Required Date' field change by the user. The timestamp when the user made the change is the last column.
I have another temp table #LineInfo with the following data. This table is created by joining other tables and a left join with the above table too. The 'LineNum' column from below table will match the 'GroupLineNum' column in the above table for Type1=2
+---------+-----------+---------+------------+-------------------------+-------+
| OrderNo | RowNumber | LineNum | TotalCost | ReqDate | Type1 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order85 | 1 | 1 | 309.110000 | 2019-10-30 23:59:00.000 | 1 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order85 | 2 | 2 | 265.560000 | 2019-10-30 23:59:00.000 | 1 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order86 | 1 | 1 | 309.110000 | 2019-10-30 23:59:00.000 | 1 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order86 | 2 | 2 | 265.560000 | 2019-12-28 23:59:00.000 | 2 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order87 | 1 | 1 | 309.110000 | 2020-01-31 23:59:00.000 | 2 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order87 | 2 | 2 | 265.560000 | 2020-01-01 23:59:00.000 | 2 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order88 | 1 | 1 | 309.110000 | 2019-11-29 23:59:00.000 | 2 |
+---------+-----------+---------+------------+-------------------------+-------+
| Order88 | 2 | 2 | 265.560000 | 2019-12-31 23:59:00.000 | 2 |
+---------+-----------+---------+------------+-------------------------+-------+
I will be joining #lineInfo with other tables to generate a new table with only one record for an orderno. Its grouped by orderno.
What I need to do is ensure that the new selectquery will have a column 'ReqDate' which will be the latest ReqDate value for the order.
For example, Order87 has two lines in the order. User updated Line 2 first at '2019-11-25 05:42:32.390' as seen in the row marked 'A' followed by Line 1 marked B # '2019-11-25 05:43:37.637 ' from the first table.
The new query should have the data from LineInfo and only the 'ReqDate' value matching the 'LineNum' that has the maximum of 'UpdateDate' column for Type1=2 and group by orderno.
So in our example, the output should have the ReqDate value '2020-01-31 23:59:00.000'.
In short, an order should have the most recently updated required date. Order can have multiple line items where reqdate is udpated. If there is no entry in #StatusInfo table with Type2 for an order, then any one of the ReqDate value from the #LineInfo table will suffice. Maybe the first line
I wrote something like this but it doesnt pull orders without any entry in StatusInfo table. Those orders will have a default value even though user didnt udpate and i am not sure how to join the result of this with LineInfo table to set the latest value
Select SIT.Orderno, max_date,grouplinenum
from #StatusInfo SIT
inner join
(SELECT Orderno, MAX(ActDate) as max_date
FROM #StatusInfo SI
WHERE SI.Type1=2
GROUP BY SI.Orderno)a
on a.Orderno = SIT.Orderno and a.max_date = SIT.ActDate
This is what I did. I created the blow CTE to load orders with req date change in order of Updated date and assigned it row number. Record with row number 1 will be the most recently updated date
;WITH cteLatestReqDate AS ( --We need to pull the latest ReqDate value the user set. So we are are ordering the SIT table by ActDate and assigning a row number and respective line's required date here
SELECT SIT.OrderNo, SIT.UpdateDate, SIT.GroupLineNum, LLI.ReqDate,
ROW_NUMBER() OVER (PARTITION BY SIT.OrderNo ORDER BY ActDate DESC) AS RowNum
FROM #StatusInfo SIT INNER JOIN #LineLevelInfo LLI ON SIT.OrderNo = OI.OrderNo AND SIT.GroupLineNum = LLI.LineNum
WHERE SIT.Type1 = 2
)
and then I added the below condition to my select query. Below select query is partial
SELECT
CASE WHEN MAX(LRD.ReqDate) IS NULL THEN CAST(FORMAT(MAX(LLI.ReqDate), 'yyMMdd') AS NVARCHAR(10))
ELSE CAST(FORMAT(MAX(LRD.ReqDate), 'yyMMdd') AS NVARCHAR(10)) END AS LatestReqDate
FROM #LineLevelInfo LLI
LEFT JOIN(SELECT * FROM cteLatestReqDate WHERE RowNum = 1)LRD ON LRD.OrderNo = LLI.OrderNo And LRD.GroupLineNum = LLI.LineNum

Count concurrent dates in user-input date range using SQL

The user will input a date range, and I want to output in SQL every date between and including that range in the number of concurrent uses of said equipment.
In this example, the user date range is 03/08/2016 to 03/09/2016, so you can see below I include anything on or between those dates (grouped by category, but I've simplified here by only using 'powerchair')
The table schema is as follows;
trans_date | trans_end_date | eq_category
17/03/2016 | 16/10/2016 | POWERCHAIR
08/08/2016 | 08/08/2016 | POWERCHAIR
12/08/2016 | 12/08/2016 | POWERCHAIR
17/08/2016 | 18/08/2016 | POWERCHAIR
22/08/2016 | 22/08/2016 | POWERCHAIR
26/08/2016 | 26/08/2016 | POWERCHAIR
02/09/2016 | 02/09/2016 | POWERCHAIR
And I would like to output;
date | concurrent_use
03-08-2016 | 1
04-08-2016 | 1
05-08-2016 | 1
06-08-2016 | 1
07-08-2016 | 1
08-08-2016 | 2
09-08-2016 | 1
10-08-2016 | 1
11-08-2016 | 1
12-08-2016 | 2
13-08-2016 | 1
14-08-2016 | 1
15-08-2016 | 1
16-08-2016 | 1
17-08-2016 | 2
18-08-2016 | 2
19-08-2016 | 1
20-08-2016 | 1
21-08-2016 | 1
22-08-2016 | 2
23-08-2016 | 1
24-08-2016 | 1
25-08-2016 | 1
26-08-2016 | 2
27-08-2016 | 1
28-08-2016 | 1
29-08-2016 | 1
30-08-2016 | 1
31-08-2016 | 1
01-09-2016 | 1
02-09-2016 | 2
03-09-2016 | 1
Anything 1 or 0, I can then filter out as there mustn't have been any equipment out concurrently that day.
I don't think this is a gaps/islands problem, but I'm drawing a blank trying to get this in an SQL statement.
Try like below. You need to generate dates using recursive cte. Then we need to count the no of occurrences of each date falling in range.
;WITH CTE
AS (SELECT CONVERT(DATE, '2016-08-03', 103) DATE1
UNION ALL
SELECT Dateadd(DAY, 1, DATE1) AS DATE1
FROM CTE
WHERE Dateadd(DD, 1, DATE1) <= '2016-09-03')
SELECT C.DATE1,
Count(1) OCCURENCES
FROM CTE C
JOIN #TABLE1 T
ON C.DATE1 BETWEEN [TRANS_DATE] AN [TRANS_END_DATE]
GROUP BY C.DATE1
You need a set of numbers or dates. So, if you want everything in that range:
with d as (
select cast('2016-08-03' as date) as d
union all
select dateadd(day, 1, d.d)
from d
where d < '2016-09-03'
)
select d.d, count(s.trans_date)
from d left join
schema s
on d.d between s.trans_date and s.trans_date_end
group by d.d;
I'm not sure if both the start and end dates are included in the range.

Data aggregation with left-outer join

I am trying to pull some data with transaction counts, by branch, by week, which will later be used to feed some dynamic .Net charts.
I have a calendar table, I have a branch table and I have a transaction table.
Here is my DB info (only relevant columns included):
Branch Table:
ID (int), Branch (varchar)
Calendar Table:
Date (datetime), WeekOfYear(int)
Transaction Table:
Date (datetime), Branch (int), TransactionCount(int)
So, I want to do something like the following:
Select b.Branch, c.WeekOfYear, sum(TransactionCount)
FROM BranchTable b
LEFT OUTER JOIN TransactionTable t
on t.Branch = b.ID
JOIN Calendar c
on t.Date = c.Date
WHERE YEAR(c.Date) = #Year // (SP accepts this parameter)
GROUP BY b.Branch, c.WeekOfYear
Now, this works EXCEPT when a branch doesn't have any transactions for a week, in which case NO RECORD is returned for that branch on that week. What I WANT is to get that branch, that week and "0" for the sum. I tried isnull(sum(TransactionCount), 0) - but that didn't work, either. So I will get the following (making up sums for illustration purposes):
+--------+------------+-----+
| Branch | WeekOfYear | Sum |
+--------+------------+-----+
| 1 | 1 | 25 |
| 2 | 1 | 37 |
| 3 | 1 | 19 |
| 4 | 1 | 0 | //THIS RECORD DOES NOT GET RETURNED, BUT I NEED IT!
| 1 | 2 | 64 |
| 2 | 2 | 34 |
| 3 | 2 | 53 |
| 4 | 2 | 11 |
+--------+------------+-----+
So, why doesn't the left-outer join work? Isn't that supposed to
Any help will be greatly appreciated. Thank you!
EDIT: SAMPLE TABLE DATA:
Branch Table:
+----+---------------+
| ID | Branch |
+----+---------------+
| 1 | First Branch |
| 2 | Second Branch |
| 3 | Third Branch |
| 4 | Fourth Branch |
+----+---------------+
Calendar Table:
+------------+------------+
| Date | WeekOfYear |
+------------+------------+
| 01/01/2015 | 1 |
| 01/02/2015 | 1 |
+------------+------------+
Transaction Table
+------------+--------+--------------+
| Date | Branch | Transactions |
+------------+--------+--------------+
| 01/01/2015 | 1 | 12 |
| 01/01/2015 | 1 | 9 |
| 01/01/2015 | 2 | 4 |
| 01/01/2015 | 2 | 2 |
| 01/01/2015 | 2 | 23 |
| 01/01/2015 | 3 | 42 |
| 01/01/2015 | 3 | 19 |
| 01/01/2015 | 3 | 7 |
+------------+--------+--------------+
If you want to return a query that contains each Branch and each week, then you'll need to first create a full list of that, then use a LEFT JOIN to the transactions to get the count. The code will be similar to:
select bc.Branch,
bc.WeekOfYear,
TotalTransaction = coalesce(sum(t.TransactionCount), 0)
from
(
select b.id, b.branch, c.WeekOfYear, c.date
from branch b
cross join Calendar c
-- if you want to limit the number of rows returned use a WHERE to limit the weeks
-- so far in the year or using the date column
WHERE c.date <= getdate()
and YEAR(c.Date) = #Year // (SP accepts this parameter)
) bc
left join TransactionTable t
on t.Date = bc.Date
and bc.id = t.branch
GROUP BY bc.Branch, bc.WeekOfYear
See Demo
This code will create in your subquery a full list of each branch with each date. Once you have this list, then you can JOIN to the transactions to get your total transaction count and you'd return each date as you want.
Bring in the Calendar before you bring in the transactions:
SELECT b.Branch, c.WeekOfYear, sum(TransactionCount)
FROM BranchTable b
INNER JOIN CalendarTable c ON YEAR(c.Date) = #Year
LEFT JOIN TransactionTable t ON t.Branch = b.ID AND t.Date = c.Date
GROUP BY b.Branch, c.WeekOfYear
ORDER BY c.WeekOfYear, b.Branch