Displaying data twice in SQL Server - sql

I have the following table:
Date Type Amount
Jul-17 Type A 20
Jul-17 Type B 30
Jul-17 Type C 10
Aug-17 Type A 50
Aug-17 Type D 40
Aug-17 Type C 70
My query will only filters two month as below:
SELECT DATE, Type, Amount FROM Table 1 WHERE DATE >= '01-Jul-2017'
AND DATE <= '31-Aug-2017'
I want to display Type that does not exist in July and display amount 0 and Type that does not exist in August and display Amount 0 as below:
Date Type Amount
Jul-17 Type A 20
Aug-17 Type A 50
Jul-17 Type B 30
Aug-17 Type B 0
Jul-17 Type C 10
Aug-17 Type C 70
Jul-17 Type D 0
Aug-17 Type D 40
So far I have tried below, but it's affecting performance. I want to simplify the query without using union:
SELECT DATE, Type, Amount
FROM Table 1
WHERE DATE >= '01-Jul-2017'
AND DATE <= '31-Aug-2017'
Union
SELECT '01-Jul-2017' AS DATE, TYPE, 0 AS AMOUNT
WHERE DATE >= '01-Aug-2017'
AND DATE <= '31-Aug-2017'
AND Type NOT in (SELECT DISTINCT TYPE WHERE DATE >= '01-Jul-2017'
AND DATE <= '31-Jul-2017')
Union
SELECT '01-Aug-2017' AS DATE, TYPE, 0 AS AMOUNT
WHERE DATE >= '01-Jul-2017'
AND DATE <= '31-Jul-2017'
AND Type NOT in (SELECT DISTINCT TYPE WHERE DATE >= '01-Aug-2017'
AND DATE <= '31-Aug-2017')

You can use a cross join to get all possible combinations and then use a left outer join to get the actual amount...
WITH cte
AS ( SELECT DISTINCT
t1.Date ,
t2.Type
FROM dbo.Table1 t1
CROSS JOIN dbo.Table1 t2
WHERE t1.Date BETWEEN '2017-07-01' AND '2017-08-31'
AND t2.Date BETWEEN '2017-07-01' AND '2017-08-31'
)
SELECT cte.Date ,
cte.Type ,
COALESCE(t.Amount, 0) AS Amount
FROM cte
LEFT OUTER JOIN dbo.Table1 AS t ON t.Date = cte.Date
AND t.Type = cte.Type;

At begininng I misred your tags, and thought you should do for MYSQL (apologize). I leave MYSQL version at end.
I used * for brevity. Pls change it with full name list for your final query.
THIS IS MSSQL VERSION
You need to specify two dates just one time (at the beginning).
I used TN as table name.
WITH TT AS (SELECT *, MONTH(DATE) AS MN from TN WHERE DATE >= '2017-07-01' AND DATE <= '2017-08-31' )
SELECT COALESCE(C.DATE, CASE WHEN C.MN2=MINMN THEN DATEADD(MONTH,+1, C.DATE2) ELSE DATEADD(MONTH,-1, C.DATE2) END) AS DATE
, COALESCE(C.TYPE, C.TYPE2) AS TYPE
, COALESCE(C.AMOUNT,0) AS AMOUNT
FROM
(SELECT A.*, B.DATE AS DATE2, B.TYPE AS TYPE2, B.AMOUNT AS AMOUNT2, B.MN AS MN2, X.MINMN
from TT A
FULL JOIN TT B ON A.MN<>B.MN AND A.TYPE = B.TYPE
CROSS JOIN (SELECT MIN(MN) MINMN FROM TT) X
)C
ORDER BY DATE, TYPE ;
Output:
DATE TYPE AMOUNT
1 17.07.2017 00:00:00 A 20
2 17.07.2017 00:00:00 B 30
3 17.07.2017 00:00:00 C 10
4 17.07.2017 00:00:00 D 0
5 17.08.2017 00:00:00 A 50
6 17.08.2017 00:00:00 B 0
7 17.08.2017 00:00:00 C 70
8 17.08.2017 00:00:00 D 40
MYSQL VERSION
At moment I just find this one (it use only one UNION, instead of two).
I used TN as table name. Pls check it for performances.
SELECT * from TN WHERE DATE >= '2017-07-01' AND DATE <= '2017-08-31'
UNION ALL
SELECT CASE WHEN MONTH(A.DATE)=M1 THEN '2017-08-01' ELSE '2017-07-01' END AS DATE, A.TYPE, 0 AS AMOUNT
from TN A
LEFT JOIN (SELECT * FROM TN WHERE DATE >= '2017-07-01' AND DATE <= '2017-08-31') B ON DATE_FORMAT(A.DATE ,'%Y-%m-01') <> DATE_FORMAT(B.DATE ,'%Y-%m-01') AND A.TYPE = B.TYPE
CROSS JOIN (SELECT MIN(MONTH(DATE)) AS M1 FROM TN WHERE DATE >= '2017-07-01' AND DATE <= '2017-08-31') X
WHERE A.DATE >= '2017-07-01' AND A.DATE <= '2017-08-31'
AND B.TYPE IS NULL
;
Output:
Date type amount
1 2017-07-17 A 20
2 2017-07-17 B 30
3 2017-07-17 C 10
4 2017-08-17 A 50
5 2017-08-17 D 40
6 2017-08-17 C 70
7 2017-08-01 B 0
8 2017-07-01 D 0

If you wanna execute the result using pivot.
You may try the below query:
select [Date] As Date_For_Table,
[Type A] =
case
when [Type A] IS NULL Then 0
else [Type A]
end,
[Type B] =
case
when [Type B] IS NULL Then 0
else [Type B]
end,
[Type C] =
case
when [Type C] IS NULL Then 0
else [Type C]
end,
[Type D] =
case
when [Type D] IS NULL Then 0
else [Type D]
end
from
(
select CONVERT(CHAR(4), date, 100) + CONVERT(CHAR(4), date, 120) as date ,type,Amount
from Table1
) as PivotData
pivot
(
avg(amount) for type in
([Type A],[Type B],[Type C],[Type D]))as Pivoting
order by date desc
output:

Related

How to calculate new column with sum of moving time window within a group in snwoflake SQL?

I have a table like this:
date
ID
count
2021-01-01
A
24
2021-01-02
A
10
2021-01-03
A
5
2021-01-04
A
1
2021-01-01
B
5
2021-01-02
B
10
2021-01-03
B
1
2021-01-04
B
10
2021-01-01
C
5
2021-01-03
C
10
2021-01-04
C
1
2021-01-05
C
10
and I want to calculate a new column that sums the count value for the two days before the date within each ID. There might be missing dates (days) in between, which is why a simple lag function propably will not work (See example ID C). So I want to sum the values in between a certain date range within each ID.
So the resulting table should look like
date
ID
count
sum_two_days_before
2021-01-01
A
24
Null
2021-01-02
A
10
Null
2021-01-03
A
5
34
2021-01-04
A
1
15
2021-01-01
B
5
Null
2021-01-02
B
10
Null
2021-01-03
B
1
15
2021-01-04
B
10
11
2021-01-01
C
5
Null
2021-01-03
C
10
5
2021-01-04
C
1
10
2021-01-05
C
10
11
Would be glad about help!
A correlated sub-query might work.
But can't verify.
SELECT *
, ( SELECT SUM(t2.count)
FROM your_table t2
WHERE t2.ID = t.ID
AND t2.date IN (DATEADD(day,-1,t.date), DATEADD(day,-2,t.date))
HAVING COUNT(t2.date) = 2
) AS sum_two_days_before
FROM your_table t
ORDER BY t.ID, t.date;
And if that doesn't work.
Maybe adding 2 LAG will work.
But it just looks for previous days, not today-1 & today-2
SELECT *
, LAG(t.count, 1) OVER (PARTITION BY t.ID ORDER BY t.date) +
LAG(t.count, 2) OVER (PARTITION BY t.ID ORDER BY t.date) AS sum_two_dates_before
FROM your_table t
ORDER BY t.ID, t.date;
But if snowlake isn't having the HAVING, then maybe this will work.
SELECT *
, ( SELECT
SUM(CASE WHEN t.date = DATEADD(day,-2,t.date) THEN t2.count END)
+ SUM(CASE WHEN t.date = DATEADD(day,-1,t.date) THEN t2.count END)
FROM your_table t2
WHERE t2.ID = t.ID
AND t2.date IN (DATEADD(day,-1,t.date), DATEADD(day,-2,t.date))
) AS sum_two_days_before
FROM your_table t
ORDER BY t.ID, t.date;
So correct "as described" results can be gotten from a verbose version:
WITH data AS (
SELECT * FROM values
('2021-01-01','A', 24, null),
('2021-01-02','A', 10, null),
('2021-01-03','A', 5, 34),
('2021-01-04','A', 1, 15),
('2021-01-01','B', 5, null),
('2021-01-02','B', 10, null),
('2021-01-03','B', 1, 15),
('2021-01-04','B', 10, 11),
('2021-01-01','C', 5, null),
('2021-01-03','C', 10, 5),
('2021-01-04','C', 1, 10),
('2021-01-05','C', 10, 11)
v(date, id, count, expected )
)
SELECT
date,
id,
count,
expected,
sum_2_days
FROM (
SELECT
date,
id,
count,
expected,
LAG(date,2)over(partition by id order by date) as d2,
LAG(date,1)over(partition by id order by date) as d1,
lag(count,2)over(partition by id order by date) as c2,
lag(count,1)over(partition by id order by date) as c1,
dateadd(day,-2,date)::date AS dm2,
d2 is not null OR d1 = dm2 as X1,
iff(d2 >= dm2, c2, 0) as x1_v2,
iff(d1 >= dm2, c1, 0) as x1_v1,
x1_v2 + x1_v1 as x1_r,
iff(X1, x1_r, null) as sum_2_days
FROM data
)
order by 2,1;
giving:
DATE
ID
COUNT
EXPECTED
SUM_2_DAYS
2021-01-01
A
24
2021-01-02
A
10
2021-01-03
A
5
34
34
2021-01-04
A
1
15
15
2021-01-01
B
5
2021-01-02
B
10
2021-01-03
B
1
15
15
2021-01-04
B
10
11
11
2021-01-01
C
5
2021-01-03
C
10
5
5
2021-01-04
C
1
10
10
2021-01-05
C
10
11
11
or the compressed version:
SELECT
date,
id,
count,
expected,
iff( LAG(date,2)over(partition by id order by date) is not null
OR LAG(date,1)over(partition by id order by date) = dateadd(day,-2,date)
, iff( LAG(date,2)over(partition by id order by date) >= dateadd(day,-2,date)
,lag(count,2)over(partition by id order by date)
,0)
+ iff( LAG(date,1)over(partition by id order by date) >= dateadd(day,-2,date)
,lag(count,1)over(partition by id order by date)
,0)
,null) as sum_2_days
FROM data
order by 2,1;
AND if you really want to do it with a double join you can via some pre-conditioning:
WITH date_a AS (
SELECT date, id, count, expected,
dateadd(day,-2,date) as dm2,
first_value(date)over(partition by id order by date) as fd
FROM data
)
select
a.date,
a.id,
a.count,
a.expected,
sum(iff(a.fd <= a.dm2, b.count, null)) as sum_2_days
FROM date_a a
LEFT JOIN data b
ON a.id = b.id AND a.date > b.date AND b.date >= a.dm2
GROUP BY 1,2,3,4
order by 2,1;

Select data where sum for last 7 from max-date is greater than x

I have a data set as such:
Date Value Type
2020-06-01 103 B
2020-06-01 100 A
2020-06-01 133 A
2020-06-11 150 A
2020-07-01 1000 A
2020-07-21 104 A
2020-07-25 140 A
2020-07-28 1600 A
2020-08-01 100 A
Like this:
Type ISHIGH
A 1
B 0
Here's the query i tried,
select type, case when sum(value) > 10 then 1 else 0 end as total_usage
from table_a
where (select sum(value) as usage from tableA where date = max(date)-7)
group by type, date
This is clearly not right. What is a simple way to do this?
It is a simply group by except that you need to be able to access max date before grouping:
select type
, max(date) as last_usage_date
, sum(value) as total_usage
, case when sum(case when date >= cutoff_date then value end) >= 1000 then 'y' end as [is high!]
from t
cross apply (
select dateadd(day, -6, max(date))
from t as x
where x.type = t.type
) as ca(cutoff_date)
group by type, cutoff_date
If you want just those two columns then a simpler approach is:
select t.type, case when sum(value) >= 1000 then 'y' end as [is high!]
from t
left join (
select type, dateadd(day, -6, max(date)) as cutoff_date
from t
group by type
) as a on t.type = a.type and t.date >= a.cutoff_date
group by t.type
Find the max date by type. Then used it to find last 7 days and sum() the value.
with
cte as
(
select [type], max([Date]) as MaxDate
from tableA
group by [type]
)
select c.[type], sum(a.Value),
case when SUM(a.Value) > 1000 then 1 else 0 end as ISHIGH
from cte c
inner join tableA a on a.[type] = c.[type]
and a.[Date] >= DATEADD(DAY, -7, c.MaxDate)
group by c.[type]
This can be done through a cumulative total as follows:
;With CTE As (
Select [type], [date],
SUM([value]) Over (Partition by [type] Order by [date] Desc) As Total,
Row_Number() Over (Partition by [type] Order by [date] Desc) As Row_Num
From Tbl)
Select Distinct CTE.[type], Case When C.[type] Is Not Null Then 1 Else 0 End As ISHIGH
From CTE Left Join CTE As C On (CTE.[type]=C.[type]
And DateDiff(dd,CTE.[date],C.[date])<=7
And C.Total>1000)
Where CTE.Row_Num=1
I think you are quite close with you initial attempt to solve this. Just a tiny edit:
select type, case when sum(value) > 1000 then 1 else 0 end as total_usage
from tableA
where date > (select max(date)-7 from tableA)
group by type

Fill up date gap by month

I have table of products and their sales quantity in months.
Product Month Qty
A 2018-01-01 5
A 2018-02-01 3
A 2018-05-01 5
B 2018-08-01 10
B 2018-10-01 12
...
I'd like to first fill in the data gap between each product's min and max dates like below:
Product Month Qty
A 2018-01-01 5
A 2018-02-01 3
A 2018-03-01 0
A 2018-04-01 0
A 2018-05-01 5
B 2018-08-01 10
B 2018-09-01 0
B 2018-10-01 12
...
Then I would need to perform an accumulation of each product's sales quantity by month.
Product Month total_Qty
A 2018-01-01 5
A 2018-02-01 8
A 2018-03-01 8
A 2018-04-01 8
A 2018-05-01 13
B 2018-08-01 10
B 2018-09-01 10
B 2018-10-01 22
...
I fumbled over the "cross join" clause, however it seems to generate some unexpected results for me. Could someone help to give a hint how I can achieve this in SQL?
Thanks a lot in advance.
I think a recursive CTE is a simple way to do this. The code is just:
with cte as (
select product, min(mon) as mon, max(mon) as end_mon
from t
group by product
union all
select product, dateadd(month, 1, mon), end_mon
from cte
where mon < end_mon
)
select cte.product, cte.mon, coalesce(qty, 0) as qty
from cte left join
t
on t.product = cte.product and t.mon = cte.mon;
Here is a db<>fiddle.
Hi i think this example can help you and perform what you excepted :
CREATE TABLE #MyTable
(Product varchar(10),
ProductMonth DATETIME,
Qty int
);
GO
CREATE TABLE #MyTableTempDate
(
FullMonth DATETIME
);
GO
INSERT INTO #MyTable
SELECT 'A', '2019-01-01', 214
UNION
SELECT 'A', '2019-02-01', 4
UNION
SELECT 'A', '2019-03-01', 50
UNION
SELECT 'B', '2019-01-01', 214
UNION
SELECT 'B', '2019-02-01', 10
UNION
SELECT 'C', '2019-04-01', 150
INSERT INTO #MyTableTempDate
SELECT '2019-01-01'
UNION
SELECT '2019-02-01'
UNION
SELECT '2019-03-01'
UNION
SELECT '2019-04-01'
UNION
SELECT '2019-05-01'
UNION
SELECT '2019-06-01'
UNION
SELECT '2019-07-01';
------------- FOR NEWER SQL SERVER VERSION > 2005
WITH MyCTE AS
(
SELECT T.Product, T.ProductMonth AS 'MMonth', T.Qty
FROM #MyTable T
UNION
SELECT T.Product, TD.FullMonth AS 'MMonth', 0 AS 'Qty'
FROM #MyTable T, #MyTableTempDate TD
WHERE NOT EXISTS (SELECT 1 FROM #MyTable TT WHERE TT.Product = T.Product AND TD.FullMonth = TT.ProductMonth)
)
-- SELECT * FROM MyCTE;
SELECT Product, MMonth, Qty, SUM( Qty) OVER(PARTITION BY Product ORDER BY Product
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as 'TotalQty'
FROM MyCTE
ORDER BY Product, MMonth ASC;
DROP TABLE #MyTable
DROP TABLE #MyTableTempDate
I have other way to perform this in lower SQL Server Version (like 2005 and lower)
It's a SELECT on SELECT if it's your case let me know and i provide some other example.
You can create the months with a recursive CTE
DECLARE #MyTable TABLE
(
ProductID CHAR(1),
Date DATE,
Amount INT
)
INSERT INTO #MyTable
VALUES
('A','2018-01-01', 5),
('A','2018-02-01', 3),
('A','2018-05-01', 5),
('B','2018-08-01', 10),
('B','2018-10-01', 12)
DECLARE #StartDate DATE
DECLARE #EndDate DATE
SELECT #StartDate = MIN(Date), #EndDate = MAX(Date) FROM #MyTable
;WITH dates AS (
SELECT #StartDate AS Date
UNION ALL
SELECT DATEADD(Month, 1, Date)
FROM dates
WHERE Date < #EndDate
)
SELECT A.ProductID, d.Date, COALESCE(Amount,0) AS Amount, COALESCE(SUM(Amount) OVER(PARTITION BY A.ProductID ORDER BY A.ProductID, d.Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),0) AS Total
FROM
(
SELECT ProductID, MIN(date) as DateStart, MAX(date) as DateEnd
FROM #MyTable
GROUP BY ProductID -- As I read in your comments that you need different min and max dates per product
) A
JOIN dates d ON d.Date >= A.DateStart AND d.Date <= A.DateEnd
LEFT JOIN #MyTable T ON A.ProductID = T.ProductID AND T.Date = d.Date
ORDER BY A.ProductID, d.Date
Try this below
IF OBJECT_ID('tempdb..#Temp') IS NOT NULL
DROP TABLE #Temp
;WITH CTE(Product,[Month],Qty)
AS
(
SELECT 'A','2018-01-01', 5 UNION ALL
SELECT 'A','2018-02-01', 3 UNION ALL
SELECT 'A','2018-05-01', 5 UNION ALL
SELECT 'B','2018-08-01', 10 UNION ALL
SELECT 'D','2018-10-01', 12
)
SELECT ct.Product,[MonthDays],ct.Qty
INTO #Temp
FROM
(
SELECT c.Product,[Month],
ISNULL(Qty,0) AS Qty
FROM CTE c
)ct
RIGHT JOIN
(
SELECT -- This code is to get month data
CONVERT(VARCHAR(10),'2018-'+ RIGHT('00'+CAST(MONTH(DATEADD(MM, s.number, CONVERT(DATETIME, 0)))AS VARCHAR),2) +'-01',120) AS [MonthDays]
FROM master.dbo.spt_values s
WHERE [type] = 'P' AND s.number BETWEEN 0 AND 11
)DT
ON dt.[MonthDays] = ct.[Month]
SELECT
MAX(Product)OVER(ORDER BY [MonthDays])AS Product,
[MonthDays],
ISNULL(Qty,0) Qty,
SUM(ISNULL(Qty,0))OVER(ORDER BY [MonthDays]) As SumQty
FROM #Temp
Result
Product MonthDays Qty SumQty
------------------------------
A 2018-01-01 5 5
A 2018-02-01 3 8
A 2018-03-01 0 8
A 2018-04-01 0 8
A 2018-05-01 5 13
A 2018-06-01 0 13
A 2018-07-01 0 13
B 2018-08-01 10 23
B 2018-09-01 0 23
D 2018-10-01 12 35
D 2018-11-01 0 35
D 2018-12-01 0 35
First of all, i would divide month and year to get easier with statistics.
I will give you an example query, not based on your table but still helpful.
--here i create the table that will be used as calendar
Create Table MA_MonthYears (
Month int not null ,
year int not null
PRIMARY KEY ( month, year) )
--/////////////////
-- here i'm creating a procedure to fill the ma_monthyears table
declare #month as int
declare #year as int
set #month = 1
set #year = 2015
while ( #year != 2099 )
begin
insert into MA_MonthYears(Month, year)
select #month, #year
if #month < 12
set #month=#month+1
else
set #month=1
if #month = 1
set #year = #year + 1
end
--/////////////////
--here you are the possible result you are looking for
select SUM(Ma_saledocdetail.taxableamount) as Sold, MA_MonthYears.month , MA_MonthYears.year , item
from MA_MonthYears left outer join MA_SaleDocDetail on year(MA_SaleDocDetail.DocumentDate) = MA_MonthYears.year
and Month(ma_saledocdetail.documentdate) = MA_MonthYears.Month
group by MA_SaleDocDetail.Item, MA_MonthYears.year , MA_MonthYears.month
order by MA_MonthYears.year , MA_MonthYears.month

SQL: selecting rows where column value changed last time

I need to get the latest date where number is changed I have this SQL statement
Select
a.group, a.date a.number
From
xx.dbo.list a
Where
a.group in ('10, '10NC', '210')
And a.date >= '2018-06-01'
And a.number > 0
And a. number <> (Select Top 1 b.number
From xxx.dbo.list b
Where b.group = a.group
And b.date >= '2018-06-01'
And b.number > 0
And b.date < a.date
Order by b.date desc)
order by a.date desc
I have a table that looks like this
Group date Number
--------------------------
10 2018-02-06 4
10 2018-04-06 4
10 2018-06-12 4
10NC 2018-02-06 68
10NC 2018-04-06 35
10NC 2018-06-11 35
10NC 2018-06-12 68
10NC 2018-06-13 35
210 2018-06-02 94
210 2018-06-04 100
210 2018-06-06 100
210 2018-06-07 93
I get this output now, but I only want to get the rows with X
Group date Number
------------------------------
10NC 2018-06-12 68
10NC 2018-06-13 35 X
210 2018-06-04 100
210 2018-06-07 93 X
Can anyone help?
You would use lag():
select a.*
from (select a.group, a.date, a.number, lag(a.number) over (partition by group order by date) as prev_number
From xx.dbo.list a
where a.group in ('10', '10NC', '210') And
a.date >= '2018-06-01' And
a.number > 0
) a
where prev_number <> number;
Is this what is Expected?
DECLARE #List TABLE ([Group] VARCHAR(100), [Date] DATE, Number INT)
INSERT INTO #List
SELECT '10','2018-02-06',4
UNION ALL
SELECT '10','2018-04-06',4
UNION ALL
SELECT '10','2018-06-12',4
UNION ALL
SELECT '10NC','2018-02-06',68
UNION ALL
SELECT '10NC','2018-04-06',35
UNION ALL
SELECT '10NC','2018-06-11',35
UNION ALL
SELECT '10NC','2018-06-12',68
UNION ALL
SELECT '10NC','2018-06-13',35
UNION ALL
SELECT '210','2018-06-02',94
UNION ALL
SELECT '210','2018-06-04',100
UNION ALL
SELECT '210','2018-06-06',100
UNION ALL
SELECT '210','2018-06-07',93
;WITH CTE AS
(
SELECT
*
,RN = ROW_NUMBER() OVER (Partition by [Group] ORDER BY [DATE] DESC)
FROM #List
WHERE
[Date] >= '2018-06-01'
AND [Group] in ('10', '10NC', '210')
And Number > 0
)
SELECT * FROM CTE WHERE RN = 1
Note: I am posting it directly in answer as i don't have enough reputation to ask questions in comments.

SQL - Find if column dates include at least partially a date range

I need to create a report and I am struggling with the SQL script.
The table I want to query is a company_status_history table which has entries like the following (the ones that I can't figure out)
Table company_status_history
Columns:
| id | company_id | status_id | effective_date |
Data:
| 1 | 10 | 1 | 2016-12-30 00:00:00.000 |
| 2 | 10 | 5 | 2017-02-04 00:00:00.000 |
| 3 | 11 | 5 | 2017-06-05 00:00:00.000 |
| 4 | 11 | 1 | 2018-04-30 00:00:00.000 |
I want to answer to the question "Get all companies that have been at least for some point in status 1 inside the time period 01/01/2017 - 31/12/2017"
Above are the cases that I don't know how to handle since I need to add some logic of type :
"If this row is status 1 and it's date is before the date range check the next row if it has a date inside the date range."
"If this row is status 1 and it's date is after the date range check the row before if it has a date inside the date range."
I think this can be handled as a gaps and islands problem. Consider the following input data: (same as sample data of OP plus two additional rows)
id company_id status_id effective_date
-------------------------------------------
1 10 1 2016-12-15
2 10 1 2016-12-30
3 10 5 2017-02-04
4 10 4 2017-02-08
5 11 5 2017-06-05
6 11 1 2018-04-30
You can use the following query:
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
ORDER BY company_id, effective_date
to get:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 0
2 10 1 2016-12-30 1
3 10 5 2017-02-04 2
4 10 4 2017-02-08 2
5 11 5 2017-06-05 0
6 11 1 2018-04-30 0
Now you can identify status = 1 islands using:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
)
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
Output:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 1
2 10 1 2016-12-30 1
3 10 5 2017-02-04 1
4 10 4 2017-02-08 2
5 11 5 2017-06-05 1
6 11 1 2018-04-30 2
Calculated field grp will help us identify those islands:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
), CTE2 AS
(
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
)
SELECT company_id,
MIN(effective_date) AS start_date,
CASE
WHEN COUNT(*) > 1 THEN DATEADD(DAY, -1, MAX(effective_date))
ELSE MIN(effective_date)
END AS end_date
FROM CTE2
GROUP BY company_id, grp
HAVING COUNT(CASE WHEN status_id = 1 THEN 1 END) > 0
Output:
company_id start_date end_date
-----------------------------------
10 2016-12-15 2017-02-03
11 2018-04-30 2018-04-30
All you want know is those records from above that overlap with the specified interval.
Demo here with somewhat more complicated use case.
Maybe this is what you are looking for? For these kind of questions, you need to join two instance of your table, in this case I am just joining with next record by Id, which probably is not totally correct. To do it better, you can create a new Id using a windowed function like row_number, ordering the table by your requirement criteria
If this row is status 1 and it's date is before the date range check
the next row if it has a date inside the date range
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
else NULL
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
Implementing second criteria:
"If this row is status 1 and it's date is after the date range check
the row before if it has a date inside the date range."
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
when csh1.status_id=1 and csh1.effective_date>#range_en
then
case
when csh3.effective_date between #range_st and #range_en then true
else false
end
else null -- ¿?
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
left join company_status_history csh3
on csh1.id=csh3.id-1
I would suggest the use of a cte and the window functions ROW_NUMBER. With this you can find the desired records. An example:
DECLARE #t TABLE(
id INT
,company_id INT
,status_id INT
,effective_date DATETIME
)
INSERT INTO #t VALUES
(1, 10, 1, '2016-12-30 00:00:00.000')
,(2, 10, 5, '2017-02-04 00:00:00.000')
,(3, 11, 5, '2017-06-05 00:00:00.000')
,(4, 11, 1, '2018-04-30 00:00:00.000')
DECLARE #StartDate DATETIME = '2017-01-01';
DECLARE #EndDate DATETIME = '2017-12-31';
WITH cte AS(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) AS rn
FROM #t
),
cteLeadLag AS(
SELECT c.*, ISNULL(c2.effective_date, c.effective_date) LagEffective, ISNULL(c3.effective_date, c.effective_date)LeadEffective
FROM cte c
LEFT JOIN cte c2 ON c2.company_id = c.company_id AND c2.rn = c.rn-1
LEFT JOIN cte c3 ON c3.company_id = c.company_id AND c3.rn = c.rn+1
)
SELECT 'Included' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Following' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date > #EndDate
AND LagEffective BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Trailing' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date < #EndDate
AND LeadEffective BETWEEN #StartDate AND #EndDate
I first select all records with their leading and lagging Dates and then I perform your checks on the inclusion in the desired timespan.
Try with this, self-explanatory. Responds to this part of your question:
I want to answer to the question "Get all companies that have been at
least for some point in status 1 inside the time period 01/01/2017 -
31/12/2017"
Case that you want to find those id's that have been in any moment in status 1 and have records in the period requested:
SELECT *
FROM company_status_history
WHERE id IN
( SELECT Id
FROM company_status_history
WHERE status_id=1 )
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'
Case that you want to find id's in status 1 and inside the period:
SELECT *
FROM company_status_history
WHERE status_id=1
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'