How to insert values based on another column value - sql

Below is a subset of my table (for the first id)
date
id
value
01/01/2022
1
5
08/01/2022
1
2
For each id, the dates are not consecutive (e.g., for id 1, the min date is 01/01/2022 and the max date is 08/01/2022)--there are 7 days in between both dates. I want to insert rows to make the dates for each id consecutive and contiguous - the value for the value field/column to be filled with 0s so that the updated table looks like:
date
id
value
01/01/2022
1
5
02/01/2022
1
0
03/01/2022
1
0
04/01/2022
1
0
05/01/2022
1
0
06/01/2022
1
0
07/01/2022
1
0
08/01/2022
1
2
Any SQL code on how to implement this would be highly appreciated. I have a calendar table but am unsure how to join it with the above table so that I fill in missing dates dynamically for each id with 0s.
My calendar table looks like:
date
01/01/2022
02/01/2022
03/01/2022
04/01/2022

Considering you state you have a calendar table, it seems what you need to do with JOIN to it with the MIN and MAX dates from your other table, and the LEFT JOIN back to your table:
WITH MinMax AS(
SELECT ID,
MIN(date) AS MinDate,
MAX(date) AS MaxDate
FROM dbo.YourTable
GROUP BY ID),
Dates AS(
SELECT MM.ID,
C.CalendarDate AS [Date]
FROM MinMax MM
JOIN dbo.CalendarTable C ON MM.MinDate <= C.CalendarDate
AND MM.MaxDate >= C.CalendarDate)
SELECT D.ID,
D.[Date],
ISNULL(YT.[Value],0) AS [Value]
FROM Dates D
LEFT JOIN dbo.YourTable YT ON D.ID = YT.ID
AND D.[Date] = YT.[Date];

SET DATEFORMAT DMY
-- CREATE A TABLE WITH OUR INPUT DATA
DROP TABLE IF EXISTS #TheData
GO
CREATE TABLE #TheData
(TheDate DATE, id INT, TheValue INT)
INSERT INTO #TheData
(TheDate,id,Thevalue)
VALUES
('01/01/2022',1,5),
('08/01/2022',1,2),
('17/01/2022',2,7),
('25/01/2022',2,7),
('15/02/2022',2,7)
-- CREATE A CALENDAR CTE
DECLARE #StartDate date = '20210101';
DECLARE #CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 2, #StartDate));
;WITH DateSeq(TheDate) AS
(
SELECT #StartDate
UNION ALL
SELECT DATEADD(dd,1,TheDate) FROM DateSeq
WHERE TheDate < #CutoffDate
)
-- CROSS JOIN OUR CALENDAR CTE TO OUR SOURCE DATA. DERIVED TABLE TO GET FIRST AND LAST OF EACH RANGE TO USE FOR JOIN
SELECT
ds.*
,SourceDataRangesByID.ID
,ISNULL(td.TheValue,0) AS TheValue
FROM
DateSeq ds
CROSS JOIN
(
SELECT
d.ID
,MIN(d.TheDate) AS MinDatePerID
,MAX(d.TheDate) AS MaxDatePerID
FROM #TheData d
GROUP BY d.ID
) SourceDataRangesByID
LEFT JOIN #TheData td ON td.id = SourceDataRangesByID.ID AND td.TheDate = ds.TheDate
WHERE ds.TheDate >= SourceDataRangesByID.MinDatePerID
AND ds.TheDate <= SourceDataRangesByID.MaxDatePerID
OPTION (MAXRECURSION 0);

try the generate_series to create a date table then right join with it and coalesce for the non null value
SELECT generate_series('2016-01-01', -- series start date
'2018-06-30', -- series end date
'1 day'::interval)::date AS day) AS daily_series
from mytable
See Generate_Series for TSQL
https://dba.stackexchange.com/questions/255165/does-ms-sql-server-have-generate-series-function
(Sql server 2022)
https://learn.microsoft.com/en-us/sql/t-sql/functions/generate-series-transact-sql?view=sql-server-ver16

Related

Convert Dates range column into months

I have table call students and it has three columns ID, DateFrom, DateTo now I wanted to convert two Dates columns range Datefrom and Dateto into the one column months as output.
The table structure is as below. Date format (yyyy-mm-dd)
ID DateFrom DateTo
123 2019-12-03 2020-02-03
456 2020-02-03 2020-02-21
Output Structures
ID Months
123 2019-12
123 2020-01
123 2020-02
456 2020-02
The easy way to do it would be to use an auxiliary calendar table - that would make a simple join between both tables:
select distinct a.id, convert(char(7), b.[Date], 120) as month
from yourTable as a
join Calendar as b
on a.DateFrom <= b.[Date]
and a.DateTo >= b.[Date]
;
If you don't have a calendar table, you might want to consider adding one.
If you can't (or don't want to) add a calendar table, you can get the same result using a numbers (tally) table.
If you don't have a tally table you can generate one on the fly, using a common table expression.
Here's how your SQL would look like in that case:
WITH N10 AS
(
SELECT n
FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9))V(n)
), Tally AS
(
SELECT TOP (SELECT MAX(DATEDIFF(MONTH, DateFrom, DateTo)) + 1 FROM yourTable) ROW_NUMBER() OVER(ORDER BY ##SPID) -1 As n
FROM N10 As Ten
CROSS JOIN N10 As Hundred
--CROSS JOIN N10 As Thousand
)
SELECT [ID], CONVERT(char(7), Months, 120) As Months
FROM yourTable
CROSS APPLY
(
SELECT N, DATEADD(MONTH, N, [DateFrom]) As Months
FROM Tally
) X
WHERE Months <= EOMONTH([DateTo])
ORDER BY Id, Months
let's say that you table is called "dataval"
try this
;WITH minmax AS (
SELECT Min(Eomonth(datefrom)) AS MinDate,
Max(Eomonth(dateto)) AS MaxDate
FROM dataval),
months (date) AS (
SELECT mindate
FROM minmax
UNION ALL
SELECT Dateadd(month, 1, date)
FROM months
WHERE Dateadd(month, 1, date) <= (SELECT maxdate FROM minmax))
SELECT ID, FORMAT (date, 'yyyy-MM') Months
FROM dataval
INNER JOIN months
ON months.date BETWEEN Eomonth(dataval.datefrom) AND Eomonth(dataval.dateto)
fiddle here

Generate data between two range date by some values

I have 2 dates, StartDate and EndDate:
Declare #StartDate date='2018/01/01', #Enddate date ='2018/12/31'
Then there is some data with a date and value in a mytable table:
----------------------------
ID date value
----------------------------
1 2018/02/14 4
2 2018/09/26 7
3 2017/09/20 2
data maybe start before 2018 and if it exist before #startdate get before values
else get 0
I'm looking to get a result that looks like this:
-----------------------------------
fromdate todate value
-----------------------------------
2018/01/01 2018/02/13 2
2018/02/14 2018/09/25 4
2018/09/26 2018/12/31 7
The first fromdate comes from #StartDate and the last todate is from #Enddate, and the other data should be generated.
I'm hoping to get this in an SQL query. I use sql-server 2016
You could use a CTE to create your full range of dates, and then LEAD to create the ToDate column:
DECLARE #FromDate date = '20180101',
#ToDate date = '20181231';
WITH VTE AS(
SELECT ID,
CONVERT(date,[date]) [date], --This is why using keywords for column names is a bad idea
[value]
FROM (VALUES(1,'20180214',4),
(2,'20180926',7),
(3,'20170314',4))V(ID,[date],[value])),
Dates AS(
SELECT [date]
FROM VTE V
WHERE V.[date] BETWEEN #FromDate and #ToDate
UNION ALL
SELECT [date]
FROM (VALUES(#FromDate))V([date]))
SELECT D.[date] AS FromDate,
LEAD(DATEADD(DAY, -1,D.[date]),1,#ToDate) OVER (ORDER BY D.[date]) AS ToDate,
ISNULL(V.[value],0) AS [value]
FROM Dates D
LEFT JOIN VTE V ON D.[date] = V.[date];
db<>fiddle
with cte as
(
select 0 as row_num, #StartDate as start_date, 0 as val
UNION
select ROW_NUMBER() OVER(ORDER BY start_date) as row_num, * from input
)
select curr.start_date
, DATEADD(day,-1,ISNULL(nex.start_date,DATEADD(day,1,#Enddate))) as end_date
, curr.val
from cte curr
left join cte nex on curr.row_num = nex.row_num - 1;
You can find the simulation here: https://rextester.com/EIAXW23839

Return row with 0 for dates which has no entry in table - SQL

I have a table that records daily sales data. However, there are days when no sale is made and hence there is no record on the database for those dates. Is it possible to extract data out from the table that returns null for these dates when no sale was made
Referring to the image attached, it is seen there is no sales done on Jan 4 and Jan 8. I would like to write a SQL query that would return all dates from Jan 1 - Jan 10 but for Jan 4 and Jan 8, it should return 0 since there is no row for those dates (no sale done)
My date starts from Mar 1, 2018 and should go on for the next few quarters.
Yes. In Postgres, you can use generate_series() to generate dates or numbers within a range.
Then, you can use a cross join to generate the rows and then a left join to bring in the data:
select s.seller, gs.dte, t.count
from (select generate_series(mindate::timestamp, maxdate::timestamp, interval '1 day')::date
from (select min(date) as mindate, max(date) as maxdate
from t
) x
) gs(dte) cross join
(select distinct seller from t) s left join
t
on t.date = gs.dte and t.seller = s.seller
CTE is also an alternative here,
DECLARE #FDATE DATE = '2018-01-01'
,#TDATE DATE = '2018-01-10'
;WITH CTE_DATE
AS (
SELECT #FDATE AS CDATE
UNION ALL
SELECT DATEADD(DAY,1,CDATE)
FROM CTE_DATE
WHERE DATEADD(DAY,1,CDATE) <= #TDATE
)
SELECT C.CDATE AS [DATE],COUNT(*) AS [COUNT]
FROM CTE_DATE AS C
LEFT OUTER JOIN [MY_TABLE] AS M ON C.CDATE = M.[DATE] --*[your table here]*
GROUP BY C.CDATE
OPTION ( MAXRECURSION 0 );

SELECT DateTime not in SQL

I have the following table:
oDateTime pvalue
2017-06-01 00:00:00 70
2017-06-01 01:00:00 65
2017-06-01 02:00:00 90
ff.
2017-08-01 08:00:00 98
The oDateTime field is an hourly data which is impossible to have a duplicate value.
My question is, how can I know if the oDateTime data is correct? I meant, I need to make sure the data is not jump? It should be always 'hourly' base.
Am I missing the date? Am I missing the time?
Please advice. Thank you.
Based on this answer, you can get the missing times form your table MyLogTable it like this:
DECLARE #StartDate DATETIME = '20170601', #EndDate DATETIME = '20170801'
SELECT DATEADD(hour, nbr - 1, #StartDate)
FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY c.object_id ) AS Nbr
FROM sys.columns c
) nbrs
WHERE nbr - 1 <= DATEDIFF(hour, #StartDate, #EndDate) AND
NOT EXISTS (SELECT 1 FROM MyLogTable WHERE DATEADD(hour, nbr - 1, #StartDate)= oDateTime )
If you need to check longer period, you can just add CROSS JOIN like this
FROM sys.columns c
CROSS JOIN sys.columns c1
It enables you to check much more than cca thousand records (rowcount of sys.columns table) in one query.
Since your table is not having any unique id number, use a row_number() to get the row number in the cte , then perform an self inner join with the row id and next id ,take the difference of oDateTime accordingly, this will show exactly which row do not have time difference of one hour
;with cte(oDateTime,pValue,Rid)
As
(
select *,row_number() over(order by oDateTime) from [YourTableName] t1
)
select *,datediff(HH,c1.oDateTime,c2.oDateTime) as HourDiff from cte c1
inner join cte c2
on c1.Rid=c2.Rid-1 where datediff(HH,c1.oDateTime,c2.oDateTime) >1
You could use DENSE_RANK() for numbering the hours in a day from 1 to 24. Then all you have to do is to check whether the max rank is 24 or not for a day. if there is at least one entry for each hour, then dense ranking will have max value of 24.
Use the following query to find the date when you have a oDateTime missing.
SELECT [date]
FROM
(
SELECT *
, CAST(oDateTime AS DATE) AS [date]
, DENSE_RANK() OVER(PARTITION BY CAST(oDateTime AS DATE) ORDER BY DATEPART(HOUR, oDateTime)) AS rank_num
FROM Test
) AS t
GROUP BY [date]
HAVING(MAX(rank_num) != 24);
If you need validation for each row of oDateTime, you could do self join based on rank and get the missing hour for each oDateTime.
Perhaps you are looking for this? This will return dates having count < 24 - which indicates a "jump"
;WITH datecount
AS ( SELECT CAST(oDateTime AS DATE) AS [date] ,
COUNT(CAST(oDateTime AS DATE)) AS [count]
FROM #temp
GROUP BY ( CAST(oDateTime AS DATE) )
)
SELECT *
FROM datecount
WHERE [count] < 24;
EDIT: Since you changed the requirement from "How to know if there is missing" to "What is the missing", here's an updated query.
DECLARE #calendar AS TABLE ( oDateTime DATETIME )
DECLARE #min DATETIME = (SELECT MIN([oDateTime]) FROM #yourTable)
DECLARE #max DATETIME = (SELECT MAX([oDateTime]) FROM #yourTable)
WHILE ( #min <= #max )
BEGIN
INSERT INTO #calendar
VALUES ( #min );
SET #min = DATEADD(hh, 1, #min);
END;
SELECT t1.[oDateTime]
FROM #calendar t1
LEFT JOIN #yourTable t2 ON t1.[oDateTime] = t2.[oDateTime]
GROUP BY t1.[oDateTime]
HAVING COUNT(t2.[oDateTime]) = 0;
I first created a hourly calendar based on your MAX and MIN Datetime, then compared your actual table to the calendar to find out if there is a "jump".

Select rows where value is equal given value or lower and nearest to it

Sorry for confusing title. Please, tell, if it's possible to do via db request. Assume we have following table
ind_id name value date
----------- -------------------- ----------- ----------
1 a 10 2010-01-01
1 a 20 2010-01-02
1 a 30 2010-01-03
2 b 10 2010-01-01
2 b 20 2010-01-02
2 b 30 2010-01-03
2 b 40 2010-01-04
3 c 10 2010-01-01
3 c 20 2010-01-02
3 c 30 2010-01-03
3 c 40 2010-01-04
3 c 50 2010-01-05
4 d 10 2010-01-05
I need to query all rows to include each ind_id once for the given date, and if there's no ind_id for given date, then take the nearest lower date, if there's no any lower dates, then return ind_id + name (name/ind_id pairs are equal) with nulls.
For example, date is 2010-01-04, I expect following result:
ind_id name value date
----------- -------------------- ----------- ----------
1 a 30 2010-01-03
2 b 40 2010-01-04
3 c 40 2010-01-04
4 d NULL NULL
If it's possible, I'll be very grateful if someone help me with building query. I'm using SQL server 2008.
Check this SQL FIDDLE DEMO
with CTE_test
as
(
select int_id,
max(date) MaxDate
from test
where date<='2010-01-04 00:00:00:000'
group by int_id
)
select A.int_id, A.[Value], A.[Date]
from test A
inner join CTE_test B
on a.int_id=b.int_id
and a.date = b.Maxdate
union all
select int_id, null, null
from test
where int_id not in (select int_id from CTE_test)
(Updated) Try:
with cte as
(select m.*,
max(date) over (partition by ind_id) max_date,
max(case when date <= #date then date end) over
(partition by ind_id) max_acc_date
from myTable m)
select ind_id,
name,
case when max_acc_date is null then null else value end value,
max_acc_date date
from cte c
where date = coalesce(max_acc_date, max_date)
(SQLFiddle here)
Here is a query that returns the result that you are looking for:
SELECT
t1.ind_id
, CASE WHEN t1.date <= '2010-01-04' THEN t1.value ELSE null END
FROM test t1
WHERE t1.date=COALESCE(
(SELECT MAX(DATE)
FROM test t2
WHERE t2.ind_id=t1.ind_id AND t2.date <= '2010-01-04')
, t1.date)
The idea is to pick a row in a correlated query such that its ID matches that of the current row, and the date is the highest one prior to your target date of '2010-01-04'.
When such row does not exist, the date for the current row is returned. This date needs to be replaced with a null; this is what the CASE statement at the top is doing.
Here is a demo on sqlfiddle.
You can use something like:
declare #date date = '2010-01-04'
;with ids as
(
select distinct ind_id
from myTable
)
,ranks as
(
select *
, ranking = row_number() over (partition by ind_id order by date desc)
from myTable
where date <= #date
)
select ids.ind_id
, ranks.value
, ranks.date
from ids
left join ranks on ids.ind_id = ranks.ind_id and ranks.ranking = 1
SQL Fiddle with demo.
Ideally you wouldn't be using the DISTINCT statement to get the ind_id values to include, but I've used it in this case to get the results you needed.
Also, standard disclaimer for these sorts of queries; if you have duplicate data you should consider a tie-breaker column in the ORDER BY or using RANK instead of ROW_NUMBER.
Edited after OPs update
Just add the new column into the existing query:
with ids as
(
select distinct ind_id, name
from myTable
)
,ranks as
(
select *
, ranking = row_number() over (partition by ind_id order by date desc)
from myTable
where date <= #date
)
select ids.ind_id
, ids.name
, ranks.value
, ranks.date
from ids
left join ranks on ids.ind_id = ranks.ind_id and ranks.ranking = 1
SQL Fiddle with demo.
As with the previous one it would be best to get the ind_id/name information through joining to a standing data table if available.
Try
DECLARE #date DATETIME;
SET #date = '2010-01-04';
WITH temp1 AS
(
SELECT t.ind_id
, t.name
, CASE WHEN t.date <= #date THEN t.value ELSE NULL END AS value
, CASE WHEN t.date <= #date THEN t.date ELSE NULL END AS date
FROM test1 AS t
),
temp AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ind_id ORDER BY t.date DESC) AS rn
FROM temp1 AS t
WHERE t.date <= #date OR t.date IS NULL
)
SELECT *
FROM temp AS t
WHERE rn = 1
Use option with EXISTS operator
DECLARE #date date = '20100104'
SELECT ind_id,
CASE WHEN date <= #date THEN value END AS value,
CASE WHEN date <= #date THEN date END AS date
FROM dbo.test57 t
WHERE EXISTS (
SELECT 1
FROM dbo.test57 t2
WHERE t.ind_id = t2.ind_id AND t2.date <= #date
HAVING ISNULL(MAX(t2.date), t.date) = t.date
)
Demo on SQLFiddle
This is not the exact answer but will give you the concept as i just write it down quickly without any testing.
use
go
if
(Select value from table where col=#col1) is not null
--you code to get the match value
else if
(Select LOWER(Date) from table ) is not null
-- your query to get the nerst dtae record
else
--you query withh null value
end