Select rows where value is equal given value or lower and nearest to it - sql

Sorry for confusing title. Please, tell, if it's possible to do via db request. Assume we have following table
ind_id name value date
----------- -------------------- ----------- ----------
1 a 10 2010-01-01
1 a 20 2010-01-02
1 a 30 2010-01-03
2 b 10 2010-01-01
2 b 20 2010-01-02
2 b 30 2010-01-03
2 b 40 2010-01-04
3 c 10 2010-01-01
3 c 20 2010-01-02
3 c 30 2010-01-03
3 c 40 2010-01-04
3 c 50 2010-01-05
4 d 10 2010-01-05
I need to query all rows to include each ind_id once for the given date, and if there's no ind_id for given date, then take the nearest lower date, if there's no any lower dates, then return ind_id + name (name/ind_id pairs are equal) with nulls.
For example, date is 2010-01-04, I expect following result:
ind_id name value date
----------- -------------------- ----------- ----------
1 a 30 2010-01-03
2 b 40 2010-01-04
3 c 40 2010-01-04
4 d NULL NULL
If it's possible, I'll be very grateful if someone help me with building query. I'm using SQL server 2008.

Check this SQL FIDDLE DEMO
with CTE_test
as
(
select int_id,
max(date) MaxDate
from test
where date<='2010-01-04 00:00:00:000'
group by int_id
)
select A.int_id, A.[Value], A.[Date]
from test A
inner join CTE_test B
on a.int_id=b.int_id
and a.date = b.Maxdate
union all
select int_id, null, null
from test
where int_id not in (select int_id from CTE_test)

(Updated) Try:
with cte as
(select m.*,
max(date) over (partition by ind_id) max_date,
max(case when date <= #date then date end) over
(partition by ind_id) max_acc_date
from myTable m)
select ind_id,
name,
case when max_acc_date is null then null else value end value,
max_acc_date date
from cte c
where date = coalesce(max_acc_date, max_date)
(SQLFiddle here)

Here is a query that returns the result that you are looking for:
SELECT
t1.ind_id
, CASE WHEN t1.date <= '2010-01-04' THEN t1.value ELSE null END
FROM test t1
WHERE t1.date=COALESCE(
(SELECT MAX(DATE)
FROM test t2
WHERE t2.ind_id=t1.ind_id AND t2.date <= '2010-01-04')
, t1.date)
The idea is to pick a row in a correlated query such that its ID matches that of the current row, and the date is the highest one prior to your target date of '2010-01-04'.
When such row does not exist, the date for the current row is returned. This date needs to be replaced with a null; this is what the CASE statement at the top is doing.
Here is a demo on sqlfiddle.

You can use something like:
declare #date date = '2010-01-04'
;with ids as
(
select distinct ind_id
from myTable
)
,ranks as
(
select *
, ranking = row_number() over (partition by ind_id order by date desc)
from myTable
where date <= #date
)
select ids.ind_id
, ranks.value
, ranks.date
from ids
left join ranks on ids.ind_id = ranks.ind_id and ranks.ranking = 1
SQL Fiddle with demo.
Ideally you wouldn't be using the DISTINCT statement to get the ind_id values to include, but I've used it in this case to get the results you needed.
Also, standard disclaimer for these sorts of queries; if you have duplicate data you should consider a tie-breaker column in the ORDER BY or using RANK instead of ROW_NUMBER.
Edited after OPs update
Just add the new column into the existing query:
with ids as
(
select distinct ind_id, name
from myTable
)
,ranks as
(
select *
, ranking = row_number() over (partition by ind_id order by date desc)
from myTable
where date <= #date
)
select ids.ind_id
, ids.name
, ranks.value
, ranks.date
from ids
left join ranks on ids.ind_id = ranks.ind_id and ranks.ranking = 1
SQL Fiddle with demo.
As with the previous one it would be best to get the ind_id/name information through joining to a standing data table if available.

Try
DECLARE #date DATETIME;
SET #date = '2010-01-04';
WITH temp1 AS
(
SELECT t.ind_id
, t.name
, CASE WHEN t.date <= #date THEN t.value ELSE NULL END AS value
, CASE WHEN t.date <= #date THEN t.date ELSE NULL END AS date
FROM test1 AS t
),
temp AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ind_id ORDER BY t.date DESC) AS rn
FROM temp1 AS t
WHERE t.date <= #date OR t.date IS NULL
)
SELECT *
FROM temp AS t
WHERE rn = 1

Use option with EXISTS operator
DECLARE #date date = '20100104'
SELECT ind_id,
CASE WHEN date <= #date THEN value END AS value,
CASE WHEN date <= #date THEN date END AS date
FROM dbo.test57 t
WHERE EXISTS (
SELECT 1
FROM dbo.test57 t2
WHERE t.ind_id = t2.ind_id AND t2.date <= #date
HAVING ISNULL(MAX(t2.date), t.date) = t.date
)
Demo on SQLFiddle

This is not the exact answer but will give you the concept as i just write it down quickly without any testing.
use
go
if
(Select value from table where col=#col1) is not null
--you code to get the match value
else if
(Select LOWER(Date) from table ) is not null
-- your query to get the nerst dtae record
else
--you query withh null value
end

Related

How to insert values based on another column value

Below is a subset of my table (for the first id)
date
id
value
01/01/2022
1
5
08/01/2022
1
2
For each id, the dates are not consecutive (e.g., for id 1, the min date is 01/01/2022 and the max date is 08/01/2022)--there are 7 days in between both dates. I want to insert rows to make the dates for each id consecutive and contiguous - the value for the value field/column to be filled with 0s so that the updated table looks like:
date
id
value
01/01/2022
1
5
02/01/2022
1
0
03/01/2022
1
0
04/01/2022
1
0
05/01/2022
1
0
06/01/2022
1
0
07/01/2022
1
0
08/01/2022
1
2
Any SQL code on how to implement this would be highly appreciated. I have a calendar table but am unsure how to join it with the above table so that I fill in missing dates dynamically for each id with 0s.
My calendar table looks like:
date
01/01/2022
02/01/2022
03/01/2022
04/01/2022
Considering you state you have a calendar table, it seems what you need to do with JOIN to it with the MIN and MAX dates from your other table, and the LEFT JOIN back to your table:
WITH MinMax AS(
SELECT ID,
MIN(date) AS MinDate,
MAX(date) AS MaxDate
FROM dbo.YourTable
GROUP BY ID),
Dates AS(
SELECT MM.ID,
C.CalendarDate AS [Date]
FROM MinMax MM
JOIN dbo.CalendarTable C ON MM.MinDate <= C.CalendarDate
AND MM.MaxDate >= C.CalendarDate)
SELECT D.ID,
D.[Date],
ISNULL(YT.[Value],0) AS [Value]
FROM Dates D
LEFT JOIN dbo.YourTable YT ON D.ID = YT.ID
AND D.[Date] = YT.[Date];
SET DATEFORMAT DMY
-- CREATE A TABLE WITH OUR INPUT DATA
DROP TABLE IF EXISTS #TheData
GO
CREATE TABLE #TheData
(TheDate DATE, id INT, TheValue INT)
INSERT INTO #TheData
(TheDate,id,Thevalue)
VALUES
('01/01/2022',1,5),
('08/01/2022',1,2),
('17/01/2022',2,7),
('25/01/2022',2,7),
('15/02/2022',2,7)
-- CREATE A CALENDAR CTE
DECLARE #StartDate date = '20210101';
DECLARE #CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 2, #StartDate));
;WITH DateSeq(TheDate) AS
(
SELECT #StartDate
UNION ALL
SELECT DATEADD(dd,1,TheDate) FROM DateSeq
WHERE TheDate < #CutoffDate
)
-- CROSS JOIN OUR CALENDAR CTE TO OUR SOURCE DATA. DERIVED TABLE TO GET FIRST AND LAST OF EACH RANGE TO USE FOR JOIN
SELECT
ds.*
,SourceDataRangesByID.ID
,ISNULL(td.TheValue,0) AS TheValue
FROM
DateSeq ds
CROSS JOIN
(
SELECT
d.ID
,MIN(d.TheDate) AS MinDatePerID
,MAX(d.TheDate) AS MaxDatePerID
FROM #TheData d
GROUP BY d.ID
) SourceDataRangesByID
LEFT JOIN #TheData td ON td.id = SourceDataRangesByID.ID AND td.TheDate = ds.TheDate
WHERE ds.TheDate >= SourceDataRangesByID.MinDatePerID
AND ds.TheDate <= SourceDataRangesByID.MaxDatePerID
OPTION (MAXRECURSION 0);
try the generate_series to create a date table then right join with it and coalesce for the non null value
SELECT generate_series('2016-01-01', -- series start date
'2018-06-30', -- series end date
'1 day'::interval)::date AS day) AS daily_series
from mytable
See Generate_Series for TSQL
https://dba.stackexchange.com/questions/255165/does-ms-sql-server-have-generate-series-function
(Sql server 2022)
https://learn.microsoft.com/en-us/sql/t-sql/functions/generate-series-transact-sql?view=sql-server-ver16

Based on todays date, how to get the date of the penultimate working day?

I try to figure out, how I can get the penultimate workingday from todays date.
In my query, I would like to add an where clause where a specific date is <= today´s date minus 2 working days.
Like:
SELECT
SalesAmount
,SalesDate
FROM mytable t
JOIN D_Calendar c ON t.Date = c.CAL_DATE
WHERE SalesDate <= GETDATE()- 2 workingdays
I have a calendar table with a column "isworkingDay" in my database and I think i have to use this but i don´t know how?!
Structure of this table is like:
CAL_DATE
DayIsWorkDay
2022-07-28
1
2022-07-29
1
2022-07-30
0
2022-07-31
0
2022-08-01
1
One example: Today is Monday, August 01, 2022. So based on today, I need to get Thursday, July 28 2022.
My desired result in the where clause should get me something like this:
where SalesDate<= Getdate() minus 2 workingdays
Thanks for your ideas!
You could use something like this:
SELECT t.SalesDate,
PreviousWorkingDay = d.CAL_DATE
FROM mytable t
CROSS APPLY
( SELECT c.CAL_DATE
FROM D_Calendar AS c
WHERE c.CAL_DATE < t.SalesDate
AND c.DayIsWorkDay = 1
ORDER BY c.CAL_DATE DESC OFFSET 1 ROWS FETCH NEXT 1 ROW ONLY
) AS d;
It uses OFFSET 1 ROWS within the CROSS APPLY to get the penultimate working day
This is how i implemented the idea from #SMor:
SELECT
SalesAmount
,SalesDate
FROM mytable t
JOIN D_Calendar c ON t.Date = c.CAL_DATE
WHERE SalesDate <= (SELECT
MIN(t1.CAL_DATE) as MinDate
FROM
(SELECT TOP 2
[CAL_DATE]
FROM [DWH_PROD].[cbi].[D_Calendar]
WHERE CAL_DAYISWORKDAY = 1 AND CAL_DATE < DATEADD(dd,0,DATEDIFF(dd,0,GETDATE()))
ORDER BY CAL_DATE DESC
) t1)
Thank you for your ideas and recommendations!
You can use a ROW_NUMBER() OVER(ORDER BY CAL_DATE desc) getting get the top 2 rows then take the row with number 2.
Example:
-- setup
Declare #D_Calendar as Table (CAL_DATE date, DayIsWorkDay bit)
insert into #D_Calendar values('2022-07-27', 1)
insert into #D_Calendar values('2022-07-28', 1)
insert into #D_Calendar values('2022-07-29', 1)
insert into #D_Calendar values('2022-07-30', 0)
insert into #D_Calendar values('2022-07-31', 0)
insert into #D_Calendar values('2022-08-01', 1)
Declare #RefDate DateTime = '2022-08-01 10:00'
-- example query
Select CAL_DATE
From
(Select top 2 ROW_NUMBER() OVER(ORDER BY CAL_DATE desc) AS BusinessDaysBack, CAL_DATE
from #D_Calendar
where DayIsWorkDay = 1
and CAL_DATE < Cast(#RefDate as Date)) as Data
Where BusinessDaysBack = 2
From there you can plug that into your where clause to get :
SELECT
SalesAmount
,SalesDate
FROM mytable t
WHERE SalesDate <= (Select CAL_DATE
From (Select top 2 ROW_NUMBER() OVER(ORDER BY CAL_DATE desc) AS BusinessDaysBack, CAL_DATE
from D_Calendar
where DayIsWorkDay = 1
and CAL_DATE < Cast(getdate() as Date)) as Data
Where BusinessDaysBack = 2)
Change the 2 to 3 to go three days back etc

how to get unique row numbers in sql

How to get only the first row from the result of the below query. I need the latest record for each date so I did the partition by created_date. But in some places, I am getting the same row number and not able to get the expected output. Please find the below query, current output, and expected output.
What changes do in need to make in order to get the expected output? Thank you.
WITH ctetable
AS (
SELECT created_date BPMDate
,tenor
,row_number() OVER (
PARTITION BY created_date ORDER BY created_date DESC
) rw
FROM table1 a
INNER JOIN table2 b ON a.case_id = b.case_id
AND a.eligible_transaction = 'true'
AND to_date(a.created_date) >= '2020-10-01'
AND to_date(a.created_date) <= '2020-10-05'
AND case_status = 'Completed'
)
SELECT BPMDate
,Tenor
,rw
FROM ctetable
Current output:
date tenor rw
2020-10-05 13:24:15.0 1W 1
2020-10-05 12:15:43.0 1Y 1
2020-10-05 12:15:43.0 1Y 2
2020-10-01 13:30:59.0 1W 1
2020-10-01 13:30:59.0 1W 2
Expected output:
date tenor rw
2020-10-05 13:24:15.0 1W 1
2020-10-01 13:30:59.0 1W 1
Regards,
Viresh
That would be:
with ctetable as (
select created_date, bpmdate, tenor,
row_number() over (partition by date(created_date) order by created_date desc ) rn
from table1 a
inner join table2 b
on a.case_id = b.case_id
and a.eligible_transaction = 'true'
and to_date(a.created_date) >= '2020-10-01'
and to_date(a.created_date) <= '2020-10-05'
and case_status='completed'
)
select bpmdate,tenor,rw
from ctetable
where rn = 1
Changes to your original code:
you need to remove the time portion of the date in the partition by clause of the window function; you didn't tell which database you are using: I used date(), but the function might be different in your database (trunc() in Oracle, date_trunc() in Postgres, and so on)
the outer query needs to filter on the row number that is equal to 1
You seem to want the first row per day:
select BPMDate, Tenor, rw
from (select t.*,
row_number() over (partition by trunc(bpmdate) order by bpmdate) as seqnum
from ctetable
) t
where seqnum = 1;
Note: I don't know if your database supports trunc(), but that is simply some method for extracting the date from the column.

Ignoring Duplicate Records SQL

In need of some help :)
So I have a table of records with the following columns:
Key (PK, FK, int) DT (smalldatetime) Value (real)
The DT is a datetime for every half hour of the day with an associated value
E.g.
Key DT VALUE
1000 2010-01-01 08:00:00 80
1000 2010-01-01 08:30:00 75
1000 2010-01-01 09:00:00 100
I have a Query that finds the max value every 24 hour period and its associated time however, on one day the max value occurs twice and hence duplicates the date which is causing processing issues. I have tried using rownumber() which works but I can't use a calculated column in my where clause?
Currently I have:
SELECT cast(T1.DT as date) as 'Date',Cast(T1.DT as time(0)) as 'HH', ROW_NUMBER() over (PARTITION BY cast(DT as date) ORDER BY DT) AS 'RowNumber'
FROM TABLE_1 AS T1
INNER JOIN (
SELECT CAST([DT] as date) as 'DATE'
, MAX([VALUE]) as 'MAX_HH'
FROM TABLE_1
WHERE DT > '6-nov-2016' and [KEY] = '1000'
GROUP BY CAST([DT] as date)
) AS MAX_DT
ON MAX_DT.[DATE] = CAST(T1.[DT] as date)
AND T1.VALUE = MAX_DT.MAX_HH
WHERE DT > '6-nov-2016' and [KEY] = '1000'
ORDER BY DT
This results in
Key DT VALUE HH
1000 2010-01-01 80 07:00:00
1000 2010-02-01 100 17:30:00
1000 2010-02-01 100 18:00:00
I need to remove the duplicate date (I Have no preference which HH it takes)
I think I've explained that terribly, let me know if it makes no sense and i'll try and re write
Any ideas?
Can you try this the new code is in ** **:
SELECT cast(T1.DT as date) as 'Date', ** MIN(Cast(T1.DT as time(0))) as 'HH' **
FROM TABLE_1 AS T1
INNER JOIN (
SELECT CAST([DT] as date) as 'DATE'
, MAX([VALUE]) as 'MAX_HH'
FROM TABLE_1
WHERE DT > '6-nov-2016' and [KEY] = '1000'
GROUP BY CAST([DT] as date)
) AS MAX_DT
ON MAX_DT.[DATE] = CAST(T1.[DT] as date)
AND T1.VALUE = MAX_DT.MAX_HH
WHERE DT > '6-nov-2016' and [KEY] = '1000'
here put the group by
GROUP BY cast(T1.DT as date)
ORDER BY DT
i would do something like this
i didnt try it but i think it s correct.
SELECT cast(T1.DT as date) as 'Date',Cast(T1.DT as time(0)) as 'HH', VALUE
FROM TABLE_1 T1
WHERE [DT] IN (
--select the max date from Table_1 for each day
SELECT MAX([DT]) max_date FROM TABLE_1
WHERE (CAST([DT] as date) ,value) IN
(
SELECT CAST([DT] as date) as 'CAST_DATE'
,MAX([VALUE]) as 'MAX_HH'
FROM TABLE_1
WHERE DT > '6-nov-2016' and [KEY] = '1000'
GROUP BY CAST([DT] as date
)group by [DT]
)
WHERE DT > '6-nov-2016' and [KEY] = '1000'
Change the JOIN to an APPLY.
The APPLY operation will allow you to limit the connected relation to just one result for each source relation.
SELECT v.[Key], cast(v.DT As Date) as "Date", v.[Value], cast(v.DT as Time(0)) as "HH"
FROM
( -- First a projection to get just the exact dates you want
SELECT DISTINCT [Key], CAST(DT as DATE) as DT
FROM Table_1
WHERE [Key] = '1000' AMD DT > '20161106'
) dates
CROSS APPLY (
-- Then use APPLY rather than JOIN to find just the exact one record you need for each date
SELECT TOP 1 *
FROM Table_1
WHERE [Key] = dates.[Key] AND cast(DT as DATE) = dates.DT ORDER BY [Value] DESC
) v
A final note: Both this query and your sample query in the question will include values from Nov 6, 2016. The query says > 2016-11-05 with an exlusive inequality, but the original was still comparing using full DateTime values, meaning there is a implied 0 as a time component. So 12:01 AM on Nov 6 is still greater than 12:00:00.001 AM on Nov 6. If you want to exclude all Nov 6 dates from the query, you either need to change this to use a time value at the end of the date, or cast to date before making that > comparison.
With SQL you can use SELECT DISTINCT,
The SELECT DISTINCT statement is used to return only distinct (different) values.
Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.
The SELECT DISTINCT statement is used to return only distinct (different) values.

DB2 SQL Pairing Dates

I am trying to pair up dates that I am getting from my SQL. The output at the moment looks something like this:
start_date end_date
2015-02-02 2015-02-02
2015-02-02 2015-02-03
2015-02-03 2015-02-03
2015-04-12 2015-02-12
I would like the ouput to be paired up so that the smallest and the biggest date of a date group is chosen, so that the output would look like this:
start_date end_date
2015-02-02 2015-02-03
2015-04-12 2015-02-12
Using the first response I get something like this, I believe I have formatted this wrong, I am getting the same date pairs as before, but it does run.
select min(date), max(date)
from (select date,
sum(case when sum(inc) = 0 then 1 else 0 end) over (order by date desc) as grp
from (select t1.datev as date, 1 as inc
from table2 t1,
table3 c,
table4 cr
where t1.datev between date(c.e_start_date) and date(c.e_end_date)
and t1.datev not in (select date(temp.datev) from mdmins11.temp temp where temp.number < 4000 and temp.organisation_id = 11111)
and c.tp_cd in (1,6)
and cr.from_id = c.id
and cr.organisation_id = 11111
union all
select t.datev as date, -1 as inc
from table1 t,
table3 c,
table4 cr
where t.datev between date(c.e_start_date) and date(c.e_end_date)
and t.datev not in (select date(temp.datev) from mdmins11.temp temp where temp.number < 4000 and temp.organisation_id = 11111)
and c.tp_cd in (1,6)
and cr.from_id = c.id
and cr.organisation_id = 11111
) t
group by date
) t
group by grp;
One method is to determine where groups of non-overlapping dates start. For this, you can use not exists. Then count up this flag over all records. This uses window functions. However, this poses problems because you have multiple starts on the same date.
Another method is to keep track of starts and stops and note where the sum is zero. These represent boundaries between groups. The following should work on your data:
select min(date), max(date)
from (select date,
sum(case when sum(inc) = 0 then 1 else 0 end) over (order by date desc) as grp
from (select start_date as date, 1 as inc
from table
union all
select end_date as date, -1 as inc
from table
) t
group by date
) t
group by grp;
This type of problem is made more complicated when duplicate values are allowed on a given date. Given only the dates, this is challenging. With a separate unique id for each row, then there are more robust solutions.
EDIT:
A more robust solution:
select min(start_date), max(end_date)
from (select t.*, sum(StartGroupFlag) over (order by start_date) as grp
from (select t.*,
(case when not exists (select 1
from table t2
where t2.start_date < t.start_date and
t2.end_date >= t.start_date
)
then 1 else 0
end) as StartGroupFlag
from table t
) t
) t
group by grp;