tsql most effective way to compare three date values in where clause? - sql

I am trying to create a stored procedure.
What is most effective way to compare three date values in where clause?
Example:
tbl1.Date1,
tbl2.Date2, -- NOTE: Date2 can be NULL.
tbl3.Date3
Example data:
Date1 Date2 Date3
2016-12-20 2016-11-21 2016-11-30
2016-11-21 NULL 2016-12-20
First, I compare Date1 and Date2 and I choose "bigger" date.
Then I compare this "bigger" date to Date3.
If comparsion is true, I write values to table.
-- This is simplified example:
INSERT INTO records
(
[user_date],
[user_name]
)
SELECT
tbl1.Date1,
tbl1.user_name
FROM
table1 AS tbl1
INNER JOIN table2 AS tbl2 ON tbl1.id = tbl2.id
INNER JOIN table3 AS tbl3 ON tbl2.id = tbl3.id
WHERE
-- I need to know what is bigger, Date1 or Date2, so I can compare correct date to Date3.
ISNULL(tbl2.Date2, tbl1.Date1) <= tbl3.Date3 -- ISNULL, doesn't work here, because Date2 and Date1 can get a value and comparison fails if Date1 is bigger than Date3.
AND ISNULL(tbl2.Date2, tbl1.Date1) > tbl3.last_date

Perhaps something like this?
Declare #tbl1 table (id int,date1 date);Insert Into #tbl1 values (1,'2016-12-20'),(2,'2016-11-21 ');
Declare #tbl2 table (id int,date2 date);Insert Into #tbl2 values (1,'2016-11-21'),(2,null);
Declare #tbl3 table (id int,date3 date);Insert Into #tbl3 values (1,'2016-12-30'),(2,'2016-12-20');
Select User_Date = (Select max(d) from (values(date1),(date2),(date3)) D(D))
,A.ID
From #tbl1 A
Join #tbl2 B on A.ID=B.ID
Join #tbl3 C on A.ID=C.ID
Returns
User_Date ID
2016-12-30 1
2016-12-20 2

You can use a case expression, or inline if, to conditionally return the bigger of the two dates, for comparison.
Sample Data
-- Sample data.
DECLARE #Sample TABLE
(
Date1 DATE NOT NULL,
Date2 DATE NULL,
Date3 DATE NOT NULL
)
;
INSERT INTO #Sample
(
Date1,
Date2,
Date3
)
VALUES
('2016-01-02', NULL, '2016-01-01'), -- D1 > D3.
('2015-12-31', '2016-01-02', '2016-01-01'), -- D2 > D1 and D3
('2015-12-30', '2016-12-31', '2016-01-01') -- D3 > D1 and D2
;
Case Expression
-- Using a CASE EXPRESSION.
SELECT
CASE WHEN s.Date2 > s.Date1 THEN s.Date2 ELSE s.Date1 END AS Bigger_of_D1_D2,
*
FROM
#Sample AS s
WHERE
CASE WHEN s.Date2 > s.Date1 THEN s.Date2 ELSE s.Date1 END > s.Date3
;
Inline If (SQL Serer 2012, or above)
-- Using an INLINE IF.
SELECT
IIF(s.Date2 > s.Date1, s.Date2, s.Date1) AS Bigger_of_D1_D2,
*
FROM
#Sample AS s
WHERE
IIF(s.Date2 > s.Date1, s.Date2, s.Date1) > s.Date3
;
Both methods rely on the fact that NULL is not equal to anything, inculding itself. This means that checking D2 against D1 will always return false, if Date2 is null. If Date1 also allows NULLs this technique would fail. In that case you could expand the case expression to include more where expressions, or include the ISNULL function.
The second option is an example of sytactic sugar. Behind the scenes SQL Server will convert your code into a simple case expression, as per MSDN:
IIF is a shorthand way for writing a CASE expression.

Here is my suggestion:
INSERT INTO records
(
[user_date],
[user_name]
)
SELECT tbl1.Date1
,tbl1.user_name
FROM table1 as tbl1
INNER JOIN table2 as tbl2 on tbl1.ID=tbl2.ID
INNER JOIN table3 as tbl3 on tbl1.ID=tbl3.ID
WHERE (SELECT CASE WHEN tbl1.Date1 > ISNULL(tbl2.Date2,'1900-01-01') THEN tbl1.Date1 ELSE tbl2.Date2 END) <= tbl3.Date3
AND
(SELECT CASE WHEN tbl1.Date1 > ISNULL(tbl2.Date2,'1900-01-01') THEN tbl1.Date1 ELSE tbl2.Date2 END) > tbl3.last_date

Related

SQL how to count reocrds from date

I would like it to create a NUMBER column where the records for each date will be counted. So, for example, how many NRBs are there in 2021-10. However, when I choose count it gets such a result, sum cannot be because these are not numbers but an identification number
Here is my result:
Here my code:
PROC SQL; <- FIRST QUERY
create table PolisyEnd as
select distinct
datepart(t1.data_danych) as DATA_DANYCH format yymmdd10.
,(t4.spr_NRB) as NRB
,datepart(t1.PRP_END_DATE) as PRP_END_DATE format yymmdd10.
,datepart(t1.PRP_END_DATE) as POLICY_VINTAGE format yymmd7.,
case
when datepart(t1.PRP_END_DATE) IS NOT NULL and datepart(t1.PRP_END_DATE) - &gv_date_dly. < 0 THEN 'WYGASLA'
when datepart(t1.PRP_END_DATE) IS NOT NULL and datepart(t1.PRP_END_DATE) - &gv_date_dly. >= 0 and datepart(t1.PRP_END_DATE) - &gv_date_dly. <=7 THEN 'UWAGA'
when datepart(t1.PRP_END_DATE) IS NOT NULL and datepart(t1.PRP_END_DATE) - &gv_date_dly. >= 30 THEN 'AKTYWNA'
when datepart(t1.PRP_END_DATE) IS NULL THEN 'BRAK INFORMACJI O POLISIE'
end as POLISA_INFORMACJA
from
cmz.WMDTZDP_BH t1
left join
(select distinct kontr_id,obj_oid from cmz.BH_D_ZAB_X_ALOK_&thismonth) t2
on t2.obj_oid = t1.obj_oid
left join
(select distinct data_danych, kontr_id, kre_nrb from dm.BH_WMDTKRE_&thismonth) t3
on t3.kontr_id = t2.kontr_id
left join
(select distinct spr_NRB, spr_STATUS from _mart.mart_kred) t4
on t4.spr_NRB = t3.kre_nrb
where datepart(t1.data_danych) between '5Aug2019'd and &gv_date_dly. and t1.Actual = "T"
and t4.spr_STATUS ="A"
; SECOND CAME FROM FIRST
create table PolisyEnd1 as
select distinct
DATE_
,(POLICY_VINTAGE)
,count(NRB) as NUMBER
,POLISA_INFORMACJA
from PolisyEnd
where INFORMATION ="U"
;
Quit;
EDIT 1 :
I got the result, but how to do so that for 2021-11 there is one result and summed up all records for this period
Rather than using a distinct here what you really want is a GROUP BY.
PROC SQL;
create table PolisyEnd1 as
select
DATE_
,(POLICY_VINTAGE)
,count(NRB) as NUMBER
,POLISA_INFORMACJA
from PolisyEnd
where INFORMATION ="U"
group by DATE_, (POLICY_VINTAGE), POLISA_INFORMACJA
;
Quit;
You can use group by
If you want to count just based on the DATE_ column here is an example
select DATE_, count(NRB) as NUMBER
from PolisyEnd
where INFORMATION ="U"
group by DATE_
Otherwise, you can add other columns also in the group by and select clause.
For Edit1:
For each month you can use this:
select POLICY_VINTAGE, SUM(NUMBER) as NUMBER
from Your_Table
group by POLICY_VINTAGE

Variables Declaration, CTEs, and While Loops in Oracle SQL

So I might be stuck at something very trivial but can't figure out how to make it work. I create a 2 blocks of code that work in SQL but I have some problems with the date variable declaration in Oracle SQL.
I had write access to the SQL database when I create these codes so I did a 'Insert Into' to create temp tables. I don't have write access anymore. So I am using CTEs for it.
The original code looks like this:
DECLARE #Startdate Datetime = '2021-Jun-01 00:00:00.000'
DECLARE #Enddate Datetime = '2021-Jun-30 00:00:00.000'
Insert into Temp1
select ...
from ...
WHILE Startdate <= Enddate
BEGIN
Insert into Temp2
select ...
from (Temp 1)
left join
select ...
set #startdate=dateadd(d,1,#startdate)
end;
With my new code, I have made the following adjustmnets:
VARIABLE Startdate Datetime = '2021-Jun-01 00:00:00.000'
VARIABLE Enddate Datetime = '2021-Jun-30 00:00:00.000'
EXEC :Startdate := '2021-Jun-30 00:00:00.000'
EXEC :Enddate := '2021-Jun-30 00:00:00.000'
WITH Temp1 as (
select ...
from ...),
/* Unsure about using WHILE with with 2 CTEs so removing them for now but will need to be added*/
WITH Temp2 as
select ...
from (Temp 1)
left join
select ...
set startdate = :startdate + 1
end)
select * from Temp2;
The 2 blocks of code work perfectly individually. I think my concern lies with one or all of the following:
Variable Declaration - I read a couple of stackoverflow posts and it seems like there is binding variable and substitution variable. Is there a different way to declare variables?
The WHILE Loop specially between 2 CTEs. Can we do a while loop as a CTE? (similar to this) create while loop with cte
How the date is incremented. Is this the proper way to increment dates in Oracle PL/SQL?
Any guidance would be helpful.
Also adding 2 blocks of codes for reference:
Details of Tables:
Transactions - Contains Transaction information. Execution Date is a timestamp of the transaction execution
Account - Contains Account Information with a unique Account_Key for every account
Code_Rel - Maps the transaction code to a transaction type
Group Rel - Maps the transaction type to a transaction group
/***Block 1 of Code***/
insert into Temp1
select
a.ACCOUNT_KEY
,a.SPG_CD
,t.EXECUTION_DATE
from Schema_Name.TRANSACTIONS t
inner join Schema_Name.ACCOUNT a on a.en_sk=t.ac_sk
inner join Schema_Name.Code_Rel tr on t.t_cd_s = tr.t_cd_s
inner join ( select * from Schema_Name.Group_Rel
where gtrt_cd in ('Type1','Type2')) tt on tr.trt_cd = tt.trt_cd
where t.EXECUTION_DATE >= #startdate and t.EXECUTION_DATE<=#EndDt
and tt.gtrt_cd in ('Type1','Type2')
group by a.ACCOUNT_KEY ,a.SPG_CD, t.EXECUTION_DATE;
/***WHILE LOOP***/
while #startdate <= #EndDt
BEGIN
/***INSERT AND BLOCK 2 OF CODE***/
insert into Temp2
select table1.account_key, table1.SPG_CD, #startdate, coalesce(table2.sum_tr1,0),coalesce(table3.sum_tr2,0),
case when coalesce(table3.sum_tr2,0)>0 THEN coalesce(table2.sum_tr1,0)/coalesce(table3.sum_tr2,0) ELSE 0 END,
case when coalesce(table3.sum_tr2,0)>0 THEN
CASE WHEN coalesce(table2.sum_tr1,0)/coalesce(table3.sum_tr2,0)>=0.9 and coalesce(table2.sum_tr1,0)/coalesce(table3.sum_tr2,0)<=1.10 and coalesce(table2.sum_tr1,0)>=1000 THEN 'Yes' else 'No' END
ELSE 'No' END
FROM ( SELECT * FROM Temp1 WHERE execution_date=#startdate) TABLE1 LEFT JOIN
(
select a.account_key,a.SPG_CD, SUM(t.AC_Amt) as sum_tr1
from Schema_Name.TRANSACTIONS t
inner join Schema_Name.ACCOUNT a on a.en_sk=t.ac_sk
inner join Schema_Name.Code_Rel tr on t.t_cd_s = tr.t_cd_s
inner join ( select * from Schema_Name.Group_Rel
where gtrt_cd in ('Type1')) tt on tr.trt_cd = tt.trt_cd
where t.EXECUTION_DATE <= #startdate
and t.EXECUTION_DATE >=dateadd(day,-6,#startdate)
and tt.gtrt_cd in ('Type1')
group by a.account_key, a.SPG_CD
) table2 ON table1.account_key=table2.account_key
LEFT JOIN
(
select a.account_key,a.SPG_CD, SUM(t.AC_Amt) as sum_tr2
from Schema_Name.TRANSACTIONS t
inner join Schema_Name.ACCOUNT a on a.en_sk=t.ac_sk
inner join Schema_Name.Code_Rel tr on t.t_cd_s = tr.t_cd_s
inner join ( select * from Schema_Name.Group_Rel
where gtrt_cd in ('Type2')) tt on tr.trt_cd = tt.trt_cd
where t.EXECUTION_DATE <= #startdate
and t.EXECUTION_DATE >=dateadd(day,-6,#startdate)
and tt.gtrt_cd in ('Type2')
group by a.account_key, a.SPG_CD ) table3 on table1.account_key=table3.account_key
where coalesce(table2.sum_tr1,0)>=1000
set #startdate=dateadd(d,1,#startdate)
end;
You do not need to use PL/SQL or a WHILE loop or to declare variables and can probably do it all in a single SQL query using subquery factoring clauses (and recursion) to generate a calendar of incrementing dates. Something like this made-up example:
INSERT INTO temp2 (col1, col2, col3)
WITH time_bounds(start_date, end_date) AS (
-- You can declare the bounds in the query.
SELECT DATE '2021-06-01',
DATE '2021-06-30'
FROM DUAL
),
calendar (dt, end_date) AS (
-- Recursive query to generate a row for each day.
SELECT start_date, end_date FROM time_bounds
UNION ALL
SELECT dt + INTERVAL '1' DAY, end_date
FROM calendar
WHERE dt + INTERVAL '1' DAY <= end_date
),
temp1 (a, b, c) AS (
-- Made-up query
SELECT a, b, c FROM some_table
),
temp2 (a, d, e) AS (
-- Another made-up query.
SELECT t1.a,
s2.d,
s2.e
FROM temp1 t1
LEFT OUTER JOIN some_other_table s2
ON (t1.b = s2.b)
)
-- Get the values to insert.
SELECT t2.a,
t2.d,
t2.e
FROM temp2 t2
INNER JOIN calendar c
ON (t2.e = c.dt)
WHERE a BETWEEN 3.14159 AND 42;
If you try doing it with multiple inserts in a PL/SQL loop then it will be much slower than a single statement.

SQL Comparing 2 date values (one of which is stored in a varchar field and might not contain Date data)

I need to compare a date value within a where clause i.e. where a certain date is later than another. The problem arising is that the date which I am comparing against is stored in a varchar field. To doubly complicate matters a simple CONVERT will not work as the data stored in this varchar field is not always a date value and can be any old gubbins (this is an existing DB design and cannot be modified unfortunately. Ahh the joys of legacy design/code.)
Here's a rough example of what I currently have:
SELECT A.Value, B.Value, B.Value2
FROM Table A
JOIN Table B ON B.Id = A.Id
WHERE A.Value3 = 'Some String'
AND ISDATE(B.GubbinsField) = 1 AND CONVERT(DATETIME, B.GubbinsField, 120) >= A.DateField
Does anyone have a potential solution to this problem of how I can successfully check these values?
Well you are on the right path all you need to do is to get the column data type same on both sides of the Comparison Operator.
When you do CONVERT(DATETIME, B.GubbinsField, 120) >= A.DateField
basically you are comparing a string value with a datetime value. Sql server will try to convert the String to Datetime value because of its higher presidence.
You need to write your query something like....
;WITH CTE
AS (
SELECT A.Value ValueA
, B.Value ValueB
, B.Value2
, A.DateField
,B.GubbinsField
FROM Table A
JOIN Table B ON B.Id = A.Id
WHERE A.Value3 = 'Some String'
AND ISDATE(B.GubbinsField) = 1
)
SELECT ValueA, ValueB, Value2
FROM CTE
WHERE CAST(GubbinsField AS DATETIME) >= DateField
Try using a cte to ensure you are working with only records from table B that have the date in the GubbinsField during the join:
;WITH cte AS (SELECT B.value, B.Value2, B.ID, B.GubbinsField FROM TABLEB B WHERE ISDATE(B.GubbinsField) = 1)
SELECT A.Value, B.Value, B.Value2
FROM Table A
JOIN cte B ON B.Id = A.Id
WHERE A.Value3 = 'Some String'
AND CONVERT(DATETIME, B.GubbinsField, 120) >= A.DateField

difficult sql query

I have a table containing many columns, I have to make my selection according to these two columns:
TIME ID
-216 AZA
215 AZA
56 EA
-55 EA
66 EA
-03 AR
03 OUI
-999 OP
999 OP
04 AR
87 AR
The expected output is
TIME ID
66 EA
03 OUI
87 AR
I need to select the rows with no matches. There are rows which have the same ID, and almost the same time but inversed with a little difference. For example the first row with the TIME -216 matches the second record with time 215. I tried to solve it in many ways, but everytime I find myself lost.
First step -- find rows with duplicate IDs. Second step -- filter for rows which are near-inverse duplicates.
First step:
SELECT t1.TIME, t2.TIME, t1.ID FROM mytable t1 JOIN mytable
t2 ON t1.ID = t2.ID AND t1.TIME > t2.TIME;
The second part of the join clause ensures we only get one record for each pair.
Second step:
SELECT t1.TIME,t2.TIME,t1.ID FROM mytable t1 JOIN mytable t2 ON t1.ID = t2.ID AND
t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3;
This will produce some duplicate results if eg. (10, FI), (-10, FI) and (11, FI) are in your table as there are two valid pairs. You can possibly filter these out as follows:
SELECT t1.TIME,MAX(t2.TIME),t1.ID FROM mytable t1 JOIN mytable t2 ON
t1.ID = t2.ID AND t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3 GROUP BY
t1.TIME,t1.ID;
But it's unclear which result you want to drop. Hopefully this points you in the right direction, though!
Does this help?
create table #RawData
(
[Time] int,
ID varchar(3)
)
insert into #rawdata ([time],ID)
select -216, 'AZA'
union
select 215, 'AZA'
union
select 56, 'EA'
union
select -55, 'EA'
union
select 66, 'EA'
union
select -03, 'AR'
union
select 03, 'OUI'
union
select -999, 'OP'
union
select 999, 'OP'
union
select 04, 'AR'
union
select 87, 'AR'
union
-- this value added to illustrate that the algorithm does not ignore this value
select 156, 'EA'
--create a copy with an ID to help out
create table #Data
(
uniqueId uniqueidentifier,
[Time] int,
ID varchar(3)
)
insert into #Data(uniqueId,[Time],ID) select newid(),[Time],ID from #RawData
declare #allowedDifference int
select #allowedDifference = 1
--find duplicates with matching inverse time
select *, d1.Time + d2.Time as pairDifference from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=#allowedDifference and d1.[Time] + d2.[Time] >= (-1 * #allowedDifference))
-- now find all ID's ignoring these pairs
select [Time],ID from #data
where uniqueID not in (select d1.uniqueID from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=3 and d1.[Time] + d2.[Time] >= -3))

Finding efficient overlapped entries in a SQL table

What is the most efficient way to find all entries which do overlap with others in the same table? Every entry has a start and end date. For example I have the following database setup:
CREATE TABLE DEMO
(
DEMO_ID int IDENTITY ,
START date NOT NULL ,
END date NOT NULL
);
INSERT INTO DEMO (DEMO_ID, START, END) VALUES (1, '20100201', '20100205');
INSERT INTO DEMO (DEMO_ID, START, END) VALUES (2, '20100202', '20100204');
INSERT INTO DEMO (DEMO_ID, START, END) VALUES (3, '20100204', '20100208');
INSERT INTO DEMO (DEMO_ID, START, END) VALUES (4, '20100206', '20100211');
My query looks as follow:
SELECT DISTINCT *
FROM DEMO A, DEMO B
WHERE A.DEMO_ID != B.DEMO_ID
AND A.START < B.END
AND B.START < A.END
The problem is when my demo table has for example 20'000 rows the query takes too long. My environment is MS SQL Server 2008.
Thanks for any more efficient solution
This is simpler and executes in about 2 seconds for over 20000 records
select * from demo a
where not exists(
select 1 from demo b
where a.demo_id!=b.demo_id
AND A.S < B.E
AND B.S < A.E)
You could rewrite the query a bit:
SELECT A.DEMO_ID, B.DEMO_ID
FROM DEMO A, DEMO B
WHERE A.DEMO_ID != B.DEMO_ID
AND A.START >= B.START
AND A.START <= B.END
Getting rid of the DISTINCT keyword may make things cheaper, because Sql Server will do a sort on the returned column (which is all of them when you use DISTINCT *) to eliminate duplicates.
You should also consider adding an index. With Sql Server 2008, I would recommend an index on START, END, containing DEMO_ID.
Use a function or stored procedure:
First, order the entries by Start and End
DECLARE #t table (
Position int identity(1,1),
DEMO_ID int,
START date NOT NULL ,
END date NOT NULL
)
INSERT INTO #t (DEMO_ID, START, END)
SELECT DEMO_ID, START, END
FROM DEMO
ORDER BY START, END
Then check for overlaps with previous and next record:
SELECT t.DEMO_ID
FROM #t t INNER JOIN #t u ON t.Position + 1 = u.Position
WHERE u.Start <= t.End
UNION
SELECT t.DEMO_ID
FROM #t t INNER JOIN #t u ON t.Position - 1 = u.Position
WHERE t.Start <= u.End
You need to measure to be sure this is faster. In any case, we won't compare the date fields of all records with all the other records, so this could be faster for large datasets.
Late answer, but wondering if this would help:
create index IXNCL_Demo_DemoId on Demo(Demo_Id)
select a.demo_id, b.demo_id as [CrossingDate]
from demo a
cross join demo b
where a.[end] between b.start and b.[end]
and a.demo_id <> b.demo_id