CASE Statement inside a subquery - sql

I was able to create the following query after help from the post below
select * from duppri t
where exists (
select 1
from duppri
where symbolUP = t.symbolUP
AND date = t.date
and price <> t.price)
ORDER BY date
SQL to check when pairs don't match
I have now realized that I need to add a case statement to indicate when all the above criteria fits, but the type value is equal between duppri and t.duppri. This occurs because of case sensitivity. This query is an attempt to clean up a portfolio accounting system that unfortunately allowed numerous duplicates because it didn't have strong referential integrity or constraints.
I would like the case statement to produce the column 'isMatch'
Date |Type|Symbol |SymbolUP |Concatt |Price |IsMatch
6/30/1995 |gaus|313586U72|313586U72|gaus313586U72|109.25|Different
6/30/1995 |gbus|313586U72|313586U72|gbus313586U72|108.94|Different
6/30/1995 |agus|SRR |SRR |agusSRR |10.25 |Different
6/30/1995 |lcus|SRR |SRR |lcusSRR |0.45 |Different
11/27/1996|lcus|LLY |LLY |lcusLLY |76.37 |Matched
11/27/1996|lcus|lly |LLY |lcusLLY |76 |Matched
11/28/1996|lcus|LLY |LLY |lcusLLY |76.37 |Matched
11/28/1996|lcus|lly |LLY |lcusLLY |76 |Matched
I tried the following CASE statement but it is creating errors
SELECT * from duppri t
where exists (
select 1,
CASE IsMatch WHEN [type] = [t.TYPE] THEN 'Matched' ELSE 'Different' END
from duppri
where symbolUP = t.symbolUP
AND date = t.date
and price <> t.price)
ORDER BY date

You could just use window functions, if I understand correctly:
select d.*,
(case when mint = maxt
then 'Matched' else 'Different'
end)
from (select d.*,
min(type) over (partition by symbolup, date) as mint,
max(type) over (partition by symbolup, date) as maxt,
min(price) over (partition by symbolup, date) as minp,
max(price) over (partition by symbolup, date) as maxp
from duppri d
) d
where minp <> maxp
order by date;

The subquery used with the exists predicate can't and won't return anything other than true/false but you can accomplish what you want using a subquery like this, which should work:
select
*,
(select
CASE when count(distinct type) = 1 THEN 'Matched' ELSE 'Different' END
from duppri
where symbol = t.symbol and date = t.date
) IsMatch
from duppri t
where exists (
select 1
from duppri
where symbol = t.symbol
and price <> t.price);

Related

SQL - find row with closest date but different column value

i'm new to SQL and i would need an help.
I have a TAB and I need to find for any item B in the TAB the item A with the closest date. In this case the A with 02.09.2021 04:25:30
Date.
Item
07.09.2021 05:02:05
A
06.09.2021 05:01:02
A
05.09.2021 05:00:02
A
04.09.2021 04:59:01
A
03.09.2021 04:58:03
A
02.09.2021 04:56:55
A
02.09.2021 04:33:56
B
02.09.2021 04:25:30
A
WITH CTE(DATE,ITEM)AS
(
SELECT '20210907 05:02:05' , 'A'UNION ALL
SELECT '20210906 05:01:02' , 'A'UNION ALL
SELECT '20210905 05:00:02' , 'A'UNION ALL
SELECT'20210904 04:59:01' , 'A'UNION ALL
SELECT'20210903 04:58:03' , 'A'UNION ALL
SELECT'20210902 04:56:55' , 'A'UNION ALL
SELECT'20210902 04:33:56' , 'B'UNION ALL
SELECT'20210902 04:25:30' , 'A'
)
SELECT
CAST(C.DATE AS DATETIME)X_DATE,C.ITEM,Q.CLOSEST
FROM CTE AS C
OUTER APPLY
(
SELECT TOP 1 CAST(X.DATE AS DATETIME)CLOSEST
FROM CTE AS X
WHERE X.ITEM='A'AND CAST(X.DATE AS DATETIME)<CAST(C.DATE AS DATETIME)
ORDER BY CAST(X.DATE AS DATETIME) ASC
)Q
WHERE C.ITEM='B'
You can use OUTER APPLY-approach as in the above query.
Please also take a look that datetime-column (DATE)is written in the ISO-compliant form
Your data has only two columns. If you want the only the closest A timestamp, then the fastest way is probably window functions:
select t.*,
(case when prev_a_date is null then next_a_date
when next_a_date is null then prev_a_date
when datediff(second, prev_a_date, date) <= datediff(second, date, next_a_date) then prev_a_date
else next_a_date
end) as a_date
from (select t.*,
max(case when item = 'A' then date end) over (order by date) as prev_a_date,
min(case when item = 'A' then date end) over (order by date desc) as next_a_date
from t
) t
where item = 'B';
This uses seconds to measure the time difference, but you can use a smaller unit if appropriate.
You can also do this using apply if you have more columns from the "A" rows that you want:
select tb.*, ta.*
from t b outer apply
(select top (1) ta.*
from t ta
where item = 'A'
order by abs(datediff(second, a.date, b.date))
) t
where item = 'B';

How to select data without using group?

My base data based on dealer code only but in one condition we need to select other field as well to matching the condition in other temp table how can i retrieve data only based on dealercode ith matching the condition on chassis no.
Below is the sample data:
This is how we have selected the data for the requirement:
---------------lastyrRenewalpolicy------------------
IF OBJECT_ID('TEMPDB..#LASTYRETEN') IS NOT NULL DROP TABLE #LASTYRETEN
select DEALERMASTERCODE , count(*) RENEWALEXPRPOLICY,SUM(NETOD_YEAR_PREM_PART_A) AS 'ACHIEVED-ODPREMIUM_RENEWAL' into #LASTYRETEN
from [dbo].[T_RE_POLICY_TRANSACTION]
where cast (InsPolicyCreatedDate as date) between #FirstDayC and #LastDayC
AND PolicyStatus= 'Renewal' AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE
-----------------lastrollower------------------------
IF OBJECT_ID('TEMPDB..#LASTYROLWR') IS NOT NULL DROP TABLE #LASTYROLWR
select DEALERMASTERCODE , count(*) ROLLOWEEXPRPOLICY ,SUM(NETOD_YEAR_PREM_PART_A) AS 'ACHIEVED-ODPREMIUM_ROLLOVER'
into #LASTYROLWR from [dbo].[T_RE_POLICY_TRANSACTION] where cast (InsPolicyCreatedDate as date) between #FirstDayC and #LastDayC
AND PolicyStatus= 'ROLLOVER' AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE
And continue with above flow Below is the other select statement which creating issue at the end due to grouping
:
-------------OTHERYRBASE(EXPIRYRENEWAL)--------------
IF OBJECT_ID('TEMPDB..#OTHERYRBASEEXPIRY') IS NOT NULL DROP TABLE #OTHERYRBASEEXPIRY
select DEALERMASTERCODE ,ChassisNo , count(*) RENEWALPOLICYEXPIRY
into #OTHERYRBASEEXPIRY
from [dbo].[T_RE_POLICY_TRANSACTION] where cast (PolicyExpiryDate as date) between '2020-08-01' and '2020-08-31'
and BASIC_PREM_TOTAL <> 0 AND PolicyStatus in ('Renewal','rollover') and BusinessType='jcb'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE,ChassisNo
-------------OTHERYRBASE(EXPIRYRENEWAL)--------------
IF OBJECT_ID('TEMPDB..#OTHERYRCON') IS NOT NULL DROP TABLE #OTHERYRCON
select OTE.DEALERMASTERCODE ,OTE.ChassisNo , count(*) OTHERYRCON into #OTHERYRCON
from [dbo].[T_RE_POLICY_TRANSACTION] OTE INNER JOIN #OTHERYRBASEEXPIRY EXP
ON OTE.ChassisNo=EXP.ChassisNo
where cast(CREATED_DATE as date) between '2020-06-01' and '2020-12-31' and BusinessType='jcb'
and OTE.BASIC_PREM_TOTAL <> 0 AND OTE.PolicyStatus = 'Renewal'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by OTE.DEALERMASTERCODE,OTE.ChassisNo
Thanks a lot in advance for helping and giving a solution very quickly ///
After taking a look at this code it seems possible there was an omitted JOIN condition in the last SELECT statement. In the code provided the JOIN condition is only on ChassisNo. The GROUP BY in the prior queries which populates the temporary table also included the DEALERMASTERCODE column. I'm thinking DEALERMASTERCODE should be added to the JOIN condition. Something like this
select OTE.DEALERMASTERCODE ,OTE.ChassisNo , count(*) OTHERYRCON
into #OTHERYRCON
from [dbo].[T_RE_POLICY_TRANSACTION] OTE
INNER JOIN #OTHERYRBASEEXPIRY EXP ON OTE.DEALERMASTERCODE=EXP.DEALERMASTERCODE
and OTE.ChassisNo=EXP.ChassisNo
where cast(CREATED_DATE as date) between '2020-06-01' and '2020-12-31'
and BusinessType='jcb'
and OTE.BASIC_PREM_TOTAL <> 0
AND OTE.PolicyStatus = 'Renewal'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 )
group by OTE.DEALERMASTERCODE,OTE.ChassisNo;

Writing a Single Query w/ Multiple CTE Subqueries SQL/R

I have some data I would like to pull from a database, I'm using RStudio for my query. What I intend to do is write:
The first CTE statement to pull all my necessary information.
The second CTE statement will add two new columns for two row numbers, which are partitioned by different groups. Two additional columns will be added for Lead and Lag values.
The third CTE will produce two more columns where the two columns use nested case_when statements to give me NewOpen and NewClosed dates.
What I have so far:
q5<- sqlQuery(ch,paste("
;with CTE AS
(
select
oz.id as AccountID
,ac.PROD_TYPE_CDE as ProductTypeCode
,CASE WHEN ac.OPEN_DTE='0001-01-01' then null else ac.OPEN_DTE END as OpenDate
,CASE WHEN ac.CLOS_DTE = '0001-01-01' then null else ac.CLOS_DTE END as ClosedDate
,df.proc_dte as FullDate
FROM
dbs.tb_dbs_acct_fact df
inner join
dbs.tb_acct_details ac on df.dw_serv_id = ac.dw_serv_id
left outer join
dbs.tb_oz_id oz on df.proc_dte = oz.proc_dte
),
cte1 as
(
select *
,row_nbr = row_number() over( partition by AccountID order by AccountID, FullDate asc )
,row_nbr2 = row_number() over( partition by AccountID,ProductTypeCode order by AccountID, FullDate asc )
,lag(ProductTypeCode) over(partition by AccountID order by FullDate asc ) as Lagging
,LEAD(ProductTypeCode) over(partition by AccountID order order by FullDate asc ) as Leading
FROM CTE
),
cte2 as (select *
,case when cte1.row_nbr = 1 & cte1.Lagging=cte1.ProductTypeCode then cte1.OpenDate else
case when cte1.Lagging<>cte1.ProductTypeCode then cte1.FullDate else NULL END END as NewOpen
,case when cte1.ClosedDate IS NOT NULL then cte1.ClosedDate else
case when cte1.Leading <> cte1.ProductTypeCode then cte1.FullDate else NULL END END as NewClosed
FROM cte1
);"))
This code, however won't run.
As mentioned, WITH is a statement to define CTEs to be used in a final query. Your query only contains CTE definitions but never actually use any in a final statement. Additionally, you can combine the first two CTEs since window functions can run at any level. Possibly the last CTE can serve as your final SELECT statement.
sql <- "WITH CTE AS
(SELECT
oz.id AS AccountID
, ac.PROD_TYPE_CDE as ProductTypeCode
, CASE
WHEN ac.OPEN_DTE='0001-01-01'
THEN NULL
ELSE ac.OPEN_DTE
END AS OpenDate
, CASE
WHEN ac.CLOS_DTE = '0001-01-01'
THEN NULL
ELSE ac.CLOS_DTE
END AS ClosedDate
, df.proc_dte AS FullDate
, ROW_NUMBER() OVER (PARTITION BY oz.id
ORDER BY oz.id, df.proc_dte) AS row_nbr
, ROW_NUMBER() OVER (PARTITION BY oz.id, ac.PROD_TYPE_CDE
ORDER BY oz.id, df.proc_dte) AS row_nbr2
, LAG(ac.PROD_TYPE_CDE) OVER (PARTITION BY oz.id
ORDER BY df.proc_dte) AS Lagging
, LEAD(ac.PROD_TYPE_CDE) OVER (PARTITION BY oz.id
ORDER BY df.proc_dte) AS Leading
FROM
dbs.tb_dbs_acct_fact df
INNER JOIN
dbs.tb_acct_details ac ON df.dw_serv_id = ac.dw_serv_id
LEFT OUTER JOIN
dbs.tb_oz_id oz ON df.proc_dte = oz.proc_dte
)
SELECT *
, CASE
WHEN row_nbr = 1 & Lagging = ProductTypeCode
THEN OpenDate
ELSE
CASE
WHEN Lagging <> ProductTypeCode
THEN FullDate
ELSE NULL
END
END AS NewOpen
, CASE
WHEN ClosedDate IS NOT NULL
THEN ClosedDate
ELSE
CASE
WHEN Leading <> ProductTypeCode
THEN FullDate
ELSE NULL
END
END AS NewClosed
FROM CTE;"
q5 <- sqlQuery(ch, sql)

How to get the difference in dates in SQL Server

I'm having trouble with writing a query to get difference between the UpdateDate and the CreationDate of 2 records if the ID is the lowets and the difference between the most recent and second most recent UpdateDate.
Here's my Query:
SELECT
a.ID, a.RequestID, b.KrStatus, b.CrDate , b.UpdateDate,
DATEDIFF (HOUR, b.CrDate, b.UpdateDate) AS TimeDifference,
CASE WHEN a.ID = (SELECT MAX(a.ID) FROM [dbo].[Krdocs_hist] a WHERE a.RequestID = 1)
THEN 'YES'
ELSE 'NO'
END AS isMax,
CASE WHEN a.ID = (SELECT MIN(a.ID) FROM [dbo].[Krdocs_hist] a WHERE a.RequestID = 1)
THEN 'YES'
ELSE 'NO'
END AS isMi
FROM [dbo].[Krdocs_hist] a, [dbo].Krdocs_Details_hist b
WHERE
a.RequestId = b.RequestId
and a.ID = b.ID
and a.RequestId = 1
ORDER BY b.RequestID
Here's my current result:
What I'd like to do is get the last possible record, check to see if there was an existing one before it. If there wasn't compare the UpdateDate and CrDate (UpdateDate minus CrDate. If there was a record before this I want to do the UpdateDate minus the previous UpdateDate.
Using this query:
SELECT b.Id, b.RequestId, b.UpdateDate, b.KrStatus
FROM [dbo].[Krdocs_Details_hist] b
WHERE b.RequestId = 1
Has this result:
And using this query:
SELECT a.*
FROM [dbo].[Krdocs_hist] a
WHERE RequestId = 1
Has this result:
UPDATE
Since LAG is available from SQL 2012, you can use like below:
SELECT
ID,
RequestID,
CrDate,
UpdateDate,
KrStatus,
DATEDIFF(HOUR, PreviousUpdateDate, UpdateDate) as TimeDifference
FROM
(SELECT
ID,
RequestID,
CrDate,
UpdateDate,
KrStatus,
LAG(UpdateDate, 1, CrDate) OVER (ORDER BY YEAR(ID)) AS PreviousUpdateDate
FROM [dbo].Krdocs_Details_hist) as tmp
I think you can try like this:
SELECT
CASE
WHEN COUNT(*) <= 1 THEN DATEDIFF(HOUR,
(SELECT CrDate FROM [dbo].Krdocs_Details_hist),
(SELECT UpdateDate FROM [dbo].Krdocs_Details_hist))
WHEN COUNT(*) > 1 THEN DATEDIFF(HOUR,
(SELECT MAX(UpdateDate) FROM [dbo].Krdocs_Details_hist WHERE UpdateDate < ( SELECT MAX(UpdateDate) FROM [dbo].Krdocs_Details_hist)),
(SELECT MAX(UpdateDate) FROM [dbo].Krdocs_Details_hist))
END AS TimeDifference
FROM [dbo].Krdocs_Details_hist

How do I remove certain duplicates in a complex SQL query

I am writing a query and need it to Remove all duplicates of a.GenUserID but also keep the most recent login date ( that is b.LogDateTime) but this date must be older than 6 months. If there are later dates, they have to be removed.
I hope this makes sense.
SELECT DISTINCT
a.GenUserID,
c.DeletionDate,
b.LogDateTime,
(CASE c.Disabled WHEN 0 THEN 'NO' else 'YES - ARCHIVED' end)
FROM RioReport.dbo.GenUser a
LEFT JOIN dbo.GenUserArchive c on a.GenUserID = c.GenUserID
LEFT JOIN dbo.GenUserAccessHistory b on a.GenUserID = b.ExtraInfo
WHERE(a.Disabled=0 or c.Disabled=0)
AND c.DeletionDate IS NOT NULL
AND ((DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime IS NULL))
ORDER BY a.GenUserID, b.LogDateTime desc
You could add the row_number() information to your query, and wrap that query into an outer query that just takes the records with number 1 from that result:
select *
from (
select a.GenUserID,
c.DeletionDate,
b.LogDateTime,
case c.Disabled when 0 then 'NO' else 'YES - ARCHIVED' end as diabled,
row_number() over (partition by a.GenUserID
order by b.LogDateTime desc) as rn
from RioReport.dbo.GenUser a
inner join dbo.GenUserArchive c
on a.GenUserID = c.GenUserID
left join dbo.GenUserAccessHistory b
on a.GenUserID = b.ExtraInfo
where (a.Disabled=0 or c.Disabled=0)
and c.DeletionDate is not null
and (DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime is null)
)
where rn = 1
order by a.GenUserID
Note that you can turn the first left join into an inner join without any change to the result set, since you have a non-null check on one of its fields. inner join is then preferred, and might give a performance improvement.
If GenUserAccessHistory.LogDateTime is always non-null, then you can avoid the test or b.LogDateTime is null by moving the DateAdd(MM, -6, GetDate()) > b.LogDateTime condition to the appropriate join on clause.
The generated row number will be given in order of descending LogDateTime values, and restart from 1 for every different user.
Alternative without window functions
row_number() and other window functions are supported since SQL Server 2008. In comments you write you cannot use it. If that is the case, here is an alternative using a common table expression (supported since SQL Server 2005):
;with cte as (
select a.GenUserID,
c.DeletionDate,
b.LogDateTime,
case c.Disabled when 0 then 'NO' else 'YES - ARCHIVED' end as disabled,
from RioReport.dbo.GenUser a
inner join dbo.GenUserArchive c
on a.GenUserID = c.GenUserID
left join dbo.GenUserAccessHistory b
on a.GenUserID = b.ExtraInfo
where (a.Disabled=0 or c.Disabled=0)
and c.DeletionDate is not null
and (DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime is null)
)
select *
from cte main
where LogDateTime is null
or not exists (select 1
from cte sub
where sub.GenUserID = main.GenUserID
and sub.LogDateTime > main.LogDateTime)
order by GenUserID
Try with the below query.
;WITH CTE_Group
AS(
SELECT
ROW_NUMBER() OVER (PARTITION BY a.GenUserID ORDER BY b.LogDateTime DESC) as RNO,
a.GenUserID,
c.DeletionDate,
b.LogDateTime,
(CASE c.Disabled WHEN 0 THEN 'NO' else 'YES - ARCHIVED' end) IsArchived
FROM RioReport.dbo.GenUser a
LEFT JOIN dbo.GenUserArchive c on a.GenUserID = c.GenUserID
LEFT JOIN dbo.GenUserAccessHistory b on a.GenUserID = b.ExtraInfo
WHERE(a.Disabled=0 or c.Disabled=0)
AND c.DeletionDate IS NOT NULL
AND ((DateAdd(MM, -6, GetDate()) > b.LogDateTime or b.LogDateTime IS NULL)))
SELECT GenUserID,
DeletionDate,
LogDateTime,
IsArchived
FROM WITH_CTE_Group
WHERE RNO=1
Use cte and window function
;with ctr as (
select a.GenUserID, a.DeletionDate, a.LogDateTime
row_number()over(partition by a.GenUserID order by b.LogDateTime desc) rnk
from RioReport.dbo.GenUser a )
select a.GenUserID, a.DeletionDate, a.LogDateTime,
CASE WHEN DATEDIFF(mm,LogDateTime,getdate())<6 THEN 'NO' else 'YES - ARCHIVED' end)
from ctr a where a.rnk=1