I am sure this is a very stupid question and I am having a dumb moment.
Consider the following basic scenario (this is a very small scenario compared with reality which has many many dimensions and measures):
What I need to get to is the expected output.
So ALL costs between the input_Date and output_date defined in the params are included. However only the latest PID is included- defined as either:
1- where PIDs run sequentially, or overlap the latest one based on date_to as long as both aren't active at the # output date
2- where there are two PID active at the # output date show both
I can't for the life of me work out how to do this in SQL, note that is has to be non dynamic and not use any CTE unfortunately, just your basic SQL with subqueries
Obviously returning the necessary list of ID and PID is easy:
declare #input_date date ='2006-01-01'
declare #output_date date ='2006-12-31'
select a.PID, a.ID
from #tmp a
where date_from <=#output_date and date_to >=#input_date
But I can't figure out how to join this back to return the correct cost values
drop table tmp
CREATE TABLE [dbo].[tmp](
[date_from] [datetime] NOT NULL,
[date_to] [datetime] NOT NULL,
[ID] [nvarchar](25) NOT NULL,
[PID] [nvarchar](25) NOT NULL,
[cost] [float] NULL
) ON [PRIMARY]
INSERT tmp VALUES('2005-1-1','2005-1-31','10001','X123',1254.32)
INSERT tmp VALUES('2000-10-10','2006-8-21','10005','TEST01',21350.9636378758)
INSERT tmp VALUES('2006-8-22','2099-12-31','10005','TEST02',22593.4926163943)
INSERT tmp VALUES('2006-1-1','2099-12-31','10006','X01',22458.3342354444)
INSERT tmp VALUES('2006-2-8','2099-12-31','10006','X02',22480.3772331959)
INSERT tmp VALUES('2006-1-1','2006-2-7','10007','AB01',565.416874152212)
INSERT tmp VALUES('2006-2-8','2006-7-31','10007','AA05',19108.3206482165)
I've made some progress using a CTE so you can see how I would do it this way if I could:
drop table #tmp
CREATE TABLE #tmp (
[date_from] [datetime] NOT NULL,
[date_to] [datetime] NOT NULL,
[ID] [nvarchar](25) NOT NULL,
[PID] [nvarchar](25) NOT NULL,
[cost] [float] NULL
) ON [PRIMARY]
INSERT #tmp VALUES('2005-1-1','2005-1-31','10001','X123',1254.32)
INSERT #tmp VALUES('2000-10-10','2006-8-21','10005','TEST01',21350.9636378758)
INSERT #tmp VALUES('2006-8-22','2099-12-31','10005','TEST02',22593.4926163943)
INSERT #tmp VALUES('2006-1-1','2099-12-31','10006','X01',22458.3342354444)
INSERT #tmp VALUES('2006-2-8','2099-12-31','10006','X02',22480.3772331959)
INSERT #tmp VALUES('2006-1-1','2006-2-7','10007','AB01',565.416874152212)
INSERT #tmp VALUES('2006-2-8','2006-7-31','10007','AA05',19108.3206482165)
declare #input_date date ='2006-01-01'
declare #output_date date ='2006-12-31'
;with cte as (
select t.id,t.PID,t.cost,t.date_from,t.date_to ,
iif(date_To >= #output_date OR max_date_To is not null,PID,NULL) as PID2,
b.total_id_cost
from #tmp t
left join (select ID,max(date_to) as max_date_to
from #tmp
where date_from <=#output_date and date_to >=#input_date
group by ID) a
on t.ID = a.ID and t.date_to = a.max_date_to
left join (Select ID, sum(cost) as total_id_cost
from #tmp
where date_from <=#output_date and date_to >=#input_date
group by ID) b
on t.ID = b.ID
where date_from <=#output_date and date_to >=#input_date )
select distinct ID,PID2,
iif(ID in (
select ID
from cte
where PID2 IS NULL)
and ID not in (select ID
from cte
where PID IS NOT NULL
group by ID
having count (distinct PID2) >1 ), cte.total_id_cost, cost) as cost
from cte
where PID2 is not null;
so it looks like there's several problems to solve within 1 query.
We want the PID that matches the latest date. This wasn't too difficult and can be solved by joining the data with an aggregate of itself that finds the latest date
Where both PID is active i.e. overlapping from and to dates, then both must show. I found this to be more tricky. in the end I did a query to find the ones that do overlap and meet the dates, and did a count on that. then used this count as a criteria for the join on 1. so that it can conditionally pick the PID that matches the latest date
Then finally using the results from above, you can do the sum to get the cost. The resulting query is a bit of a monster, but here it is.
if it doesn't cover other scenarios not detailed, do let me know.
DECLARE #Data TABLE (date_from DATETIME, date_to DATETIME, ID INT, PID NVARCHAR(50), COST MONEY)
INSERT #Data VALUES('2005-1-1','2005-1-31','10001','X123',1254.32)
INSERT #Data VALUES('2000-10-10','2006-8-21','10005','TEST01',21350.9636378758)
INSERT #Data VALUES('2006-8-22','2099-12-31','10005','TEST02',22593.4926163943)
INSERT #Data VALUES('2006-1-1','2099-12-31','10006','X01',22458.3342354444)
INSERT #Data VALUES('2006-2-8','2099-12-31','10006','X02',22480.3772331959)
INSERT #Data VALUES('2006-1-1','2006-2-7','10007','AB01',565.416874152212)
INSERT #Data VALUES('2006-2-8','2006-7-31','10007','AA05',19108.3206482165)
declare #input_date date ='2006-01-01'
declare #output_date date ='2006-12-31'
select
a.ID,
PIDForMaxDateThatMatches.PID,
SUM(a.cost) as cost
from
#Data a
inner join (
-- number of PIDs for dates that overlap grouped by ID
select
a.ID,
-- where there's no overlap then we want the count to be 1 so that later we can use it as condition
COUNT(DISTINCT ISNULL(b.PID,'')) as NumberOfPID
from
#Data a
-- may or may not find overlaps
LEFT JOIN #data b ON
b.date_from <=#output_date and
b.date_to >=#input_date and
a.date_from <= b.date_to and
a.date_to >= b.date_from and
a.ID = b.ID and
a.PID <> b.PID
where
a.date_from <=#output_date and
a.date_to >=#input_date
group by
a.ID) as PIDCountForOverlappingMatches ON
a.ID = PIDCountForOverlappingMatches.ID
left join (
-- get the PID that matches the max date_to
select
DataForMaxDate.ID,
DataForMaxDate.date_from,
DataForMaxDate.date_to,
DataForMaxDate.PID
from
#Data as DataForMaxDate
inner join (
-- get the max date_to that matches the criteria
select
ID,
MAX(date_to) as maxDateTo
from
#Data a
where
date_from <=#output_date and
date_to >=#input_date
group by
ID) as MaxToDatePerID on
DataForMaxDate.ID = MaxToDatePerID.ID and
DataForMaxDate.date_to = MaxToDatePerID.maxDateTo) as PIDForMaxDateThatMatches on
a.ID = PIDForMaxDateThatMatches.ID AND
-- if there's no overlapping dates the PID count would be 1, which we'll take the PID that matches the max(date_to)
-- but if there is overlap, then we want both dates to show, thus the from date must also match before we take the PID
(PIDCountForOverlappingMatches.NumberOfPID = 1 OR a.date_from = PIDForMaxDateThatMatches.date_from)
where
a.date_from <= #output_date and
a.date_to >= #input_date
GROUP BY
a.ID,
PIDForMaxDateThatMatches.PID
ORDER BY
a.ID
EDIT: DB Fiddle http://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=d43cb4b9765da1bca035531e78a2c77d
Results:
ID PID cost
10005 TEST02 43944.4562
10006 X01 22458.3342
10006 X02 22480.3772
10007 AA05 19673.7375
Hello you can try the following query :
select a.resource_id ID, max(a.post_id) PID, SUM(a.cost) Cost
from #tmp a
where date_from <=#output_date and date_to >=#input_date
group by a.resource_id
order by a.resource_id;
I think this might work:
SELECT
t1.ID,
q1.PID,
SUM(t1.cost)
FROM
Table AS t1
JOIN
(
SELECT
q2.ID,
t2.PID
FROM
(
SELECT
ID,
MAX(date_to) AS maxdate
FROM
Table
GROUP BY
ID
) AS q2
JOIN
table AS t2
ON
q2.ID = t2.ID
AND
q2.maxdate = t2.date_to
) AS q1
ON
t1.ID = q1.ID
AND
t1.PID = q1.PID
GROUP BY
t1.ID,
q1.PID
Here is a query without CTE. Idea of query:
1) Find consecutive dates and make different groups within each id
2) Find min and max date, sum of costs for each group
3) Limit by input parametres
declare #date_from date = '20060101'
declare #date_to date = '20061231'
declare #myTable table(
date_from date
, date_to date
, id int
, pid varchar(30)
, cost decimal(10,2)
)
insert into #myTable values
('20050101', '20050201', 10001, 'x123', 1254.32)
, ('20001010', '20060821', 10005, 'test01', 21350.96)
, ('20060822', '20991231', 10005, 'test02', 22593.49)
, ('20060101', '20991231', 10006, 'x01', 22548.33)
, ('20060208', '20991231', 10006, 'x02', 22480.38)
, ('20060101', '20060207', 10007, 'abo1', 565.42)
, ('20060208', '20060731', 10007, 'abo2', 19108.32)
select
date_from = min(date_from), date_to = max(date_to)
, id, pid = max(case when date_to = max_date_to then pid end)
, cost = sum(cost)
from (
select
a.date_from, a.date_to, a.id, a.pid, a.cost, a.rn, grp = sum(b.ss)
, max_date_to = max(a.date_to) over (partition by a.id, sum(b.ss))
from
(
select
a.*, ss = case when datediff(dd, b.date_to, a.date_from) = 1 then 0 else 1 end
from
(
select
*, rn = row_number() over (partition by id order by date_from)
from
#myTable
) a
left join (
select
*, rn = row_number() over (partition by id order by date_from)
from
#myTable
) b on a.id = b.id and a.rn - 1 = b.rn
) a
left join (
select
a.*, ss = case when datediff(dd, b.date_to, a.date_from) = 1 then 0 else 1 end
from
(
select
*, rn = row_number() over (partition by id order by date_from)
from
#myTable
) a
left join (
select
*, rn = row_number() over (partition by id order by date_from)
from
#myTable
) b on a.id = b.id and a.rn - 1 = b.rn
) b on a.id = b.id and a.rn >= b.rn
group by a.date_from, a.date_to, a.id, a.pid, a.cost, a.rn
) t
group by id, grp, max_date_to
having min(date_from) <= #date_from and max(date_to) >= #date_to
order by id
Output
date_from date_to id pid cost
------------------------------------------------
2000-10-10 2099-12-31 10005 test02 43944.45
2006-01-01 2099-12-31 10006 x01 22548.33
Result is a bit different than your provided output. But:
1) For id = 10006 and pid = X02 date_from = 08/02/2006 while input is 01/01/2006
2) For id = 10007 date_to = 31/07/2006 while input is 31/12/2006
So, I think query works correctly
Rextester demo in more readable format with cte
Related
I finished writing multiple SQL statements in order to generate my final table CCtbase2. What I want to do next is to package the entire SQL code into a single SQL statement avoiding creating multiple temporary tables. I just want to write my query in a way in which I will be creating my final table without creating multiple temporary tables but I am not sure how to go about it and I apologise for not providing sample of the tables.
DROP TABLE IF EXISTS temp_t1;
Select PID, CID ,CCID, CExpiry, PStatusID, PDate into temp_t1
FROM TABLE1 where PStatusID = 1001;
-- Applying filters
DROP TABLE IF EXISTS temp_t2;
SELECT * into temp_t2
FROM TABLE2 where ADate < GETDATE() and PTID > 0 and PTStatusID <> 812 ;
-- Fetching the latest record as per ADate based on PTID
DROP TABLE IF EXISTS temp_t3;
Select * into temp_t3 from (
select *, row_number() over(partition by PTID order by ADate desc) as rn
from temp_t2
) t
where t.rn = 1
DROP TABLE IF EXISTS temp_t4;
SELECT *,
CASE WHEN TTID = 2301 and PTStatusID in (800,801) THEN PAmount
ELSE 0 END AS OAmount
into temp_t4 from temp_t3
DROP TABLE IF EXISTS temp_t5
Select PID, sum(OAmount) as Final_amt into temp_t5
from temp_t4 group by PID;
DROP TABLE IF EXISTS CCtbase1
Select * into CCtbase1 from temp_t1 where PID not in
(Select distinct(PID) from temp_t5 where Final_amt = 0);
DROP TABLE IF EXISTS CCtbase2
Select a.*, b.EDate, c.EID into CCtbase2 from CCtbase1 a
left join TABLE3 b on a.CCID = b.CCID
left join TABLE4 c on a.CID = c.CID;
As no sample data was provided, I've come up with this:
WITH T1_CTE as (
SELECT PID, CID ,CCID, CExpiry, PStatusID, PDate
FROM TABLE1 where PStatusID = 1001
),
T2_CTE as (
SELECT *, CASE WHEN TTID = 2301 AND PTStatusID IN (800,801) THEN PAmount ELSE 0 END AS OAmount
FROM
(
SELECT *, row_number() OVER(PARTITION BY PTID ORDER BY ADate DESC) rowNum
FROM TABLE2
WHERE ADate < GETDATE() AND PTID > 0 AND PTStatusID <> 812
) t
WHERE t.rowNum = 1
),
T2_F_CTE as (
SELECT PID, SUM(OAmount) as Final_amt
FROM T2_CTE
GROUP BY PID
)
SELECT a.*, b.EDate, c.EID
FROM T1_CTE a
LEFT JOIN TABLE3 b ON a.CCID = b.CCID
LEFT JOIN TABLE4 c ON a.CID = c.CID
WHERE PID NOT IN (SELECT PID from T2_F_CTE where Final_amt = 0)
This will prevent you from creating too much temporary tables.
Can you try this solution
CREATE TABLE TABLE1(
PID int,
CID int,
CCID int,
CExpiry datetime,
PStatusID int,
PDate datetime
)
CREATE TABLE TABLE2(
PID int,
ADate DATETIME,
PTID int,
PTStatusID int,
TTID int,
PAmount int
)
DROP TABLE IF EXISTS CCtbase1
;with TABLE2CTE(PID,OAmount) as
(
SELECT TOP 1
PID, CASE
WHEN TTID = 2301 AND PTStatusID in (800,801) THEN PAmount
ELSE 0
END AS OAmount
FROM TABLE2
WHERE
ADate < GETDATE()
AND PTID > 0
AND TTID = 2301
AND PTStatusID in (800,801)
ORDER BY ADate DESC
)
SELECT
PID,
CID,
CCID,
CExpiry,
PStatusID,
PDate
INTO CCtbase1
FROM TABLE1
WHERE
PID NOT IN
(Select distinct(PID) from TABLE2CTE where OAmount = 0)
I have tried many ways, but unsuccessfully, to combine Start dates and end dates where the record Id is the same and combine the where there is no break in the Date
CREATE TABLE #t (
A_ID VARCHAR(100),
BDate VARCHAR(100),
CDate VARCHAR(100)
)
INSERT INTO #T
(A_ID, BDate, CDate)
VALUES
('1000','2017/12/01','2017/12/31'),
('1000','2018/01/01','2018/03/31'),
('1000','2018/05/01','2018/05/31')
Select A_ID, bDate,cDate from
(
select BDate,A_ID,Cdate,lead(Bdate) over (order by Bdate) next_BDate from #T as t2
where exists ( select null from #T as t1
where t1.A_ID = t2.A_ID and t1.Bdate <= t2.Bdate and t1.CDate <=t2.CDate )
) as combine
where bDate < Cdate
order by BDate;
I would like to see:
1000 2017/12/01 2018/03/31 (no break in first two dates)
1000 2018/05/01 2018/05/31 (Break between 4-1-18 and 5-1-18)
This is a gaps & islands problem, depending on your actual data a solution based on bĀ“nested OLAP-functions might be more efficient that recursion:
with combine as
(
select BDate,A_ID,Cdate,
-- find the gap and flag it
case when lag(Cdate)
over (partition by A_ID
order by CDate) = dateadd(day,-1, BDate)
then 0
else 1
end as flag
from T
)
, groups as
(
Select A_ID, bDate,cDate,
-- cumulative sum over 0/1 to assign the same group number for row without gaps
sum(flag)
over (partition by A_ID
order by Bdate) as grp
from combine
)
-- group consecutive rows into one
select A_ID, min(BDate), max(CDate)
from groups
group by A_ID, grp
order by min(BDate);
How does this work for you?
declare #table table (a_id int, bdate date, cdate date, id int)
insert #table
select a_id, bdate, cdate,
case when lag(cdate, 1,cdate) over(partition by a_id order by bdate) in (cdate, dateadd(day, -1, bdate))
then 1 else 2 end id from #t
select a.a_id, min(a.bdate)bdate, max(a.cdate)cdate from #table a
left join
#table b
on a.id=b.id and a.a_id=b.a_id and b.id=1
group by a.a_id, a.id
This is my original data (anonymised):
id usage verified date
1 4000 Y 2015-03-20
2 5000 N 2015-06-20
3 6000 N 2015-07-20
4 7000 Y 2016-09-20
Original query:
SELECT
me.usage,
mes.verified,
mes.date
FROM
Table1 me,
Table2 mes,
Table3 m,
Table4 mp
WHERE
me.theFk=mes.id
AND mes.theFk=m.id
AND m.theFk=mp.id
How would I go about selecting the most recent verified and non-verified?
So I would be left with:
id usage verified date
1 6000 N 2015-07-20
2 7000 Y 2016-09-20
I am using Microsoft SQL Server 2012.
First, do not use implicit joins. This was discontinued more than 10 years ago.
Second, embrace the power of the CTE, the in clause and row_number:
with CTE as
(
select
me.usage,
mes.verified,
mes.date,
row_number() over (partition by Verified order by Date desc) as CTEOrd
from Table1 me
inner join Table2 mes
on me.theFK = mes.id
where mes.theFK in
(
select m.id
from Table3 m
inner join Table4 mp
on mp.id = m.theFK
)
)
select CTE.*
from CTE
where CTEOrd = 1
You can select the TOP 1 ordered by date for verified=N, union'd with the TOP 1 ordered by date for verified=Y.
Or in pseudo SQL:
SELECT TOP 1 ...fields ...
FROM ...tables/joins...
WHERE Verified = 'N'
ORDER BY Date DESC
UNION
SELECT TOP 1 ...fields ...
FROM ...tables/joins...
WHERE Verified = 'Y'
ORDER BY Date DESC
drop table #stack2
CREATE TABLE #stack2
([id] int, [usage] int, [verified] varchar(1), [date] datetime)
;
INSERT INTO #stack2
([id], [usage], [verified], [date])
VALUES
(1, 4000, 'Y', '2015-03-20 00:00:00'),
(2, 5000, 'N', '2015-06-20 00:00:00'),
(3, 6000, 'N', '2015-07-20 00:00:00'),
(4, 7000, 'Y', '2016-09-20 00:00:00')
;
;with cte as (select verified,max(date) d from #stack2 group by verified)
select row_number() over( order by s2.[verified]),s2.[usage], s2.[verified], s2.[date] from #stack2 s2 join cte c on c.verified=s2.verified and c.d=s2.date
As per the data shown i had written the query.
for your scenario this will be use full
WITH cte1
AS (SELECT me.usage,
mes.verified,
mes.date
FROM Table1 me,
Table2 mes,
Table3 m,
Table4 mp
WHERE me.theFk = mes.id
AND mes.theFk = m.id
AND m.theFk = mp.id),
cte
AS (SELECT verified,
Max(date) d
FROM cte1
GROUP BY verified)
SELECT Row_number()
OVER(
ORDER BY s2.[verified]),
s2.[usage],
s2.[verified],
s2.[date]
FROM cte1 s2
JOIN cte c
ON c.verified = s2.verified
AND c.d = s2.date
You can as the below Without join.
-- Mock data
DECLARE #Tbl TABLE (id INT, usage INT, verified CHAR(1), date DATETIME)
INSERT INTO #Tbl
VALUES
(1, 4000 ,'Y', '2015-03-20'),
(2, 5000 ,'N', '2015-06-20'),
(3, 6000 ,'N', '2015-07-20'),
(4, 7000 ,'Y', '2016-09-20')
SELECT
A.id ,
A.usage ,
A.verified ,
A.MaxDate
FROM
(
SELECT
id ,
usage ,
verified ,
date,
MAX(date) OVER (PARTITION BY verified) MaxDate
FROM
#Tbl
) A
WHERE
A.date = A.MaxDate
Result:
id usage verified MaxDate
----------- ----------- -------- ----------
3 6000 N 2015-07-20
4 7000 Y 2016-09-20
CREATE TABLE #Table ( ID INT ,usage INT, verified VARCHAR(10), _date DATE)
INSERT INTO #Table ( ID , usage , verified , _date)
SELECT 1,4000 , 'Y','2015-03-20' UNION ALL
SELECT 2, 5000 , 'N' ,'2015-06-20' UNION ALL
SELECT 3, 6000 , 'N' ,'2015-07-20' UNION ALL
SELECT 4, 7000 , 'Y' ,'2016-09-20'
SELECT ROW_NUMBER() OVER(ORDER BY usage) ID,usage , A.verified , A._date
FROM #Table
JOIN
(
SELECT verified , MAX(_date) _date
FROM #Table
GROUP BY verified
) A ON #Table._date = A._date
first table is my input and expecting output like second table with out using left join.
this is the table data
declare #table table
(customer_id int,
indicator bit,
salary numeric(22,6)
,netresult numeric(22,6))
INSERT INTO #table (
customer_id
,indicator
,salary
)
VALUES
(1,1,2000),
(1,1,3000),
(2,1,1000),
(1,0,500),
(1,1,5000),
(2,1,2000),
(2,0,100)
select * from #table order by customer_id,indicator desc
I tried in below method it works. Is there any better alternative?
SELECT a.customer_id
,a.indicator
,a.salary
,netresult=p_salary-(2*n_salary)
FROM (
SELECT customer_id
,indicator
,salary
,sum(salary) OVER (PARTITION BY customer_id) p_salary
FROM #table
) a
LEFT JOIN (
SELECT customer_id
,indicator
,salary
,sum(salary) OVER (PARTITION BY customer_id) n_salary
FROM #table
WHERE indicator = 0
) b ON a.customer_id = b.customer_id
order by customer_id,indicator desc
Expected Output
I think you want this:
select t.customer_id, t.indicator,
sum(case when indicator = 1 then salary else - salary end) over (partition by customer_id) as netresult
form #table t;
No joins are necessary.
with math
select t.customer_id, t.indicator, t.salary
, sum((( t.indicator * 2) -1) * salary) over (partition by customer_id) as netresult
from #table t;
I have a table example like this:
date id status
01/01/2013 55555 high
01/01/2014 55555 low
01/01/2010 44444 high
01/01/2011 33333 low
I need in order: group by id and select most recent date.
this is the result I want.
date id status
01/01/2014 55555 low
01/01/2010 44444 high
01/01/2011 33333 low
I do not care the order of the rows.
you need to join your table with a subquery that "links" the record date with the greatest date for each id:
select a.*
from your_table as a
inner join (
select id, max(date) as max_date
from your_table
group by id
) as b on a.id = b.id and a.date = b.max_date;
I think you will need a subquery to get the MAX(Date) and then inner join. Try this:
SELECT A.[Date], A.[Id], A.[Status]
FROM Table A
INNER JOIN(SELECT Id, MAX([Date]) AS MaxDate
FROM Table
GROUP BY [Id]) B ON
A.[Id] = B.[Id] AND
A.[Date] = B.[MaxDate]
--return the group id and the latest date in that group
select id
, MAX([date]) [latestDateInGroup]
from tbl
group by id
--return the group id, and the related status and date for the record with the latest date in that group
select id
, [status] [latestDateInGroup'sStatus]
, [date] [latestDateInGroup]
from
(
select id
, [status]
, [date]
, row_number() over (partition by id order by [date] desc) r
from tbl
) x
where x.r = 1
--return all ids and statuses, along with the latest date in that group's group (requires SQL 2012+)
select id
, [status]
, max([date]) over (partition by id order by [date] desc) [latestDateInGroup]
from tbl
SQL Fiddle's offline at the moment; once back up the following code should allow you to build a table to test the above queries with
http://sqlfiddle.com
create table tbl ([date] date, id bigint, [status] nvarchar(4))
go
insert tbl select '2013-01-01', 55555, 'high'
insert tbl select '2014-01-01', 55555, 'low'
insert tbl select '2010-01-01', 44444, 'high'
insert tbl select '2011-01-01', 33333, 'low'