Related
We have a requirement to assign row number to all rows using following rule
Row if pinned should have same row number
Otherwise sort it by GMD
Example:
ID GMD IsPinned
1 2.5 0
2 0 1
3 2 0
4 4 1
5 3 0
Should Output
ID GMD IsPinned RowNo
5 3 0 1
2 0 1 2
1 2.5 0 3
4 4 1 4
3 2 0 5
Please Note row number for Id's 2 and 4 stayed intact as they are pinned with values of 2 and 4 respectively even though the GMD are not in any order
Rest of rows Id's 1, 3 and 5 row numbers are sorted using GMD desc
I tried using RowNumber SQL 2012 however, it is pushing pinned items from their position
Here's a set-based approach to solving this. Note that the first CTE is unnecessary if you already have a Numbers table in your database:
declare #t table (ID int,GMD decimal(5,2),IsPinned bit)
insert into #t (ID,GMD,IsPinned) values
(1,2.5,0), (2, 0 ,1), (3, 2 ,0), (4, 4 ,1), (5, 3 ,0)
;With Numbers as (
select ROW_NUMBER() OVER (ORDER BY ID) n from #t
), NumbersWithout as (
select
n,
ROW_NUMBER() OVER (ORDER BY n) as rn
from
Numbers
where n not in (select ID from #t where IsPinned=1)
), DataWithout as (
select
*,
ROW_NUMBER() OVER (ORDER BY GMD desc) as rn
from
#t
where
IsPinned = 0
)
select
t.*,
COALESCE(nw.n,t.ID) as RowNo
from
#t t
left join
DataWithout dw
inner join
NumbersWithout nw
on
dw.rn = nw.rn
on
dw.ID = t.ID
order by COALESCE(nw.n,t.ID)
Hopefully my naming makes it clear what we're doing. I'm a bit cheeky in the final SELECT by using a COALESCE to get the final RowNo when you might have expected a CASE expression. But it works because the contents of the DataWithout CTE is defined to only exist for unpinned items which makes the final LEFT JOIN fail.
Results:
ID GMD IsPinned RowNo
----------- --------------------------------------- -------- --------------------
5 3.00 0 1
2 0.00 1 2
1 2.50 0 3
4 4.00 1 4
3 2.00 0 5
Second variant that may perform better (but never assume, always test):
declare #t table (ID int,GMD decimal(5,2),IsPinned bit)
insert into #t (ID,GMD,IsPinned) values
(1,2.5,0), (2, 0 ,1), (3, 2 ,0), (4, 4 ,1), (5, 3 ,0)
;With Numbers as (
select ROW_NUMBER() OVER (ORDER BY ID) n from #t
), NumbersWithout as (
select
n,
ROW_NUMBER() OVER (ORDER BY n) as rn
from
Numbers
where n not in (select ID from #t where IsPinned=1)
), DataPartitioned as (
select
*,
ROW_NUMBER() OVER (PARTITION BY IsPinned ORDER BY GMD desc) as rn
from
#t
)
select
dp.ID,dp.GMD,dp.IsPinned,
CASE WHEN IsPinned = 1 THEN ID ELSE nw.n END as RowNo
from
DataPartitioned dp
left join
NumbersWithout nw
on
dp.rn = nw.rn
order by RowNo
In the third CTE, by introducing the PARTITION BY and removing the WHERE clause we ensure we have all rows of data so we don't need to re-join to the original table in the final result in this variant.
this will work:
CREATE TABLE Table1
("ID" int, "GMD" number, "IsPinned" int)
;
INSERT ALL
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (1, 2.5, 0)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (2, 0, 1)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (3, 2, 0)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (4, 4, 1)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (5, 3, 0)
SELECT * FROM dual
;
select * from (select "ID","GMD","IsPinned",rank from(select m.*,rank()over(order by
"ID" asc) rank from Table1 m where "IsPinned"=1)
union
(select "ID","GMD","IsPinned",rank from (select t.*,rank() over(order by "GMD"
desc)-1 rank from (SELECT * FROM Table1)t)
where "IsPinned"=0) order by "GMD" desc) order by rank ,GMD;
output:
2 0 1 1
5 3 0 1
1 2.5 0 2
4 4 1 2
3 2 0 3
Can you try this query
CREATE TABLE Table1
(ID int, GMD numeric (18,2), IsPinned int);
INSERT INTO Table1 (ID,GMD, IsPinned)
VALUES (1, 2.5, 0),
(2, 0, 1),
(3, 2, 0),
(4, 4, 1),
(5, 3, 0)
select *, row_number () over(partition by IsPinned order by (case when IsPinned =0 then GMD else id end) ) [CustOrder] from Table1
This took longer then I thought, the thing is row_number would take a part to resolve the query. We need to differentiate the row_numbers by id first and then we can apply the while loop or cursor or any iteration, in our case we will just use the while loop.
dbo.test (you can replace test with your table name)
1 2.5 False
2 0 True
3 3 False
4 4 True
6 2 False
Here is the query I wrote to achieve your result, I have added comment under each operation you should get it, if you have any difficultly let me know.
Query:
--user data table
DECLARE #userData TABLE
(
id INT NOT NULL,
gmd FLOAT NOT NULL,
ispinned BIT NOT NULL,
rownumber INT NOT NULL
);
--final result table
DECLARE #finalResult TABLE
(
id INT NOT NULL,
gmd FLOAT NOT NULL,
ispinned BIT NOT NULL,
newrownumber INT NOT NULL
);
--inserting to uer data table from the table test
INSERT INTO #userData
SELECT t.*,
Row_number()
OVER (
ORDER BY t.id ASC) AS RowNumber
FROM test t
--creating new table for ids of not pinned
CREATE TABLE #ids
(
rn INT,
id INT,
gmd FLOAT
)
-- inserting into temp table named and adding gmd by desc
INSERT INTO #ids
(rn,
id,
gmd)
SELECT DISTINCT Row_number()
OVER(
ORDER BY gmd DESC) AS rn,
id,
gmd
FROM #userData
WHERE ispinned = 0
--declaring the variable to loop through all the no pinned items
DECLARE #id INT
DECLARE #totalrows INT = (SELECT Count(*)
FROM #ids)
DECLARE #currentrow INT = 1
DECLARE #assigningNumber INT = 1
--inerting pinned items first
INSERT INTO #finalResult
SELECT ud.id,
ud.gmd,
ud.ispinned,
ud.rownumber
FROM #userData ud
WHERE ispinned = 1
--looping through all the rows till all non-pinned items finished
WHILE #currentrow <= #totalrows
BEGIN
--skipping pinned numers for the rows
WHILE EXISTS(SELECT 1
FROM #finalResult
WHERE newrownumber = #assigningNumber
AND ispinned = 1)
BEGIN
SET #assigningNumber = #assigningNumber + 1
END
--getting row by the number
SET #id = (SELECT id
FROM #ids
WHERE rn = #currentrow)
--inserting the non-pinned item with new row number into the final result
INSERT INTO #finalResult
SELECT ud.id,
ud.gmd,
ud.ispinned,
#assigningNumber
FROM #userData ud
WHERE id = #id
--going to next row
SET #currentrow = #currentrow + 1
SET #assigningNumber = #assigningNumber + 1
END
--getting final result
SELECT *
FROM #finalResult
ORDER BY newrownumber ASC
--dropping table
DROP TABLE #ids
Output:
I have 2 tables, 01 is current status and 01 is finish status.
I want to calculate time difference of 2 rows that have the same PO_NO,MANAGEMENT_NO,PROCESS_NAME .
Each PROCESS_NAME has the STATUS (Start/Finish)
ID INDEXNO PO_NO ITEM_CD MANAGEMENT_NO SEQ PROCESS_NAME STATUS Time_Occurrence TimeDiff (Minute)
43 126690 GV12762 332393961 616244 6 RFID Start 17-03-18 13:28 NULL
44 126690 GV12762 332393961 616244 6 RFID Finish 17-03-18 13:29 0
49 141646 GV14859 7E7060100 619005 2 Imprint Start 19-03-18 13:23 NULL
50 141646 GV14859 7E7060100 619005 2 Imprint Finish 19-03-18 13:30 7
48 141646 GV14859 7E7060100 619005 1 R.M.Requisition Start 19-03-18 13:18 NULL
56 141646 GV14859 7E7060100 619005 1 R.M.Requisition Finish 19-03-18 15:54 156
The expected result is : TimeDiff (Minute) column
select PO_NO, [MANAGEMENT_NO],[STATUS] [Time_Occurrence],
datediff(minute, (isnull((select [Time_Occurrence] from [TBL_FINISH_STATUS] t1 where t1.id=t2.id-1), dateadd(dd, 0, datediff(dd, 0, getdate())))), [Time_Occurrence])TimeDiff
from [PROC_MN].[dbo].[TBL_FINISH_STATUS] t2
ORDER BY PO_NO,MANAGEMENT_NO,ITEM_CD,Time_Occurrence
With above query, the result is far wrong with the expected result
Could anyone help me please?
Note: the ID column (48,56) of SEQ 1 of PO_NO: GV14859
If I understand what you want, then this seems like a simple query for it:
select INDEXNO, PO_NO, ITEM_CD, MANAGEMENT_NO, SEQ,
datediff(minute,
min(case when status = 'Start' then Time_Occurrence end),
max(case when status = 'Finish' then Time_Occurrence end)
) as timediff
from t
group by INDEXNO, PO_NO, ITEM_CD, MANAGEMENT_NO, SEQ;
Here is a SQL Fiddle.
It is not really clear what you are expecting as a result. Looking at your data sample, the design looks flawed from the start. There is too much redundancy for an SQL database. Maybe you don't have any control over the existing database. Anyway, this could be solved in N different ways and if my memory is not wrong, LEAD\LAG functions didn't exist in SQL server 2008 (but row_number is there as another solution). I tried to create something that is even compatible with older versions, but not sure if that is what you meant as a result:
DECLARE #myTable TABLE([ID] INT,
[INDEXNO] INT,
[PO_NO] VARCHAR(7),
[ITEM_CD] VARCHAR(10),
[MANAGEMENT_NO] INT,
[SEQ] INT,
[PROCESS_NAME] VARCHAR(15),
[STATUS] VARCHAR(6),
[Time_Occurrence] DATETIME,
[TimeDiff] VARCHAR(4));
INSERT INTO #myTable([ID], [INDEXNO], [PO_NO], [ITEM_CD], [MANAGEMENT_NO], [SEQ], [PROCESS_NAME], [STATUS], [Time_Occurrence], [TimeDiff])
VALUES(43, 126690, 'GV12762', '332393961', 616244, 6, 'RFID', 'Start', '20180317 13:28', NULL),
(44, 126690, 'GV12762', '332393961', 616244, 6, 'RFID', 'Finish', '20180317 13:29', '0'),
(49, 141646, 'GV14859', '7E7060100', 619005, 2, 'Imprint', 'Start', '20180319 13:23', NULL),
(50, 141646, 'GV14859', '7E7060100', 619005, 2, 'Imprint', 'Finish', '20180319 13:30', '7'),
(48, 141646, 'GV14859', '7E7060100', 619005, 1, 'R.M.Requisition', 'Start', '20180318 13:18', NULL),
(56, 141646, 'GV14859', '7E7060100', 619005, 1, 'R.M.Requisition', 'Finish', '20180318 15:54', '156');
SELECT * FROM #myTable;
WITH
Starters AS (
SELECT ID, PO_NO, [MANAGEMENT_NO], [PROCESS_NAME], [Time_Occurrence]
FROM #myTable
WHERE STATUS='Start'
),
Finishers AS (
SELECT ID, PO_NO, [MANAGEMENT_NO], [PROCESS_NAME], [Time_Occurrence]
FROM #myTable
WHERE STATUS='Finish'
)
SELECT s.PO_NO, s.MANAGEMENT_NO, s.PROCESS_NAME,
s.Time_Occurrence as [Start], f.Time_Occurrence as [End],
DATEDIFF(MINUTE, s.Time_Occurrence, f.Time_Occurrence) AS TIMEdiff
FROM Starters s
LEFT JOIN Finishers f ON s.PO_NO=f.PO_NO
AND s.MANAGEMENT_NO=f.MANAGEMENT_NO
AND f.PROCESS_NAME=s.PROCESS_NAME;
I have the following data:
CREATE TABLE Table1
(
ID varchar(10),
StudentName varchar(30),
Course varchar(15),
SECTION varchar(2),
DAY varchar(10),
START_TIME time,
END_TIME time,
actual_starttime time,
actual_endtime time
);
INSERT INTO Table1
VALUES (111, 'Mary', 'Science', 'A', 'Mon', '13:30:00.0000000', '16:20:00.0000000', '09:00:00.0000000', '21:20:00.0000000')
INSERT INTO Table1
VALUES (111, 'Mary', 'Maths', 'A', 'Tue', '12:30:00.0000000', '13:20:00.0000000', '09:00:00.0000000', '21:20:00.0000000')
INSERT INTO Table1
VALUES (111, 'Mary', 'Physics', 'C', 'Tue', '10:30:00.0000000', '11:10:00.0000000', '09:00:00.0000000', '21:20:00.0000000')
INSERT INTO Table1
VALUES (112, 'Robert', 'Maths', 'A', 'Mon', '13:30:00.0000000', '16:20:00.0000000', '09:00:00.0000000', '21:20:00.0000000')
The scenario is as follows: the student can have class from morning 9 to night 9:30 from Monday to Friday. My requirement is I have to identify a timeslot where all the students in the same section are free so that a teacher can reschedule a class.
Example: both Mary and Robert are free in the morning from 9:00 to 1:30 in the afternoon on Monday. I would like to write query for this.
Please help.
Thanks in advance!
To return the full list of the timeslots available, you need to build a set of all the timeslots for each day of the week and then find if any of these slots have students being taught within it.
This is easily achieved with a recursive CTE to build your full timeslot set, from which you can JOIN into your Students data. The output of the query below is the day and time of each vacant session:
-- Build the dummy data sets:
declare #Data table
(
ID varchar(10),
StudentName varchar(30),
Course varchar(15),
SECTION varchar(2),
DAY varchar(10),
START_TIME time,
END_TIME time,
actual_starttime time,
actual_endtime time
);
insert into #Data values
(111, 'Mary', 'Science', 'A', 'Mon', '13:30:00.0000000', '16:20:00.0000000', '09:00:00.0000000', '21:20:00.0000000')
,(111, 'Mary', 'Maths', 'A', 'Tue', '12:30:00.0000000', '13:20:00.0000000', '09:00:00.0000000', '21:20:00.0000000')
,(111, 'Mary', 'Physics', 'C', 'Tue', '10:30:00.0000000', '11:10:00.0000000', '09:00:00.0000000', '21:20:00.0000000')
,(112, 'Robert', 'Maths', 'A', 'Mon', '13:30:00.0000000', '16:20:00.0000000', '09:00:00.0000000', '21:20:00.0000000');
-- Query the data:
with TimeSlots as -- Recursive CTE builds a table of all timeslots in TIME data type.
(
select cast('09:00:00' as time) as TimeSlotStart
,cast('09:30:00' as time) as TimeSlotEnd
union all
select dateadd(minute,30,TimeSlotStart)
,dateadd(minute,30,TimeSlotEnd)
from TimeSlots
where TimeSlotStart < cast('21:00:00' as time)
)
, TeachingDays as -- Used to return all the time slots above for each day of the week in CROSS JOIN below.
(
select 1 as DaySort
,'Mon' as TeachingDay
union all
select 2 as DaySort
,'Tue'
union all
select 3 as DaySort
,'Wed'
union all
select 4 as DaySort
,'Thu'
union all
select 5 as DaySort
,'Fri'
)
select td.TeachingDay
,t.TimeSlotStart
,t.TimeSlotEnd
from TimeSlots t -- Select all timeslots.
cross join TeachingDays td -- For each day.
left join #Data d -- And find all students that are being taught on that day at the specified time.
on(td.TeachingDay = d.DAY
and t.TimeSlotStart <= d.END_TIME
and t.TimeSlotEnd > d.START_TIME
)
where d.ID is null -- Then only return data where there are no students being taught at this timeslot.
order by td.DaySort
,t.TimeSlotStart;
You could create a Stored Procedure with following steps.
Step 1: Predefine timeslots in a different table.(09:00-10:00, 10:00-11:00 etc)
Step 2: Select count of students
Step 3:
for all the slots
Begin
for all the students
Begin
if(students.actual_starttime =slots.actual_starttime and
students.actual_endtime =slots.actual_endtime
break;
else count=count+1;
End
End
Step 4: if above count matches with count of total students, then slot is free for all the students else slot is not foree for all the students.
Hope this helps. Let me know if you find difficulty with it.
You should have three more tables to make it more simple
i.e. Student, Section and slots
I tried to create 1 more table with half hour slots
create table table2(timeslot time);
insert into table2 values ('9:00:00.0000000');
insert into table2 values ('9:30:00.0000000');
insert into table2 values ('10:00:00.0000000');
insert into table2 values ('10:30:00.0000000');
insert into table2 values ('11:00:00.0000000');
insert into table2 values ('11:30:00.0000000');
insert into table2 values ('12:00:00.0000000');
insert into table2 values ('12:30:00.0000000');
insert into table2 values ('13:00:00.0000000');
insert into table2 values ('13:30:00.0000000');
insert into table2 values ('14:00:00.0000000');
insert into table2 values ('14:30:00.0000000');
insert into table2 values ('15:00:00.0000000');
insert into table2 values ('15:30:00.0000000');
insert into table2 values ('16:00:00.0000000');
insert into table2 values ('16:30:00.0000000');
insert into table2 values ('17:00:00.0000000');
insert into table2 values ('17:30:00.0000000');
insert into table2 values ('18:00:00.0000000');
insert into table2 values ('18:30:00.0000000');
insert into table2 values ('19:00:00.0000000');
insert into table2 values ('19:30:00.0000000');
insert into table2 values ('20:00:00.0000000');
insert into table2 values ('20:30:00.0000000');
insert into table2 values ('21:00:00.0000000');
insert into table2 values ('21:30:00.0000000');
Following SQL will give you free slot and name of student:
Query:
select t1.StudentName,t2.timeslot
from Table2 t2,
Table1 t1
where t2.timeslot<t1.start_time
and t2.timeslot<t1.end_time
and t1.section='A'
group by t1.StudentName,t2.timeslot
order by t2.timeslot
Output:
StudentName timeslot
1 Mary 09:00:00
2 Robert 09:00:00
3 Mary 09:30:00
4 Robert 09:30:00
5 Mary 10:00:00
6 Robert 10:00:00
7 Mary 10:30:00
8 Robert 10:30:00
9 Mary 11:00:00
10 Robert 11:00:00
11 Mary 11:30:00
12 Robert 11:30:00
13 Mary 12:00:00
14 Robert 12:00:00
15 Mary 12:30:00
16 Robert 12:30:00
17 Mary 13:00:00
18 Robert 13:00:00
This is just half task done, I just showed you way to achieve it. Introduce two more joins with student and section table to achieve this.
Shred the day (09:00 to 21:30 interval) into minutes, find free minutes with respect to students of the group and days of interest and group minutes found back as intervals.
CREATE TABLE Table1 (ID varchar(10),StudentName varchar(30), Course varchar(15) ,SECTION varchar(2),DAY varchar(10),
START_TIME time , END_TIME time, actual_starttime time, actual_endtime time);
INSERT INTO Table1 VALUES (111, 'Mary','Science','A','Mon','13:30:00.0000000','16:20:00.0000000','09:00:00.0000000','21:20:00.0000000')
INSERT INTO Table1 VALUES (111, 'Mary','Maths','A','Tue','12:30:00.0000000','13:20:00.0000000','09:00:00.0000000','21:20:00.0000000')
INSERT INTO Table1 VALUES (111, 'Mary','Physics','C','Tue','10:30:00.0000000','11:10:00.0000000','09:00:00.0000000','21:20:00.0000000')
INSERT INTO Table1 VALUES (112, 'Robert','Maths','A','Mon','13:30:00.0000000','16:20:00.0000000','09:00:00.0000000','21:20:00.0000000')
;
-- parameters
declare #tds time = '09:00';
declare #tde time = '21:30';
declare #section varchar(2) = 'A';
create table #daysofinterest (DAY varchar(10) primary key);
insert #daysofinterest (DAY) values ('Mon'),('Tue'),('Fri');
create table #groupmembers(ID int primary key);
insert #groupmembers(ID) values (111),(112);
-- query
select DAY, startt = dateadd(minute, min(n), #tds), endt = dateadd (minute, max(n), #tds)
from (
select DAY, n, grp = n - row_number() over(partition by DAY order by n)
from (
-- all minutes of the day, #tds till #tde
select top (datediff(minute, #tds, #tde)) n = row_number() over(order by (select null))
from sys.all_objects
) tally
cross join #daysofinterest dd
join #groupmembers gm on
not exists (select 1 from table1 t
where t.ID = gm.ID and t.DAY = dd.DAY and SECTION = #section and
dateadd (minute, n, #tds) between t.START_TIME and t.END_TIME )
group by DAY, n
--this minute is free for every group member
having count(*) = (select count(*) from #groupmembers)
) g
group by DAY, grp
order by DAY, min(n)
Table Schema:
CREATE TABLE [dbo].[TblPriceDetails](
[PriceID] [int] IDENTITY(1,1) NOT NULL,
[VID] [int] NOT NULL,
TypeID int not null,
[RangeStart] [decimal](18, 3) NOT NULL,
[RangeEnd] [decimal](18, 3) NOT NULL,
[Price] [decimal](18, 2) NOT NULL,
[ExtraLoad] [decimal](18, 3) NULL,
[ExtraPrice] [decimal](18, 2) NULL
)
GO
Sample Data
Insert into dbo.TblPriceDetails values (1,1, 0,0.250,10,0,0)
Insert into dbo.TblPriceDetails values (1,1, 0.251,0.500,15,0.500,15)
Insert into dbo.TblPriceDetails values (1,1, 3,5,40,1,25)
GO
Insert into dbo.TblPriceDetails values (1,2, 0,0.250,15,0,0)
Insert into dbo.TblPriceDetails values (1,2, 0.251,0.500,20,0.500,20)
Insert into dbo.TblPriceDetails values (1,2, 3,5,50,1,30)
GO
Expected Output:
For VID = 1 and TypeID = 1 and a given value 0.300
As the input unit falls between RangeStart 0.251 and RangeEnd 0.500
the resultant price will be 15
For VID = 1 and TypeID = 1 and a given value 0.600
As per the data until 0.500 the price is 15 and for every extraLoad
of upto 0.500 its another 15. So the final price will be 30
For VID = 1 and TypeID = 1 and given value 1.500
As per the data until 0.500 the price is 15. For every extra 0.500
its another 15, so for the remaining 1 unit it would be 15 * 2. The
final price will be 45
For VID = 1 and TypeID = 1 and given value 5.5
As per the data until 5.000 the price is 40. For every extra 1 unit its another 25, so the final price will be 65
Need help in writing a query for this. Unlike my other questions I don't have a query yet to show what I have come up with till now. As of now I am not able to frame a logic and come up with a generic query for this.
It looks like you are looking to calculate postage price. The trick is to join on the RangeStart of the next weight tier. LEAD will help you do that:
;WITH
AdjustedPriceDetails AS
(
SELECT VID, TypeID, RangeStart, RangeEnd, Price, ExtraLoad, ExtraPrice
, ISNULL(LEAD(RangeStart, 1) OVER (PARTITION BY VID, TypeID ORDER BY RangeStart), 1000000) AS NextRangeStart
FROM TblPriceDetails
)
SELECT T.*
, A.Price + IIF(T.Value <= A.RangeEnd, 0, CEILING((T.Value - A.RangeEnd) / A.ExtraLoad) * A.ExtraPrice)
AS FinalPrice
FROM #TestData T
INNER JOIN AdjustedPriceDetails A ON A.RangeStart <= T.Value AND T.Value < A.NextRangeStart
Explanation:
LEAD(RangeStart, 1) OVER (PARTITION BY VID, TypeID ORDER BY RangeStart) gets the RangeStart of the next row that has the same VID and TypeID
You will eventually reach the highest weight tier. So ISNULL(..., 1000000) make this tier appear to end at 1M. The 1M is just a stand-in for infinity.
Edit: if you want to make this work with SQL Server 2008, change the CTE:
;WITH
tmp AS
(
SELECT VID, TypeID, RangeStart, RangeEnd, Price, ExtraLoad, ExtraPrice
, ROW_NUMBER() OVER (PARTITION BY VID, TypeID ORDER BY RangeStart) AS RowNumber
FROM TblPriceDetails
),
AdjustedPriceDetails AS
(
SELECT T1.VID, T1.TypeID, T1.RangeStart, T1.RangeEnd, T1.Price, T1.ExtraLoad, T1.ExtraPrice
, ISNULL(T2.RangeStart, 1000000) AS NextRangeStart
FROM tmp T1
LEFT JOIN tmp T2 ON T1.VID = T2.VID AND T1.TypeId = T2.TypeID AND T1.RowNumber + 1 = T2.RowNumber
)
If you wonder what #TestData is (you may not need it)
CREATE TABLE #TestData
(
VID int
, TypeID int
, Value float
)
INSERT INTO #TestData
( VID, TypeID, Value)
VALUES ( 1, 1, 0.3 )
, ( 1, 1, 0.6 )
, ( 1, 1, 1.5 )
, ( 1, 1, 5.5 )
I have a query that will result in a customer bill being created on our SSRS 2008 R2 server. The SQL Server instance is also 2008 R2. The query is large and I don't want to post the entire thing for security reasons, etc.
What I need to do with the example data below, is to remove the two rows with 73.19 and -73.19 from the result set. So, if two rows have the same absolute value in the LineBalance column and their sum is 0 AND if they have the same value in the REF1 column, the should be removed from the result set. The line with REF1 = 14598 and a line balance of 281.47 should still be returned in the result set and the other two rows below with REF1 = 14598 should not be returned.
The point of this is to "hide" accounting errors and their correction from the customer. by "hide" I mean, not show it on the bill they get in the mail. What happened here is the customer was mistakenly billed 73.19 when they should have been billed 281.47. So, our AR dept. returned 73.19 to their account and charged them the correct amount of 281.47. As you can see they all have the same REF1 value.
I would add a field that would contain explicit flag telling you that a certain charge was a mistake/reversal of a mistake and then it is trivial to filter out such rows. Doing it on the fly could make your reports rather slow.
But, to solve the given problem as is we can do like this. The solution assumes that SysInvNum is unique.
Create a table with sample data
DECLARE #T TABLE (SysInvNum int, REF1 int, LineBalance money);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (3344299, 14602, 558.83);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (3344298, 14598, 281.47);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (3344297, 14602, -95.98);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (3344296, 14598, -73.19);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (3341758, 14598, 73.19);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (11, 100, 50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (12, 100, -50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (13, 100, 50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (21, 200, -50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (22, 200, -50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (23, 200, 50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (31, 300, -50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (32, 300, 50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (33, 300, -50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (34, 300, 50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (41, 400, 50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (42, 400, -50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (43, 400, 50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (44, 400, -50.00);
INSERT INTO #T (SysInvNum, REF1, LineBalance) VALUES (45, 400, 50.00);
I've added few more cases that have multiple mistakes.
Number and count rows
SELECT
SysInvNum
, REF1
, LineBalance
, ROW_NUMBER() OVER(PARTITION BY REF1, LineBalance ORDER BY SysInvNum) AS rn
, COUNT(*) OVER(PARTITION BY REF1, ABS(LineBalance)) AS cc1
FROM #T AS TT
This is the result set:
SysInvNum REF1 LineBalance rn cc1
11 100 50.00 1 3
12 100 -50.00 1 3
13 100 50.00 2 3
21 200 -50.00 1 3
23 200 50.00 1 3
22 200 -50.00 2 3
31 300 -50.00 1 4
32 300 50.00 1 4
33 300 -50.00 2 4
34 300 50.00 2 4
41 400 50.00 1 5
42 400 -50.00 1 5
43 400 50.00 2 5
44 400 -50.00 2 5
45 400 50.00 3 5
3341758 14598 73.19 1 2
3344296 14598 -73.19 1 2
3344298 14598 281.47 1 1
3344297 14602 -95.98 1 1
3344299 14602 558.83 1 1
You can see that those rows that have mistakes have count > 1. Also, pairs of mistakes have same row numbers. So, we need to remove/hide those rows that have count > 1 and those that have two same row numbers.
Determine rows to remove
WITH
CTE_rn
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, ROW_NUMBER() OVER(PARTITION BY REF1, LineBalance ORDER BY SysInvNum) AS rn
, COUNT(*) OVER(PARTITION BY REF1, ABS(LineBalance)) AS cc1
FROM #T AS TT
)
, CTE_ToRemove
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, COUNT(*) OVER(PARTITION BY REF1, rn) AS cc2
FROM CTE_rn
WHERE CTE_rn.cc1 > 1
)
SELECT *
FROM CTE_ToRemove
WHERE CTE_ToRemove.cc2 = 2
This is another intermediate result:
SysInvNum REF1 LineBalance cc2
12 100 -50.00 2
11 100 50.00 2
21 200 -50.00 2
23 200 50.00 2
32 300 50.00 2
31 300 -50.00 2
33 300 -50.00 2
34 300 50.00 2
42 400 -50.00 2
41 400 50.00 2
43 400 50.00 2
44 400 -50.00 2
3344296 14598 -73.19 2
3341758 14598 73.19 2
Now, we just put all this together.
Final query
WITH
CTE_rn
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, ROW_NUMBER() OVER(PARTITION BY REF1, LineBalance ORDER BY SysInvNum) AS rn
, COUNT(*) OVER(PARTITION BY REF1, ABS(LineBalance)) AS cc1
FROM #T AS TT
)
, CTE_ToRemove
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, COUNT(*) OVER(PARTITION BY REF1, rn) AS cc2
FROM CTE_rn
WHERE CTE_rn.cc1 > 1
)
SELECT *
FROM #T AS TT
WHERE
TT.SysInvNum NOT IN
(
SELECT CTE_ToRemove.SysInvNum
FROM CTE_ToRemove
WHERE CTE_ToRemove.cc2 = 2
)
ORDER BY SysInvNum;
Result:
SysInvNum REF1 LineBalance
13 100 50.00
22 200 -50.00
45 400 50.00
3344297 14602 -95.98
3344298 14598 281.47
3344299 14602 558.83
Note, that final result doesn't have any rows with REF = 300, because there were two corrected mistakes, that balanced each other completely.
Most AR/billing systems treat "credit memos" (the negative amount) similar to cash, in which case the -73.19 would get applied to the 73.19 LineBalance the same as if the customer had paid that amount, resulting in a $0 balance.
OPTION 1:
Do you handle cash receipts and applications in this sytem? If so, you may be able to pull data from those cash application tables to show the tie between SysInvNum 3344296 and 3341758.
OPTION 2:
I'm assuming that PayAdjust column is used to reduce the balance after a customer has paid, and that LineBalance is a calculated column that is Charges + PayAdjust.
Most of the time when this occurs, the AR department would be responsible for applying the credit memo to an open invoice, so that the PayAdjust column would net $0 between the 2 rows, and this would cause the LineBalance to also be $0 on each of the 2 rows. It may just be a training issue for the system that is being used.
This would cause the 3 rows in question to look like this, so you don't have an issue, you would just exclude the rows by adding where LineBalance <> 0 to your query since the AR department (which applied the credit to begin with and so knows the answer to this question) explicitly stated which LineBalance the credit applies to:
Option 2 Preferred data structure:
SysInvNum REF1 Charges PayAdjust LineBalance
----------- ----------- --------------------- --------------------- ---------------------
3344298 14598 281.47 0.00 281.47
3344296 14598 -73.19 73.19 0.00
3341758 14598 73.19 -73.19 0.00
OPTION 3:
Without having this data from Option 1 or 2, you have make many assumptions and run the risk of inadvertently hiding the wrong rows.
That being said, here is a query that attempts to do what you are asking, but I would highly recommend checking with the AR dept to see if they can update "PayAdjust" for these records instead.
I added several test cases of scenarios that could cause issues, but this may not cover all the bases.
This query will only hide rows where one distinct matching negative value is found for a positive value, for the same REF1 and same DueDate. It also makes sure that the original Charge invoice ID is prior to the credit since it can be assumed that a credit would not occur prior to the actual charge (Test case 6 shows both rows still because the credit has an SysInvNum that occurred before the charge). If more than once match is found per REF1, DueDate, and LineBalance, then it will not hide the corresponding charge and credit lines (test cases 2 & 4). Test case 3 sums in total to 0, but it still shows all 3 rows because the LineBalance values do not match exactly. These are all assumptions that I made to handle edge cases, so they can be adjusted as needed.
CREATE TABLE #SysInvTable (SysInvNum int not null primary key, REF1 int, Charges money, PayAdjust money, LineBalance as Charges + PayAdjust, DueDate date, REF2 int, Remark varchar(50), REM varchar(50));
INSERT INTO #SysInvTable(SysInvNum, REF1, Charges, PayAdjust, DueDate, Remark)
VALUES
--.....................................
--Your test case
(3344298, 14598, 281.47, 0, '2014-12-08','Your original test case. This one should stay.')
, (3344296, 14598, -73.19, 0, '2014-12-08',null)
, (3341758, 14598, 73.19, 0, '2014-12-08',null)
--.....................................
--Test case 2: How do you match these up?
, (2001, 2, 73.19, 0, '2015-01-06','Charge 2.1')
, (2002, 2, 73.19, 0, '2015-01-06','Charge 2.2')
, (2003, 2, 73.19, 0, '2015-01-06','Charge 2.3')
, (2004, 2, -73.19, 0, '2015-01-06','Credit for charge 2.3')
, (2005, 2, -73.19, 0, '2015-01-06','Credit for charge 2.1')
--.....................................
--Test case 3
, (3001, 3, 73.19, 0, '2015-01-06','Charge 3.1')
, (3002, 3, 73.19, 0, '2015-01-06','Charge 3.2')
, (3003, 3, -146.38, 0, '2015-01-06','Credit for charges 3.1 and 3.2')
--.....................................
--Test case 4: Do you hide 4001 or 4002?
, (4001, 4, 73.19, 0, '2015-01-06','Cable')
, (4002, 4, 73.19, 0, '2015-01-06','Internet')
, (4003, 4, -73.19, 0, '2015-01-06','Misc Credit')
--.....................................
--Test case 5: remove all lines except the first
, (5000, 5, 9.99, 0, '2015-01-06','Charge 5.0 (Should stay)')
, (5001, 5, 11.11, 0, '2015-01-06','Charge 5.1')
, (5002, 5, 22.22, 0, '2015-01-06','Charge 5.2')
, (5003, 5, 33.33, 0, '2015-01-06','Charge 5.3')
, (5004, 5, -11.11, 0, '2015-01-06','Credit for charge 5.1')
, (5005, 5, -33.33, 0, '2015-01-06','Credit for charge 5.3')
, (5006, 5, -22.22, 0, '2015-01-06','Credit for charge 5.2')
--.....................................
--Test case 6: credit occurs before charge, so keep both
, (6000, 6, -73.19, 0, '2015-01-06','Credit occurs before charge')
, (6001, 6, 73.19, 0, '2015-01-06','Charge 6.1')
;
SELECT i.*
FROM #SysInvTable i
WHERE i.SysInvNum not in
(
SELECT IngoreInvNum = case when c.N = 1 then max(t.SysInvNum) else min(t2.SysInvNum) end
FROM #SysInvTable t
INNER JOIN #SysInvTable t2
ON t.ref1 = t2.ref1
AND t.DueDate = t2.DueDate
CROSS APPLY (SELECT 1 AS N UNION ALL SELECT 2 as N) AS c --used to both both T and T2 SysInvNum's to exclude
WHERE 1=1
AND t.LineBalance > 0 AND t2.LineBalance < 0
AND t.SysInvNum < t2.SysInvNum --make sure the credit came in after the positive SysInvNum
AND t.LineBalance = t2.LineBalance * -1
GROUP BY t.REF1, t.DueDate, abs(t.LineBalance), c.n
HAVING Count(*) = 1
)
;
DROP TABLE #SysInvTable;
It's clunky, but this allows you to identify the matching negative offenders. You'll need to exclude the sysInvNum in both columns from your resultset
Create table #tmp( SysInvNum int,Ref1 int, LineBalance decimal(8,2))
insert into #tmp
values(3344299,14602,558.83)
,(3344298,14598,281.47)
,(3344297,14602,-95.98)
,(3344296,14598,-73.19)
,(3341758,14598,73.19)
--Select * From #tmp
Select *
INTO #t1
From #tmp
Where LineBalance < 0
--Select * From #t1
Select SysInvNum1 = #tmp.SysInvNum, SysInvNum2 = #T1.SysInvNum
INTO #T2
From #tmp
LEFT JOIN #t1 On #tmp.Ref1 = #T1.Ref1 and #tmp.LineBalance = -#T1.LineBalance
where #t1.SysInvNum is not null
Select * From
#tmp
Where SysInvNum not in(
Select SysInvNum1 from #t2
union Select SysInvNum2 from #t2
)
drop table #tmp
drop table #t1
drop table #t2
This solution works for SQL Server 2012 only, but decided to leave it here if someone has to do something similar in the future because it is pretty straight forward.
....
This should only remove two transactions for specific recepient if their sum is 0 and there are no other transactions between them (and the negative transaction happens after positive transaction). It is more conservative and safe.
There should be information about the order of transactions. Something like date column. Than you should order by this column instead of PRIMARY KEY column in PARTITION BY statement.
IF OBJECT_ID('Receipts', 'U') IS NOT NULL
DROP TABLE Receipts
CREATE TABLE Receipts
(
SysInvNum INT PRIMARY KEY IDENTITY(1,1),
REF1 INT,
LineBalance DECIMAL(10,2)
)
INSERT INTO Receipts
values
(14602,558.83),
(14598,281.47),
(14602,-95.98),
(14598,73.19),
(14598,-73.19),
(14598,73.19),
(14598,73.19),
(14598, 215.6),
(14598,73.19)
WITH ghosts AS
(
SELECT SysInvNum, REF1, LineBalance,
LAG(LineBalance, 1, 0) OVER (PARTITION BY REF1 ORDER BY SysInvNum) PreviousLineBalance,
LEAD(LineBalance, 1, 0) OVER (PARTITION BY REF1 ORDER BY SysInvNum) NextLineBalance
FROM Receipts
)
SELECT r.SysInvNum, r.REF1, r.LineBalance
FROM Receipts r
JOIN ghosts g On (r.SysInvNum = g.SysInvNum)
WHERE NOT (g.LineBalance + g.PreviousLineBalance = 0 AND g.LineBalance < 0)
AND NOT (g.LineBalance + g.NextLineBalance = 0 AND g.LineBalance > 0)
ORDER BY r.SysInvNum