I have the following data shown below. It was produced by the following query:
;with noms as
(
select DateS, Region, Sales from tblSalesRegion
where Id = 'B2PM'
)
select * from
noms as source pivot (max(Sales) for Region in ([UK],[US],[EUxUK],[JAP],[Brazil])) as pvt
order by DateS
Data:
DateS UK US EUxUK JAP Brazil
2015-11-24 23634 22187 NULL NULL NULL
2015-11-30 23634 22187 NULL NULL NULL
2015-12-01 23634 22187 NULL NULL NULL
2015-12-02 23634 22187 NULL NULL NULL
2015-12-03 23634 22187 NULL NULL NULL
2015-12-04 56000 22187 NULL NULL NULL
2015-12-07 56000 22187 NULL NULL NULL
2015-12-08 56000 22187 NULL NULL NULL
2015-12-09 56000 22187 NULL NULL NULL
2015-12-10 56000 10025 NULL NULL NULL
2015-12-11 56000 10025 NULL NULL NULL
2015-12-14 56000 10025 NULL NULL NULL
Below is the result I'm after. So basically when one of the values changes in one of the five columns (excluding the dateS column) I want that to be shown. Is there a way to do this in Sql? As I need the date I don't think a simple group by statement would work. Also be nice if I could change the NULL's to zeros
Result I'm looking for:
DateS UK US EUxUK JAP Brazil
2015-11-24 23634 22187 0 0 0
2015-12-04 56000 22187 0 0 0
2015-12-10 56000 10025 0 0 0
Seems like a simple GROUP BY is what you want:
;WITH noms AS
(
SELECT DateS, Region, Sales
FROM tblSalesRegion
WHERE Id = 'B2PM'
)
SELECT MIN(DateS), [UK],[US],[EUxUK],[JAP],[Brazil]
FROM (
SELECT DateS,
COALESCE([UK], 0) AS [UK],
COALESCE([US], 0) AS [US],
COALESCE([EUxUK], 0) AS [EUxUK],
COALESCE([JAP], 0) AS [JAP],
COALESCE([Brazil], 0) AS [Brazil]
FROM noms AS source
PIVOT (
MAX(Sales) FOR Region IN ([UK],[US],[EUxUK],[JAP],[Brazil])) AS pvt
) AS t
GROUP BY [UK],[US],[EUxUK],[JAP],[Brazil]
ORDER BY MIN(DateS)
Related
Example dataset.
CLINIC
APPTDATETIME
PATIENT_ID
NEW_FOLLOWUP_FLAG
TGYN
20/07/2022 09:00:00
1
N
TGYN
20/07/2022 09:45:00
2
F
TGYN
20/07/2022 10:05:00
NULL
NULL
TGYN
20/07/2022 10:05:00
4
F
TGYN
20/07/2022 10:25:00
5
F
TGYN
20/07/2022 10:30:00
NULL
NULL
TGYN
20/07/2022 10:35:00
NULL
NULL
TGYN
20/07/2022 10:40:00
NULL
NULL
TGYN
20/07/2022 10:45:00
NULL
NULL
TGYN
20/07/2022 11:10:00
6
F
TGYN
20/07/2022 11:10:00
7
F
As you can see there are times with multiple patients, times with empty slots and times with both (generally DQ errors).
I'm trying to calculate how many slots where filled and how many of those were new (N) or follow up(F). If there is a slot with a patient and also a NULL row then I only want to count the row with the patient. If there are only NULL rows for a timeslot then I want to count that as 'unfilled'.
From this dataset I would like to calculate the following for each group of clinic and apptdatetime.
CLINIC
APPTDATE
N Capacity
F Capacity
Unfilled Capacity
TGYN
20/07/2022
1
5
4
What's the best way to go about this?
I've considered taking a list of distinct values for each clinic and date and then joining to that but wanted to know if there are a more elegant way.
First I set up some demo data in a table from what you provided:
DECLARE #table TABLE (CLINIC NVARCHAR(4), APPTDATETIME DATETIME, PATIENT_ID INT, NEW_FOLLOWUP_FLAG NVARCHAR(1))
INSERT INTO #table (CLINIC, APPTDATETIME, PATIENT_ID, NEW_FOLLOWUP_FLAG) VALUES
('TGYN','07/20/2022 09:00:00', 1 ,'N'),
('TGYN','07/20/2022 09:45:00', 2 ,'F'),
('TGYN','07/20/2022 10:05:00', NULL ,NULL),
('TGYN','07/20/2022 10:05:00', 4 ,'F'),
('TGYN','07/20/2022 10:25:00', 5 ,'F'),
('TGYN','07/20/2022 10:30:00', NULL ,NULL),
('TGYN','07/20/2022 10:35:00', NULL ,NULL),
('TGYN','07/20/2022 10:40:00', NULL ,NULL),
('TGYN','07/20/2022 10:45:00', NULL ,NULL),
('TGYN','07/20/2022 11:10:00', 6 ,'F'),
('TGYN','07/20/2022 11:10:00', 7 ,'F')
Reading through your description it looks like you'd need a couple of case statements and a group by:
SELECT CLINIC, CAST(APPTDATETIME AS DATE) AS APPTDATE,
SUM(CASE WHEN NEW_FOLLOWUP_FLAG = 'N' THEN 1 ELSE 0 END) AS NCapacity,
SUM(CASE WHEN NEW_FOLLOWUP_FLAG = 'F' THEN 1 ELSE 0 END) AS FCapacity,
SUM(CASE WHEN NEW_FOLLOWUP_FLAG IS NULL THEN 1 ELSE 0 END) AS UnfilledCapacity
FROM #table
GROUP BY CLINIC, CAST(APPTDATETIME AS DATE)
Which returns a result set like this:
CLINIC APPTDATE NCapacity FCapacity UnfilledCapacity
------------------------------------------------------------
TGYN 2022-07-20 1 5 5
Note that I cast the datetime column to a date and grouped by that.
The case statements just test for a condition (is the column null, or F or N) and then just returns a 1, which is summed.
Your title also asked about finding duplicates in the data set. You should likely have a constraint on this table making CLINIC and APPTDATETIME forcibly unique. This would prevent rows even being inserted as dupes.
If you want to find them in the table try something like this:
SELECT CLINIC, APPTDATETIME, COUNT(*) AS Cnt
FROM #table
GROUP BY CLINIC, APPTDATETIME
HAVING COUNT(*) > 1
Which from the test data returned:
CLINIC APPTDATETIME Cnt
-----------------------------------
TGYN 2022-07-20 10:05:00.000 2
TGYN 2022-07-20 11:10:00.000 2
Indicating there are dupes for those clinic/datetime combinations.
HAVING is the magic here, we can count them up and state we only want ones which are greater than 1.
This is basically a straight-forward conditional aggregation with group by, with the slight complication of excluding NULL rows where a corresponding appointment also exists.
For this you can include an anti-semi self-join using not exists so as to exclude counting for unfilled capacity any row where there's also valid data for the same date:
select CLINIC, Convert(date, APPTDATETIME) AppDate,
Sum(case when NEW_FOLLOWUP_FLAG = 'N' then 1 end) N_Capacity,
Sum(case when NEW_FOLLOWUP_FLAG = 'f' then 1 end) F_Capacity,
Sum(case when NEW_FOLLOWUP_FLAG is null then 1 end) U_Capacity
from t
where not exists (
select * from t t2
where t.PATIENT_ID is null
and t2.PATIENT_ID is not null
and t.APPTDATETIME = t2.APPTDATETIME
)
group by CLINIC, Convert(date, APPTDATETIME);
I am trying to solve an issue with LAG. I want to update the current row (if null) with the previous value and keep it repeating until null.
Here is my code:
SELECT PersonID, FirstDateofTermYear,
case
when PhysicalOVRARiskRating is null then LAG (PhysicalOVRARiskRating, 2, PhysicalOVRARiskRating) OVER (PARTITION BY PersonID ORDER BY FirstDateofTermYear)
ELSE PhysicalOVRARiskRating END AS PhysicalOVRARiskRating,
case
when PsychologicalOVRARiskRating is null then LAG (PsychologicalOVRARiskRating, 2, PsychologicalOVRARiskRating) OVER (PARTITION BY PersonID ORDER BY FirstDateofTermYear)
ELSE PsychologicalOVRARiskRating END AS PsychologicalOVRARiskRating
FROM [BI].[vw_Fact_OVT_CCI_V3]
WHERE PersonID = '0258077'
ORDER BY FirstDateofTermYear;
Original result -
PersonID FirstDateofTermYear PhysicalOVRARiskRating PsychologicalOVRARiskRating
0258077 2020-02-03 NULL NULL
0258077 2020-04-28 NULL MEDIUM
0258077 2020-07-20 NULL NULL
0258077 2020-10-12 NULL NULL
0258077 2021-02-01 NULL NULL
0258077 2021-04-19 NULL NULL
0258077 2021-07-12 NULL NULL
0258077 2021-10-05 NULL NULL
0258077 2022-01-31 NULL NULL
0258077 2022-04-26 NULL LOW
0258077 2022-07-18 NULL NULL
Current result -
PersonID FirstDateofTermYear PhysicalOVRARiskRating PsychologicalOVRARiskRating
0258077 2020-02-03 NULL NULL
0258077 2020-04-28 NULL MEDIUM
0258077 2020-07-20 NULL MEDIUM
0258077 2020-10-12 NULL MEDIUM
0258077 2021-02-01 NULL NULL
0258077 2021-04-19 NULL NULL
0258077 2021-07-12 NULL NULL
0258077 2021-10-05 NULL NULL
0258077 2022-01-31 NULL NULL
0258077 2022-04-26 NULL LOW
0258077 2022-07-18 NULL LOW
Expected result -
PersonID FirstDateofTermYear PhysicalOVRARiskRating PsychologicalOVRARiskRating
258077 03/02/2020 NULL NULL
258077 28/04/2020 NULL MEDIUM
258077 20/07/2020 NULL MEDIUM
258077 12/10/2020 NULL MEDIUM
258077 01/02/2021 NULL MEDIUM
258077 19/04/2021 NULL MEDIUM
258077 12/07/2021 NULL MEDIUM
258077 05/10/2021 NULL MEDIUM
258077 31/01/2022 NULL MEDIUM
258077 26/04/2022 NULL LOW
258077 18/07/2022 NULL LOW
How do I repeat the value until we find a value other than null? LAG function is only repeating once or twice, not every time.
As per #lptr's comment
select PersonID, FirstDateofTermYear
, min(PhysicalOVRARiskRating) over (partition by PersonID, grpphy)
, min(PsychologicalOVRARiskRating) over (partition by PersonID, grppsy)
from (
select *
, count(PhysicalOVRARiskRating) over (partition by PersonID order by FirstDateofTermYear rows unbounded preceding) as grpphy
, count(PsychologicalOVRARiskRating) over (partition by PersonID order by FirstDateofTermYear rows unbounded preceding) as grppsy
from [BI].[vw_Fact_OVT_CCI_V3]
) as d
My question involves how to identify an index discharge.
The index discharge is the earliest discharge. On that date, the 30 day window starts. Any admissions during that time period are considered readmissions, and they should be ignored. Once the 30 day window is over, then any subsequent discharge is considered an index and the 30 day window begins again.
I can't seem to work out the logic for this. I've tried different windowing functions, I've tried cross joins and cross applies. The issue I keep encountering is that a readmission cannot be an index admission. It must be excluded.
I have successfully written a while loop to solve this problem, but I'd really like to get this in a set based format, if it's possible. I haven't been successful so far.
Ultimate goal is this -
id
AdmitDate
DischargeDate
MedicalRecordNumber
IndexYN
1
2021-03-03 00:00:00.000
2021-03-09 13:20:00.000
X0090362
1
4
2021-03-05 00:00:00.000
2021-03-10 16:00:00.000
X0012614
1
6
2021-05-18 00:00:00.000
2021-05-21 22:20:00.000
X0012614
1
7
2021-06-21 00:00:00.000
2021-07-08 13:30:00.000
X0012614
1
8
2021-02-03 00:00:00.000
2021-02-09 17:00:00.000
X0019655
1
10
2021-03-23 00:00:00.000
2021-03-26 16:40:00.000
X0019655
1
11
2021-03-15 00:00:00.000
2021-03-18 15:53:00.000
X4135958
1
13
2021-05-17 00:00:00.000
2021-05-23 14:55:00.000
X4135958
1
15
2021-06-24 00:00:00.000
2021-07-13 15:06:00.000
X4135958
1
Sample code is below.
CREATE TABLE #Admissions
(
[id] INT,
[AdmitDate] DATETIME,
[DischargeDateTime] DATETIME,
[UnitNumber] VARCHAR(20),
[IndexYN] INT
)
INSERT INTO #Admissions
VALUES( 1 ,'2021-03-03' ,'2021-03-09 13:20:00.000' ,'X0090362', NULL)
,(2 ,'2021-03-27' ,'2021-03-30 19:59:00.000' ,'X0090362', NULL)
,(3 ,'2021-03-31' ,'2021-04-04 05:57:00.000' ,'X0090362', NULL)
,(4 ,'2021-03-05' ,'2021-03-10 16:00:00.000' ,'X0012614', NULL)
,(5 ,'2021-03-28' ,'2021-04-16 13:55:00.000' ,'X0012614', NULL)
,(6 ,'2021-05-18' ,'2021-05-21 22:20:00.000' ,'X0012614', NULL)
,(7 ,'2021-06-21' ,'2021-07-08 13:30:00.000' ,'X0012614', NULL)
,(8 ,'2021-02-03' ,'2021-02-09 17:00:00.000' ,'X0019655', NULL)
,(9 ,'2021-02-17' ,'2021-02-22 17:25:00.000' ,'X0019655', NULL)
,(10 ,'2021-03-23' ,'2021-03-26 16:40:00.000' ,'X0019655', NULL)
,(11 ,'2021-03-15' ,'2021-03-18 15:53:00.000' ,'X4135958', NULL)
,(12 ,'2021-04-08' ,'2021-04-13 19:42:00.000' ,'X4135958', NULL)
,(13 ,'2021-05-17' ,'2021-05-23 14:55:00.000' ,'X4135958', NULL)
,(14 ,'2021-06-09' ,'2021-06-14 12:45:00.000' ,'X4135958', NULL)
,(15 ,'2021-06-24' ,'2021-07-13 15:06:00.000' ,'X4135958', NULL)
You can use a recursive CTE to identify all rows associated with each "index" discharge:
with a as (
select a.*, row_number() over (order by dischargedatetime) as seqnum
from admissions a
),
cte as (
select id, admitdate, dischargedatetime, unitnumber, seqnum, dischargedatetime as index_dischargedatetime
from a
where seqnum = 1
union all
select a.id, a.admitdate, a.dischargedatetime, a.unitnumber, a.seqnum,
(case when a.dischargedatetime > dateadd(day, 30, cte.index_dischargedatetime)
then a.dischargedatetime else cte.index_dischargedatetime
end) as index_dischargedatetime
from cte join
a
on a.seqnum = cte.seqnum + 1
)
select *
from cte;
You can then incorporate this into an update:
update admissions
set indexyn = (case when admissions.dischargedatetime = cte.index_dischargedatetime then 'Y' else 'N' end)
from cte
where cte.id = admissions.id;
Here is a db<>fiddle. Note that I changed the type of IndexYN to a character to assign 'Y'/'N', which makes sense given the column name.
I have a table where I have been gathering user's clock in, clock out times and this gets displayed onto a calendar no problem. However, now I would like to see what hours all users are working.
Every time a user Clocks In, a new record is created. Every time a user clocks out, a new record is created also.
My SQL code below allows me to bring up Clock In and Clock Out times:
DECLARE #StartDate DateTime;
DECLARE #EndDate DateTime;
DECLARE #AssumedShiftStartTime DateTime;
DECLARE #AssumedShiftEndTime DateTime;
DECLARE #EmployeeName nvarchar(200);
DECLARE #ShiftStart DateTime
DECLARE #ShiftEnd DateTime
-- Date format: YYYY-MM-DD
SET #StartDate = '2014-07-01 00:00:00'
SET #EndDate = DATEADD (DAY, 1, #StartDate); -- Add one day
SET #AssumedShiftEndTime = '18:00:00'
SET #AssumedShiftStartTime = '09:00:00'
SET #EmployeeName = 'Paul';
-------------- Get Clock IN / OUT TIMES -----------------
SELECT EmployeeAttendance.LastUpdate, EmployeeAttendance.ClockInTime, EmployeeAttendance.ClockOutTime
FROM EmployeeAttendance INNER JOIN
Membership ON EmployeeAttendance.UserId = Membership.UserId
WHERE EmployeeAttendance.LastUpdate >= #StartDate AND EmployeeAttendance.LastUpdate <= #EndDate
AND Membership.Username = #EmployeeName
Which gives the following results:
LastUpdate ClockInTime ClockOutTime
2014-07-01 08:48:08.650 2014-07-01 08:48:08.650 NULL
2014-07-01 18:04:39.943 NULL 2014-07-01 18:04:39.923
2014-07-02 08:48:08.680 2014-07-01 09:00:08.340 NULL
2014-07-02 18:04:39.343 NULL 2014-07-01 18:00:39.623
2014-07-03 08:48:08.620 2014-07-01 08:58:08.860 NULL
2014-07-03 18:04:39.455 NULL 2014-07-01 18:05:39.985
What I am really trying to achieve is something that returns the following results.
EDIT: Where the results return a null, I want to use #AssumedShiftStartTime or #AssumedShiftEndTime to allow a result to be caluclated for total hours but gets difficult because two seperate records are recorded for Clock In and Clock Out:
DATE CLOCK-IN-TIME CLOCK-OUT-TIME TOTAL-HOURS
2014-07-01 08:49 18:04 9 Hours 15 Mins
2014-07-02 09:00 18:00 9 Hours 00 Mins
2014-07-03 08:58 18:05 9 Hours 07 Mins
Total-This-Month
27 Hours 15 Mins
EDIT: Thank you Sean Lange for your help. After applying the help from your reply I get the following output which shows two rows account for Clock In and Clock Out. I am trying to determine how would be the best way to get the results for a single day, calculate total hours, merge next two and calculate hours etc. I think this is more complicated than it needs to be and maybe time for a SQL logic recode?
2014-07-01 08:48:08.650 NULL NULL NULL
NULL 2014-07-01 18:04:39.923 NULL NULL
2014-07-02 08:54:03.483 NULL NULL NULL
NULL 2014-07-02 17:09:34.940 NULL NULL
2014-07-03 08:48:01.070 NULL NULL NULL
NULL 2014-07-03 18:12:11.487 NULL NULL
2014-07-04 08:48:07.983 NULL NULL NULL
NULL 2014-07-04 18:07:09.390 NULL NULL
2014-07-05 08:56:24.410 NULL NULL NULL
NULL 2014-07-05 14:19:12.800 NULL NULL
2014-07-08 08:44:56.727 NULL NULL NULL
NULL 2014-07-08 18:15:12.143 NULL NULL
2014-07-09 08:46:15.103 NULL NULL NULL
NULL 2014-07-09 17:10:46.327 NULL NULL
2014-07-10 08:57:14.733 NULL NULL NULL
NULL 2014-07-10 18:10:37.897 NULL NULL
2014-07-11 08:52:10.783 NULL NULL NULL
NULL 2014-07-11 18:08:58.580 NULL NULL
2014-07-12 08:56:20.073 NULL NULL NULL
NULL 2014-07-12 14:15:44.103 NULL NULL
2014-07-15 08:47:04.330 NULL NULL NULL
NULL 2014-07-15 18:10:05.800 NULL NULL
2014-07-16 08:56:34.490 NULL NULL NULL
NULL 2014-07-16 17:05:06.627 NULL NULL
2014-07-17 08:46:37.263 NULL NULL NULL
NULL 2014-07-17 18:06:08.840 NULL NULL
2014-07-18 08:52:56.200 NULL NULL NULL
NULL 2014-07-18 18:11:25.750 NULL NULL
2014-07-19 08:54:36.277 NULL NULL NULL
NULL 2014-07-19 14:15:09.620 NULL NULL
2014-07-22 08:56:30.623 NULL NULL NULL
NULL 2014-07-22 16:03:00.653 NULL NULL
2014-07-23 08:49:53.687 NULL NULL NULL
NULL 2014-07-23 17:07:37.943 NULL NULL
2014-07-24 08:52:08.690 NULL NULL NULL
2014-07-25 08:57:13.477 NULL NULL NULL
NULL 2014-07-25 18:09:01.793 NULL NULL
2014-07-26 08:53:42.597 NULL NULL NULL
NULL 2014-07-26 14:03:21.063 NULL NULL
Any help would be gratefully accepted.
Thank you
Here we make up some test data:
DECLARE #TimeSheet TABLE
(
EmpId INT,
LastUpdate DATETIME,
ClockInTime DATETIME,
ClockOutTime DATETIME
)
INSERT INTO #TimeSheet
VALUES
(201, '2014-07-01 08:48:08.650', '2014-07-01 08:48:08.650' ,NULL ),
(201, '2014-07-01 18:04:39.943', NULL ,'2014-07-01 18:04:39.923'),
(201, '2014-07-02 08:48:08.680', '2014-07-01 09:00:08.340' ,NULL ),
(201, '2014-07-02 18:04:39.343', NULL ,'2014-07-01 18:00:39.623'),
(201, '2014-07-03 08:48:08.620', '2014-07-01 08:58:08.860' ,NULL ),
(201, '2014-07-03 18:04:39.455', NULL ,'2014-07-01 18:05:39.985'),
(110, '2014-07-01 08:48:08.650', '2014-07-01 06:48:08.650' ,NULL ),
(110, '2014-07-01 18:04:39.943', NULL ,'2014-07-01 14:01:39.923'),
(110, '2014-07-02 08:48:08.680', '2014-07-01 07:10:08.340' ,NULL ),
(110, '2014-07-02 18:04:39.343', NULL ,'2014-07-01 14:00:39.623'),
(110, '2014-07-03 08:48:08.620', '2014-07-01 06:58:58.860' ,NULL ),
(110, '2014-07-03 18:04:39.455', NULL ,'2014-07-01 14:01:39.985');
Now lets create a CTE to number our data:
WITH TimeRows AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY EmpId, CAST(LastUpdate AS DATE) ORDER BY LastUpdate) RN
FROM #TimeSheet
)
Now we query the CTE against itself to find our clockin and clockout times:
SELECT T1.EmpId,
T1.ClockInTime,
T2.ClockOutTime,
DATEDIFF(HOUR, T1.ClockInTime, T2.ClockOutTime) AS DHour,
DATEDIFF(MINUTE, T1.ClockInTime, T2.ClockOutTime) % 60 AS DMinutes
FROM TimeRows T1
INNER JOIN TimeRows T2
ON T2.EmpId = T1.EmpId
AND T2.RN = T1.RN + 1
AND CAST(T2.LastUpdate AS DATE) = CAST(T1.LastUpdate AS DATE)
Here is the output:
EmpId ClockInTime ClockOutTime DHour DMinutes
110 2014-07-01 06:48:08.650 2014-07-01 14:01:39.923 8 13
110 2014-07-01 07:10:08.340 2014-07-01 14:00:39.623 7 50
110 2014-07-01 06:58:58.860 2014-07-01 14:01:39.987 8 3
201 2014-07-01 08:48:08.650 2014-07-01 18:04:39.923 10 16
201 2014-07-01 09:00:08.340 2014-07-01 18:00:39.623 9 0
201 2014-07-01 08:58:08.860 2014-07-01 18:05:39.987 10 7
Something like this help?
declare #Times table(ClockIn datetime, ClockOut datetime)
insert #Times
select '2014-07-01 08:49', '2014-07-01 18:04' union all
select '2014-07-01 09:00', '2014-07-01 18:00'
select *
, datediff(hour, ClockIn, ClockOut) - case when datediff(minute, ClockIn, ClockOut) % 60 > 0 then 1 else 0 end as MyHours
,datediff(minute, ClockIn, ClockOut) % 60 as MyMinutes
from #Times
I have a set of data that looks like this I want to remove one row for each of the debnrs that has a p in it for type. I don't care which one. The two rows with P in the type are identical except for the date. How would I select just one with a P in the type.
debnr docno date type num amount
4 NULL 2013-08-29 07:26:25.000 P 1761 -12
4 NULL 2013-09-12 00:00:00.000 P 1761 -12
4 168371 2013-08-29 00:00:00.000 I 168371 12
5 NULL 2013-10-11 09:24:58.000 P 7287 -24
5 NULL 2013-10-14 00:00:00.000 P 7287 -24
5 170366 2013-10-11 00:00:00.000 I 170366 24
6 NULL 2013-10-24 00:00:00.000 P 4023 -465
6 NULL 2013-10-24 09:42:18.000 P 4023 -465
6 171095 2013-10-24 00:00:00.000 I 171095 465
7 NULL 2013-12-16 00:00:00.000 P 171502 -394.2
7 NULL 2013-12-16 00:00:00.000 P 6601 -394.2
7 171502 2013-10-30 00:00:00.000 I 171502 394.2
how would I get it to look like this.
4 NULL 2013-09-12 00:00:00.000 P 1761 -12
4 168371 2013-08-29 00:00:00.000 I 168371 12
5 NULL 2013-10-14 00:00:00.000 P 7287 -24
5 170366 2013-10-11 00:00:00.000 I 170366 24
6 NULL 2013-10-24 09:42:18.000 P 4023 -465
6 171095 2013-10-24 00:00:00.000 I 171095 465
7 NULL 2013-12-16 00:00:00.000 P 6601 -394.2
7 171502 2013-10-30 00:00:00.000 I 171502 394.2
Shot in the dark:
select
debnr,
docno,
max(date),
type,
num,
amount
from magical_table
group by
debnr,
docno,
type,
num,
amount
You could GROUP and use an aggregate given your sample above, if however the amount field weren't identical, for instance, then you could use the ROW_NUMBER() function for this to avoid needing an aggregate:
;WITH cte AS (SELECT *
,CASE WHEN TYPE = 'P' THEN ROW_NUMBER() OVER(PARTITION BY debnr ORDER BY (SELECT 1))
ELSE 0
END AS RN
FROM Table1)
SELECT *
FROM cte
WHERE RN <= 1
Demo: SQL Fiddle
The ORDER BY (SELECT 1) could be changed to any field, that's just one way to get an arbitrary result if you don't want a min/max.
Want you line with type "I" ungrouped ?
select debnr, docno, max(date), type, num, amount
from magical_table
where type = "P"
group by debnr, docno, type, num, amount
UNION
select debnr, docno, date, type, num, amount
from magical_table
where type = "I"