SQL Server - SUM and comma-separated values using GROUP BY clause - sql

I have 2 tables:
NDEvent:
EventId EndTime
33 2020-10-23 15:00:00.000
33 2020-10-23 15:00:00.000
35 2020-10-21 03:30:00.000
35 2020-10-24 15:00:00.000
35 2020-10-25 15:00:00.000
34 2020-10-23 15:00:00.000
EventAppointment:
Id DocId EventId Amount
1 7647 34 10.00
2 7647 34 10.00
3 28531 33 20.00
4 7647 35 20.00
5 7647 35 100.00
6 7647 35 200.00
And I want result to be like this:
DocId EventId Amount Id
7647 34 20.00 1,2
28531 33 20.00 3
7647 35 320.00 4,5,6
What I have tried is:
select e.Amount,e.DoctorId,e.EventId,
Id= STUFF(
(SELECT DISTINCT ',' + CAST(e.Id as nvarchar(max))
from NDEvent nd
inner join EventAppointment e on nd.Id = e.EventId
where
GETDATE() > nd.EndTime
GROUP BY
e.Amount,e.DoctorId,e.EventId,e.Id
FOR XML PATH(''))
, 1, 1, ''
)
from NDEvent nd
inner join EventAppointment e on nd.Id = e.EventId
where
GETDATE() > nd.EndTime
GROUP BY
e.Amount,e.DoctorId,e.EventId
But it is not giving expected result.
Could anyone help with this query? Or point me to a right direction? Thank you.

It doesn't look like yo need to NDEvent table here at all (though I include it in the sample data). Just SUM and STRING_AGG against EventAppointment:
USE Sandbox
GO
WITH NDEvent AS(
SELECT *
FROM (VALUES(33,CONVERT(datetime,'2020-10-23T15:00:00.000')),
(33,CONVERT(datetime,'2020-10-23T15:00:00.000')),
(35,CONVERT(datetime,'2020-10-21T03:30:00.000')),
(35,CONVERT(datetime,'2020-10-24T15:00:00.000')),
(35,CONVERT(datetime,'2020-10-25T15:00:00.000')),
(34,CONVERT(datetime,'2020-10-23T15:00:00.000')))V(EventID,EndTime)),
EventAppointment AS(
SELECT *
FROM (VALUES(1,7647 ,34,10.00),
(2,7647 ,34,10.00),
(3,28531,33,20.00),
(4,7647 ,35,20.00),
(5,7647 ,35,100.00),
(6,7647 ,35,200.00))V(Id,DocId, EventID, Amount))
SELECT DocID,
EventID,
SUM(Amount) AS Amount,
STRING_AGG(Id,',') WITHIN GROUP (ORDER BY Id) AS IDs
FROM EventAppointment EA
GROUP BY DocId,
EventID;

Can be used in other data.
WITH Table1 AS(
SELECT EventId FROM NDEvent
GROUP BY EventId
),
Table2 AS(
SELECT e.DocId,e.EventId,e.Amount,
STUFF((
SELECT ',' + CAST(ee.Id as nvarchar)
FROM EventAppointment ee
where ee.EventId = e.EventId
GROUP BY ee.EventId,ee.Id
FOR XML PATH('')), 1, 1, '') AS Id
FROM Table1 t
LEFT OUTER JOIN EventAppointment e ON t.EventId = e.EventId
)
SELECT DocId,EventId,SUM(Amount) AS Amount,Id FROM Table2
GROUP BY DocId,EventId,Id

Related

SQL query group by with null values is returning duplicates

I have following query
My #dates table has following records:
month year saledate
9 2020 2020-09-01
10 2020 2020-10-01
11 2020 2020-11-01
with monthlysalesdata as(
select month(salesdate) as salemonth, year(salesdate) as saleyear,salesrepid, salespercentage
from salesrecords r
join #dates d on d.saledate = r.salesdate
group by salesrepid, salesdate),
averagefor3months as(
select 0 as salemonth, 0 as saleyear, salesrepid, salespercentage
from monthlysalesdata
group by salesrepid)
finallist as(
select * from monthlysalesdata
union
select * from averagefor3months
This query returns following records which gives duplicate for a averagefor3months result set when there is null record in the first monthlyresultdata. how to achieve average for 3 months as one record instead of having duplicates?
salesrepid salemonth saleyear percentage
232 0 0 null -------------this is the duplicate record
232 0 0 90
232 9 2020 80
232 10 2020 null
232 11 2020 100
My first cte has this result:
salerepid month year percentage
---------------------------------------------
232 9 2020 80
232 10 2020 null
232 11 2020 100
My second cte has this result:
salerepid month year percentage
---------------------------------------------
232 0 0 null
232 0 0 90
How to avoid the duplicate record in my second cte,
I suspect that you want a summary row per sales rep based on some aggregation. Your question is not clear on what is needed for the aggregation, but something like this:
with ym as (
select r.salesrepid, d.year, d.month, sum(<something>) as whatever
from salesrecords r join
#dates d
on d.saledate = r.salesdate
group by r.salesrepid, d.year, d.month
)
select ym.*
from ym
union all
select salesrepid, null, null, avg(whatever)
from hm
group by salesrepid;
I updated to selected the group by from the table directly instead of the previous cte and got my results. Thank you all for helping
with ym as (
select r.salesrepid, d.year, d.month, sum(<something>) as whatever
from salesrecords r join
#dates d
on d.saledate = r.salesdate
group by r.salesrepid, d.year, d.month
),
threemonthsaverage as(
select r.salesrepid, r.year, r.month, sum(something) as whatever
from salesrecords as r
group by salesrepid)
select ym *
union
select threemonthsaverage*

select rows with events related with another events in the same query column

I need to select rows with EventTypeID = 19 which does not have related EventtypeID = 21 LoggedOn exactly 4 minutes earlier for the same EmployeeID. Here's the query bellow and some raw output:
SELECT * FROM
(
SELECT rcp..EventLogEntries.EmployeeID, rcp..EventLogEntries.EventTypeID, rcp..EventLogEntries.TerminalID, rcp..EventLogEntries.LoggedOn
FROM rcp..EventLogEntries
WHERE rcp..EventLogEntries.terminalid = 3
UNION
SELECT viso..AccessUserPersons.UserExternalIdentifier, rcp..EventTypes.ID, rcp..Terminals.ID, viso..EventLogEntries.LoggedOn
FROM viso..EventLogEntries, viso..AccessUserPersons, rcp..Terminals, rcp..EventTypes
WHERE viso..EventLogEntries.LocationID = 10
AND viso..EventLogEntries.EventCode = 615
AND rcp..EventTypes.Code = 36
AND viso..EventLogEntries.PersonID = viso..AccessUserPersons.ID
AND viso..EventLogEntries.locationID = rcp..Terminals.TerminalTAID
) results
ORDER BY LoggedOn
EmployeeID EventTypeID TerminalID LoggedOn
273 19 3 2018-12-04 12:31:23.000
273 21 3 2018-12-04 12:34:18.000
483 19 3 2018-12-04 12:40:10.000
268 19 3 2018-12-04 13:19:23.000
273 21 3 2018-12-04 13:28:00.000
273 19 3 2018-12-04 13:32:00.000
459 19 3 2018-12-04 15:01:04.000
What I need to achieve is:
EmployeeID EventTypeID TerminalID LoggedOn
273 19 3 2018-12-04 12:31:23.000
483 19 3 2018-12-04 12:30:10.000
268 19 3 2018-12-04 13:19:23.000
459 19 3 2018-12-04 15:01:04.000
TerminalID column value is always 3 in that scenario and it's not related with any query condition, but must be in the output for syntax requirement in the futher processing.
The bad practice to join tables using conditions in WHERE. The block WHERE need to use to filter first of all.
And aliases help to make code shorter.
SELECT * FROM
(
SELECT EmployeeID, EventTypeID, TerminalID, LoggedOn
FROM rcp..EventLogEntries
WHERE terminalid = 3
UNION
SELECT p.UserExternalIdentifier, et.ID, t.ID, el.LoggedOn
FROM viso..EventLogEntries el
JOIN viso..AccessUserPersons p ON el.PersonID = p.ID
JOIN rcp..Terminals t ON el.locationID = t.TerminalTAID
JOIN rcp..EventTypes et ON --!!! no any condition here
WHERE el.LocationID = 10
AND el.EventCode = 615
AND et.Code = 36
) results
ORDER BY LoggedOn
Try to use the following:
WITH cteData AS
(
SELECT EmployeeID, EventTypeID, TerminalID, LoggedOn
FROM rcp..EventLogEntries
WHERE terminalid = 3
UNION
SELECT p.UserExternalIdentifier, et.ID, t.ID, el.LoggedOn
FROM viso..EventLogEntries el
JOIN viso..AccessUserPersons p ON el.PersonID = p.ID
JOIN rcp..Terminals t ON el.locationID = t.TerminalTAID
JOIN rcp..EventTypes et ON --!!! no any condition here
WHERE el.LocationID = 10
AND el.EventCode = 615
AND et.Code = 36
)
SELECT q19.*
FROM
(
SELECT *
FROM cteData
WHERE EventTypeID=19
) q19
LEFT JOIN
(
SELECT *
FROM cteData
WHERE EventTypeID=21
) q21
ON q19.EmployeeID=q21.EmployeeID
WHERE (DATEDIFF(MINUTE,q19.LoggedOn,q21.LoggedOn)>4 OR q21.LoggedOn IS NULL)
If there don't need any conditions you can use CROSS JOIN.
I got your result using your data:
WITH cteData AS(
SELECT *
FROM (VALUES
(273,19,3,CAST('2018-12-04 12:31:23.000' AS datetime)),
(273,21,3,CAST('2018-12-04 12:34:18.000' AS datetime)),
(483,19,3,CAST('2018-12-04 12:40:10.000' AS datetime)),
(268,19,3,CAST('2018-12-04 13:19:23.000' AS datetime)),
(273,21,3,CAST('2018-12-04 13:28:00.000' AS datetime)),
(273,19,3,CAST('2018-12-04 13:32:00.000' AS datetime)),
(459,19,3,CAST('2018-12-04 15:01:04.000' AS datetime))
)v(EmployeeID,EventTypeID,TerminalID,LoggedOn)
)
SELECT q19.*
FROM
(
SELECT *
FROM cteData
WHERE EventTypeID=19
) q19
LEFT JOIN
(
SELECT *
FROM cteData
WHERE EventTypeID=21
) q21
ON q19.EmployeeID=q21.EmployeeID
WHERE (DATEDIFF(MINUTE,q19.LoggedOn,q21.LoggedOn)>4 OR q21.LoggedOn IS NULL)
Small change done in this part:
ON q19.EmployeeID=q21.EmployeeID AND q19.LoggedOn=dateadd(mi, 4, q21.LoggedOn)
WHERE q21.LoggedOn IS NULL

How to get single result when a column has the same value but the second column have different value

I have this table
Id VendorId ClaimRequestDate
1 5 2017-12-14 00:00:00.000
2 5 2018-02-02 00:00:00.000
7 5 2018-02-07 11:08:25.257
I want my result to show only the latest date for each VendorId starting from date later than 2 Feb 2018
what I've done now
SELECT DISTINCT
[Project1].[Id] AS [Id],
[Project1].[VendorId] AS [VendorId],
[Project1].[ClaimRequestDate] AS [ClaimRequestDate]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[VendorId] AS [VendorId],
[Extent1].[ClaimRequestDate] AS [ClaimRequestDate]
FROM [dbo].[Claim] AS [Extent1]
WHERE [Extent1].[ClaimRequestDate] >= '2018-02-02 00:00:00.000'
) AS [Project1]
ORDER BY [Project1].[ClaimRequestDate] DESC
But my result is
Id VendorId ClaimRequestDate
7 5 2018-02-07 11:08:25.257
2 5 2018-02-02 00:00:00.000
Can someone help me with this
There are tree problem in your query. one is
WHERE [Extent1].[ClaimRequestDate] >= '2018-02-02 00:00:00.000'
Row is >= should be ">" second one is your query gain all rows from date later than 2018-02-02 If a vendorId has more than a value Query would return you can try this
SELECT * FROM Claim c
where ClaimRequestDate IN (select MAX(ClaimRequestDate) from claim c1
where c.vendorId =c1.vendorId and c1.Claimrequestdate >'2018.02.02')
Third is this query when your vendorId has more than same max(Claimrequestdate) would return all of them
Id VendorId ClaimRequestDate
1 5 2017-12-14 00:00:00.000
2 5 2018-02-02 00:00:00.000
7 5 2018-02-07 11:08:25.257
8 5 2018-02-07 11:08:25.257
returns
Id VendorId ClaimRequestDate
7 5 2018-02-07 11:08:25.257
8 5 2018-02-07 11:08:25.257
For these reason I suggest this query for use
SELECT * FROM Claim c
where CAST(ClaimRequestDate AS VARCHAR)+ CAST(ID AS VARCHAR) IN (select
MAX(CAST(ClaimRequestDate AS VARCHAR)+ CAST(ID AS VARCHAR)) from claim c1
where c.vendorId =c1.vendorId and c1.Claimrequestdate >'2018.02.02'
)
Try the following SQL:
select aa.* from [Claim] as aa inner join
(
select [VendorId], max([Id]) as maxId from [Claim]
where [ClaimRequestDate] >= '2018-02-03 00:00:00'
group by [VendorId]
) as bb on aa.[Id] = bb.[maxId]

SQL First Time In Last Time Out (Transact-SQL..preferrably)

I have the following table which contains all the time in and time out of people:
CREATE TABLE test (
timecardid INT
, trandate DATE
, employeeid INT
, trantime TIME
, Trantype VARCHAR(1)
, Projcode VARCHAR(3)
)
The task is to get all the earliest trantime with trantype A (perhaps using MIN) and the latest trantime with trantype Z (Using Max), all of which in that trandate (ie. trantype A for july 17 is 8:00 AM and trantype Z for july 17 is 7:00PM).
the problem is, the output should be in the same format as the table where it's coming from, meaning that I have to leave this data and filter out the rest (that aren't the earliest and latest in/out for that date, per employee)
My current solution is to use two different select commands to get all earliest, then get all the latest. then combine them both.
I was wondering though, is there a much simpler, single string solution?
Thank you very much.
EDIT (I apologize, here is the sample. Server is SQL Server 2008):
Timecardid | Trandate | employeeid | trantime | trantype | Projcode
1 2013-04-01 1 8:00:00 A SAMPLE1
2 2013-04-01 1 9:00:00 A SAMPLE1
3 2013-04-01 2 7:00:00 A SAMPLE1
4 2013-04-01 2 6:59:59 A SAMPLE1
5 2013-04-01 1 17:00:00 Z SAMPLE1
6 2013-04-01 1 17:19:00 Z SAMPLE1
7 2013-04-01 2 17:00:00 Z SAMPLE1
8 2013-04-02 1 8:00:00 A SAMPLE1
9 2013-04-02 1 9:00:00 A SAMPLE1
10 2013-04-02 2 7:00:58 A SAMPLE1
11 2013-04-02 2 18:00:00 Z SAMPLE1
12 2013-04-02 2 18:00:01 Z SAMPLE1
13 2013-04-02 1 20:00:00 Z SAMPLE1
Expected Results (the earliest in and the latest out per day, per employee, in a select command):
Timecardid | Trandate | employeeid | trantime | trantype | Projcode
1 2013-04-01 1 8:00:00 A SAMPLE1
4 2013-04-01 2 6:59:59 A SAMPLE1
6 2013-04-01 1 17:19:00 Z SAMPLE1
7 2013-04-01 2 17:00:00 Z SAMPLE1
8 2013-04-02 1 8:00:00 A SAMPLE1
10 2013-04-02 2 7:00:58 A SAMPLE1
12 2013-04-02 2 18:00:01 Z SAMPLE1
13 2013-04-02 1 20:00:00 Z SAMPLE1
Thank you very much
Perhaps this is what you're looking for:
select
t.*
from
test t
where
trantime in (
(select min(trantime) from test t1 where t1.trandate = t.trandate and trantype = 'A'),
(select max(trantime) from test t2 where t2.trandate = t.trandate and trantype = 'Z')
)
Changing my answer to account for the "per employee" requirement:
;WITH EarliestIn AS
(
SELECT trandate, employeeid, min(trantime) AS EarliestTimeIn
FROM test
WHERE trantype = 'A'
GROUP BY trandate, employeeid
),
LatestOut AS
(
SELECT trandate, employeeid, max(trantime) AS LatestTimeOut
FROM test
WHERE trantype = 'Z'
GROUP BY trandate, employeeid
)
SELECT *
FROM test t
WHERE
EXISTS (SELECT * FROM EarliestIn WHERE t.trandate = EarliestIn.trandate AND t.employeeid = EarliestIn.employeeid AND t.trantime = EarliestIn.EarliestTimeIn)
OR EXISTS (SELECT * FROM LatestOut WHERE t.trandate = LatestOut.trandate AND t.employeeid = LatestOut.employeeid AND t.trantime = LatestOut.LatestTimeOut)
Assuming timecardid column is PK or unique, and if I understand it correctly, I would do something like
DECLARE #date DATE
SET #date = '2013-07-01'
SELECT
T0.*
FROM
(SELECT DISTINCT employeeid FROM test) E
CROSS APPLY (
SELECT TOP 1
T.timecardid
FROM
test T
WHERE
T.trandate = #date
AND T.Trantype = 'A'
AND T.employeeid = E.employeeid
ORDER BY T.trantime
UNION ALL
SELECT TOP 1
T.timecardid
FROM
test T
WHERE
T.trandate = #date
AND T.Trantype = 'Z'
AND T.employeeid = E.employeeid
ORDER BY T.trantime DESC
) V
JOIN test T0 ON T0.timecardid = V.timecardid
Appropriate indexes should be set for the table, if you aware of performance.
If you're using SQL server 2012, you can use LAG/LEAD to find the max and min rows in a fairly concise way;
WITH cte AS (
SELECT *,
LAG(timecardid) OVER (PARTITION BY trandate,employeeid,trantype ORDER BY trantime) lagid,
LEAD(timecardid) OVER (PARTITION BY trandate,employeeid,trantype ORDER BY trantime) leadid
FROM test
)
SELECT timecardid,trandate,employeeid,trantime,trantype,projcode
FROM cte
WHERE trantype='A' AND lagid IS NULL
OR trantype='Z' AND leadid IS NULL;
An SQLfiddle to test with.
I would use ROW_NUMBER to sort out the rows you want to select:
;with Ordered as (
select *,
ROW_NUMBER() OVER (PARTITION BY Trandate,employeeid,trantype
ORDER BY trantime ASC) as rnEarly,
ROW_NUMBER() OVER (PARTITION BY Trandate,employeeid,trantype
ORDER BY trantime DESC) as rnLate
from
Test
)
select * from Ordered
where
(rnEarly = 1 and trantype='A') or
(rnLate = 1 and trantype='Z')
order by TimecardId
(SQLFiddle)
It produces the results you've requested, and I think it's quite readable. The reason that trantype is included in the PARTITION BY clauses is so that A and Z values receive separate numbering.

How to determine the maximum value for each category in SQL?

My table has records like below:
ID EmpID EffectiveDate PayElement Amount ComputeType AddDeduction
42 ISIPL001 2010-04-16 00:00:00.000 Basic 8000.00 On Attendance Addition
43 ISIPL001 2010-04-01 00:00:00.000 Con 2000.00 On Attendance Addition
44 ISIPL001 2010-04-01 00:00:00.000 HRA 2000.00 On Attendance Addition
54 ISIPL001 2011-01-01 00:00:00.000 Basic 15000.00 On Attendance Addition
55 ISIPL001 2011-01-01 00:00:00.000 Con 6000.00 On Attendance Addition
57 ISIPL001 2011-01-01 00:00:00.000 HRA 6000.00 On Attendance Addition
61 ISIPL001 2010-07-10 00:00:00.000 Basic 12000.00 On Attendance Addition
66 ISIPL001 2010-07-10 00:00:00.000 HRA 4200.00 On Attendance Addition
68 ISIPL001 2010-07-10 00:00:00.000 Con 5600.00 On Attendance Addition
I want the result display below:
i.e for each pay element available in my database, I need to record which is having maximum date for each pay element.
So my output should be like given below:
54 Basic 15000
55 Con 6000
57 HRA 6000
Try this:
SELECT ID,
PayElement,
Amount
FROM (
SELECT a.*,
RANK() OVER(PARTITION BY PayElement ORDER BY EffectiveDate DESC) AS rn
FROM <YOUR_TABLE> a
) a
WHERE rn = 1
;with cte as
(
select *,
row_number() over(partition by PayElement order by EffectiveDate desc) as rn
from YourTable
)
select
ID,
PayElement,
Amount
from cte
where rn = 1
Try this.
select
T.ID,
T.PayElement,
T.Amount
from
Test T inner join (select MAX(T_DATE.EffectiveDate) as MAX_DATE, T_DATE.PayElement from Test T_DATE group by T_DATE.PayElement) T_DATE on (T.PayElement = T_DATE.PayElement) and (T.EffectiveDate = T_DATE.MAX_DATE)
order by
T.ID
Select a.Id,
a.PayElement,
a.Amount
From dbo.YourTable a
Join
(
Select PayElement,
Max(EffectiveDate) as[MaxDate]
From dbo.YourTable
Group By PayElement
)b on a.PayElement = b.PayElement
And a.EffectiveDate = b.MaxDate
try something like
Select
a.ID, a.PayElement, a.Amount
From MyTable a
Inner Join (
Select PayElement, max(EffectiveDate) as MaxDate From MyTable Group By PayElement
) sub on a.EffectiveDate = sub.MaxDate and a.PayElement = sub.PayElement
select
Id, PayElement, Amount
from
YourTable a
inner join
(select
Id, PayElement, max(EffectiveDate) as EffectiveDate
from
YourTable
group by
PayElement, Id) b
on
a.Id = b.Id