SQL - Get the sum of several groups of records

SQL - Get the sum of several groups of records - sql

DESIRED RESULT
Get the hours SUM of all [Hours] including only a single result from each [DevelopmentID] where [Revision] is highest value
e.g SUM 1, 2, 3, 5, 6 (Result should be 22.00)
I'm stuck trying to get the appropriate grouping.
DECLARE #CompanyID INT = 1
SELECT
SUM([s].[Hours]) AS [Hours]
FROM
[dbo].[tblDev] [d] WITH (NOLOCK)
JOIN
[dbo].[tblSpec] [s] WITH (NOLOCK) ON [d].[DevID] = [s].[DevID]
WHERE
[s].[Revision] = (
SELECT MAX([s2].[Revision]) FROM [tblSpec] [s2]
)
GROUP BY
[s].[Hours]

use row_number() to identify the latest revision
SELECT SUM([Hours])
FROM (
SELECT *, R = ROW_NUMBER() OVER (PARTITION BY d.DevID
ORDER BY s.Revision)
FROM [dbo].[tblDev] d
JOIN [dbo].[tblSpec] s
ON d.[DevID] = s.[DevID]
) d
WHERE R = 1

If you want one row per DevId, then that should be in the GROUP BY (and presumably in the SELECT as well):
SELECT s.DevId, SUM(s.Hours) as hours
FROM [dbo].[tblDev] d JOIN
[dbo].[tblSpec] s
ON [d].[DevID] = [s].[DevID]
WHERE s.Revision = (SELECT MAX(s2.Revision) FROM tblSpec s2)
GROUP BY s.DevId;
Also, don't use WITH NOLOCK unless you really know what you are doing -- and I'm guessing you do not. It is basically a license that says: "You can get me data even if it is not 100% accurate."
I would also dispense with all the square braces. They just make the query harder to write and to read.

Related

Calculate MAX for every row in SQL

I have this tables:
Docenza(id, id_facolta, ..., orelez)
Facolta(id, ...)
and I want to obtain, for every facolta, only the id of Docenza who has done the maximum number of orelez and the number of orelez:
id_docenzaP facolta1 max(orelez)
id_docenzaQ facolta2 max(orelez)
...
id_docenzaZ facoltaN max(orelez)
how can I do this? This is what i do:
SELECT DISTINCT ... F.nome, SUM(orelez) AS oreTotali
FROM Docenza D
JOIN Facolta F ON F.id = D.id_facolta
GROUP BY F.nome
I obtain somethings like:
docenzaP facolta1 maxValueForidP
docenzaQ facolta1 maxValueForidQ
...
docenzaR facolta2 maxValueForidR
docenzaS facolta2 maxValueForidS
...
docenzaZ facoltaN maxValueForFacoltaN
How can I take only the max value for every facolta?

Presumably, you just want:
SELECT F.nome, sum(orelez) AS oreTotali
FROM Docenza D JOIN
Facolta F
ON F.id = D.id_facolta
GROUP BY F.nome;
I'm not sure what the SELECT DISTINCT is supposed to be doing. It is almost never used with GROUP BY. The . . . suggests that you are selecting additional columns, which are not needed for the results you want.

This is untested, and since you didn't provide sample data with expected results I can't be sure it's really what you need.
It's a bit ugly and I'm sure there is some clever correlated sub query approach, but I've never been good with those.
SELECT st.focolta,
s_orelez,
TMP3.id_docenza
FROM some_table AS st
INNER
JOIN (SELECT *
FROM (SELECT focolta,
s_orelez,
id_docenza,
ROW_NUMBER() OVER -- Get the ranking of the orelez sum by focolta.
( PARTITION BY focolta
ORDER BY s_orelez DESC
) rn_orelez
FROM (SELECT focolta,
id_docenza,
SUM(orelez) OVER -- Sum the orelez by focolta
( PARTITION BY focolta
) AS s_orelez
FROM some_table
) TMP
) TMP2
WHERE = TMP2.rn_orelez = 1 -- Limit to the highest rank value
) TMP3
ON some_table.focolta = TMP3.focolta; -- Join to focolta to the id associated with the hightest value.

SQL Query Help - Negative reporting

Perhaps somebody can help with Ideas or a Solution. A User asked me for a negative report. We have a table with tickets each ticket has a ticket number which would be easy to select but the user wants a list of missing tickets between the first and last ticket in the system.
E.g. Select TicketNr from Ticket order by TicketNr
Result
1,
2,
4,
7,
11
But we actually want the result 3,5,6,8,9,10
CREATE TABLE [dbo].[Ticket](
[pknTicketId] [int] IDENTITY(1,1) NOT NULL,
[TicketNr] [int] NULL
) ON [PRIMARY]
GO
SQL Server 2016 - TSQL
Any ideas ?
So a bit more information is need all solution thus far works on small table. Our production database has over 4 million tickets. Hence why we need to find the missing ones.

First get the minimum and maximum, then generate all posible ticket numbers and finally select the ones that are missing.
;WITH FirstAndLast AS
(
SELECT
MinTicketNr = MIN(T.TicketNr),
MaxTicketNr = MAX(T.TicketNr)
FROM
Ticket AS T
),
AllTickets AS
(
SELECT
TicketNr = MinTicketNr,
MaxTicketNr = T.MaxTicketNr
FROM
FirstAndLast AS T
UNION ALL
SELECT
TicketNr = A.TicketNr + 1,
MaxTicketNr = A.MaxTicketNr
FROM
AllTickets AS A
WHERE
A.TicketNr + 1 <= A.MaxTicketNr
)
SELECT
A.TicketNr
FROM
AllTickets AS A
WHERE
NOT EXISTS (
SELECT
'missing ticket'
FROM
Ticket AS T
WHERE
A.TicketNr = T.TicketNr)
ORDER BY
A.TicketNr
OPTION
(MAXRECURSION 32000)

If you can accept the results in a different format, the following will do what you want:
select TicketNr + 1 as first_missing,
next_TicketNr - 1 as last_missing,
(next_TicketNr - TicketNr - 1) as num_missing
from (select t.*, lead(TicketNr) over (order by TicketNr) as next_TicketNr
from Ticket t
) t
where next_TicketNr <> TicketNr + 1;
This shows each sequence of missing ticket numbers on a single row, rather than a separate row for each of them.
If you do use a recursive CTE, I would recommend doing it only for the missing tickets:
with cte as (
select (TicketNr + 1) as missing_TicketNr
from (select t.*, lead(TicketNr) over (order by TicketNr) as next_ticketNr
from tickets t
) t
where next_TicketNr <> TicketNr + 1
union all
select missing_TicketNr + 1
from cte
where not exists (select 1 from tickets t2 where t2.TicketNr = cte.missing_TicketNr + 1)
)
select *
from cte;
This version starts with the list of missing ticket numbers. It then adds a new one, as the numbers are not found.

One method is to use recursive cte to find the missing ticket numbers :
with missing as (
select min(TicketNr) as mnt, max(TicketNr) as mxt
from ticket t
union all
select mnt+1, mxt
from missing m
where mnt < mxt
)
select m.*
from missing m
where not exists (select 1 from tickets t where t.TicketNr = m.mnt);

This should do the trick: SQL Fiddle
declare #ticketsTable table (ticketNo int not null)
insert #ticketsTable (ticketNo) values (1),(2),(4),(7),(11)
;with cte1(ticketNo, isMissing, sequenceNo) AS
(
select ticketNo
, 0
, row_number() over (order by ticketNo)
from #ticketsTable
)
, cte2(ticketNo, isMissing, sequenceNo) AS
(
select ticketNo, isMissing, sequenceNo
from cte1
union all
select a.ticketNo + 1
, 1
, a.sequenceNo
from cte2 a
inner join cte1 b
on b.sequenceNo = a.sequenceNo + 1
and b.ticketNo != a.ticketNo + 1
)
select *
from cte2
where isMissing = 1
order by ticketNo
It works by collecting all of the existing tickets, marking them as existing, and assigning each a consecutive number giving their order in the original list.
We can then see the gaps in the list by finding any spots where the consecutive order number shows the next record, but the ticket numbers are not consecutive.
Finally, we recursively fill in the gaps; working from the start of a gap and adding new records until that gap's consecutive numbers no longer has a gap between the related ticket numbers.

I think this one give you easiest solution
with cte as(
select max(TicketNr) maxnum,min(TicketNr) minnum from Ticket )
select a.number FROM master..spt_values a,cte
WHERE Type = 'P' and number < cte.maxnum and number > cte.minno
except
select TicketNr FROM Ticket

So After looking at all the solutions
I went with creating a temp table with a full range of number from Starting to Ending ticket and then select from the Temp table where the ticket number not in the ticket table.
The reason being I kept running in MAXRECURSION problems.

Pivot rows to columns

I need to create one master table to present in reporting services, but i have an issue on how to combine the data.
To be more specific, i have one table named "Reservaciones" which store information from residences that have been reserved in certain dates.
Now for example, I´m grouping the information by this fields:
R.ClaTrab AS WorkerId
R.ClaUbicacion AS UbicationID
R.ClaEstancia AS ResidenceID
R.FechaIni AS InitialDay
R.FechaFin AS LastDay
And the result is First result**
As you see in the picture we have two rows duplicated, the number four and number five to be exact.
So far this is my code
SELECT
R.ClaTrab AS WorkerId,
MAX(E.NomEstancia) AS ResidenceName,
R.ClaUbicacion AS UbicationID,
R.ClaEstancia AS ResidenceID,
DATEDIFF(DAY, R.FechaIni, R.FechaFin) AS NumberDays,
R.FechaIni AS InitialDay,
R.FechaFin AS LastDay
FROM Reservaciones AS R
INNER JOIN Estancias AS E ON E.ClaEstancia = R.ClaEstancia
WHERE E.ClaUbicacionEst = 3
GROUP BY R.ClaTrab,R.ClaUbicacion, R.ClaEstancia, R.FechaIni, R.FechaFin
ORDER BY R.FechaIni
I Want the result to be like this desire result, but i dont know how to do it, i have tried PIVOT but i cant get the result i want it.
If u need more information please, ask me.
thank you very much.
SOLUTION:
What i did is use the ROW NUMBER() and OVER PARTITION BY to create a group of workers in the same residence, then PIVOT the result in new columns.
SNIPPET
SELECT * FROM(
SELECT
MAX(E.NomEstancia) AS ResidenceName,
R.FechaIni AS InitialDay,
R.FechaFin AS LastDay,
DATEDIFF(DAY, R.FechaIni, R.FechaFin) AS NumberDays,
T.NomTrab AS Worker,
R.ClaUbicacion AS UbicationID,
R.ClaEstancia AS ResidenceID,
ROW_NUMBER() OVER(PARTITION BY FechaIni,FechaFin, R.ClaUbicacion, R.ClaEstancia ORDER BY T.NomTrab) AS GUEST
FROM Reservaciones AS R
INNER JOIN Estancias AS E ON E.ClaEstancia = R.ClaEstancia
INNER JOIN Trabajadores AS T ON T.ClaTrab = R.ClaTrab
WHERE E.ClaUbicacionEst = 3
GROUP BY T.NomTrab, R.ClaUbicacion, R.ClaEstancia, R.FechaIni,R.FechaFin) AS ONE
PIVOT( MAX(Worker) FOR GUEST IN ([1],[2],[3])) AS pvt
In the new query I added a new join to obtain the name of the workers

Since there can only be 2 workers, you can use Min and Max.
with cte as(
SELECT
R.ClaTrab AS WorkerId,
MAX(E.NomEstancia) AS ResidenceName,
R.ClaUbicacion AS UbicationID,
R.ClaEstancia AS ResidenceID,
DATEDIFF(DAY, R.FechaIni, R.FechaFin) AS NumberDays,
R.FechaIni AS InitialDay,
R.FechaFin AS LastDay
FROM
Reservaciones AS R
INNER JOIN
Estancias AS E
ON E.ClaEstancia = R.ClaEstancia
WHERE
E.ClaUbicacionEst = 3
GROUP BY
R.ClaTrab,
R.ClaUbicacion,
R.ClaEstancia,
R.FechaIni,
R.FechaFin),
cte2 as(
select
ResidenceName
,UbicationID
,ResidenceID
,NumberDays
,InitalDay
,LastDay
,Worker1 = max(WorkerId)
,Worker2 = min(WorkerId)
from
cte
group by
ResidenceName
,UbicationID
,ResidenceID
,NumberDays
,InitalDay
,LastDay)
select
ResidenceName
,UbicationID
,ResidenceID
,NumberDays
,InitalDay
,LastDay
,Worker1
,Worker2 = case when Worker1 = Worker2 then NULL else Worker2 end
from
cte2
ONLINE DEMO WITH PARTIAL TEST DATA

Select all rows with max date for each ID

I have the following query returning the data as shown below. But I need to exclude the rows with MODIFIEDDATETIME shown in red as they have a lower time stamp by COMMITRECID. As depicted in the data, there may be multiple rows with the max time stamp by COMMITRECID.
SELECT REQCOMMIT.COMMITSTATUS, NOTEHISTORY.NOTE, NOTEHISTORY.MODIFIEDDATETIME, NOTEHISTORY.COMMITRECID
FROM REQCOMMIT INNER JOIN NOTEHISTORY ON REQCOMMIT.RECID = NOTEHISTORY.COMMITRECID
WHERE REQCOMMIT.PORECID = 1234
Here is the result of the above query
The desired result is only 8 rows with 5 in Green and 3 in Black (6 in Red should get eliminated).
Thank you very much for your help :)

Use RANK:
WITH CTE AS
(
SELECT R.COMMITSTATUS,
N.NOTE,
N.MODIFIEDDATETIME,
N.COMMITRECID,
RN = RANK() OVER(PARTITION BY N.COMMITRECID ORDER BY N.MODIFIEDDATETIME)
FROM REQCOMMIT R
INNER JOIN NOTEHISTORY N
ON R.RECID = N.COMMITRECID
WHERE R.PORECID = 1234
)
SELECT *
FROM CTE
WHERE RN = 1;
As an aside, please try to use tabla aliases instead of the whole table name in your queries.
*Disclaimer: You said that you wanted the max date, but the selected values in your post were those with the min date, so I used that criteria in my answer

This method just limits your history table to those with the MINdate as you described.
SELECT
REQCOMMIT.COMMITSTATUS,
NOTEHISTORY.NOTE,
NOTEHISTORY.MODIFIEDDATETIME,
NOTEHISTORY.COMMITRECID
FROM REQCOMMIT
INNER JOIN NOTEHISTORY ON REQCOMMIT.RECID = NOTEHISTORY.COMMITRECID
INNER JOIN (SELECT COMMITRECID, MIN(MODIFIEDDATETIME) DT FROM NOTEHISTORY GROUP BY COMMITRECID) a on a.COMMITRECID = NOTEHISTORY.COMMITRECID and a.DT = NOTEHISTORY.MODIFIEDDATETIME
WHERE REQCOMMIT.PORECID = 1234

Find Segment with Longest Stay Per Booking

We have a number of bookings and one of the requirements is that we display the Final Destination for a booking based on its segments. Our business has defined the Final Destination as that in which we have the longest stay. And Origin being the first departure point.
Please note this is not the segments with the Longest Travel time i.e. Datediff(minute, DepartDate, ArrivalDate) This is requesting the one with the Longest gap between segments.
This is a simplified version of the tables:
Create Table Segments
(
BookingID int,
SegNum int,
DepartureCity varchar(100),
DepartDate datetime,
ArrivalCity varchar(100),
ArrivalDate datetime
);
Create Table Bookings
(
BookingID int identity(1,1),
Locator varchar(10)
);
Insert into Segments values (1,2,'BRU','2010-03-06 10:40','FIH','2010-03-06 20:20:00')
Insert into Segments values (1,4,'FIH','2010-03-13 21:50:00','BRU', '2010-03-14 07:25:00')
Insert into Segments values (2,2,'BOD','2010-02-10 06:50:00','AMS','2010-02-10 08:50:00')
Insert into Segments values (2,3,'AMS','2010-02-10 10:40:00','EBB','2010-02-10 20:40:00')
Insert into Segments values (2,4,'EBB','2010-02-28 22:55:00','AMS','2010-03-01 05:35:00')
Insert into Segments values (2,5,'AMS','2010-03-01 10:25:00','BOD','2010-03-01 12:15:00')
insert into Segments values (3,2,'BRU','2010-03-09 12:10:00','IAD','2010-03-09 14:46:00')
Insert into Segments Values (3,3,'IAD','2010-03-13 17:57:00','BRU','2010-03-14 07:15:00')
insert into segments values (4,2,'BRU','2010-07-27','ADD','2010-07-28')
insert into segments values (4,4,'ADD','2010-07-28','LUN','2010-07-28')
insert into segments values (4,5,'LUN','2010-08-23','ADD','2010-08-23')
insert into segments values (4,6,'ADD','2010-08-23','BRU','2010-08-24')
Insert into Bookings values('5MVL7J')
Insert into Bookings values ('Y2IMXQ')
insert into bookings values ('YCBL5C')
Insert into bookings values ('X7THJ6')
I have created a SQL Fiddle with real data here:
SQL Fiddle Example
I have tried to do the following, however this doesn't appear to be correct.
SELECT Locator, fd.*
FROM Bookings ob
OUTER APPLY
(
SELECT Top 1 DepartureCity, ArrivalCity
from
(
SELECT DISTINCT
seg.segnum ,
seg.DepartureCity ,
seg.DepartDate ,
seg.ArrivalCity ,
seg.ArrivalDate,
(SELECT
DISTINCT
DATEDIFF(MINUTE , seg.ArrivalDate , s2.DepartDate)
FROM Segments s2
WHERE s2.BookingID = seg.BookingID AND s2.segnum = seg.segnum + 1) 'LengthOfStay'
FROM Bookings b(NOLOCK)
INNER JOIN Segments seg (NOLOCK) ON seg.bookingid = b.bookingid
WHERE b.Locator = ob.locator
) a
Order by a.lengthofstay desc
)
FD
The results I expect are:
Locator Origin Destination
5MVL7J BRU FIH
Y2IMXQ BOD EBB
YCBL5C BRU IAD
X7THJ6 BRU LUN
I get the feeling that a CTE would be the best approach, however my attempts do this so far failed miserably. Any help would be greatly appreciated.
I have managed to get the following query working but it only works for one at a time due to the top one, but I'm not sure how to tweak it:
WITH CTE AS
(
SELECT distinct s.DepartureCity, s.DepartDate, s.ArrivalCity, s.ArrivalDate, b.Locator , ROW_NUMBER() OVER (PARTITION BY b.Locator ORDER BY SegNum ASC) RN
FROM Segments s
JOIN bookings b ON s.bookingid = b.BookingID
)
SELECT C.Locator, c.DepartureCity, a.ArrivalCity
FROM
(
SELECT TOP 1 C.Locator, c.ArrivalCity, c1.DepartureCity, DATEDIFF(MINUTE,c.ArrivalDate, c1.DepartDate) 'ddiff'
FROM CTE c
JOIN cte c1 ON c1.Locator = C.Locator AND c1.rn = c.rn + 1
ORDER BY ddiff DESC
) a
JOIN CTE c ON C.Locator = a.Locator
WHERE c.rn = 1

You can try something like this:
;WITH CTE_Start AS
(
--Ordering of segments to eliminate gaps
SELECT *, ROW_NUMBER() OVER (PARTITION BY BookingID ORDER BY SegNum) RN
FROM dbo.Segments
)
, RCTE_Stay AS
(
--recursive CTE to calculate stay between segments
SELECT *, 0 AS Stay FROM CTE_Start s WHERE RN = 1
UNION ALL
SELECT sNext.*, DATEDIFF(Mi, s.ArrivalDate, sNext.DepartDate)
FROM CTE_Start sNext
INNER JOIN RCTE_Stay s ON s.RN + 1 = sNext.RN AND s.BookingID = sNext.BookingID
)
, CTE_Final AS
(
--Search for max(stay) for each bookingID
SELECT *, ROW_NUMBER() OVER (PARTITION BY BookingID ORDER BY Stay DESC) AS RN_Stay
FROM RCTE_Stay
)
--join Start and Final on RN=1 to find origin and departure
SELECT b.Locator, s.DepartureCity AS Origin, f.DepartureCity AS Destination
FROM CTE_Final f
INNER JOIN CTE_Start s ON f.BookingID = s.BookingID
INNER JOIN dbo.Bookings b ON b.BookingID = f.BookingID
WHERE s.RN = 1 AND f.RN_Stay = 1
SQLFiddle DEMO

You can use the OUTER APPLY + TOP operators to find the next values SegNum. After finding the gap between segments are used MIN/MAX aggregate functions with OVER clause as conditions in the CASE expression
;WITH cte AS
(
SELECT seg.BookingID,
CASE WHEN MIN(seg.segNum) OVER(PARTITION BY seg.BookingID) = seg.segNum
THEN seg.DepartureCity END AS Origin,
CASE WHEN MAX(DATEDIFF(MINUTE, seg.ArrivalDate, o.DepartDate)) OVER(PARTITION BY seg.BookingID)
= DATEDIFF(MINUTE, seg.ArrivalDate, o.DepartDate)
THEN o.DepartureCity END AS Destination
FROM Segments seg (NOLOCK)
OUTER APPLY (
SELECT TOP 1 seg2.DepartDate, seg2.DepartureCity
FROM Segments seg2
WHERE seg.BookingID = seg2.BookingID
AND seg.SegNum < seg2.SegNum
ORDER BY seg2.SegNum ASC
) o
)
SELECT b.Locator, MAX(c.Origin) AS Origin, MAX(c.Destination) AS Destination
FROM cte c JOIN Bookings b ON c.BookingID = b.BookingID
GROUP BY b.Locator
See demo on SQLFiddle

The statement below:
;WITH DataSource AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY BookingID ORDER BY DATEDIFF(SS,DepartDate,ArrivalDate) DESC) AS Row
,Segments.BookingID
,Segments.SegNum
,Segments.DepartureCity
,Segments.DepartDate
,Segments.ArrivalCity
,Segments.ArrivalDate
,DATEDIFF(SS,DepartDate,ArrivalDate) AS DiffInSeconds
FROM Segments
)
SELECT *
FROM DataSource DS
INNER JOIN Bookings B
ON DS.[BookingID] = B.[BookingID]
Will give the following output:
So, adding the following clause to the above statement:
WHERE Row = 1
will give you what you need.
Few important things:
As you can see from the screenshot below, there are two records with same difference in second. If you want to show both of them (or all of them if there are), instead ROW_NUMBER function use RANK function.
The return type of DATEDIFF is INT. So, there is limitation for seconds max deference value. It is as follows:
If the return value is out of range for int (-2,147,483,648 to
+2,147,483,647), an error is returned. For millisecond, the maximum difference between startdate and enddate is 24 days, 20 hours, 31
minutes and 23.647 seconds. For second, the maximum difference is 68
years.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - Get the sum of several groups of records - sql

use row_number() to identify the latest revision SELECT SUM([Hours]) FROM ( SELECT *, R = ROW_NUMBER() OVER (PARTITION BY d.DevID ORDER BY s.Revision) FROM [dbo].[tblDev] d JOIN [dbo].[tblSpec] s ON d.[DevID] = s.[DevID] ) d WHERE R = 1

Related

Calculate MAX for every row in SQL

SQL Query Help - Negative reporting

Pivot rows to columns

Select all rows with max date for each ID

Find Segment with Longest Stay Per Booking

Categories

Resources