T-SQL. GroupBy Truncated DATETIME and order by - sql

I have a simple table of Orders like:
Order
=========
Date (DATETIME)
Price (INT)
And I would like to know how much money we earned per every month and output should look like:
January-2018 : 100
February-2018: 200
...
January-2019: 300
...
I have the following SQL:
SELECT
DATENAME(MONTH , DATEADD(M, t.MDate , -1 ) ) + '-' + CAST(t.YDate as NVARCHAR(20)),
SUM(Price)
FROM
(
SELECT
DATEPART(YYYY, o.Date) as YDate,
DATEPART(MM, o.Date) as MDate,
Price
FROM [Order] o
) t
GROUP BY t.YDate, t.MDate
ORDER BY t.YDate, t.MDate
Is it ok or may be there is a better approach ?

format() is the way to go. I would recommend:
select format(o.date, 'MMMM-yyyy') as Monthyear,
sum(o.price) as total
from orders o
group by format(datenew, 'MMMM-yyyy')
order by min(o.date)

If you are using Sql server 2012 or + , you can make use of format function as well.
select cast('2018-02-23 15:34:09.390' as datetime) as datenew, 100 as price into #temp union all
select cast('2018-02-24 15:34:09.390' as datetime) as datenew, 300 as price union all
select cast('2018-03-10 15:34:09.390' as datetime) as datenew, 500 as price union all
select cast('2018-03-11 15:34:09.390' as datetime) as datenew, 700 as price union all
select cast('2019-02-23 15:34:09.390' as datetime) as datenew, 900 as price union all
select cast('2019-02-23 15:34:09.390' as datetime) as datenew, 1500 as price
select Monthyear, total from (
select format(datenew, 'MMMM-yyyy') Monthyear, format(datenew, 'yyyyMM') NumMMYY ,sum(price) total from #temp
group by format(datenew, 'MMMM-yyyy') , format(datenew, 'yyyyMM') ) test
order by NumMMYY
Since format will convert it to nvarchar, I had to create one more column in the subquery to order it properly. If you don't mind one more column you don't have to use subquery, and you can order it with another column. I could not find any other way (except for case statement which is not a very good idea) to do it in the same select
Output:
Monthyear Total
February-2018 400
March-2018 1200
February-2019 2400

Maybe just with this?
with Orders as (
select convert(date,'2018-01-15') as [date], 10 as Price union all
select convert(date,'2018-01-20'), 20 union all
select convert(date,'2018-01-30'), 30 union all
select convert(date,'2018-02-15'), 20 union all
select convert(date,'2018-03-15'), 40 union all
select convert(date,'2018-03-20'), 50
)
select
datename(month,o.Date) + '-' + datename(year,o.Date) + ': ' +
convert(varchar(max),SUM(Price))
FROM [Orders] o
group by datename(month,o.Date) + '-' + datename(year,o.Date)
order by 1 desc
I don't get well why are you removing one month in your query, I am not doing it. You can test it here: https://rextester.com/GRKK80105

Related

Taking most recent values in sum over date range

I have a table which has the following columns: DeskID *, ProductID *, Date *, Amount (where the columns marked with * make the primary key). The products in use vary over time, as represented in the image below.
Table format on the left, and a (hopefully) intuitive representation of the data on the right for one desk
The objective is to have the sum of the latest amounts of products by desk and date, including products which are no longer in use, over a date range.
e.g. using the data above the desired table is:
So on the 1st Jan, the sum is 1 of Product A
On the 2nd Jan, the sum is 2 of A and 5 of B, so 7
On the 4th Jan, the sum is 1 of A (out of use, so take the value from the 3rd), 5 of B, and 2 of C, so 8 in total
etc.
I have tried using a partition on the desk and product ordered by date to get the most recent value and turned the following code into a function (Function1 below) with #date Date parameter
select #date 'Date', t.DeskID, SUM(t.Amount) 'Sum' from (
select #date 'Date', t.DeskID, t.ProductID, t.Amount
, row_number() over (partition by t.DeskID, t.ProductID order by t.Date desc) as roworder
from Table1 t
where 1 = 1
and t.Date <= #date
) t
where t.roworder = 1
group by t.DeskID
And then using a utility calendar table and cross apply to get the required values over a time range, as below
select * from Calendar c
cross apply Function1(c.CalendarDate)
where c.CalendarDate >= '20190101' and c.CalendarDate <= '20191009'
This has the expected results, but is far too slow. Currently each desk uses around 50 products, and the products roll every month, so after just 5 years each desk has a history of ~3000 products, which causes the whole thing to grind to a halt. (Roughly 30 seconds for a range of a single month)
Is there a better approach?
Change your function to the following should be faster:
select #date 'Date', t.DeskID, SUM(t.Amount) 'Sum'
FROM (SELECT m.DeskID, m.ProductID, MAX(m.[Date) AS MaxDate
FROM Table1 m
where m.[Date] <= #date) d
INNER JOIN Table1 t
ON d.DeskID=t.DeskID
AND d.ProductID=t.ProductID
and t.[Date] = d.MaxDate
group by t.DeskID
The performance of TVF usually suffers. The following removes the TVF completely:
-- DROP TABLE Table1;
CREATE TABLE Table1 (DeskID int not null, ProductID nvarchar(32) not null, [Date] Date not null, Amount int not null, PRIMARY KEY ([Date],DeskID,ProductID));
INSERT Table1(DeskID,ProductID,[Date],Amount)
VALUES (1,'A','2019-01-01',1),(1,'A','2019-01-02',2),(1,'B','2019-01-02',5),(1,'A','2019-01-03',1)
,(1,'B','2019-01-03',4),(1,'C','2019-01-03',3),(1,'B','2019-01-04',5),(1,'C','2019-01-04',2),(1,'C','2019-01-05',2)
GO
DECLARE #StartDate date=N'2019-01-01';
DECLARE #EndDate date=N'2019-01-05';
;WITH cte_p
AS
(
SELECT DISTINCT DeskID,ProductID
FROM Table1
WHERE [Date] <= #EndDate
),
cte_a
AS
(
SELECT #StartDate AS [Date], p.DeskID, p.ProductID, ISNULL(a.Amount,0) AS Amount
FROM (
SELECT t.DeskID, t.ProductID
, MAX(t.Date) AS FirstDate
FROM Table1 t
WHERE t.Date <= #StartDate
GROUP BY t.DeskID, t.ProductID) f
INNER JOIN Table1 a
ON f.DeskID=a.DeskID
AND f.ProductID=a.ProductID
AND f.[FirstDate]=a.[Date]
RIGHT JOIN cte_p p
ON p.DeskID=a.DeskID
AND p.ProductID=a.ProductID
UNION ALL
SELECT DATEADD(DAY,1,a.[Date]) AS [Date], t.DeskID, t.ProductID, t.Amount
FROM Table1 t
INNER JOIN cte_a a
ON t.DeskID=a.DeskID
AND t.ProductID=a.ProductID
AND t.[Date] > a.[Date]
AND t.[Date] <= DATEADD(DAY,1,a.[Date])
WHERE a.[Date]<#EndDate
UNION ALL
SELECT DATEADD(DAY,1,a.[Date]) AS [Date], a.DeskID, a.ProductID, a.Amount
FROM cte_a a
WHERE NOT EXISTS(SELECT 1 FROM Table1 t
WHERE t.DeskID=a.DeskID
AND t.ProductID=a.ProductID
AND t.[Date] > a.[Date]
AND t.[Date] <= DATEADD(DAY,1,a.[Date]))
AND a.[Date]<#EndDate
)
SELECT [Date], DeskID, SUM(Amount)
FROM cte_a
GROUP BY [Date], DeskID;

SQL Server group by date

SELECT [DATE], [AMOUNT], SUM(AMOUNT) OVER (ORDER BY DATE) AS 'Running Total'
FROM PeopleActi
WHERE INSTANCE = 'Bank'
AND DATE IS NOT NULL
GROUP BY [DATE], [AMOUNT];
In the code above I selecting a user's date, amount and the "SUM(AMOUNT) OVER (ORDER BY DATE) AS 'Running Total'" is the running total of their costs over a period of dates. When I run this code I get the following results:
DATE AMOUNT Running Total
2018-10-05 100 100
2018-10-06 1000 1100
2018-10-07 5000 6100
2018-10-08 2000 8100
2018-10-09 1000 9100
2018-10-10 5000 14100
2018-10-11 3000 25100
2018-10-11 8000 25100
This works nicely but my issue is the last two rows. I wanted them to be grouped by their date and have the total amount for both same days, so it should be:
Date Amount Running Total
2018-10-11 11000 25100
Does anyone have an idea of how this can achieved? My [DATE] is of type DATE.
UPDATE!!!!
I've seen some of your solutions and they are good but its important I display the AMOUNT and the Running Total amount as well, so the final result should be...
DATE AMOUNT Running Total
2018-10-05 100 100
2018-10-06 1000 1100
2018-10-07 5000 6100
2018-10-08 2000 8100
2018-10-09 1000 9100
2018-10-10 5000 14100
2018-10-11 11000 25100
Thank you everyone for the help so far!
Group up the amounts and then do your cumulative total
WITH CTE
AS
(
SELECT A.Dt,
SUM(A.Amount) AS Amount
FROM (
VALUES ('2018-10-05',100),
('2018-10-06',1000),
('2018-10-07',5000),
('2018-10-08',2000),
('2018-10-09',1000),
('2018-10-10',5000),
('2018-10-11',3000),
('2018-10-11',8000)
) AS A(Dt,Amount)
GROUP BY A.Dt
)
SELECT C.Dt,
C.Amount,
SUM(C.Amount) OVER (ORDER BY C.Dt) AS CumTotal
FROM CTE AS C;
Try like below
SELECT [DATE],sum( [AMOUNT]), SUM(AMOUNT) OVER (ORDER BY DATE) AS 'Running Total'
FROM PeopleActi
WHERE INSTANCE = 'Bank'
AND DATE IS NOT NULL
GROUP BY [DATE]
If you need groping sum then why you are using window function, only aggregation is enough :
SELECT [DATE], SUM([AMOUNT])
FROM PeopleActi
WHERE INSTANCE = 'Bank' AND DATE IS NOT NULL
GROUP BY [DATE];
Try this
;WITH CTe([DATE],AMOUNT)
AS
(
SELECT '2018-10-05', 100 UNION ALL
SELECT '2018-10-06', 1000 UNION ALL
SELECT '2018-10-07', 5000 UNION ALL
SELECT '2018-10-08', 2000 UNION ALL
SELECT '2018-10-09', 1000 UNION ALL
SELECT '2018-10-10', 5000 UNION ALL
SELECT '2018-10-11', 3000 UNION ALL
SELECT '2018-10-11', 8000
)
SELECT DISTINCT [DATE],SUM(AMOUNT)OVER(PARTITION BY [DATE] ORDER BY [DATE]) AMOUNT , SUM(AMOUNT)OVER( ORDER BY [DATE]) AS RuningTot FROM CTe
Script
SELECT DISTINCT [DATE],
SUM(AMOUNT)OVER(PARTITION BY [DATE] ORDER BY [DATE]) AS AMOUNT,
SUM(AMOUNT) OVER (ORDER BY DATE) AS 'Running Total'
FROM PeopleActi
WHERE INSTANCE = 'Bank'
AND DATE IS NOT NULL
I would use a CTE to first group by Date, and then do your running total ..
So something like
with myAmounts AS
(
SELECT [DATE], SUM([AMOUNT]) AS Amount
FROM PeopleActi
WHERE INSTANCE = 'Bank'
AND DATE IS NOT NULL
GROUP BY [DATE]
)
SELECT [DATE], [AMOUNT], SUM(AMOUNT) OVER (ORDER BY DATE) AS 'Running Total'
FROM myAmounts
GROUP BY [DATE], [AMOUNT]
;
HTH,
B
ps; just saw that its the same answer as another .. democoding in action
Every field in a group by is going to cause it to potentially create new lines. If you SUM the amount field and remove it from your grouping, that should solve the issue. EDIT: I see the issue, I provided a fully stand alone example of the query below that you can adapt.
DECLARE #PeopleActi TABLE ([DATE] DATE,[AMOUNT] MONEY)
INSERT INTO #PeopleActi SELECT '2018-10-05',100
INSERT INTO #PeopleActi SELECT '2018-10-06',1000
INSERT INTO #PeopleActi SELECT '2018-10-07',5000
INSERT INTO #PeopleActi SELECT '2018-10-08',2000
INSERT INTO #PeopleActi SELECT '2018-10-09',1000
INSERT INTO #PeopleActi SELECT '2018-10-10',5000
INSERT INTO #PeopleActi SELECT '2018-10-11',3000
INSERT INTO #PeopleActi SELECT '2018-10-11',8000
SELECT *, SUM(AMOUNT) OVER (ORDER BY DATE) AS 'Running Total'
FROM (
SELECT [DATE], SUM([AMOUNT]) AS AMOUNT
FROM #PeopleActi
WHERE DATE IS NOT NULL
GROUP BY [DATE]
) a
GROUP BY [DATE],Amount
Try Subselect:
SELECT p.[DATE], p.[AMOUNT], SUM(AMOUNT) OVER (ORDER BY DATE) AS 'Running Total'
FROM
(
select [date], sum([amount]) as Amount from PeopleActi
WHERE INSTANCE = 'Bank'
AND DATE IS NOT NULL
group by [date]
) p
GROUP BY p.[DATE], p.[AMOUNT]

How to get the right order?

I have written queries to get the total qty of each month in 2016. Then I added another row called Total to sum up the total qty in 2016. But the ordermonth in the result turned out to be in a mess.
So, the question is is there any way to put the ordermonth in the right order? Both ASC and DESC are OK. Thanks in advance.
The query I've written:
WITH CTE AS(
SELECT CONVERT(CHAR(10), MONTH(orderdate)) AS ordermonth, SUM(qty) AS qty
FROM dbo.orders
WHERE YEAR(orderdate) = 2016
GROUP BY MONTH(orderdate)
)
SELECT * FROM CTE
UNION
SELECT 'Total', SUM(qty)
FROM dbo.orders
WHERE YEAR(orderdate) = 2016;
Current result:
ordermonth qty
1 4134
10 6454
11 9780
12 4000
2 5548
3 6970
4 3543
5 3309
6 4251
7 4997
8 6134
9 6926
Total 66046
First, you can do what you want without UNION. Something like this:
select coalesce(cast(month(orderdate) as varchar(255)), 'Total') as mon . . .
from . . .
group by grouping sets (month(orderdate), ())
order by month(orderdate)
No CTE, that's the entire query.
You can put into subquery and try like below:
select * from (
--your query including cte and union
) a
order by case when a.[Month] = 'Total' then 9999 else convert(int,a.[Month]) end
You can sort it by yyyymm, that is the best way even when you'll have more than one year, i.e.
order by orderyear * 100 + ordermonth
Or you can just add '0' for getting the right order like this:
order by right('0' + cast(ordermonth as varchar(2)), 2)
I found an easy way to solve this issue, that is, replacing UNION with UNION ALL.
WITH CTE AS(
SELECT CONVERT(CHAR(10), MONTH(orderdate)) AS ordermonth, SUM(qty) AS qty
FROM dbo.orders
WHERE YEAR(orderdate) = 2016
GROUP BY MONTH(orderdate)
)
SELECT * FROM CTE
UNION ALL
SELECT 'Total', SUM(qty)
FROM dbo.orders
WHERE YEAR(orderdate) = 2016;

Getting rid of grouping field

Is there a safe way to not have to group by a field when using an aggregate in another field? Here is my example
SELECT
C.CustomerName
,D.INDUSTRY_CODE
,CASE WHEN D.INDUSTRY_CODE IN ('003','004','005','006','007','008','009','010','017','029')
THEN 'PM'
WHEN UPPER(CustomerName) = 'ULINE INC'
THEN 'ULINE'
ELSE 'DR'
END AS BU
,ISNULL((SELECT SUM(GrossAmount)
where CONVERT(date,convert(char(8),InvoiceDateID )) between DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 1, 0) and DATEADD(year, -1, GETDATE())),0) [PREVIOUS YEAR GROSS]
FROM factMargins A
LEFT OUTER JOIN dimDate B ON A.InvoiceDateID = B.DateId
LEFT OUTER JOIN dimCustomer C ON A.CustomerID = C.CustomerId
LEFT OUTER JOIN CRCDATA.DBO.CU10 D ON D.CUST_NUMB = C.CustomerNumber
GROUP BY
C.CustomerName,D.INDUSTRY_CODE
,A.InvoiceDateID
order by CustomerName
before grouping I was only getting 984 rows but after grouping by the A.InvoiceDateId field I am getting over 11k rows. The rows blow up since there are multiple invoices per customer. Min and Max wont work since then it will pull data incorrectly. Would it be best to let my application (crystal) get rid of the extra lines? Usually I like to have my base data be as close as possible to how the report will layout if possible.
Try moving the reference to InvoiceDateID to within an aggregate function, rather than within a selected subquery's WHERE clause.
In Oracle, here's an example:
with TheData as (
select 'A' customerID, 25 AMOUNT , trunc(sysdate) THEDATE from dual union
select 'B' customerID, 35 AMOUNT , trunc(sysdate-1) THEDATE from dual union
select 'A' customerID, 45 AMOUNT , trunc(sysdate-2) THEDATE from dual union
select 'A' customerID, 11000 AMOUNT , trunc(sysdate-3) THEDATE from dual union
select 'B' customerID, 12000 AMOUNT , trunc(sysdate-4) THEDATE from dual union
select 'A' customerID, 15000 AMOUNT , trunc(sysdate-5) THEDATE from dual)
select
CustomerID,
sum(amount) as "AllRevenue"
sum(case when thedate<sysdate-3 then amount else 0 end) as "OlderRevenue",
from thedata
group by customerID;
Output:
CustomerID | AllRevenue | OlderRevenue
A | 26070 | 26000
B | 12035 | 12000
This says:
For each customerID
I want the sum of all amounts
and I want the sum of amounts earlier than 3 days ago

How to count open records, grouped by hour and day in SQL-server-2008-r2

I have hospital patient admission data in Microsoft SQL Server r2 that looks something like this:
PatientID, AdmitDate, DischargeDate
Jones. 1-jan-13 01:37. 1-jan-13 17:45
Smith 1-jan-13 02:12. 2-jan-13 02:14
Brooks. 4-jan-13 13:54. 5-jan-13 06:14
I would like count the number of patients in the hospital day by day and hour by hour (ie at
1-jan-13 00:00. 0
1-jan-13 01:00. 0
1-jan-13 02:00. 1
1-jan-13 03:00. 2
And I need to include the hours when there are no patients admitted in the result.
I can't create tables so making a reference table listing all the hours and days is out, though.
Any suggestions?
To solve this problem, you need a list of date-hours. The following gets this from the admit date cross joined to a table with 24 hours. The table of 24 hours is calculating from information_schema.columns -- a trick for getting small sequences of numbers in SQL Server.
The rest is just a join between this table and the hours. This version counts the patients at the hour, so someone admitted and discharged in the same hour, for instance is not counted. And in general someone is not counted until the next hour after they are admitted:
with dh as (
select DATEADD(hour, seqnum - 1, thedatehour ) as DateHour
from (select distinct cast(cast(AdmitDate as DATE) as datetime) as thedatehour
from Admission a
) a cross join
(select ROW_NUMBER() over (order by (select NULL)) as seqnum
from INFORMATION_SCHEMA.COLUMNS
) hours
where hours <= 24
)
select dh.DateHour, COUNT(*) as NumPatients
from dh join
Admissions a
on dh.DateHour between a.AdmitDate and a.DischargeDate
group by dh.DateHour
order by 1
This also assumes that there are admissions on every day. That seems like a reasonable assumption. If not, a calendar table would be a big help.
Here is one (ugly) way:
;WITH DayHours AS
(
SELECT 0 DayHour
UNION ALL
SELECT DayHour+1
FROM DayHours
WHERE DayHour+1 <= 23
)
SELECT B.AdmitDate, A.DayHour, COUNT(DISTINCT PatientID) Patients
FROM DayHours A
CROSS JOIN (SELECT DISTINCT CONVERT(DATE,AdmitDate) AdmitDate
FROM YourTable) B
LEFT JOIN YourTable C
ON B.AdmitDate = CONVERT(DATE,C.AdmitDate)
AND A.DayHour = DATEPART(HOUR,C.AdmitDate)
GROUP BY B.AdmitDate, A.DayHour
This is a bit messy and includes a temp table with the test data you provided but
CREATE TABLE #HospitalPatientData (PatientId NVARCHAR(MAX), AdmitDate DATETIME, DischargeDate DATETIME)
INSERT INTO #HospitalPatientData
SELECT 'Jones.', '1-jan-13 01:37:00.000', '1-jan-13 17:45:00.000' UNION
SELECT 'Smith', '1-jan-13 02:12:00.000', '2-jan-13 02:14:00.000' UNION
SELECT 'Brooks.', '4-jan-13 13:54:00.000', '5-jan-13 06:14:00.000'
;WITH DayHours AS
(
SELECT 0 DayHour
UNION ALL
SELECT DayHour+1
FROM DayHours
WHERE DayHour+1 <= 23
),
HospitalPatientData AS
(
SELECT CONVERT(nvarchar(max),AdmitDate,103) as AdmitDate ,DATEPART(hour,(AdmitDate)) as AdmitHour, COUNT(PatientID) as CountOfPatients
FROM #HospitalPatientData
GROUP BY CONVERT(nvarchar(max),AdmitDate,103), DATEPART(hour,(AdmitDate))
),
Results AS
(
SELECT MAX(h.AdmitDate) as Date, d.DayHour
FROM HospitalPatientData h
INNER JOIN DayHours d ON d.DayHour=d.DayHour
GROUP BY AdmitDate, CountOfPatients, DayHour
)
SELECT r.*, COUNT(h.PatientId) as CountOfPatients
FROM Results r
LEFT JOIN #HospitalPatientData h ON CONVERT(nvarchar(max),AdmitDate,103)=r.Date AND DATEPART(HOUR,h.AdmitDate)=r.DayHour
GROUP BY r.Date, r.DayHour
ORDER BY r.Date, r.DayHour
DROP TABLE #HospitalPatientData
This may get you started:
BEGIN TRAN
DECLARE #pt TABLE
(
PatientID VARCHAR(10)
, AdmitDate DATETIME
, DischargeDate DATETIME
)
INSERT INTO #pt
( PatientID, AdmitDate, DischargeDate )
VALUES ( 'Jones', '1-jan-13 01:37', '1-jan-13 17:45' ),
( 'Smith', '1-jan-13 02:12', '2-jan-13 02:14' )
, ( 'Brooks', '4-jan-13 13:54', '5-jan-13 06:14' )
DECLARE #StartDate DATETIME = '20130101'
, #FutureDays INT = 7
;
WITH dy
AS ( SELECT TOP (#FutureDays)
ROW_NUMBER() OVER ( ORDER BY name ) dy
FROM sys.columns c
) ,
hr
AS ( SELECT TOP 24
ROW_NUMBER() OVER ( ORDER BY name ) hr
FROM sys.columns c
)
SELECT refDate, COUNT(p.PatientID) AS PtCount
FROM ( SELECT DATEADD(HOUR, hr.hr - 1,
DATEADD(DAY, dy.dy - 1, #StartDate)) AS refDate
FROM dy
CROSS JOIN hr
) ref
LEFT JOIN #pt p ON ref.refDate BETWEEN p.AdmitDate AND p.DischargeDate
GROUP BY refDate
ORDER BY refDate
ROLLBACK