Selecting fields that are not in GROUP BY [duplicate] - sql

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed last year.
I am working with group by in SQL Server. I want to select fields that shouldn't be part of group by.
I want to select DishId but it shouldn't be in group by clause. If I add the DishId into group by it repeats the range as shown following.
SELECT v.range, COUNT(*) AS 'occurrences',COUNT(DishID) * SUM(OrderQty) AS [TotalOrders],po.DishID
FROM ( SELECT PgrId,CASE WHEN DATEDIFF(YEAR, p.PgrDOB, GETDATE()) >= 0 AND DATEDIFF(YEAR, p.PgrDOB, GETDATE()) < 10 THEN '0-9'
WHEN DATEDIFF(YEAR, p.PgrDOB, GETDATE()) >= 10 AND DATEDIFF(YEAR, p.PgrDOB, GETDATE()) < 20 THEN '10-19'
WHEN DATEDIFF(YEAR, p.PgrDOB, GETDATE()) >= 20 AND DATEDIFF(YEAR, p.PgrDOB, GETDATE()) < 30 THEN '20-29'
WHEN DATEDIFF(YEAR, p.PgrDOB, GETDATE()) >= 30 AND DATEDIFF(YEAR, p.PgrDOB, GETDATE()) < 40 THEN '30-39'
ELSE '40+'
END AS 'range'
FROM Passenger p
) v inner join PassengerOrder po on v.PgrId = po.PgrID
GROUP BY v.range,po.DishID

Sounds like you want the highest dish in each age range, ordered by total sales descending. There are many other improvements that could be made here but the simplest way is to just generate a row number per range ordered by total sales descending, and wrap that in a CTE, filtered by the first row number.
;WITH cte AS
(
SELECT v.range,
COUNT(*) AS occurrences,
COUNT(DishID) * SUM(OrderQty) AS TotalOrders,
po.DishID,
rn = ROW_NUMBER() OVER (PARTITION BY v.range
ORDER BY COUNT(DishID) * SUM(OrderQty) DESC)
FROM
( SELECT PgrId,
CASE WHEN p.PgrDOB < DATEADD(YEAR, -40, GETDATE()) THEN '40+'
WHEN p.PgrDOB < DATEADD(YEAR, -30, GETDATE()) THEN '30-39'
WHEN p.PgrDOB < DATEADD(YEAR, -20, GETDATE()) THEN '20-29'
WHEN p.PgrDOB < DATEADD(YEAR, -10, GETDATE()) THEN '10-19'
ELSE '0-9'
END AS range
FROM dbo.Passenger p
) v inner join dbo.PassengerOrder po on v.PgrId = po.PgrID
GROUP BY v.range,po.DishID
)
SELECT range, occurrences, TotalOrders, DishID
FROM cte
WHERE rn = 1
ORDER BY TotalOrders DESC;
Output (shown in this db<>fiddle):
range
occurrences
TotalOrders
DishID
40+
2
220
dsh0000001
30-39
2
84
dsh0000001
10-19
1
11
dsh0000001
20-29
1
1
dsh0000001
0-9
1
1
dsh0000001
I fixed a couple of other things, too:
Always use schema name for objects.
Don't use 'single quotes' to delimit column aliases; this makes them too easy to confuse with string literals. Use [square brackets] instead, but don't delimit at all unless you need to. (I talk a little about that here.)
Your DATEDIFF calculation was not accurate to a person's birthday, nor is mine, but if you flip the order you can make a much less complex CASE expression, and when you move functions away from the column, you make it more likely that the query could benefit from a current (or future) index. Lots more at Dating Responsibly.

Related

Don't have the answer to attached query?

My data is in two tables. The format of the two tables is below :
I had to figure out for all customers aged between 25 to 35 years find what is the net total revenue generated by these customers in last 30 days of transactions from max transaction date available in the data ?
I wrote below code
SELECT
TOP 1 YEAR(T2.TRAN_DATE)[TRAN_YEAR] ,MONTH(T2.TRAN_DATE)[TRAN_Month],
SUM(T2.Total_amt)[REVENUE]
FROM TRANSACTIONS T2
RIGHT JOIN CUSTOMER T1
ON T1.CUSTOMER_ID = T2.CUST_ID
WHERE DATEDIFF(YY, T1.DOB, GETDATE()) BETWEEN 25 AND 35
GROUP BY YEAR(T2.TRAN_DATE),MONTH(T2.TRAN_DATE)
ORDER BY YEAR(T2.TRAN_DATE) DESC, MONTH(T2.TRAN_DATE) DESC
My query works but when i calculated the same thing on excel it gave a different answer.
I am not able to figure out my mistake.
I am expecting a query like this:
SELECT SUM(T.Total_amt) as REVENUE]
FROM TRANSACTIONS T JOIN
CUSTOMER c
ON c.CUSTOMER_ID = t.CUST_ID
WHERE c.DOB >= DATEADD(YEAR, -35, GETDATE()) AND
c.DOB < DATEADD(YEAR, -24, GETDATE()) AND
t.TRAN_DATE > DATEADD(DAY, -30, GETDATE());
Note that this uses direct date comparisons rather than DATEDIFF(). These are usually more accurate.

Calculate percentage along with count

I am trying to show the count and percentage in a table.
The query I used is this:
DECLARE #BeginDate AS DATETIME
SET #BeginDate = GETDATE();
SELECT TOP 10
s.Title AS Title, COUNT(*) AS TotalSessions
FROM
History s
WHERE
CONVERT(DATE, s.DateStamp) >= DATEADD(DAY, -7, #BeginDate)
AND CONVERT(DATE, s.DateStamp) <= DATEADD(DAY, -1, #BeginDate)
GROUP BY
Title
ORDER BY
TotalSessions DESC
This returns the top 1o records and now
I want to show the percentage value with respect to total as the third column. Can I do this in same query?
I want to show the remaining count as others (if 100 records are there, first 10 rows shows top 10 records and row #11 shows sum of remaining 90 records with title "Others"). Can I do it in the same query?
You can use window functions. Something like this:
SELECT TOP 10 s.Title as Title, count(*) as TotalSessions,
COUNT(*) * 1.0 / SUM(COUNT(*)) OVER ()
FROM History s
WHERE convert(date,s.DateStamp) >= DATEADD(DAY, -7, #BeginDate)
AND convert(date,s.DateStamp) <= DATEADD(DAY, -1, #BeginDate)
GROUP BY Title
ORDER BY TotalSessions DESC

Having case multiple conditions in SQL Server 2014 [duplicate]

This question already has answers here:
Case with multiple conditions in SQL Server 2014
(1 answer)
How to check a SQL CASE with multiple conditions?
(4 answers)
Closed 5 years ago.
I have a table 'FinancialTrans' where only 3 of those fields are needed.
AcctID TransTypeCode DateOfTrans Field 4 Field 5 Field 6....
123 TOLL 2016-06-06
123 TOLL 2016-06-02
123 TOLL 2016-04-28
123 PYMT 2016-03-11
123 TOLL 2015-12-22
123 TOLL 2015-12-22
What I need:
I only need account numbers where there are No Tolls AND No Pymt in the last 2 years.
My attempt at the code:
I know I need a Having clause but not quite sure how to write it.
Perhaps, a NOT Exist?
SELECT [AcctID]
,[TransTypeCode]
,[TransDate]
FROM [FinancialTrans]
WHERE (
(TransTypeCode = 'TOLL' AND Max(TransDate) <= DATEADD(year, -2, GETDATE()))
OR (TransTypeCode = 'PYMT' AND Max(TransDate) <= DATEADD(year, -2, GETDATE()))
)
GROUP BY AcctID, TransTypeCode, TransDate
The challenge I'm facing is that I want account numbers where there is NEITHER a toll NOR a payment in the past two years.
I'm getting account numbers that have no tolls in the past two years but has a payment in the past two years.
Question: How do I ensure I get account numbers that doesn't have BOTH in the past two years?
This question is different from an earlier question asked because the requirements have now changed.
Not exists would work also.
Select AcctID,
TransTypeCode,
TransDate
From FinancialTrans ft1
Where Not Exists (Select 1
From FinancialTrans ft2
Where ft1.AcctID = ft2.AcctID
and ft2.TransTypeCode IN ('TOLL','PYMT')
and ft2.DateOfTrans > DATEADD(year, -2, getdate()))
You can use group by and having:
SELECT [AcctID]
FROM [FinancialTrans]
GROUP BY [AcctID]
HAVING MAX(CASE WHEN TransTypeCode = 'TOLL' THEN TransDate END) <= DATEADD(year, -2, GETDATE()) AND
MAX(CASE WHEN TransTypeCode = 'PYMT' THEN TransDate END) <= DATEADD(year, -2, GETDATE()) ;
That above actually requires that there be both types of transactions. It might be better to do:
SELECT [AcctID]
FROM [FinancialTrans]
GROUP BY [AcctID]
HAVING SUM(CASE WHEN TransTypeCode IN ('TOLL', 'PYMT') AND TransDate > DATEADD(year, -2, GETDATE())
THEN 1 ELSE 0
END) = 0;

How to count number of records per month over a time period

Is there a way to run a query for a specified amount of time, say the last 5 months, and to be able to return how many records were created each month? Here's what my table looks like:
SELECT rID, dateOn FROM claims
SELECT COUNT(rID) AS ClaimsPerMonth,
MONTH(dateOn) AS inMonth,
YEAR(dateOn) AS inYear FROM claims
WHERE dateOn >= DATEADD(month, -5, GETDATE())
GROUP BY MONTH(dateOn), YEAR(dateOn)
ORDER BY inYear, inMonth
In this query the WHERE dateOn >= DATEADD(month, -5, GETDATE()) ensures that it's for the past 5 months, the GROUP BY MONTH(dateOn) then allows it to count per month.
And to appease the community, here is a SQL Fiddle to prove it.
Unlike the other two answers, this will return all 5 months, even when the count is 0. It will also use an index on the onDate column, if a suitable one exists (the other two answers so far are non-sargeable).
DECLARE #nMonths INT = 5;
;WITH m(m) AS
(
SELECT TOP (#nMonths) DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-number, 0)
FROM master.dbo.spt_values WHERE [type] = N'P' ORDER BY number
)
SELECT m.m, num_claims = COUNT(c.rID)
FROM m LEFT OUTER JOIN dbo.claims AS c
ON c.onDate >= m.m AND c.onDate < DATEADD(MONTH, 1, m.m)
GROUP BY m.m
ORDER BY m.m;
You also don't have to use a variable in the TOP clause, but this might make the code more reusable (e.g. you could pass the number of months as a parameter).
SELECT
count(rID) as Cnt,
DatePart(Month, dateOn) as MonthNumber,
Max(DateName(Month, dateOn)) as MonthName
FROM claims
WHERE dateOn >= DateAdd(Month, -5, getdate())
GROUP BY DatePart(Month, dateOn)

MSSQL 2005 query group dates - even dates with no records?

I have this MS SQL 2005 query:
SELECT
DATEDIFF(dd, getdate(), CreatedOn) as Day,
COUNT(CreatedOn) as 'Active Cases'
FROM
[dbo].[IncidentBase]
WHERE
(StatusCode != 6 AND StatusCode != 5)
AND (CaseTypeCode = '200000' OR CaseTypeCode = '200005' OR CaseTypeCode = '200006')
GROUP BY
DATEDIFF(dd, getdate(), CreatedOn)
ORDER BY
Day DESC
And returns something like this:
-1 10
-2 6
-5 4
-7 8
I would really like it to be like:
-1 10
-2 6
-3 0
-4 0
-5 4
-6 0
-7 8
(Insert zero between dates with no records)
How can I do that?
Many thanks in advance!
Try an outer join on a subquery returning all the dates
SELECT table_cal.day_diff as "Day",
COALESCE(table_count.base_count,0) as "Active Cases"
FROM
(SELECT DISTINCT DATEDIFF(dd, getdate(), ibase.CreatedOn) as day_diff
FROM [dbo].[IncidentBase] ibase) table_cal
LEFT OUTER JOIN
(SELECT DATEDIFF(dd, getdate(), ibase.CreatedOn) as day_diff,
COUNT(ibase.CreatedOn) as base_count
FROM [dbo].[IncidentBase] ibase
WHERE ibase.StatusCode NOT IN (5,6) AND ibase.CaseTypeCode IN ('200000','200005','200006')
GROUP BY DATEDIFF(dd, getdate(), ibase.CreatedOn)) table_count
ON (table_cal.day_diff = table_count.day_diff)
ORDER BY table_cal.day_diff DESC
The idea behind is quite simple. You need a subquery to generate the list of existing dates, and another to generate the result values. Then you outer join both and replace null values by 0.