Multiple GROUP BY? - sql

I'm hoping to be able to write a SQL query that can use multiple COUNTs based upon different criteria and have the values grouped together.
Let's say I have a (canned) scenario where I change the price-points of my products and want to analyze what people have paid.
SELECT Product, COUNT(*) as Total FROM Orders WHERE Location = 'Amazon'
SELECT Product, COUNT(*) as HighPriceCount FROM Orders
WHERE Location = 'Amazon' and PRICE > 10
From here, I'd like to be able to see results like this.
--------------------------------------------
| Product | Total | HighPriceCount | Avg |
--------------------------------------------
| Game 1 | 50 | 20 | .40 |
| Prod 2 | 300 | 200 | .66 |
--------------------------------------------
Where Avg. is the "price above 10" / "total sold". My initial approach is to group by Product but I wanted to see if an "inner-select" is the only path or whether there is a more elegant way to do this. Seems like a lot of duplication? Here's my initial version of a query.
-- I don't know if this works?
SELECT Product, COUNT(*) AS Total,
(
SELECT Product, COUNT(*) FROM Orders WHERE Location = 'Amazon' and Price > 10
GROUP BY Product
) AS HighPriceCount,
(Total / HighPriceCount) AS Avg
From Orders
WHERE Location = 'Amazon'
GROUP BY Product

To get the count use a CASE for HighPriceCount as below. Aggregate functions do not count null except for COUNT(*)
Sql-Fiddle Example
SELECT Product, COUNT(*) as Total,
COUNT(case when price > 10 then 1 end) as HighPriceCount,
SUM(case when price > 10 then price end)/COUNT(*) as Avg
FROM Orders
WHERE Location = 'Amazon'
GROUP BY Product

Did you try using SUM and CASE? (can't try it right know I think it should work)
SELECT PRODUCT,
SUM(CASE WHEN PRICE>10 THEN 1 ELSE 0 END) as highpricecount,
COUNT(CASE WHEN PRICE>10 THEN 1 END) as total
FROM Orders
WHERE LOCATION='AMAZON'
GROUP BY PRODUCT;

Here's mine...
Select Product, Total, HighPriceCount, HighPriceCount/Total As Avg
From (Select Product,
Sum(Case When Location = 'Amazon' Then 1 Else 0 End) As Total,
Sum(Case When Location = 'Amazon' And Price > 10 Then 1 Else 0 End) As HighPriceCount,
From Orders
Group By Product) o

Here is mine
Select
Product, Sum(Total) as Total, Sum(HighPriceCount) as HighPriceCount,Total/HighPriceCount as AVG
from
(
select Product, 1 as Total, case WHEN PRICE>10 THEN 1 ELSE 0 END as HighPriceCount
from Orders where LOCATION='AMAZON'
)
group by Product

Your query doesn't work. Please try this one:
select A.Product, A.Total, ISNULL(B.HighPriceCount,0), B.HighPriceCount*1.0/A.Total*1.0 as [Avg]
(select Product, count(*) as Total
from Orders
where Location='Amazon'
group by Product ) A
left join
(SELECT Product, COUNT(*) as HighPriceCount
from Orders
where Location = 'Amazon' and Price > 10
group by Product) B
on A.Product = B.Product
I don't have your database, so maybe there is typo, I can't test this query, but with some change by you, it should work.

Related

Calculating multiple averages across different parts of the table?

I have the following transactions table:
customer_id purchase_date product category department quantity store_id
1 2020-10-01 Kit Kat Candy Food 2 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
2 2020-10-01 Snickers Candy Food 2 store_A
2 2020-10-01 Baguette Bread Food 5 store_A
2 2020-10-01 iPhone Cell phones Electronics 2 store_A
3 2020-10-01 Sony PS5 Games Electronics 1 store_A
I would like to calculate the average number of products purchased (for each product in the table). I'm also looking to calculate averages across each category and each department by accounting for all products within the same category or department respectively. Care should be taken to divide over unique customers AND the product quantity being greater than 0 (a 0 quantity indicates a refund, and should not be accounted for).
So basically, the output table would like below:
...where store_id and average_level_type are partition columns.
Is there a way to achieve this in a single pass over the transactions table? or do I need to break down my approach into multiple steps?
Thanks!
How about using “union all” as below -
Select store_id, 'product' as average_level_type,product as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,product
Union all
Select store_id, 'category' as average_level_type, category as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,category
Union all
Select store_id, 'department' as average_level_type,department as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,department;
If you want to avoid using union all in that case you can use something like rollup() or group by grouping sets() to achieve the same but the query would be a little more complicated to get the output in the exact format which you have shown in the question.
EDIT : Below is how you can use grouping sets to get the same output -
Select store_id,
case when G_ID = 3 then 'product'
when G_ID = 5 then 'category'
when G_ID = 6 then 'department' end As average_level_type,
case when G_ID = 3 then product
when G_ID = 5 then category
when G_ID = 6 then department end As id,
total_quantity,
unique_customer_count,
average
from
(select store_id, product, category, department, sum(quantity) as total_quantity, Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average, GROUPING__ID As G_ID
from transactions
group by store_id,product,category,department
grouping sets((store_id,product),(store_id,category),(store_id,department))
) Tab
order by 2
;

SQL Query to get sums among multiple payments which are greater than or less than 10k

I am trying to write a query to get sums of payments from accounts for a month. I have been able to get it for the most part but I have hit a road block. My challenge is that I need a count of the amount of payments that are either < 10000 or => 10000. The business rules are that a single payment may not exceed 10000 but there can be multiple payments made that can total more than 10000. As a simple mock database it might look like
ID | AccountNo | Payment
1 | 1 | 5000
2 | 1 | 6000
3 | 2 | 5000
4 | 3 | 9000
5 | 3 | 5000
So the results I would expect would be something like
NumberOfPaymentsBelow10K | NumberOfPayments10K+
1 | 2
I would like to avoid doing a function or stored procedure and would prefer a sub query.
Any help with this query would be greatly appreciated!
I suggest avoiding sub-queries as much as possible because it hits the performance, specially if you have a huge amount of data, so, you can use something like Common Table Expression instead. You can do the same by using:
;WITH CTE
AS
(
SELECT AccountNo, SUM(Payment) AS TotalPayment
FROM Payments
GROUP BY AccountNo
)
SELECT
SUM(CASE WHEN TotalPayment < 10000 THEN 1 ELSE 0 END) AS 'NumberOfPaymentsBelow10K',
SUM(CASE WHEN TotalPayment >= 10000 THEN 1 ELSE 0 END) AS 'NumberOfPayments10K+'
FROM CTE
You can get the totals per account using SUM and GROUP BY...
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
You can use that result to count the number over 10000
SELECT COUNT(*)
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)
WHERE TotPay>10000
You can get the the number over and the number under in a single query if you want but that's a but more complicated:
SELECT
COUNT(CASE WHEN TotPay<=10000 THEN 1 END) AS Below10K,
COUNT(CASE WHEN TotPay> 10000 THEN 1 END) AS Above10K
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)

SQL Get percent of bad records from total

i am relatively new to SQL. Each employee access an account for testing with a tech, sometimes it's a good attempt, sometimes it's bad, so I need to calculate the percentage of the bad attempts mostly, my report should look something like this:
SELECT
employee, event, total, percentage
FROM my_table
employee | event | total | percentage|
user1 | good | 50 | 50% |
user1 | bad | 50 | 50% |
Calculate the total in a subquery and then JOIN to calculate percentage on each row.
SELECT employee, event, COUNT(*), COUNT(*) * 100.0 / t.total as percentage
FROM my_table
JOIN (SELECT employee, count(*) total
FROM my_table
GROUP BY employee) T
ON my_table.employee = t.employee
GROUP BY employee, event
Try something like this calculate the bad event percentage for each employee
select employee,(sum(case when event = 'bad' then 1 else 0 end) / count(*)) * 100
From Yourtable
Group by employee

Build customers report of last years with T-SQL

I've 3 tables (simplified):
-----------Orders--------------------
Id | Total_Price | Customer_Id | Date
--------Order Details---------------------
Id | Order_Id | Product Name | Qty | Value
----Customers------
Id | Name | Address
I take a total order value of single customer with this query:
SELECT C.ID, C.NAME , SUM(O.TOTAL_PRICE)
FROM CUSTOMERS C
JOIN ORDERS O ON O.CUSTOMER_ID = C.ID
GROUP BY C.ID, C.NAME
Now, I want to build a report with total order value filtered by a range of dates:
SELECT C.ID, C.NAME , SUM(O.TOTAL_PRICE)
FROM CUSTOMERS C
JOIN ORDERS O ON O.CUSTOMER_ID = C.ID
WHERE O.DATE BETWEEN #value1 AND #value2
GROUP BY C.ID, C.NAME
this works OK, but I want to select last 3 year sums of total orders value grouped by customer, this is the results that I want:
1Year | 2Year | 3Year | Customer_Name
-------------------------------------------------
XXX | YYY | ZZZZ | Customer1
XYX | YYZ | ZZTZ | Customer2
....
I've this cardinality:
Customer table with 22.000 rows
Orders table with 87.000 rows
Orders details with 600.000
It is possible without temptable,vartable or stored procedure with long execution time?
In my report I want also to calculate total Qty of last 3 years grouped by customer of a product, but this is the next step.
Any ideas?
Thanks
You can use a case statement to get the result you want. Since there is some ambiguity in your post about how the year ranges are defined, I've left out any calculations to get those year end/starts and just put variables in. You can revise to suit your need.
SELECT C.ID
,C.NAME
,SUM(CASE
WHEN o.DATE BETWEEN #year1start
AND #year1end
THEN O.TOTAL_PRICE
ELSE 0
END) Year1
,SUM(CASE
WHEN o.DATE BETWEEN #year2start
AND #year2end
THEN O.TOTAL_PRICE
ELSE 0
END) Year2
,SUM(CASE
WHEN o.DATE BETWEEN #year3start
AND #year3end
THEN O.TOTAL_PRICE
ELSE 0
END) Year3
FROM CUSTOMERS C
INNER JOIN ORDERS O ON O.CUSTOMER_ID = C.ID
GROUP BY C.ID
,C.NAME
Another option is to use pivot statement. I assume every your date range equals to one year (e.g. 2013, 2014 and so on).
If these years are strongly determined pivot isn't very beautiful option (look at full sqlfiddle example, it has possible solution for your additional question):
select
c.Id, c.Name, c.Address, CostByYear.[2013], CostByYear.[2014], CostByYear.[2015]
from Customers c
left join (
select
pt.Customer_Id, isnull(pt.[2013], 0) as [2013],
isnull(pt.[2014], 0) as [2014], isnull(pt.[2015], 0) as [2015]
from (
select
o.Customer_Id, year(o.Date) [Year], sum(o.Total_Price) [TotalCost]
from Orders o
group by
o.Customer_Id, year(o.Date)
) src
pivot (
sum(TotalCost) for [Year] in ([2013], [2014], [2015])
) pt
) CostByYear on
c.Id = CostByYear.Customer_Id
order by
c.Name
Also you can do both approaches (mine and prev answer) with dynamically created queries if year ranges aren't known and strongly defined.

Multiple counts with different conditions

I want to retrieve and display on one row, the number of sales made by an employee followed by the total number of sales.
SELECT COUNT(SalesID) AS SalesForEmployee, COUNT(SalesID) AS TotalSales
FROM Sales
WHERE EmployeeID = 123
How do I make it so that the where clause only applies to the first column in the select?
SELECT
sum(case when EmployeeID = 123 then 1 else 0 end) AS SalesForEmployee
,COUNT(SalesID) AS TotalSales
FROM Sales
select count(SalesEmp.SalesID) AS SalesForEmployee count(Sales.salesID) As TotalSales
from Sales left outer join Sales as SalesEmp
on Sales.salesID=SalesEmp.SalesID
and SalesEmp.EmployeeID = 123
You can't have a where that only applies to one column.
In order to get both counts while only scanning the table once, you can do this:
select
sum(case when EmployeeID=123 then 1 else 0 end) as SalesForEmployee,
Count(SalesID) as TotalSales
from Sales
There's no where clause because Count(SalesID) needs to count every row to give you the total count.
Since you have to look at every row, case when EmployeeID=123 then 1 else 0 end gives you a 1 for each row that belongs to the target employee and a 0 for every row that doesn't. Therefore, summing that expression gives you the count only for that employee.
SalesID EmployeeID (case when ... )
1 123 1
2 311 0
3 333 0
4 123 1
5 300 0
count = 5 sum = 2
You could also do it like this:
select
(select count(SalesID) from Sales where EmployeeID=123) as SalesForEmployee,
(select count(SalesID) from Sales) as TotalSales
But now you are scanning the Sales table twice, which will be slower.
This is a nested subquery approach combining two select statements into one.
SELECT
(SELECT COUNT(SalesID) FROM Sales WHERE EmployeeID = 123) AS SalesForEmployee,
(SELECT COUNT(SalesID) FROM Sales) AS TotalSales
Another way to write it would be like this.
SELECT COUNT(SalesID) AS SalesForEmployee,
(SELECT COUNT(SalesID) FROM Sales) AS TotalSales
FROM Sales WHERE EmployeeID = 123
With a correlated subquery, you can link the outer query with the inner query. Say you wanted to get the Total sales not including the sales of EmployeeID 123
SELECT COUNT(SalesEmployee.SalesID) AS SalesForEmployee,
(SELECT COUNT(SalesID) FROM Sales WHERE Sales.EmplyeeID <> SalesEmployee.EmployeeID) AS TotalSales
FROM Sales As SalesEmployee WHERE SalesEmployee.EmployeeID = 123
Here the inner query is referencing the outqueries EmployeeeID in the WHERE clause to filter them out.