How to count rows by group in a SQL query

How to count rows by group in a SQL query - sql

I have a SQL query that returns data which includes quote types along with customer information. There are 4 types of quotes (Open, Dead, Requote, Project). I want to be able to count each type for each customer. And also count the total. I have not found a way to accomplish this.
Ultimately I want this to go into a gauge on an SSRS report so that we can tell how many of the quotes (percentage) eventually turn into a project.
I haven't found anything in my searches that works. Thanks in advance for any advice.

Using CTE's
WITH quoteTotal as (
Select customer, count(*) as customer_total
from customer
group by customer
),
typeQuote as (
Select customer, quote_type, count(*) as quote_total
from customer
group by customer, quote_type
)
SELECT T.customer, T.quote_type, T.quote_total, Q.customer_total
FROM typeQuote T
INNER JOIN quoteTotal Q
ON T.customer = Q.customer
I think using window functions can be easy.
SELECT DISTINCT
customer,
quote_type,
COUNT(*) OVER (partition by customer, quote_type order by customer) as type_total,
COUNT(*) OVER (partition by customer order by customer) as customer_total
FROM customers

For the data as below:
I have executed the query below
select CustomerEntityId,QuoteType,
(select count(QuoteType) from Customers c
where c.QuoteType=Customers.QuoteType
and c.CustomerEntityId=customers.CustomerEntityId
group by CustomerEntityId,QuoteType) as QuoteTypeTotal,
(select count(*) from Customers c1
where c1.CustomerEntityId=Customers.CustomerEntityId
group by CustomerEntityId) as CustomerTotal
from Customers
Group By CustomerEntityId,QuoteType
Order By CustomerEntityId
The result is as below :
I hope it helps.

Related

SQL How to select customers with highest transaction amount by state

I am trying to write a SQL query that returns the name and purchase amount of the five customers in each state who have spent the most money.
Table schemas
customers
|_state
|_customer_id
|_customer_name
transactions
|_customer_id
|_transact_amt
Attempts look something like this
SELECT state, Sum(transact_amt) AS HighestSum
FROM (
SELECT name, transactions.transact_amt, SUM(transactions.transact_amt) AS HighestSum
FROM customers
INNER JOIN customers ON transactions.customer_id = customers.customer_id
GROUP BY state
) Q
GROUP BY transact_amt
ORDER BY HighestSum
I'm lost. Thank you.
Expected results are the names of customers with the top 5 highest transactions in each state.
ERROR: table name "customers" specified more than once
SQL state: 42712

First, you need for your JOIN to be correct. Second, you want to use window functions:
SELECT ct.*
FROM (SELECT c.customer_id, c.name, c.state, SUM(t.transact_amt) AS total,
ROW_NUMBER() OVER (PARTITION BY c.state ORDER BY SUM(t.transact_amt) DESC) as seqnum
FROM customers c JOIN
transaactions t
ON t.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, c.state
) ct
WHERE seqnum <= 5;
You seem to have several issues with SQL. I would start with understanding aggregation functions. You have a SUM() with the alias HighestSum. It is simply the total per customer.

You can get them using aggregation and then by using the RANK() window function. For example:
select
state,
rk,
customer_name
from (
select
*,
rank() over(partition by state order by total desc) as rk
from (
select
c.customer_id,
c.customer_name,
c.state,
sum(t.transact_amt) as total
from customers c
join transactions t on t.customer_id = c.customer_id
group by c.customer_id
) x
) y
where rk <= 5
order by state, rk

There are two valid answers already. Here's a third:
SELECT *
FROM (
SELECT c.state, c.customer_name, t.*
, row_number() OVER (PARTITION BY c.state ORDER BY t.transact_sum DESC NULLS LAST, customer_id) AS rn
FROM (
SELECT customer_id, sum(transact_amt) AS transact_sum
FROM transactions
GROUP BY customer_id
) t
JOIN customers c USING (customer_id)
) sub
WHERE rn < 6
ORDER BY state, rn;
Major points
When aggregating all or most rows of a big table, it's typically substantially faster to aggregate before the join. Assuming referential integrity (FK constraints), we won't be aggregating rows that would be filtered otherwise. This might change from nice-to-have to a pure necessity when joining to more aggregated tables. Related:
Why does the following join increase the query time significantly?
Two SQL LEFT JOINS produce incorrect result
Add additional ORDER BY item(s) in the window function to define which rows to pick from ties. In my example, it's simply customer_id. If you have no tiebreaker, results are arbitrary in case of a tie, which may be OK. But every other execution might return different results, which typically is a problem. Or you include all ties in the result. Then we are back to rank() instead of row_number(). See:
PostgreSQL equivalent for TOP n WITH TIES: LIMIT "with ties"?
While transact_amt can be NULL (has not been ruled out) any sum may end up to be NULL as well. With an an unsuspecting ORDER BY t.transact_sum DESC those customers come out on top as NULL comes first in descending order. Use DESC NULLS LAST to avoid this pitfall. (Or define the column transact_amt as NOT NULL.)
PostgreSQL sort by datetime asc, null first?

SQL: Take 1 value per grouping

I have a very simplified table / view like below to illustrate the issue:
The stock column represents the current stock quantity of the style at the retailer. The reason the stock column is included is to avoid joins for reporting. (the table is created for reporting only)
I want to query the table to get what is currently in stock, grouped by stylenumber (across retailers). Like:
select stylenumber,sum(sold) as sold,Max(stock) as stockcount
from MGTest
I Expect to get Stylenumber, Total Sold, Most Recent Stock Total:
A, 6, 15
B, 1, 6
But using ...Max(Stock) I get 10, and with (Sum) I get 25....
I have tried with over(partition.....) also without any luck...
How do I solve this?

I would answer this using window functions:
SELECT Stylenumber, Date, TotalStock
FROM (SELECT M.Stylenumber, M.Date, SUM(M.Stock) as TotalStock,
ROW_NUMBER() OVER (PARTITION BY M.Stylenumber ORDER BY M.Date DESC) as seqnum
FROM MGTest M
GROUP BY M.Stylenumber, M.Date
) m
WHERE seqnum = 1;

The query is a bit tricky since you want a cumulative total of the Sold column, but only the total of the Stock column for the most recent date. I didn't actually try running this, but something like the query below should work. However, because of the shape of your schema this isn't the most performant query in the world since it is scanning your table multiple times to join all of the data together:
SELECT MDate.Stylenumber, MDate.TotalSold, MStock.TotalStock
FROM (SELECT M.Stylenumber, MAX(M.Date) MostRecentDate, SUM(M.Sold) TotalSold
FROM [MGTest] M
GROUP BY M.Stylenumber) MDate
INNER JOIN (SELECT M.Stylenumber, M.Date, SUM(M.Stock) TotalStock
FROM [MGTest] M
GROUP BY M.Stylenumber, M.Date) MStock ON MDate.Stylenumber = MStock.Stylenumber AND MDate.MostRecentDate = MStock.Date

You can do something like this
SELECT B.Stylenumber,SUM(B.Sold),SUM(B.Stock) FROM
(SELECT Stylenumber AS 'Stylenumber',SUM(Sold) AS 'Sold',MAX(Stock) AS 'Stock'
FROM MGTest A
GROUP BY RetailerId,Stylenumber) B
GROUP BY B.Stylenumber
if you don't want to use joins

My solution, like that of Gordon Linoff, will use the window functions. But in my case, everything will turn around the RANK window function.
SELECT stylenumber, sold, SUM(stock) totalstock
FROM (
SELECT
stylenumber,
SUM(sold) OVER(PARTITION BY stylenumber) sold,
RANK() OVER(PARTITION BY stylenumber ORDER BY [Date] DESC) r,
stock
FROM MGTest
) T
WHERE r = 1
GROUP BY stylenumber, sold

SQL should I use a subquery?

I need help on a course question for my school. So I am supposed to get two tables Seller & Item, and I need to return the most active seller based on the most items offered. I have the tables as links below.
How could I just return one record with the sellers ID# and Name? Do I need to do a subquery? Thank you so much in advance.

Actually this is a way to accomplish it via subquery. I don't know that any teacher would anticipate students using >= all though:
select s.sellerid, min(s.name) as name
from seller s inner join item i on i.sellerid = s.sellerid
group by s.sellerid
having count(*) >= all (
select count(*)
from item
group by sellerid
)
You can also do it doubly=nested without even needing aliases!
select * from seller where sellerid in
(
select sellerid from item group by sellerid
having count(*) >= all (select count(*) from item group by sellerid)
)

A simple subquery in SQL

I have a table ORDERS with columns NAME and AMOUNT.
I need to get a NAME and total AMOUNT of each product.
I have such a solution
select PRODUCT_NAME, SUM(AMOUNT) from ORDERS GROUP BY PRODUCT_NAME;
But I dont use any subqueries for achieving that. But the lesson that Im going through is about subqueries. May be Im wrong about this solution?

Its a simple aggregation query without a need of sub query like:
SELECT name, SUM(AMOUNT)
FROM Orders
GROUP BY name

This solution is perfectly fine:
selectname, sum(amount)
from Orders
group by name
But, if you must use subquery, use this:
select o2.name,
(select sum(amount)
from Orders o1
where o1.name = o2.name) as total
from
(select distinct name
from Orders) o2

MS-Access: HAVING clause not returning any records

I have a Select query to extract Customer Names and Purchase Dates from a table. My goal is to select only those names and dates for customers who have ordered on more than one distinct date. My code is as follows:
SELECT Customer, PurchDate
FROM (SELECT DISTINCT PurchDate, Customer
FROM (SELECT CDate(FORMAT(DateAdd("h",-7,Mid([purchase-date],1,10)+""+Mid([purchase-date],12,8)), "Short Date")) AS PurchDate,
[buyer-name] AS Customer
FROM RawImport
WHERE sku ALIKE "%RE%"))
GROUP BY Customer, PurchDate
HAVING COUNT(PurchDate)>1
ORDER BY PurchDate
This returns no results, even though there are many customers with more than one Purchase Date. The inner two Selects work perfectly and return a set of distinct dates for each customer, so I believe there is some problem in my GROUP/HAVING/ORDER clauses.
Thanks in advance for any help!

You are doing in the inner select
SELECT DISTINCT PurchDate, Customer
and in the outter select
GROUP BY Customer, PurchDate
That mean all are
having count(*) = 1
I cant give you the exact sintaxis in access but you need something like this
I will use YourTable as a replacement of your inner derivated table to make it easy to read
SELECT DISTINCT Customer, PurchDate
FROM YourTable
WHERE Customer IN (
SELECT Customer
FROM (SELECT DISTINCT Customer, PurchDate
FROM YourTable)
GROUP BY Customer
HAVING COUNT(*) > 1
)
inner select will give you which customer order on more than one day.
outside select will bring you those customer on all those days.
.
Maybe you can try something simple to get the list of customer who brought in more than one day like this
SELECT [buyer-name]
FROM RawImport
WHERE sku ALIKE "%RE%"
GROUP BY [buyer-name]
HAVING Format(MAX(purchase-date,"DD/MM/YYYY")) <>
Format(MIN(purchase-date,"DD/MM/YYYY"))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count rows by group in a SQL query - sql

Related

SQL How to select customers with highest transaction amount by state

SQL: Take 1 value per grouping

SQL should I use a subquery?

A simple subquery in SQL

MS-Access: HAVING clause not returning any records

Categories

Resources