Grouping in SQL using CASE Statements - sql

Hello I am trying to group multiple customer orders into buckets in SQL, the output should look something like it does below. Do I have to use a case statement to group them?
Table1 looks like:
CustomerID
Order_date
1
somedate
2
somedate
3
somedate
2
somedate
Edit: # of customers meaning if CustomerID 2 had 2 orders he/she would be of the in the bucket of #of orders of 2.
Output should be something like this?
# of Customers
# of Orders
2
1
1
2
My code so far is:
select count(*) CustomerID
FROM Table1
GROUP BY CustomerID;

Use a double aggregation:
SELECT COUNT(*) AS num_customers, cnt AS num_orders
FROM
(
SELECT CustomerID, COUNT(*) AS cnt
FROM Table1
GROUP BY CustomerID
) t
GROUP BY cnt;
The inner subquery finds the number of orders for each customer. The outer query then aggregates by number of orders and finds out the number of customers having each number of orders.

If you want to sort your tables and your users depending on the number of orders they made, this query should work:
SELECT CustomerID, COUNT(CustomerID) as NbOrder
FROM Table1
GROUP BY(NbOrder)

I believe what you want to do is get the count of orders by customer, first, via aggregation. Then get the count of customers by order count from that query.
SELECT count(*) as count_of_customers, count_of_orders
FROM
(
SELECT customerid, count(*) as count_of_orders
FROM your_table
GROUP BY customerid
) sub
GROUP BY count_of_orders
ORDER BY count_of_orders

Related

SQL select with GROUP BY and COUNT?

Let's say I have three tables to manage purchases from my online shop:
Products: with columns ID, Name, Price
Customers: with columns ID, Name
Purchases: with columns ProductID, CustomerID, PurchaseDate
Now, how would I go about retrieving products purchased by more than N distinct customers?
I've tried the following on SQL Server 2019 Trial Edition but I'm getting a syntax error on COUNT.
SELECT ProductID, CustomerID, COUNT(*) as C
FROM Purchases
GROUP BY ProductID, CustomerID
HAVING C > 100
ORDER BY C DESC
Better still, how would I go about retrieving products purchased by more than N distinct customers over a 30-day period?
Thanks for any help and/or pointers.
With your current query, you are just counting how often each customer bought each product, because you are grouping by the combination of productid and customerid. Furthermore you cannot reference to the column alias for the count in the HAVING or ORDER BY clause
Try this
declare #purchasedatelower datetime = dateadd(day, -30, getdate())
declare #purchasedateupper datetime = getdate()
declare #distinctcustomers int = 100
select productid, count(distinct customerid) as customercount
from purchases
where purchasedate between #purchasedatelower and #purchasedateupper
group by productid
having count (distinct customerid) >= #distinctcustomers
order by count(distinct customerid) desc
This will return all products, which have been bought by at least 100 distinct customers, in the last 30 days, together with the distinct number of customers.
Try to order them by COUNT()
SELECT ProductID, CustomerID
FROM Purchases
GROUP BY ProductID, CustomerID
HAVING COUNT(*) > 100
ORDER BY COUNT(*) DESC
Now, how would I go about retrieving products purchased by more than N distinct customers?
You would use COUNT(DISTINCT):
SELECT ProductID, COUNT(DISTINCT CustomerID) as num_customers
FROM Purchases
GROUP BY ProductID
HAVING COUNT(DISTINCT CustomerID) > 100
ORDER BY COUNT(DISTINCT CustomerID) DESC;
If you have a particular period in mind, then add a WHERE clause before the GROUP BY.
DECLARE #Popularity int = 1
DECLARE #MonthsAgo int = 1
SELECT p.ProductId, p.ProductName, pc.CustomersDistinct
FROM (
-- Sub-query to compute the aggregate.
-- Count the number of distinct customers that purchased each product.
SELECT ProductId, COUNT(DISTINCT CustomerId) AS CustomersDistinct
FROM Purchases
WHERE PurchaseDate >= DATEADD(MONTH, -#MonthsAgo, CURRENT_TIMESTAMP)
GROUP BY ProductId
HAVING COUNT(DISTINCT CustomerId) > #Popularity
) AS pc
-- Now do JOINs to look up helpful values for the final SELECT list.
INNER JOIN Products p ON pc.ProductId = p.ProductId
ORDER BY 3 DESC

Select customer with highest price of all his orders

i want to list customer id with highest sum of price of his orders. Please see graph oí orders down.
SQL DEMO
SELECT *
FROM (
SELECT "customerid", SUM("price")
FROM Orders
GROUP BY "customerid"
ORDER BY SUM("price") DESC
) T
WHERE ROWNUM <= 1
You need to group by customer_id to get all prices of each customers.
Then sum these prices filter it with max( sum(price))or get first row by descending order of sum(price).
--for Oracle
select * from (Select c.name,c.id,sum(o.price) from Customer c
inner join order o on o.customer_id=c.id
group by c.name,c.id
order by sum(o.price)desc
)where rownum =1
--For sql server and mysql
Select top 1 c.name,c.id,sum(o.price) from Customer c
inner join order o on o.customer_id=c.id
group by c.name,c.id
order by sum(o.price)desc

SQL show orders where 2 values are distinct and match the first

I'm looking for a way to let me select all orders that have multiple distinct names within the same order-number, it looks like this:
order - name
111-Paul
112-Paula
113-John
113-John
113-Jessica
114-Eric
114-Eric
114-John
115-Zack
115-Zack
115-Zack
etc.
so that i would get all the orders that have 2 or more distinct names in it:
113-John
113-Jessica
114-Eric
114-John
with which I could do further queries but I'm stuck. Can anyone give me some hints on how to tackle this problem please? I've tried it with count(*) which looked like this:
select order, name, count(name) from dbo.orders
group by order, name
having count(name) > 1
which gave me all the orders which had more than 1 name in it but I don't know how to let it only show orders with the distinct names.
Here's one approach using exists:
select distinct [order], name
from orders o
where exists (
select 1
from orders o2
where o.[order] = o2.[order] and o.name != o2.name)
Fiddle Demo
I would use windows functions for this
For example:
select distinct order
from
(select
order,
row_number() over(partition by order, name order by order asc) as rn
) as t1
where rn > 1
you can do the same with count
count(*) over(partition by order,name order by order asc) as cnt
Here's a straight forward implementation in Sql Server:
select distinct *
from table1
where [order] in (
select [order]
from (select distinct * from table1) iq
group by [order]
having count(*) > 1)
It's essentially breaking down the problem into:
Finding the orders that have more than one distinct value.
Finding the pairs of distinct order - name that belong to the list previously calculated.
When you use HAVING COUNT(name) > 1, it is counting all of the rows in those groups, including duplicate rows (rows 113-John and 113-John are 2 rows for order 113). I would query distinct rows from your table, and then select from that:
SELECT [order], [name] FROM (
SELECT DISTINCT [order], [name] FROM dbo.orders
) A
GROUP BY [order], [name]
HAVING COUNT([name]) > 1
As a note, if a [name] is null, then it will not be counted with COUNT(name). If you want nulls to be counted, use COUNT(*) instead.
You can use count(distinct name) to get the number of unique names for each order:
select [order], count(distinct name)
from orders
group by [order]
To just get the order for those you can use having:
select [order]
from orders
group by [order]
having count(distinct name) > 1
To get the details for those orders you can put that in the where clause to just return the rows with order in that list:
select *
from orders
where [order] in (
select [order]
from orders
group by [order]
having count(distinct name) > 1
)
sqlfiddle
I would use RANK (or DENSE_RANK) for this as shown below.
SELECT [Order]
FROM (SELECT
[Order],
RANK() OVER(PARTITION BY [Order] ORDER BY Name) AS NameRank
FROM [StackOverflow].[dbo].[OrderAndName]) ranked
WHERE ranked.NameRank > 1
GROUP BY [Order]
The sub-query ranks (gives a seeding) to the names in an order according to their value. Names with the same value would have the same rank i.e. when an order has one name multiple times (like 115) the rank of all names would be 1.
The partition is important here as otherwise you would get the rank for all names for all orders which wouldn't give you the result you'd like.
It is then just a case of pulling out the orders that have a RANK greater than 1 and grouping (could use distinct if that's a preference).
You can then join to this table to get get the orders and names as follows;
SELECT oan.[Order], [Name]
FROM [StackOverflow].[dbo].[OrderAndName] oan
INNER JOIN (SELECT [Order]
FROM (SELECT [Order],
RANK() OVER(PARTITION BY [Order] ORDER BY Name) AS NameRank
FROM [StackOverflow].[dbo].[OrderAndName]) ranked
WHERE ranked.NameRank > 1
GROUP BY [Order]) twoOrMore ON oan.[Order] = twoOrMore.[Order]

Order by most frequent occurrence in SQL

I have a table PAYTB with transaction information. The table contains ACAUREQ_AUREQ_ENV_M_CMONNM which is the Common Merchant Name.
Now I want an output like this:
orders ACAUREQ_AUREQ_ENV_M_CMONNM
---------+---------+----------------
100 Antique Shop
30 Airleisure
23 Books
12 ....
How can I construct the "orders" column which is the count of all transaction with a certain common merchant name?
You need to group on ACAUREQ_AUREQ_ENV_M_CMONNM and find count of rows for each, then order by that count in descending order.
SELECT COUNT(*) orders,
ACAUREQ_AUREQ_ENV_M_CMONNM
FROM PAYTB
GROUP BY ACAUREQ_AUREQ_ENV_M_CMONNM
ORDER BY orders desc;
SELECT count(*) as orders , ACAUREQ_AUREQ_ENV_M_CMONNM
FROM PAYTB
GROUP BY ACAUREQ_AUREQ_ENV_M_CMONNM
ORDER BY 1 desc;
you want to use a group by:
select count(*) as orders, ACAUREQ_AUREQ_ENV_M_CMONNM
from PAYTB
group by ACAUREQ_AUREQ_ENV_M_CMONNM;
Order by count(*) desc to show descending counts in a group by expression.
SELECT count(*) as orders , ACAUREQ_AUREQ_ENV_M_CMONNM
FROM PAYTB
GROUP BY ACAUREQ_AUREQ_ENV_M_CMONNM
ORDER BY count(*) desc;

Sql query to select records for a single distinct customer id

I have written a query for sales by customers in groups it is as follows:
SELECT customerid,
SUM (salestax1)As total_salestax1,
SUM(total_payment_received) As total_payment_recieved,
COUNT (orderid)as order_qty,
SUM(paymentamount)As paymentamount
FROM Orders_74V94W6D22$
WHERE orderdate between '7/6/2011 16:35' and '2/3/2012 11:53'
GROUP BY customerid
but this query shows only 5 fields but I need to show following fields:
orderid billingcompanyname billingfirstname billinglastname
billingcountry shipcountry paymentamount creditcardtransactionid
orderdate creditcardauthorizationdate orderstatus
total_payment_received tax1_title salestax1
then how to deal with it?
you need to understand what GROUP BY means.
If you are grouping by customerId, you will have only one customer because all data is grouped into it.
How do you want to group by orderid and display the orderid on your result set? If you have 10 order ids, do you expect 10 rows on the result? If yes, fine, group by it but I don't think that's what you want
EDIT:
Well, this is NOT a good idea, your table structure is WRONG and I dont think you fully understand that a group by means, BUT I think this query will get your result:
SELECT customerid,
(select top 1 [column1] from Orders_74V94W6D22$ where customerid = ORD.customerid),
(select top 1 [column2] from Orders_74V94W6D22$ where customerid = ORD.customerid),
(select top 1 [column3] from Orders_74V94W6D22$ where customerid = ORD.customerid),
SUM (salestax1)As total_salestax1,
SUM(total_payment_received) As total_payment_recieved,
COUNT (orderid)as order_qty,
SUM(paymentamount)As paymentamount
FROM Orders_74V94W6D22$ ORD
WHERE orderdate between '7/6/2011 16:35' and '2/3/2012 11:53'
GROUP BY customerid
To select more about the customer, you need to use your query as a sub query, something like:
Select distinct c.[column1], c.[column2], c.[column3], tbl.*
From Orders_74V94W6D22$ c inner join (
SELECT customerid,
SUM (salestax1)As total_salestax1,
SUM(total_payment_received) As total_payment_recieved,
COUNT (orderid)as order_qty,
SUM(paymentamount)As paymentamount
FROM Orders_74V94W6D22$
WHERE orderdate between '7/6/2011 16:35' and '2/3/2012 11:53'
GROUP BY customerid
) as tbl on tbl.customerid = c.customerid
but you cant logically select something about 1 order as youve grouped multiple orders