I have a database for a pet shop, and I'm trying to get the most bought pets per customer. So if a customer bought 6 mice, 3 birds, 2 cats, 3 dogs, I'm trying to get the following:
Customer ID Animal Count
----------- ------ -----
1 mouse 6
1 bird 3
1 dog 3
However, in order to do that, I need to group by animal and customer ID, and create a row number for each record by count.
I have 3 tables:
Orders
Order contents (i.e animals in the order)
Animal details (i.e type of animal, cost, etc.)
Here is my query so far:
SELECT customer_id, animal, count(*) as cnt, row_number() over (order by count(*)) as seqnum
FROM [Order_Contents] cc
INNER JOIN [Animals] p on cc.animal_id = p.animal_id
INNER JOIN [Orders] o ON cc.order_id = o.order_id
WHERE customer_id = 1
GROUP BY animal, customer_id
ORDER BY customer_id, seqnum
Here is what I expect:
Customer ID Animal Count seqnum
----------- ------ ----- ------
1 mouse 6 1
1 bird 3 2
1 dog 3 3
1 cat 2 4
However, the sequential number isn't per customer, it's just sequential for the whole result set:
Customer ID Animal Count seqnum
----------- ------ ----- ------
1 mouse 6 98
1 bird 3 33
1 dog 3 36
1 cat 2 15
What am I doing wrong here? I need seqnum to be able to do "top 3" per customer later.
The challenge is getting the highest total for a customer with the details.
This returns the information:
SELECT o.customer_id, a.animal, COUNT(*) as cnt,
SUM(COUNT(*)) OVER (PARTITION BY o.customer_id) as customer_cnt
FROM Order_Contents cc INNER JOIN
Animals a
ON cc.animal_id = a.animal_id INNER JOIN
Orders o ON cc.order_id = o.order_id
WHERE customer_id = 1
GROUP BY animal, customer_id
ORDER BY customer_cnt DESC;
To get the details for the customer with the highest count, you can use the TOP WITH TIES trick:
SELECT TOP (1) WITH TIES ca.*
FROM (SELECT o.customer_id, a.animal, COUNT(*) as cnt,
SUM(COUNT(*)) OVER (PARTITION BY o.customer_id) as customer_cnt
FROM Order_Contents cc INNER JOIN
Animals a
ON cc.animal_id = a.animal_id INNER JOIN
Orders o ON cc.order_id = o.order_id
WHERE customer_id = 1
GROUP BY a.animal, o.customer_id
) ca
ORDER BY DENSE_RANK() OVER (ORDER BY customer_cnt DESC);
You need to add the partition so seq reset
row_number() over (partition by customer_id order by count(*))
Related
Given a table products
pid
name
123
Milk
456
Tea
789
Cake
...
...
and a table sales
stamp
pid
units
14:54
123
3
15:02
123
9
15:09
456
1
15:14
456
1
15:39
456
2
15:48
789
12
...
...
...
How would I be able to get the product(s) with the most sold units?
My goal is to run a SELECT statement that results in, for this example,
pid
name
123
Milk
789
Cake
because the sum of sold units of both those products is 12, the maximum value (greater than 4 for Tea, despite there being more sales for Tea).
I have the following query:
SELECT DISTINCT products.pid, products.name
FROM sales
INNER JOIN products ON sale.pid = products.pid
INNER JOIN (
SELECT pid, SUM(units) as sum_units
FROM sales
GROUP BY pid
) AS total_units ON total_units.pid = sales.pid
WHERE total_units.sum_units IN (
SELECT MAX(sum_units) as max_units
FROM (
SELECT pid, SUM(units) as sum_units
FROM sales
GROUP BY pid
) AS total_units
);
However, this seems very long, confusing, and inefficient, even repeating the sub-query to obtain total_units, so I was wondering if there was a better way to accomplish this.
How can I simplify this? Note that I can't use ORDER BY SUM(units) LIMIT 1 in case there are multiple (i.e., >1) products with the most units sold.
Thank you in advance.
Since Postgres 13 it has supported with ties so your query can be simply this:
select p.pId, p.name
from sales s
join products p on p.pid = s.pid
group by p.pId, p.name
order by Sum(units) desc
fetch first 1 rows with ties;
See demo Fiddle
Solution for your problem:
WITH cte1 AS
(
SELECT s.pid, p.name,
SUM(units) as total_units
FROM sales s
INNER JOIN products p
ON s.pid = p.pid
GROUP BY s.pid, p.name
),
cte2 AS
(
SELECT *,
DENSE_RANK() OVER(ORDER BY total_units DESC) as rn
FROM cte1
)
SELECT pid,name
FROM cte2
WHERE rn = 1
ORDER BY pid;
Working example: db_fiddle link
I'm trying to find repetitions between rows based on the column. I've tried window functions with row_number() / rank() but they group all the values that are found (similar to GROUP BY) which I do not expect.
How can I find repetitions of the values?
I tried to do something like this:
SELECT *, rank() OVER(PARTITION BY customer ORDER BY id) FROM customers ORDER BY id
And got the following result:
id
customer
rank
1
customer_1
1
2
customer_2
1
3
customer_2
2
4
customer_1
2
5
customer_3
1
6
customer_1
3
What I want to do:
id
customer
rank
1
customer_1
1
2
customer_2
1
3
customer_2
2
4
customer_1
1
5
customer_3
1
6
customer_1
1
You are looking for counts within adjacent rows. This is a type of gaps-and-islands problem. You can define the adjacent rows with the difference of row_numbers() and then enumerate them:
SELECT c.*,
ROW_NUMBER() OVER (PARTITION BY customer, seqnum - seqnum_2 ORDER BY id) as ranking
FROM (SELECT c.*,
ROW_NUMBER() OVER (ORDER BY id) as seqnum,
ROW_NUMBER() OVER (PARTITION BY customer ORDER BY id) as seqnum_2
FROM customers c
) c
ORDER BY id
You can use a recursive query:
WITH RECURSIVE repeatitions(id, customer, repeat_count) AS (
SELECT id, customer, 1 as repeat_count
FROM customers
UNION ALL
SELECT c.id, c.customer, r.repeat_count + 1
FROM customers c, repeatitions r
WHERE c.id = r.id + 1 AND c.customer = r.customer
)
SELECT id, customer, repeat_count
FROM repeatitions
ORDER by id
I created a working fiddle to demonstrate it.
I would like to achieve the following and to be honest, I don't even know where to start. We have two tables, Customers and Orders. I need to create a third table, which will have combined data, and displayed in a horizontal way.
Those are the current tables:
CUSTOMERS:
Id Email Language
Customer1 1 cust1#email.com en
Customer2 2 cust2#email.com sp
Customer3 3 cust3#email.com ru
ORDERS:
Id CustomerId Total
a 1 200
b 1 300
c 2 400
d 3 500
e 3 500
f 3 500
g 3 500
And the desired outcome:
CustomerID Email Language Order1 Order2 Order3 Order4 Order5 Order6
1 a b - - - -
2 c - - - - -
3 d e f g - -
Each customer can have up to 6 active orders, but the logic can also be that for each customer only the 6 first orders will be listed.
Any suggestions on how to achieve this result? Your help will be greatly appreciated.
SQL tables represent unordered tables. There is no ordering unless a column specifies the ordering. Let me assume that id plays that role.
Then, you can do this with conditional aggregation:
select c.id, c.email, c.language,
max(case when seqnum = 1 then o.id end) as order_1,
max(case when seqnum = 2 then o.id end) as order_2,
max(case when seqnum = 3 then o.id end) as order_3,
max(case when seqnum = 4 then o.id end) as order_4,
max(case when seqnum = 5 then o.id end) as order_5,
max(case when seqnum = 6 then o.id end) as order_6
from customers c left join
(select o.*,
row_number() over (partition by customerid order by id) as seqnum
from orders o
) o
on c.customerid = o.customerid
group by c.id, c.email, c.language;
I work with Postgresql.
I have a sql code
SELECT lp."RegionId", COUNT(w."Id") FROM public.workplace w
GROUP BY lp."RegionId"
that returns to me
RegionId | Count
1 | 3
2 | 12
3 | 5
I have table 'person'. Each person have RegionId.
So i for region 1 i want to select first 3 persons, for region 2 select first 12 persons, for region 3 select first 5 persons.
So how can i use it as subquery to table 'person'?
WITH (SELECT lp."RegionId", COUNT(w."Id") FROM public.workplace w
GROUP BY lp."RegionId") AS pc
SELECT * FROM public.person p
???????
limit pc."Count"
???
Something like:
SELECT p.*
FROM (SELECT *, row_number() OVER (PARTITION BY RegionId ORDER BY PersonId) AS rn
FROM person) AS p
JOIN (SELECT RegionId, count(*) AS cnt
FROM workplace
GROUP BY RegionId) AS r ON p.RegionId = r.RegionId
WHERE p.rn <= r.cnt
ORDER BY p.RegionId, p.PersonId;
I have a table with customer IDs, location IDs, and their order values. I need to select the location ID for each customer with the largest spend
Customer | Location | Order $
1 | 1A | 100
1 | 1A | 20
1 | 1B | 100
2 | 2A | 50
2 | 2B | 20
2 | 2B | 50
So I would get
Customer | Location | Order $
1 | 1A | 120
2 | 2B | 70
I tried something like this:
SELECT
a.CUST
,a.LOC
,c.BOOKINGS
FROM (SELECT DISTINCT TOP 1 b.CUST, b.LOC, sum(b.ORDER_VAL) as BOOKINGS
FROM ORDER_TABLE b
GROUP BY b.CUST, b.LOC
ORDER BY BOOKINGS DESC) as c
INNER JOIN ORDER_TABLE a
ON a.CUST = c.CUST
But that just returns the top order.
Just use variables to emulate ROW_NUM()
DEMO
SELECT *
FROM ( SELECT `Customer`, `Location`, SUM(`Order`) as `Order`,
#rn := IF(#customer = `Customer`,
#rn + 1,
IF(#customer := `Customer`, 1, 1)
) as rn
FROM Table1
CROSS JOIN (SELECT #rn := 0, #customer := '') as par
GROUP BY `Customer`, `Location`
ORDER BY `Customer`, SUM(`Order`) DESC
) t
WHERE t.rn = 1
Firs you have to sum the values for each location:
select Customer, Location, Sum(Order) as tot_order
from order_table
group by Customer, Location
then you can get the maximum order with MAX, and the top location with a combination of group_concat that will return all locations, ordered by total desc, and substring_index in order to get only the top one:
select
Customer,
substring_index(
group_concat(Location order by tot_order desc),
',', 1
) as location,
Max(tot_order) as max_order
from (
select Customer, Location, Sum(Order) as tot_order
from order_table
group by Customer, Location
) s
group by Customer
(if there's a tie, two locations with the same top order, this query will return just one)
This seems like an order by using aggregate function problem. Here is my stab at it;
SELECT
c.customer,
c.location,
SUM(`order`) as `order_total`,
(
SELECT
SUM(`order`) as `order_total`
FROM customer cm
WHERE cm.customer = c.customer
GROUP BY location
ORDER BY `order_total` DESC LIMIT 1
) as max_order_amount
FROM customer c
GROUP BY location
HAVING max_order_amount = order_total
Here is the SQL fiddle. http://sqlfiddle.com/#!9/2ac0d1/1
This is how I'd handle it (maybe not the best method?) - I wrote it using a CTE first, only to see that MySQL doesn't support CTEs, then switched to writing the same subquery twice:
SELECT B.Customer, C.Location, B.MaxOrderTotal
FROM
(
SELECT A.Customer, MAX(A.OrderTotal) AS MaxOrderTotal
FROM
(
SELECT Customer, Location, SUM(`Order`) AS OrderTotal
FROM Table1
GROUP BY Customer, Location
) AS A
GROUP BY A.Customer
) AS B INNER JOIN
(
SELECT Customer, Location, SUM(`Order`) AS OrderTotal
FROM Table1
GROUP BY Customer, Location
) AS C ON B.Customer = C.Customer AND B.MaxOrderTotal = C.OrderTotal;
Edit: used the table structure provided
This solution will provide multiple rows in the event of a tie.
SQL fiddle for this solution
How about:
select a.*
from (
select customer, location, SUM(val) as s
from orders
group by customer, location
) as a
left join
(
select customer, MAX(b.tot) as t
from (
select customer, location, SUM(val) as tot
from orders
group by customer, location
) as b
group by customer
) as c
on a.customer = c.customer where a.s = c.t;
with
Q_1 as
(
select customer,location, sum(order_$) as order_sum
from cust_order
group by customer,location
order by customer, order_sum desc
),
Q_2 as
(
select customer,max(order_sum) as order_max
from Q_1
group by customer
),
Q_3 as
(
select Q_1.customer,Q_1.location,Q_1.order_sum
from Q_1 inner join Q_2 on Q_1.customer = Q_2.customer and Q_1.order_sum = Q_2.order_max
)
select * from Q_3
Q_1 - selects normal aggregate, Q_2 - selects max(aggregate) out of Q_1 and Q_3 selects customer,location, sum(order) from Q_1 which matches with Q_2