Find the top 2 vendors per country, in each year available in the dataset (Why in my case Qualify clause not work) - sql

SELECT
o.country_name,
v.vendor_name,
EXTRACT(year FROM date_local) AS year,
ROUND(SUM(o.gmv_local), 2) AS total_gmv
FROM Orders o
LEFT JOIN Vendors v
ON o.vendor_id = v.id
GROUP BY
o.country_name,
v.vendor_name,
EXTRACT(year FROM date_local)
--QUALIFY ROW_NUMBER() OVER (PARTITION BY country_name, EXTRACT(year FROM date_local) ORDER BY total_gmv DESC) <= 2
ORDER BY
o.country_name DESC,
total_gmv DESC;

Related

Psql : Get Min, Max and Count records for each partner' invoice, and last payment

I have a table invoice like this :
id, partner_id, number, invoice_date
And a Payment Table like this:
id, payment_date, partner_id
I want to get min and max for both number & invoice_date, and count invoices, and last payment for each partner, something like this :
partner_id, min number, min date, max number, max date, count, last_pay
1, INV-2017-003, 02-01-2017, INV-2020-010, 01-01-2020, 142, 02-12-2019
5, INV-2019-124, 05-03-2019, INV-2020-005, 01-01-2020, 150, 01-01-2020
....
You can join those three tables including partners and grouping by parners' id column along with related aggregations :
select pr.id, min(invoice_date), max(invoice_date), count(*), max(payment_date) as last_pay
from partners pr
left join invoices i on i.partner_id = pr.id
left join payments p on p.partner_id = pr.id
group by pr.id
Update : You can use min() over (), max() over () and row_number() analytic functions to get the desired code depending on max and min dates :
select *
from
(
select pr.id,
min(invoice_date) over (partition by pr.id order by invoice_date) as min_invoice_date,
max(invoice_date) over (partition by pr.id order by invoice_date desc) as max_invoice_date,
max(code) over (partition by pr.id order by invoice_date desc) as max_code,
min(code) over (partition by pr.id order by invoice_date) as min_code,
count(*) over (partition by pr.id) as cnt,
max(payment_date) over (partition by pr.id) as last_pay,
row_number() over (partition by pr.id order by invoice_date desc) as rn
from partners pr
left join invoices i on i.partner_id = pr.id
left join payments p on p.partner_id = pr.id
) q
where rn = 1
Why isn't this simple aggregation?
select i.partner_id,
min(i.number) as min_number,
min(i.invoice_date) as min_invoice_date,
max(i.number) as min_number,
max(i.invoice_date) as min_invoice_date,
count(distinct i.invoice_id) as num_invoices,
max(p.payment_date) as max_payment_date
from invoices i left join
payments p
on p.invoice_id = i.invoice_id
group by i.partner_id;
If you want the number on the earliest invoice (and min() doesn't work), then you can do this with a "first" aggregation function. Unfortunately, Postgres doesn't directly support one. But it does through array functions:
select i.partner_id,
(array_agg(i.number order by i.invoice_date asc))[1] as min_number,
min(i.invoice_date) as min_invoice_date,
(array_agg(i.number order by i.invoice_date desc))[1] as min_number,
max(i.invoice_date) as min_invoice_date,
count(distinct i.invoice_id) as num_invoices,
max(p.payment_date) as max_payment_date
from invoices i left join
payments p
on p.partner_id = i.partner_id
group by i.partner_id;
This is similar to #BarbarosĂ–zhan, but calculates the min/max before the join (if you got multiple rows per partner for both invoices and payments the COUNT will be wrong otherwise). Additionally there's only a single PARTTION/ORDER which should result in a more efficient plan.
SELECT i.*, p.last_pay
FROM
( -- 1st row has all the min values = filtered using row_number
SELECT
partner_id
,number AS min_code
,invoice_date AS min_invoice_date
-- value from row with max date
,Last_Value(number)
Over (PARTITION BY partner_id
ORDER BY invoice_date
ROWS BETWEEN Unbounded Preceding AND Unbounded Following) AS max_code
,Last_Value(invoice_date)
Over (PARTITION BY partner_id
ORDER BY invoice_date
ROWS BETWEEN Unbounded Preceding AND Unbounded Following) AS max_invoice_date
,Count(*)
Over (PARTITION BY partner_id) AS Cnt
,Row_Number()
Over (PARTITION BY partner_id ORDER BY invoice_date) AS rn
FROM invoices
) AS i
LEFT JOIN
( -- max payment date per partner
SELECT partner_id, Max(payment_date) AS last_pay
FROM payments
GROUP BY partner_id
) AS p
ON p.partner_id = i.partner_id
WHERE i.rn = 1

How to Nest query with different criteria

I have a Sales_details table where I like to get a report of the top 150 products and the top 10 customers of each product. The code I have below does just that and is working perfectly. However, it is using the same date range for both. How do I modify this so that the top 150 products is based on a 10 years history while the top 10 customers is based on 2 years history?
select pc.*
from (select pc.*,
dense_rank() over (order by product_sales desc, product_id) as product_rank
from (select sd.product_id, sd.custno, sum(sd.sales$) as total_sales,
row_number() over (partition by sd.product_id order by sum(sd.sales$) as cust_within_product_rank,
sum(sum(sd.sales$)) over (partition by sd.product_id) as product_sales
from salesdetails sd
group by sd.product_id, sd.custno
) pc
) pc
where product_rank <= 150 and cust_within_product_rank <= 10;
You can use conditional aggregation:
select pc.*
from (select pc.*,
dense_rank() over (order by product_sales desc, product_id) as product_rank
from (select sd.product_id, sd.custno, sum(sd.sales$) as total_sales,
row_number() over (partition by sd.product_id
order by sum(case when date > dateadd(year, -2, getdate()) then sd.sales$ else 0 end)
) as cust_within_product_rank,
sum(sum(case when date > dateadd(year, -10, getdate()) then sd.sales$ else 0 end)) over (partition by sd.product_id) as product_sales
from salesdetails sd
group by sd.product_id, sd.custno
) pc
) pc
where product_rank <= 150 and cust_within_product_rank <= 10;
I'm not sure what column you use for date, so I just called it date.

Postgres get sales for top account with ranking

I have the following tables:
Account (id, name)
Solution (id, name)
Sales (solution_id, account_id, month, year, amount)
I need to calculate the monthly sales of each account in a specific period:
SELECT
to_char(make_date(sales.year, sales.month, 1), 'YYYY-MM') AS period,
acc.id AS account_id,
acc.name AS account_name,
COALESCE(SUM(sales.net_sales), 0) AS amount
FROM
(SELECT *
FROM sales
WHERE make_date(year, month, 1) >= FROM_DATE
AND make_date(year, month, 1) <= TO_DATE) sales
INNER JOIN account acc.id = sales.account_id
GROUP BY sales.year, sales.month
ORDER BY sales.year, sales.month ASC
I can now calculate the total sales, in the period in the range:
SELECT
to_char(make_date(sales.year, sales.month, 1), 'YYYY-MM') AS period,
acc.id AS account_id,
acc.name AS account_name,
COALESCE(SUM(sales.net_sales), 0) AS amount
FROM
(SELECT *, COALESCE(SUM(net_sales) OVER (PARTITION BY client_id), 0) AS total
FROM sales
WHERE make_date(year, month, 1) >= FROM_DATE
AND make_date(year, month, 1) <= TO_DATE) sales
INNER JOIN account acc.id = sales.account_id
GROUP BY sales.year, sales.month
ORDER BY sales.year, sales.month ASC
Is there a way to rank the total sales in order to get only the n top account in the selected period?
Your queries are a bit of a mess. The first is not syntactically correct. I think you can simplify and the intention is:
SELECT to_char(make_date(s.year, s.month, 1), 'YYYY-MM') AS period,
a.id AS account_id, a.name AS account_name,
COALESCE(SUM(s.net_sales), 0) AS amount,
SUM(SUM(s.net_sales)) OVER (PARTITION BY a.id) as total
FROM sales s INNER JOIN
account a
ON a.id = s.account_id
WHERE make_date(s.year, s.month, 1) >= FROM_DATE AND
make_date(s.year, s.month, 1) <= TO_DATE
GROUP BY s.year, s.month, a.id, a.name
ORDER BY s.year, s.month ASC;
If you want to rank by total sales (or monthly sales), then you can use dense_rank():
SELECT ym.*
FROM (SELECT to_char(make_date(s.year, s.month, 1), 'YYYY-MM') AS period,
a.id AS account_id, a.name AS account_name,
COALESCE(SUM(s.net_sales), 0) AS amount,
total,
DENSE_RANK() OVER (ORDER BY total DESC) as seqnum
FROM (SELECT s.*, SUM(s.net_sales) OVER (PARTITION BY client_id) as total
FROM sales s
) s INNER JOIN
account a
ON a.id = s.account_id
WHERE make_date(s.year, s.month, 1) >= FROM_DATE AND
make_date(s.year, s.month, 1) <= TO_DATE
GROUP BY s.year, s.month
) ym
WHERE seqnum <= 3
ORDER BY s.year, s.month ASC;

oracle window functions

Could someone help me out with this query:
SELECT SUM(summa), name,
TO_CHAR(invoice_date, 'YYYY/mm')
OVER (PARTITON EXTRACT(MONTH FROM i.invoice_date, c.name)
FROM invoice i, customer c
WHERE i.customer_id = c.id
AND months_between(sysdate, invoice_date) = 3
AND rownum < 11 GROUP BY invoice_date, name
ORDER BY SUM(SUMMA) DESC;
Supposed to get the first ten rows from last three months, grouped by month and ordered by sum.
Thanks.
First, use proper explicit join syntax. Second, you need row_number():
SELECT t.*
FROM (SELECT SUM(summa) as sumsumma, name,
TO_CHAR(invoice_date, 'YYYY/mm') as yyyymm,
ROW_NUMBER() OVER (PARTITION BY TO_CHAR(invoice_date, 'YYYY/mm')
ORDER BY SUM(summa) DESC
) as seqnum
FROM invoice i JOIN
customer c
ON i.customer_id = c.id
WHERE months_between(sysdate, invoice_date) = 3
GROUP BY invoice_date, name
) t
WHERE seqnum <= 10
ORDER BY sumsumma DESC;

display only specific rows in a column with a group by

I'm somewhat new to Oracle SQL and can't figure this out. I want to display the rows with the high value in the third column. Here is my table i'm working with:
theyear custseg sales
2010 Corporate 573637.62
2010 Home Office 515314.98
2010 Small Biz 390361.94
2010 Consumer 383825.67
2011 Corporate 731208
2011 Home Office 521274.34
2011 Consumer 390967.03
2011 Small Biz 273264.81
2012 Corporate 823861.38
2012 Consumer 480082.9
2012 Home Office 478106.93
I want the highest value grouped by year. If I do a group by with just the year I get the answer somewhat, but I can't include/display customer segment (ugh). It just displays the year and the max sales. When I include the customer segment it gives me that table, which displays all the sales - not what i'm looking for. I simply want the rows that contain the MAX sales given the year (theyear) AND the customer segment (custseg). For what it's worth here is the code I used to create the above:
select theyear, custseg, max(totalsales) sales from (
select custseg, extract(year from ordshipdate) theyear, sum(ordsales) TotalSales from customers, orderdet
where customers.custid = orderdet.custid
group by custseg, extract(year from ordshipdate)
order by sum(ordsales) desc)
group by theyear, custseg
order by theyear, max(totalsales) desc;
Assuming all fields are in the customer table as described in the question, the following query would do what you want:
select c.theyear, c.custseg, c.sales
from
customer c inner join
(
select theyear, max(sales) as max_sales_in_year
from customer
group by theyear
) maxvalues
on (
c.year = maxvalues.theyear and
c.sales = maxvalues.max_sales_in_year
);
Swap the inner join with a right outer join if you do not plan to settle ties arbitrarily.
I would use ROW_NUMBER():
SELECT theyear, custseg, totalsales FROM
(
select theyear, custseg, totalsales,
ROW_NUMBER OVER(PARTITION BY theyear ORDER BY totalsales DESC) rn
from
(
select custseg, extract(year from ordshipdate) theyear, sum(ordsales) TotalSales
from customers, orderdet
where customers.custid = orderdet.custid
group by custseg, extract(year from ordshipdate)
) a
) b
WHERE rn = 1;
BTW, the query above will look more readable when using CTE:
WITH a AS(
select custseg, extract(year from ordshipdate) theyear, sum(ordsales) TotalSales
from customers, orderdet
where customers.custid = orderdet.custid
group by custseg, extract(year from ordshipdate)),
b AS (
select theyear, custseg, totalsales,
ROW_NUMBER OVER(PARTITION BY theyear ORDER BY totalsales DESC) rn
FROM a)
SELECT theyear, custseg, totalsales
FROM b;