Aggregate highest prices per client of salesmen - sql

I have a table like this:
SELECT * FROM orders;
client_id | order_id | salesman_id | price
-----------+----------+-------------+-------
1 | 167 | 1 | 65
1 | 367 | 1 | 27
2 | 401 | 1 | 29
2 | 490 | 2 | 48
3 | 199 | 1 | 68
3 | 336 | 2 | 22
3 | 443 | 1 | 84
3 | 460 | 2 | 92
I want to find the an array of order_ids for each of the highest priced sales for each unique salesman and client pair. In this case I want the resulting table:
salesman_id | order_id
-------------+----------------
1 | {167, 401, 443}
2 | {490, 460}
So far I have an outline for a query:
SELECT salesman_id, max_client_salesman(order_id)
FROM orders
GROUP BY salesman_id;
However I'm having trouble writing the aggregate function max_client_salesman.
The documentation online for aggregate functions and arrays in postgres is very minimal. Any help is appreciated.

Standard SQL
I would combine the window function last_value() or firstvalue() with DISTINCT to the get the orders with the highest price per (salesman_id, client_id) efficiently and then aggregate this into the array you are looking for with the simple aggregate function array_agg().
SELECT salesman_id
,array_agg(max_order_id) AS most_expensive_orders_per_client
FROM (
SELECT DISTINCT
salesman_id, client_id
,last_value(order_id) OVER (PARTITION BY salesman_id, client_id
ORDER BY price
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS max_order_id
FROM orders
) x
GROUP BY salesman_id
ORDER BY salesman_id;
Returns:
salesman_id | most_expensive_orders_per_client
-------------+------------------------------------
1 | {167, 401, 443}
2 | {490, 460}
SQL Fiddle.
If there are multiple highest prices per (salesman_id, client_id), this query pick only one order_id arbitrarily - for lack of definition.
For this solution it is essential to understand that window functions are applied before DISTINCT. How you to combine DISTINCT with a window function:
PostgreSQL: running count of rows for a query 'by minute'
For an explanation on ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING refer to this closely related answer on dba.SE.
Simper with non-standard DISTINCT ON
PostgreSQL implements, as extension to the SQL standard, DISTINCT ON. With it you can very effectively select rows unique according to a defined set of columns.
It won't get simpler or faster than this:
SELECT salesman_id
,array_agg(order_id) AS most_expensive_orders_per_client
FROM (
SELECT DISTINCT ON (1, client_id)
salesman_id, order_id
FROM orders
ORDER BY salesman_id, client_id, price DESC
) x
GROUP BY 1
ORDER BY 1;
SQL Fiddle.
I also use positional parameters for shorter syntax. Details:
Select first row in each GROUP BY group?

I think you want the Postgres function array_agg in combination with row_number() However, your description of the query does not make sense to me.
The following gets clients and salesmen and the list of orders for the highest priced order by salesman:
select client_id, salesman_id, array_agg(order_id)
from (select o.*,
row_number() over (partition by salesman_id order by price desc) as sseqnum,
row_number() over (partition by client_id order by price desc) as cseqnum
from orders o
) o
where sseqnum = 1
group by salesman_id, client_id
I don't know what you mean by "highest priced sales for each salesman and client". Perhaps you want:
where sseqnum = 1 or cseqnum = 1

Related

Select row with smallest number on multiple groups of same ids

I have the following table as an output from a sql statement
user | product | price
…
123 | 12 | 451.29
373 | 12 | 637.28
623 | 12 | 650.84
672 | 16 | 356.87
123 | 16 | 263.90
…
Now I want to get only the row with the smallest price for each product_id
THE SQL is fairly easy
SELECT user, product, price
FROM t
WHERE product IN (
SELECT product_id
FROM p
WHERE typ LIKE 'producttyp1'
)
)
but adding MIN(price) does not work how it usually do. I think its because there are several groups of the same product_ids in the same table. Is there an easy to use solution or do I have to rewrite the whole query?
Edit: when I delete user from the query I can get the product and the smallest price:
12 | 451.29
16 | 263.90
But now I would have to join the user, which I am trying to avoid.
You can use row_number():
select p.*
from (select p.*,
row_number() over (partition by product order by price asc) as seqnum
from p
) p
where seqnum = 1;

SQL Query find users with only one product type

I solemnly swear I did my best to find an existing question, may I'm not sure how to phrase it correctly.
I would like to return records for users that have quota for only one product type.
| user_id | product |
| 1 | A |
| 1 | B |
| 1 | C |
| 2 | B |
| 3 | B |
| 3 | C |
| 3 | D |
In the example above I'd like a query that only returns users who carry quota for only one product type - doesn't really matter which product at this point.
I tried using select user_id, product from table group by 1,2 having count(user) < 2 but this does not work, nor does select user_id, product from table group by 1,2 having count(*) < 2
Any help is appreciated.
Your having clause is good; the issue's with your group by. Try this:
select user_id
, count(distinct product) NumberOfProducts
from table
group by user_id
having count(distinct product) = 1
Or you could do this; which is closer to your original:
select user_id
from table
group by user_id
having count(*) < 2
The group by clause can't take ordinal arguments (like, e.g., the order by clause can). When grouping by a value like 1, you're in fact grouping by the literal value 1, which would just be the same for any row in the table, and thus will group all the rows in the table to one group. Since there are more than one product in the entire table, no rows will be returned.
Instead, you should group by the user_id:
SELECT user_id
FROM mytable
GROUP BY user_id
HAVING COUNT(*) = 1
If you want the product, then do:
select user_id, max(product) as product
from table
group by user_id
having min(product) = max(product);
The having clause could also be:
having count(distinct product) = 1

Using Qualify with Rank in Teradata

I am trying to get latest two month start dates for a particular product when it was sold in Teradata.Since a product was sold in multiple months , I should only get the latest two selling months for each product.
Trying to use Qualify with Dense Rank :
SELECT DISTINCT PRODUCT, MONTH_START_DATE,
DENSE_RANK() OVER (PARTITION BY PRODUCT ORDER BY MONTH_START_DATE DESC ) AS RNK
FROM EMP_TABLE
HERE PRODUCT = 'SOAP'
This will give me Different months with Rank and Product. Something like this :
+---------+------------------+------+
| Product | Month_start_date | RNK |
+---------+------------------+------+
| SOAP | 2016-12-01 | 1 |
| SOAP | 2016-11-01 | 2 |
| SOAP | 2016-10-01 | 3 |
+---------+------------------+------+
But if I rewrite code to get only top 2 :
SELECT DISTINCT PRODUCT, MONTH_START_DATE
DENSE_RANK() OVER (PARTITION BY PRODUCT ORDER BY MONTH_START_DATE DESC ) AS RNK
FROM EMP_TABLE
WHERE PRODUCT = 'SOAP'
QUALIFY RNK < 3
I always get only the top rank result. What is the reason for this ? The solution is writing a subquery but wanted to understand the reason behind 'Qaulify' giving only top row.
Thanks for the help.

GROUP BY PostgreSQL query where I need a column that is not in the GROUP BY clause [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 6 years ago.
I have a database that parallels the 'widget' database below.
widget_id | vendor_id | price
------------------------------
1 | 101 | 10.00
2 | 101 | 9.00
3 | 102 | 6.00
4 | 102 | 7.00
I want to find the cheapest widget by vendor, so something like the below output:
widget_id | vendor_id | price
------------------------------
1 | 101 | 10.00
3 | 102 | 6.00
In MySQL or SQLite, I could query
SELECT widget_id, vendor_id, min( price ) AS price FROM widgets GROUP BY( vendor_id )
However, it seems that this is contrary to the SQL spec. In PostgreSQL, I'm unable to run the above query. The error message is "widget_id must appear in the GROUP BY clause or be used in an aggregate function". I can kind of see PostgreSQL's point, but it seems like a perfectly reasonable thing to want the widget_id of the widget that has the minimum price.
What am I doing wrong?
You can use DISTINCT ON:
SELECT DISTINCT ON (vendor_id) *
FROM widget
ORDER BY vendor_id, price;
You can also use the row_number window function in a subquery:
SELECT widget_id, vendor_id, price
FROM (
SELECT *, row_number() OVER (PARTITION BY vendor_id ORDER BY price) AS rn
FROM widget
) t
WHERE rn=1;
Finaly, you can also do it with a LATERAL join:
SELECT t2.*
FROM
(SELECT DISTINCT vendor_id FROM widget) t1,
LATERAL (SELECT * FROM widget WHERE vendor_id=t1.vendor_id ORDER BY price LIMIT 1) t2

PostgreSQL list companies and rank by sales

So I have:
companies (id, name, tenant_id)
invoices (id, company_id, tenant_id, total)
What I want to do is return a result set like:
company | Feb Sales | Feb Rank | Lifetime Sales | Lifetime Rank
-----------------------------------------------------------------------
ABC Comp | 1,000 | 1 | 2,000 | 2
XYZ Corp | 500 | 2 | 5,000 | 1
I can do the sales totals using subselects, but when I do the rank always returns 1. I'm assuming because it only returns 1 row per subselect so will always be the top row?
Here is a piece of the sql:
SELECT
"public".companies."name",
(
SELECT
rank() OVER (PARTITION BY i.tenant_id ORDER BY sum(grand_total) DESC) AS POSITION
FROM
invoices i
where
company_id = companies.id
group by
i.tenant_id, i.company_id
)
from companies
Below is untested version that can have typos. Please treat it just as description of the approach. For simplicity I assumed that invoices have a month column.
SELECT
"public".companies."name",
rank() OVER (PARTITION BY sales.companies ORDER BY sales.lifetime) As "Lifetime Rank",
rank() OVER (PARTITION BY sales.companies ORDER BY sales.month As "One Month"
FROM companies LEFT JOIN
(
SELECT
SUM(grand_total) As Lifetime,
SUM(CASE WHEN i.month = <the month of report>, grand_total, 0) As Month
FROM
invoices i
GROUP BY company_id
) sales
ON companies.company_id = sales.company_id
If you run into problems, add the actual code that you used and sample data to your post and I will attempt to create a live demo for you.