PostgreSQL - Select row with composite maximum value from 2 columns - sql

I would like to select the best offers for a merchant in PostgreSQL 9.6 according some simple rules:
An offer is better than another if its discount value is greater regardless the benefit type
If the discount values are equal, then the one with benefit type ALL beats the one with FOOD
If both discount and benefit type are the same, then any offer can be selected e.g. pick the first one
So best is not just a max() call but a "conditional" max() where another column should be inspected too to determine which row it is.
Could you please help?
Schema:
create table offer (
id bigserial not null,
discount int4,
benefit_type varchar(25),
...
merchant_id int8 not null
);
Query (partial):
select merchant_id, max(discount) as max_discount
from offer
group by merchant_id;
Sample offers in DB:
id discount benefit_type ... merchant_id
0 10 FOOD 0
1 20 FOOD 0
2 20 ALL 0
3 30 ALL 1
4 40 ALL 1
5 40 FOOD 1
6 40 ALL 2
7 50 FOOD 2
Desired result set:
merchant_id max_discount benefit_type
0 20 ALL
1 40 ALL
2 50 FOOD
Merchant 0's best offer is offer 2 because 20 ALL > 20 FOOD.
Merchant 1's best offer is offer 4 because 40 ALL > 30 ALL.
Merchant 2's best offer is offer 5 because 50 FOOD > 40 ALL.

This can be achieved using distinct on() and a custom sort definition for the benefit_type:
select distinct on (merchant_id) *
from offer
order by merchant_id,
discount desc,
case when benefit_type = 'ALL' then 1 else 2 end;
This prefers higher discount. If two discounts are the same, a benefit_type of ALL is used as the tie-breaker.
Online example: http://rextester.com/TFBP17217

Related

Multiple select statements - Single result

I have a product list view and a v1 value which is used for mapping
For example in erp view it look like below
PRODUCT_ID
PRODUCT_NAME
QUANTITY
v1
1
WOOD PALLET 120x80
10
25
2
WOOD PALLET 200x80
5
25
25
RAW MATERIAL WOOD
100
25
in postgres we have for it 2 tables
Table: PRODUCTS
PRODUCT_ID
PRODUCT_NAME
QUANTITY
1
WOOD PALLET 120x80
10
2
WOOD PALLET 200x80
5
25
RAW MATERIAL WOOD
100
Table: PRODUCTS_MAPPING
PRODUCT_ID
MAPPING_KEY
1
25
2
25
25
25
Now I need a query that will give only one row result which is total of quantity grouped by v1 so result should be 115
I try below query
(
SELECT
SUM( pr.quantity )
FROM
products AS pr
WHERE
pr.product_id = ( SELECT map.product_id FROM products_mapping AS map WHERE mapping_key = v1 )
)
My problem is that after WHERE there is second SELECT statement which is giving multiple results. I need a statement which will do below calculation:
Check v1 value (25)
Go to product mapping table. Find 3 entry for mapping key 25.
Go to Product table and sum quantity for products_id 1,2,3 and give result 10+5+100 = 115
If I understood properly you only want a single result but you haven't put a group by clause in your sentence.
SELECT sum(pr.quantity) FROM products pr JOIN products_mapping map
on pr.product_id=map.product_id WHERE
mapping_key=v1 group by map.mapping_key
I have rewritten your statement using joins.

Select column's occurence order without group by

I currently have two tables, users and coupons
id
first_name
1
Roberta
2
Oliver
3
Shayna
4
Fechin
id
discount
user_id
1
20%
1
2
40%
2
3
15%
3
4
30%
1
5
10%
1
6
70%
4
What I want to do is select from the coupons table until I've selected X users.
so If I chose X = 2 the resulting table would be
id
discount
user_id
1
20%
1
2
40%
2
4
30%
1
5
10%
1
I've tried using both dense_rank and row_number but they return the count of occurrences of each user_id not it's order.
SELECT id,
discount,
user_id,
dense_rank() OVER (PARTITION BY user_id)
FROM coupons
I'm guessing I need to do it in multiple subqueries (which is fine) where the first subquery would return something like
id
discount
user_id
order_of_occurence
1
20%
1
1
2
40%
2
2
3
15%
3
3
4
30%
1
1
5
10%
1
1
6
70%
4
4
which I can then use to filter by what I need.
PS: I'm using postgresql.
You've stated that you want to parameterize the query so that you can retrieve X users. I'm reading that as all coupons for the first X distinct user_ids in coupon id column order.
It appears your attempt was close. dense_rank() is the right idea. Since you want to look over the entire table you can't use partition by. And a sorting column is also required to determine the ranking.
with data as (
select *,
dense_rank() over (order by id) as dr
from coupons
)
select * from data where dr <= <X>;

How do I make a query that selects where the SUM equals a fixed value

I've spent that last couple of days searching for a way to make a SQL query that searches the database and returns records where the SUM of the same ID's equal or grater then the value provided.
For this I've been using the W3schools database to test it out in the products table.
More so what I've been trying to do:
SELECT * FROM products
WHERE supplierid=? and SUM(price) > 50
in the "where supplier id" would loop through same suppliers and sum of their price higher than 50 in this case return the records.
In this case it would read supplier ID 1 then add the price of all that supplier 18+19+10=47 now 47 < 50 so it will not print those records at the end. Next supplier ID 2 22+21.35=43.35 and again would not print those records until the sum of price is higher than 50 it will print
I'm working with a DB2 database.
SAMPLE data:
ProductID ProductName SupplierID CategoryID Price
1 Chais 1 1 18
2 Chang 1 1 19
3 Aniseed 1 2 10
4 Chef Anton 2 2 22
5 Chef Anton 2 2 21.35
6 Grandma's 3 2 25
7 Uncle Bob 3 7 30
8 Northwoods 3 2 40
9 Mishi 4 6 97
10 Ikura 4 8 31
11 Queso 5 4 21
12 Queso 5 4 38
13 Konbu 6 8 6
14 Tofu 6 7 23.25
How about:
select * from products where supplierid in (
select supplierid
from products
group by supplierid
having sum(price) > 50
);
The subquery finds out all the supplierid values that match your condition. The main (external) query retrieves all rows that match the list of supplierids.
not tested, but I would expect db2 to have analytic functions and CTEs, so perhaps:
with
basedata as (
select t.*
, sum(t.price) over(partition by t.supplierid) sum_price
from products t
)
select *
from basedata
where supplierid = ?
and sum_price > 50
The analytic function aggregates the price information but does not group the resultset, so you get the rows from your initial result, but restricted to those with an aggregated price value > 50.
The difference to a solution with a subquery is, that the use of the analytic function should be more efficient since it has to read the table only once to produce the result.

Grouping by multiple fields

I have a table of ParentID's which are products made by combining the required amount of the corresponding BaseID product.
Product table:
ParentID BaseID Required UOH
-------------------------------------
1 55 1 400
1 56 .5 400
2 55 1 400
2 57 1 400
3 58 1 0
I need to select the ParentID's where there are enough of each required base product (UOH) to create the Parent.
The Query should return
ParentID
----------------
1
2
The only way I know how to do this is by using a pivot view. Is there another or a better way to accomplish this?
Thanks
You can use group by and having:
select parentid
from table t
group by parentid
having sum(case when uoh < required then 1 else 0 end) = 0
The having clause counts the number of times where uoh is less than required. If the count is zero, then all base ids have sufficient amounts.

Efficient ways to count the number of times two items are ordered together

I am currently stuck on a problem where I have to write a SQL query to count the number of times a pair of items is ordered together.
The table that I have at my disposal is something like:
ORDER_ID | PRODUCT_ID | QUANTITY
1 1 10
1 2 20
1 3 10
2 1 10
2 2 20
3 3 50
4 2 10
I am looking to write a SQL query that can, for every unique pair of items, count the number of times they were ordered together and tell me the quantities when they were in the same order.
The resulting table should look like:
PRODUCT_ID_1 | PRODUCT_ID_2 | NUM_JOINT_ORDERS | SUM_QUANTITY_1 | SUM_QUANTITY__2
1 2 2 20 40
1 3 1 10 10
2 3 1 20 10
Some things to exploit are that:
Some orders only contain 1 item and so are not relevant in counting the pairwise relationship (not sure how to exclude these but maybe it makes sense to filter them first)
We only need to list the pairwise relationship once in the final table (so maybe a WHERE PRODUCT_ID_1 < PRODUCT_ID_2)
There is a similar post here, though I have reposted the question because
I really want to know the fastest way to do this since my original table is huge and my computational resources are limited, and
in this case I only have a single table and no table that lists the number.
You may use the following approach, which gives you the result shown above.
select
PRODUCT1, PRODUCT2, count(*), sum(QUANTITY1), sum(QUANTITY2)
from (
select
T1.PRODUCT_ID AS PRODUCT1,
T2.PRODUCT_ID AS PRODUCT2,
T1.QUANTITY AS QUANTITY1,
T2.QUANTITY AS QUANTITY2
from TABLE as T1, TABLE as T2
where T1.ORDER_ID=T2.ORDER_ID
and T1.PRODUCT_ID<T2.PRODUCT_ID
)
group by PRODUCT1, PRODUCT2