Join max records in Postresql - sql

I have two tables:
products
+----+--------+
| id | name |
+----+--------+
| 1 | Orange |
| 2 | Juice |
| 3 | Fance |
+----+--------+
reviews
+----+------------+-------+------------+
| id | created_at | price | product_id |
+----+------------+-------+------------+
| 1 | 12/12/20 | 2 | 1 |
| 2 | 12/14/20 | 4 | 1 |
| 3 | 12/15/20 | 5 | 2 |
+----+------------+-------+------------+
How can I get list of products ordered by price of most recent (max created_at) review?
+------------+--------+-----------+-------+
| product_id | name | review_id | price |
+------------+--------+-----------+-------+
| 2 | Juice | 3 | 5 |
| 1 | Orance | 2 | 4 |
| 3 | Fance | | |
+------------+--------+-----------+-------+
I use latest PostgreSQL.

demo:db<>fiddle
Using DISTINCT ON
SELECT
*
FROM (
SELECT DISTINCT ON (p.id)
p.id,
p.name,
r.id as review_id,
r.price
FROM
reviews r
RIGHT JOIN products p ON r.product_id = p.id
ORDER BY p.id, r.created_at DESC NULLS LAST
) s
ORDER BY price DESC NULLS LAST
Join both tables (products LEFT JOIN review or review RIGHT JOIN products).
Now you have to do your orders. First you want to group the products together. Then you want to get the most recent entry per product (date in descending order to get the most recent as first row).
DISTINCT ON filters always the first row of an ordered group. So you get the most recent entry per product.
To sort your product rows put 1-3 into a subquery and order by price afterwards.

DISTINCT ON and an outer join is a good approach, but I would handle this as:
SELECT . . . -- whatever columns you want
FROM products p LEFT JOIN
(SELECT DISTINCT ON (r.product_id) r.*
FROM reviews r
ORDER BY r.product_id, r.created_at DESC NULLS LAST
) r
ON r.product_id = p.id
ORDER BY p.price DESC NULLS LAST;
The difference in doing DISTINCT ON before the JOIN or after may look minor. But this version of the query can take advantage of an index on reviews(product_id, created_at desc). And that could be a big performance win on a lot of data.
Indexes cannot be used for an ORDER BY that mixes columns from different tables.

Related

Postgresql left join

I have two tables cars and usage. I create a record in usage once a month for some of cars.
Now I want to get distinct list of cars with their latest usage that I saved.
first of all look at the tables please
cars:
| id | model | reseller_id |
|----|-------------|-------------|
| 1 | Samand Sall | 324228 |
| 2 | Saba 141 | 92933 |
usages:
| id | car_id | year | month | gas |
|----|--------|------|-------|-----|
| 1 | 2 | 2020 | 2 | 68 |
| 2 | 2 | 2020 | 3 | 94 |
| 3 | 2 | 2020 | 4 | 33 |
| 4 | 2 | 2020 | 5 | 12 |
The problem is here
I need only the latest usage of year and month
I tried a lot of ways but none of them is good enough. because sometimes this query gets me one ofnot latest records of usages.
SELECT * FROM cars AS c
LEFT JOIN
(select *
from usages
) u on (c.id = u.car_id)
order by u.gas desc
You can do this with a DISTINCT ON in the derived table:
SELECT *
FROM cars AS c
LEFT JOIN (
select distinct on (u.car_id) *
from usages u
order by u.car_id, u.year desc, u.month desc
) lu on c.id = lu.car_id
order by u.gas desc;
I think you need window function row_number. Here is the demo.
select
id,
model,
reseller_id
from
(
select
c.id,
model,
reseller_id,
row_number() over (partition by u.car_id order by u.id desc) as rn
from cars c
left join usages u
on c.id = u.car_id
) subq
where rn = 1

Return rows from a table and add field for that row if the ID has a relationship with another table

DBMS used: Amazon Aurora
I have a table that I store a list of all my products, let's call it products
+----+--------------+
| id | product_name |
+----+--------------+
| 1 | Product 1 |
+----+--------------+
| 2 | Product 2 |
+----+--------------+
| | |
+----+--------------+
Another table called redeemed_products stores the ID of the product that the user has redeemed.
+----+---------+------------+
| id | user_id | product_id |
+----+---------+------------+
| 1 | 1 | 1 |
+----+---------+------------+
| | | |
+----+---------+------------+
| | | |
+----+---------+------------+
I would like to retrieve all rows of products and add an extra field to the row which has a relation in the redeemed_products
+----+--------------+----------+
| id | product_name | redeemed |
+----+--------------+----------+
| 1 | Product 1 | true |
+----+--------------+----------+
| 2 | Product 2 | |
+----+--------------+----------+
| | | |
+----+--------------+----------+
The purpose of this is to retrieve the list of products and it will show which of the product has already been redeemed by the user. I do not know how I should approach this problem.
Use an outer join:
select p.id, p.product_name, rp.product_id is not null as redeemed
from products p
left join redeemed_products rp on rp.product_id = p.id;
Note that this will repeat rows from the products table if the product_id occurs more than once in the redeemed_products table (e.g. the same product_id for multiple user_ids).
If that is the case you could use a scalar sub-select:
select p.id, p.product_name,
exists (select *
redeemed_products rp
where rp.product_id = p.id) as redeemed
from products p;
You haven't tagged your DBMS, but the above is standard ANSI SQL, but not all DBMS products actually support boolean expressions like that in the SELECT list.
One option would be using a conditional within a LEFT JOIN query :
SELECT p.*, CASE WHEN r.product_id IS NOT NULL THEN 'true' END AS redeemed
FROM products p
LEFT JOIN redeemed_products r
ON r.product_id = p.id

Excluding tuples based on maximum condition

I have been trying to answer to solve this SQL Query problem, but got no success. The problem is the following:
PROBLEM:
Given 4 tables, PRODUCTS, REPAIRS, OWNERS and MALFUNCTION, for each product Brand and Model display the type of malfunction which have been repaired more times.
The tables have the following fields:
PRODUCTS: *Series_num, Brand, Model, Year, Code_Owner
OWNERS: *Code_Owner, Name, Surname, Street, Civic, City, (u)Phone
MALFUNCTIONS: *Malf_code, Desc
REPAIRS: *Series_num, *Malf_code, *Repair_Date, Price
* <- Primary key
(u) <- Unique attribute
The expected result, given this example of data:
| MODEL | BRAND | MALF_CODE | NUMBER OF REPAIRS|
|----------------------------------------------------|
| 1 | BRAND1 | 1 | 20 |
| 1 | BRAND1 | 2 | 10 |
| 2 | BRAND1 | 1 | 1 |
| 2 | BRAND1 | 2 | 1 |
| 1 | BRAND2 | 1 | 10 |
| 1 | BRAND2 | 2 | 11 |
Should be:
| MODEL | BRAND | MALF_CODE | NUMBER OF REPAIRS|
|----------------------------------------------------|
| 1 | BRAND1 | 1 | 20 |
| 2 | BRAND1 | 1 | 1 |
| 1 | BRAND2 | 2 | 11 |
Note that BRAND1, MODEL:2 has the same number of repairs for two different types of malfunction, so one of the rows can be ignored or both of them can be shown (it does not matter)
WHAT I'VE TRIED:
To get the first table, I used a simple JOIN query:
SELECT A.MODEL, A.BRAND, R.MALF_CODE, COUNT(*) AS N_REP
FROM REPAIRS R LEFT JOIN PRODUCTS A ON A.SERIES_NUM = R.SERIES_NUM
GROUP BY A.MODEL, A.BRAND, R.MALF_CODE;
Then I tried to get the second table thanks to MAX() function:
SELECT A.MODEL, A.BRAND, R.MALF_CODE, COUNT(*) AS N_REP
FROM REPAIRS R LEFT JOIN PRODUCTS A ON A.SERIES_NUM = R.SERIES_NUM
GROUP BY A.MODEL, A.BRAND, R.MALF_CODE
HAVING COUNT(*) IN(
SELECT MAX(R.MALF_CODE)
FROM REPAIRS R LEFT JOIN PRODUCTS A ON A.SERIES_NUM = R.SERIES_NUM
GROUP BY A.MODEL, A.BRAND, R.MALF_CODE
ORDER BY A.BRAND, R.MALF_CODE);
But this throws me the following error:
[42000][907] ORA-00907: Missing closing Parenthesis
It seems I can't find the error.
I hope I've been clear enough. Thanks in advance.
EDIT: I forgot to mention that I'm aware of RANK functions and such, but never heard of Partitions. So a solution without them is highly appreciated but not mandatory.
If I understand correctly, you want the row with the most repairs for each model/brand combination. If so, window functions are one method:
SELECT MODEL, BRAND, MALF_CODE, N_REP
FROM (SELECT P.MODEL, P.BRAND, R.MALF_CODE, COUNT(*) AS N_REP,
ROW_NUMBER() OVER (PARTITION BY P.MODEL, P.BRAND ORDER BY COUNT(*) DESC, R.MALF_CODE) as SEQNUM
FROM REPAIRS R LEFT JOIN
PRODUCTS P
ON P.SERIES_NUM = R.SERIES_NUM
GROUP BY P.MODEL, P.BRAND, R.MALF_CODE
) MB
WHERE seqnum = 1;

Conditionally return values from LEFT JOIN between 3 tables based on CASE

First off, apologies for a long post. It's really more simple than it looks ;-)
I'm trying to do something that I think is conceptually simple, and I believe I'm most of the way there, but there's one last part that I can't implement without errors that I can't figure out how to fix.
I have three related tables.
Orders:
Each row is an Order with a unique ID, there will never be duplicates.
+---------+---------+
| OrderID | Name |
+---------+---------+
| 1 | Order 1 |
| 2 | Order 2 |
| 3 | Order 3 |
+---------+---------+
Order Details:
Relational table where each row is a product line on an order.
+---------+-----------+
| OrderID | ProductID |
+---------+-----------+
| 1 | a |
| 2 | b |
| 2 | c |
| 3 | a |
| 3 | b |
| 3 | b |
+---------+-----------+
As you can see some orders have just one product (1), some will have multiple products (2) and some will have duplicate products (3).
Products
Each row is a product with a unique ID, there will never be duplicates.
+-----------+-------------+
| ProductID | Description |
+-----------+-------------+
| a | Chicken |
| b | Fish |
| c | Beef |
+-----------+-------------+
I want to return all rows from the Orders table and conditionally return some information about the related Products in one column.
The condition is that I look at how many DISTINCT products each Order has. If it's just 1 then I want to return the Product Description value. If it's more than 1 then I want to return some placeholder text such as 'Multi'.
I think that I need to use CASE to get this working, but I can't figure it out.
I can count the unique products successfully like this:
SELECT
o.Name
,COUNT(DISTINCT d.ProductId) as 'Unique Products'
FROM Orders o
LEFT JOIN OrderDetails d ON o.OrderID = d.OrderID
LEFT JOIN Products p on d.ProductId = p.ProductId
GROUP BY o.Name
ORDER BY o.Name DESC
GO
Results are like this:
+---------+-----------------+
| Name | Unique Products |
+---------+-----------------+
| Order 1 | 1 |
| Order 2 | 2 |
| Order 3 | 2 |
+---------+-----------------+
What I want is this:
+---------+-----------------+
| Name | Unique Products |
+---------+-----------------+
| Order 1 | Chicken |
| Order 2 | Multi |
| Order 3 | Multi |
+---------+-----------------+
I have been trying to use CASE which I believe I've gotten correct:
CASE WHEN (COUNT(DISTINCT d.ProductId)) > 1 THEN 'Multi' ELSE p.Description END AS 'Products'
However unless I add p.Description to GROUP BY then I get the error (which I understand):
Column 'Product.Description' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
But if I do add it the results aren't what I want, for example:
+---------+----------+
| Name | Products |
+---------+----------+
| Order 1 | Chicken |
| Order 2 | Fish |
| Order 2 | Beef |
| Order 3 | Chicken |
| Order 3 | Fish |
| Order 3 | Fish |
+---------+----------+
When it should just say "Order 2 - Multi" on one row for example. This is the bit I don't understand.
If can get some help on this bit alone it would solve my problem and I'd accept the answer. However...
Bonus Round
The above is fine and all, but if this bit is possible I'd accept this as an answer above the others.
Can I concatenate the product names? I've been looking at COALESCE and FOR XML PATH but I can't wrap my head around them at all so I don't even have any code to show.
Results would look something like this:
+---------+--------------+
| Name | Products |
+---------+--------------+
| Order 1 | Chicken |
| Order 2 | Fish;Beef |
| Order 3 | Chicken;Fish |
+---------+--------------+
If you've made it this far I commend you! Thanks!
You are pretty close. You just need some case logic and an aggregation function around the description:
SELECT o.Name,
(CASE WHEN COUNT(DISTINCT d.ProductId) = 1
THEN MAX(p.description)
ELSE 'Multi'
END) as Descriptions
FROM Orders o LEFT JOIN
OrderDetails d
ON o.OrderID = d.OrderID LEFT JOIN
Products p
ON d.ProductId = p.ProductId
GROUP BY o.Name
ORDER BY o.Name DESC
The second part is a very different question. In SQL Server, you need to use an XML subquery:
select o.Name,
stuff((select distinct ',' + p.description
from OrderDetails d left join
Products p
on d.ProductId = p.ProductId
where o.OrderID = d.OrderID
for xml path (''), type
).value('.', 'nvarchar(max)'
), 1, 1, ''
) as descriptions
from Orders o
order by o.Name desc

Select DISTINCT returning too many records

I have two tables: Products and Items. I want to select distinct items that belong to a product based on the condition column, sorted by price ASC.
+-------------------+
| id | name |
+-------------------+
| 1 | Mickey Mouse |
+-------------------+
+-------------------------------------+
| id | product_id | condition | price |
+-------------------------------------+
| 1 | 1 | New | 90 |
| 2 | 1 | New | 80 |
| 3 | 1 | Excellent | 60 |
| 4 | 1 | Excellent | 50 |
| 5 | 1 | Used | 30 |
| 6 | 1 | Used | 20 |
+-------------------------------------+
Desired output:
+----------------------------------------+
| id | name | condition | price |
+----------------------------------------+
| 2 | Mickey Mouse | New | 80 |
| 4 | Mickey Mouse | Excellent | 50 |
| 6 | Mickey Mouse | Used | 20 |
+----------------------------------------+
Here's the query. It returns six records instead of the desired three:
SELECT DISTINCT(items.condition), items.price, products.name
FROM products
INNER JOIN items ON products.id = items.product_id
WHERE products.id = 1
ORDER BY items."price" ASC, products.name;
Correct PostgreSQL query:
SELECT DISTINCT ON (items.condition) items.id, items.condition, items.price, products.name
FROM products
INNER JOIN items ON products.id = items.product_id
WHERE products.id = 1
ORDER BY items.condition, items.price, products.name;
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of
each set of rows where the given expressions evaluate to equal.
Details here
There is no distinct() function in SQL. Your query is being parsed as
SELECT DISTINCT (items.condition), ...
which is equivalent to
SELECT DISTINCT items.condition, ...
DISTINCT applies to the whole row - if two or more rows all have the same field values, THEN the "duplicate" row is dropped from the result set.
You probably want something more like
SELECT items.condition, MIN(items.price), products.name
FROM ...
...
GROUP BY products.id
I want to select distinct items that belong to a product based on the
condition column, sorted by price ASC.
You most probably want DISTINCT ON:
SELECT *
FROM (
SELECT DISTINCT ON (i.condition)
i.id AS item_id, p.name, i.condition, i.price
FROM products p
JOIN items i ON i.products.id = p.id
WHERE p.id = 1
ORDER BY i.condition, i.price ASC
) sub
ORDER BY item_id;
Since the leading columns of ORDER BY have to match the columns used in DISTINCT ON , you need a subquery to get the sort order you display.
Better yet:
SELECT i.item_id, p.name, i.condition, i.price
FROM (
SELECT DISTINCT ON (condition)
id AS item_id, product_id, condition, price
FROM items
WHERE product_id = 1
ORDER BY condition, price
) i
JOIN products p ON p.id = i.product_id
ORDER BY item_id;
Should be a bit faster.
Aside: You shouldn't be using the non-descriptive name id as identifier. Use item_id and product_id instead.
More details, links and a benchmark test in this related answer:
Select first row in each GROUP BY group?
Use a SELECT GROUP BY, extracting only the MIN(price) for every PRODUCT/CONDITION.