Select multiple columns with not all columns mentioned in Groupby - Postgres v12 - sql

I have a table which contain review_id,product_id,ratings,reviewer_id,review_comments. The table i have is as below.
My need is quite simple but I have issues figuring it out. Need is to get product_id, rating, reviewer_id and review_comments of the product_id which has the max value of review_id
With below query, I am able to get product_id and review_id properly.
SELECT product_id,max(review_id) as review_id
FROM public.products Group by product_id;
But when I try to add ratings, reviewer_id, and review_comments, it raises an error that those columns have to be part of a groupby and if I add those columns, grouping gets disturbed since I need grouping only on product_id and nothing else.
Is there a way to solve this?
My expected result should contain all row content with review_id 7,5,8 since for product_id 1 review_id 7 is highest and for product_id 2 review_id 5 is highest and for product_id 3 review_id 8 is highest.

Try PostgreSQL's DISTINCT ON:
SELECT DISTINCT ON (product_id)
product_id,
review_id,
rating,
reviewer_id,
review_comments
FROM products
ORDER BY product_id, review_id DESC;
This will return the first row for each product_id in the ORDER BY order.

This can be done with NOT EXISTS:
select p.product_id, p.rating, p.reviewer_id, p.review_comments
from public.products p
where not exists (
select 1 from public.products
where product_id = p.product_id and review_id > p.review_id
)

You can try below way-
select * from tablename a
where review_id =(select max(review_id) from tablename b where a.product_id=b.product_id)
or use row_number()
select * from
(
select *, row_number() over(partition by product_id order by review_id desc) as rn
from tablename
)A where rn=1

Related

SQL query for table with multiple keys?

I am sorry if this seems too easy but I was asked this question and I couldn't answer even after preparing SQL thoroughly :(. Can someone answer this?
There's a table - Seller id, product id, warehouse id, quantity of products at each warehouse for each product as per each seller.
We have to list the Product Ids with Seller Id who has highest number of products for that product and the total number of units he has for that product.
I think I got confused because there were 3 keys in the table.
It's not quite clear which DBMS you are using currently. The below should work if your DBMS support window functions.
You can find count of rows for each product and seller, rank each seller within each product using window function rank and then use filter to get only top ranked sellers in each product along with count of units.
select
product_id,
seller_id,
no_of_products
from (
select
product_id,
seller_id,
count(*) no_of_products,
rank() over (partition by product_id order by count(*) desc) rnk
from your_table
group by
product_id,
seller_id
) t where rnk = 1;
If window functions are not supported, you can use correlated query to achieve the same effect:
select
product_id,
seller_id,
count(*) no_of_products
from your_table a
group by
product_id,
seller_id
having count(*) = (
select max(cnt)
from (
select count(*) cnt
from your_table b
where b.product_id = a.product_id
group by seller_id
) t
);
Don't know why having id columns would mess you up... group by the right columns, sum up the totals and just return the first row:
select *
from (
select sellerid, productid, sum(quantity) as total_sold
from theres_a_table
group by sellerid, productid
) x
order by total_sold desc
fetch first 1 row only
If I do not think about optimization, straight forward answer is like this
select *
from
(
select seller_id, product_id, sum(product_qty) as seller_prod_qty
from your_table
group by seller_id, product_id
) spqo
inner join
(
select product_id, max(seller_prod_qty) as max_prod_qty
from
(
select seller_id, product_id, sum(product_qty) as seller_prod_qty
from your_table
group by seller_id, product_id
) spqi
group by product_id
) pmaxq
on spqo.product_id = pmaxq.product_id
and spqo.seller_prod_qty = pmaxq.max_prod_qty
both spqi (inner) and sqpo (outer) give you seller, product, sum of quantity across warehouses. pmaxq gives you max of each product again across warehouses, and then final inner join picks up sum of quantities if seller has highest (max) of the product (could be multiple sellers with the same quantity). I think this is the answer you are looking for. However, I'm sure query can be improved, since what I'm posting is the "conceptual" one :)

Create multiply function SQL

I don't know if you can call it a multiply function, or function in function
I want to create output of productname number 5,6,7,8 from the small to the big one.
this output is from the big to the small
And i want to create the reverse output , create function that output the productname 5,6,7,8 asc
and later create another function that output 5,6,7,8 order by price desc
How to do it ? thanks !
you just add column name desc order and limit to get number of record
select * from products order by unitprice desc limit 5,4
RowNumber() will fix your issue
Row Number
WITH OrderedProducts AS
(
SELECT product_id, unit_price
ROW_NUMBER() OVER (ORDER BY unit_price DESC) AS RowNumber
)
SELECT product_id, unit_price
FROM OrderedProducts
WHERE RowNumber BETWEEN 4 AND 8;
If you want to skip the first, second, third and fourth items, then you can use the NOT IN clause. Somewhat like this :
Select Top 8 product_id, price, other_fields etc from Table1 Where product_id not in (select Top 4 product_id from Table 1 where filter_goes_here Order By product_id asc) Order By Price desc

Sorting Records on the Basis of Number of Items in a Group- SQL Server

I have a set of records and I want to sort these records on the basis of the number of items in a group.
I want to arrange the records in such a way that Products with maximum number of items are at the top i.e. the required order is- Product_ID 3 (with 6 items), then Product_ID 1 (with 5 items) and the last one would be Product_ID 2(with 3 items).
The following query returns the count of the items with same Product_ID, however, I want Item_Name, Item_Description and Item_Number to be arranged as well.
Select Product_ID, Count(*) from Product group by Product_ID order by Count(*) DESC
I have tried another query as follows, but I know I am wrong somewhere that it is not giving the desired results and I can't think of a possible solution:
Select Product_ID, Item_Name, Item_Description, Item_Number from Product
group by Product_ID,item_name,item_description,item_number
order by COUNT(product_ID)
Thanks in advance for your help!!
Select Product_ID, Item_Name, Item_Description, Item_Number
from Product
order by COUNT(1) over (partition by Product_ID) desc
I assume you want to group by the ID only but you want to list all other fields, you don't need to group by at all if you just want to order by:
SELECT product_id,
item_name,
item_description,
item_number
FROM product p1
ORDER BY (SELECT Count(product_id)
FROM product p2
WHERE p1.product_id = p2.product_id) DESC
Try using an alias:
Select Product_ID, Count(*) AS num_products from Product group by Product_ID order by num_products DESC;

How to use COUNT in my sql query

I'm trying to count the number of occurrence of product_id.
This is my query that's return 10 products but not according the number of occurrence of product_id.
SELECT name, category_id, product_id
FROM product, notification
WHERE type='product_consumed' OR type='product_rate' AND product.id = notification.product_id AND category_id="1"
GROUP BY product_id ORDER BY product.category_id ASC LIMIT 10
I tried to COUNT the occurrenceof product_id like this, but it returns to me bad results,
SELECT name, category_id, product_id, COUNT(*) AS most_viewed
FROM product, notification
WHERE type='product_consumed' OR type='product_rate' AND product.id = notification.product_id AND category_id=".$cat."
GROUP BY product_id ORDER BY product.category_id ASC, most_viewed DESC LIMIT 10
My wish is to have a sql response like this :
Category of the product | Name of product | Number of views
Thanks
Try this, you are missing a set of brackets
SELECT name, category_id, product_id, COUNT(*) AS most_viewed
FROM product, notification
WHERE (type='product_consumed' OR type='product_rate') AND product.id = notification.product_id AND category_id=".$cat."
GROUP BY product_id
ORDER BY product.category_id ASC, most_viewed DESC LIMIT 10

PostgreSQL DISTINCT ON with different ORDER BY

I want to run this query:
SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM purchases
WHERE purchases.product_id = 1
ORDER BY purchases.purchased_at DESC
But I get this error:
PG::Error: ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
Adding address_id as first ORDER BY expression silences the error, but I really don't want to add sorting over address_id. Is it possible to do without ordering by address_id?
Documentation says:
DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. [...] Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. [...] The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).
Official documentation
So you'll have to add the address_id to the order by.
Alternatively, if you're looking for the full row that contains the most recent purchased product for each address_id and that result sorted by purchased_at then you're trying to solve a greatest N per group problem which can be solved by the following approaches:
The general solution that should work in most DBMSs:
SELECT t1.* FROM purchases t1
JOIN (
SELECT address_id, max(purchased_at) max_purchased_at
FROM purchases
WHERE product_id = 1
GROUP BY address_id
) t2
ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at
ORDER BY t1.purchased_at DESC
A more PostgreSQL-oriented solution based on #hkf's answer:
SELECT * FROM (
SELECT DISTINCT ON (address_id) *
FROM purchases
WHERE product_id = 1
ORDER BY address_id, purchased_at DESC
) t
ORDER BY purchased_at DESC
Problem clarified, extended and solved here: Selecting rows ordered by some column and distinct on another
A subquery can solve it:
SELECT *
FROM (
SELECT DISTINCT ON (address_id) *
FROM purchases
WHERE product_id = 1
) p
ORDER BY purchased_at DESC;
Leading expressions in ORDER BY have to agree with columns in DISTINCT ON, so you can't order by different columns in the same SELECT.
Only use an additional ORDER BY in the subquery if you want to pick a particular row from each set:
SELECT *
FROM (
SELECT DISTINCT ON (address_id) *
FROM purchases
WHERE product_id = 1
ORDER BY address_id, purchased_at DESC -- get "latest" row per address_id
) p
ORDER BY purchased_at DESC;
If purchased_at can be NULL, use DESC NULLS LAST - and match your index for best performance. See:
Sort by column ASC, but NULL values first?
Why does ORDER BY NULLS LAST affect the query plan on a primary key?
Related, with more explanation:
Select first row in each GROUP BY group?
Sort by column ASC, but NULL values first?
You can order by address_id in an subquery, then order by what you want in an outer query.
SELECT * FROM
(SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM "purchases"
WHERE "purchases"."product_id" = 1 ORDER BY address_id DESC )
ORDER BY purchased_at DESC
Window function may solve that in one pass:
SELECT DISTINCT ON (address_id)
LAST_VALUE(purchases.address_id) OVER wnd AS address_id
FROM "purchases"
WHERE "purchases"."product_id" = 1
WINDOW wnd AS (
PARTITION BY address_id ORDER BY purchases.purchased_at DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
For anyone using Flask-SQLAlchemy, this worked for me
from app import db
from app.models import Purchases
from sqlalchemy.orm import aliased
from sqlalchemy import desc
stmt = Purchases.query.distinct(Purchases.address_id).subquery('purchases')
alias = aliased(Purchases, stmt)
distinct = db.session.query(alias)
distinct.order_by(desc(alias.purchased_at))
It can also be solved using the following query along with other answers.
WITH purchase_data AS (
SELECT address_id, purchased_at, product_id,
row_number() OVER (PARTITION BY address_id ORDER BY purchased_at DESC) AS row_number
FROM purchases
WHERE product_id = 1)
SELECT address_id, purchased_at, product_id
FROM purchase_data where row_number = 1
You can also done this by using group by clause
SELECT purchases.address_id, purchases.* FROM "purchases"
WHERE "purchases"."product_id" = 1 GROUP BY address_id,
purchases.purchased_at ORDER purchases.purchased_at DESC