sql select with group by and join for lookup - sql

Imagine I have a product table and an orders table. I want to list out the most recent order for each product
I imagine something like this
select name, description, price, max(date)
from product
join order on order.item = product.name
group by order.item
But my postgres DB complains that I cant have raw fields (sqlite doesnt complain) I need to have aggregate function. I can put min() for each column but that seems like a waste, given that all the values for a particular product are always the same. I wondered about 'distinct' but that doesnt seem to help here
NOTE - I need standard portable SQL , not specific to any given engine.

In Postgres, you can use distinct on:
select distinct on (o.item) p.name, description, price, date
from product p join
order o
on o.item = p.name
order by o.item, date desc;
I added aliases into the query. I strongly advise you to always qualify all column names. I would do that but I don't know where they come from in most cases.

If you require standard ANSI SQL you can use a window function:
select *
from (
select p.name, p.description, p.price,
o.date,
max(o.date) over (partition by o.item) as last_date
from product p
join "order" o on o.item = p.name
) t
where date = last_date;
But in Postgres distinct on () is usually a lot faster.

If it was Oracle or MS you would need to group by all the fields in your select that aren't aggregate functions.
It would be an extra line before "order by" with "group by p.name, description, price, date" ...
About Postgres I am not so sure, but probably it will work.

You can use correlated subquery :
select p.name, p.description, p.price, o.date
from product p inner join
order o
on o.item = p.name
where o.date = (select max(o1.date)
from order o1
where o1.item = p.name
);

Related

SQL Anywhere error - Function or column reference to 'Fraction' must also appear in a GROUP BY

I have this simple SQL SELECT that summarizes by ProductID but I get this "must appear" error.
SELECT Products.ProductID,
Products.Qtty / Catalog.Fraction as Amount
FROM Products, Catalog
WHERE Catalog.ID = Products.ID
GROUP BY Products.ProductID
SQL requires me to put column Fraction into GROUP BY clause which I do not need. I just need to summarize by ProductID. How do I build correct SQL statement here?
Thanks.
You need to use aggregate function:
SELECT Products.ProductID,
SUM(Products.Qtty) / SUM(Catalog.Fraction) as Amount
FROM Products
JOIN Catalog
ON Catalog.ID = Products.ID
GROUP BY Products.ID
You are doing division, so I might suspect that you want to divide before adding things up. You should also use explicit JOIN syntax and table aliases:
SELECT p.ProductID,
SUM(p.Qtty / c.Fraction) as Amount
FROM Products p JOIN
Catalog c
ON c.ID = p.ID
GROUP BY p.ProductID;

What can I use other than Group By?

I have a question that uses this DML statement
SELECT SupplierID, COUNT(*) AS TotalProducts
FROM Products
GROUP BY SupplierID;
I'm trying to get the same results without using "Group By". I can use a table variable or a temp table, with Insert and Update if needed. Also, using While and IF-Else is allowed.
I'm really lost any help would be awesome. Thanks SO Community.
This is used in SQL Server. Thanks again.
You can always use SELECT DISTINCT with window functions:
SELECT DISTINCT SupplierID,
COUNT(*) OVER (PARTITION BY SupplierId) AS TotalProducts
FROM Products;
But GROUP BY is the right way to write an aggregation query.
You may also use the following query :
select distinct P.SupplierID, (select count(*) from Products
where SupplierID=P.SupplierID) TotalProducts from Products P
You will get the same result using the above query, but i don't think avoiding GROUP BY is a good idea!
Using a subquery:
SELECT DISTINCT SupplierID
,(SELECT COUNT(*)
FROM Products P2
WHERE P2.SupplierID = P.SupplierID
) AS TotalProducts
FROM Products P
The distinct is to remove duplicates... the count executes for every row so without distinct you would get repeat answers for supplierID.
Another way
select distinct supplierId, p2.ttl
from products p1
cross apply
(
select count(*)
from products p2
where p1.supplierId = p2.supplierId
) p2(ttl);

Using SQL query to find details of customers who ordered > x types of products

Please note that I have seen a similar query here, but think my query is different enough to merit a separate question.
Suppose that there is a database with the following tables:
customer_table with customer_ID (key field), customer_name
orders_table with order_ID (key field), customer_ID, product_ID
Now suppose I would like to find the names of all the customers who have ordered more than 10 different types of product, and the number of types of products they ordered. Multiple orders of the same product does not count.
I think the query below should work, but have the following questions:
Is the use of count(distinct xxx) generally allowed with a "group by" statement?
Is the method I use the standard way? Does anybody have any better ideas (e.g. without involving temporary tables)?
Below is my query
select T1.customer_name, T1.customer_ID, T2.number_of_products_ordered
from customer_table T1
inner join
(
select cust.customer_ID as customer_identity, count(distinct ord.product_ID) as number_of_products_ordered
from customer_table cust
inner join order_table ord on cust.customer_ID=ord.customer_ID
group by ord.customer_ID, ord.product_ID
having count(distinct ord.product_ID) > 10
) T2
on T1.customer_ID=T2.customer_identity
order by T2.number_of_products_ordered, T1.customer_name
Isn't that what you are looking for? Seems to be a little bit simpler. Tested it on SQL Server - works fine.
SELECT customer_name, COUNT(DISTINCT product_ID) as products_count FROM customer_table
INNER JOIN orders_table ON customer_table.customer_ID = orders_table.customer_ID
GROUP BY customer_table.customer_ID, customer_name
HAVING COUNT(DISTINCT product_ID) > 10
You could do it more simply:
select
c.id,
c.cname,
count(distinct o.pid) as `uniques`
from o join c
on c.id = o.cid
group by c.id
having `uniques` > 10

Get Product Onhand Quantity

Using sqlite3, I have two tables: products, orders. I want to know how many products are left in the shop.
SELECT pid,
txt,
price,
qty-coalesce((SELECT SUM(qty)
FROM ORDERS
WHERE pid=?),0)
FROM PRODUCTS
WHERE pid=?
This works if I select 1 product, I would like a list of all my products ?
SELECT
P.pid, P.txt, P.price,
P.qty - coalesce((SELECT sum(O.qty) FROM orders O WHERE O.pid = P.pid), 0)
FROM products P
Try this:
SELECT
pid,
txt,
price,
qty-coalesce(
(SELECT sum(qty)
FROM orders
WHERE orders.pid = products.pid),0)
FROM products
I recommend using:
SELECT t.pid,
t.txt,
t.price,
t.qty - IFNULL(qs.qty_sold, 0) 'onhand_qty'
FROM PRODUCTS t
LEFT JOIN (SELECT o.pid,
SUM(o.qty) 'qty_sold'
FROM ORDERS o) qs ON qs."o.pid" = t.pid
WHERE t.pid = ?
While it works, using correllated SELECT statements in the SELECT clause will have the worst performance because they are executing once for every row returned in your query.
IFNULL is preferrable to use in this case compared to COALESCE. COALESCE is intended for checking 2+ values for being null, giving a false impression when someone else reads your code. There isn't any inherent benefit - per the documentation, they are the same.
Reference: SQLite Core Functions

How to reference a custom field in SQL

I am using mssql and am having trouble using a subquery. The real query is quite complicated, but it has the same structure as this:
select
customerName,
customerId,
(
select count(*)
from Purchases
where Purchases.customerId=customerData.customerId
) as numberTransactions
from customerData
And what I want to do is order the table by the number of transactions, but when I use
order by numberTransactions
It tells me there is no such field. Is it possible to do this? Should I be using some sort of special keyword, such as this, or self?
use the field number, in this case:
order by 3
Sometimes you have to wrestle with SQL's syntax (expected scope of clauses)
SELECT *
FROM
(
select
customerName,
customerId,
(
select count(*)
from Purchases
where Purchases.customerId=customerData.customerId
) as numberTransactions
from customerData
) as sub
order by sub.numberTransactions
Also, a solution using JOIN is correct. Look at the query plan, SQL Server should give identical plans for both solutions.
Do an inner join. It's much easier and more readable.
select
customerName,
customerID,
count(*) as numberTransactions
from
customerdata c inner join purchases p on c.customerID = p.customerID
group by customerName,customerID
order by numberTransactions
EDIT: Hey Nathan,
You realize you can inner join this whole table as a sub right?
Select T.*, T2.*
From T inner join
(select
customerName,
customerID,
count(*) as numberTransactions
from
customerdata c inner join purchases p on c.customerID = p.customerID
group by customerName,customerID
) T2 on T.CustomerID = T2.CustomerID
order by T2.numberTransactions
Or if that's no good you can construct your queries using temporary tables (#T1 etc)
There are better ways to get your result but just from your example query this will work on SQL2000 or better.
If you wrap your alias in single ticks 'numberTransactions' and then call ORDER BY 'numberTransactions'
select
customerName,
customerId,
(
select count(*)
from Purchases
where Purchases.customerId=customerData.customerId
) as 'numberTransactions'
from customerData
ORDER BY 'numberTransactions'
The same thing could be achieved by using GROUP BY and a JOIN, and you'll be rid of the subquery. This might be faster too.
I think you can do this in SQL2005, but not SQL2000.
You need to duplicate your logic. SQL Server isn't very smart at columns that you've named but aren't part of the dataset in your FROM statement.
So use
select
customerName,
customerId,
(
select count(*)
from Purchases p
where p.customerId = c.customerId
) as numberTransactions
from customerData c
order by (select count(*) from purchases p where p.customerID = c.customerid)
Also, use aliases, they make your code easier to read and maintain. ;)