SQL Beginner: Getting items from 2 tables (+grouping+ordering)

SQL Beginner: Getting items from 2 tables (+grouping+ordering) - sql

I have an e-commerce website (using VirtueMart) and I sell products that consist child products. When a product is a parent, it doesn't have ParentID, while it's children refer to it. I know, not the best logic but I didn't create it.
My SQL is very basic and I believe I ask for something quite easy to achieve
Select products that have children.
Sort results by prices (ASC/DSC).

SELECT * FROM Products INNER JOIN Prices ON Products.ProductID = Prices.ProductID ORDER BY Products.Price [ASC/DSC]
Explanation:
SELECT - Select (Get/Retrieve)
* - ALL
FROM Products - Get them from a DB Table named "Products".
INNER JOIN Prices - Selects all rows from both tables as long as there is a match between the columns in both tables. Rather, JOIN DB Table "Products" with DB Table "Prices".
ON - Like WHERE, this defines which rows will be checked for matches.
Products.ProductID = Prices.ProductID - Your match criteria. Get the rows where "ProductID" exists in both DB Tables "Products" and "Prices".
ORDER BY Products.Price [ASC/DSC] - Sorting. Use ASC for Ascending, DSC for Descending.

This table design is subpar for a number of reasons. First, it appears that the value 0 is being used to indicate lack of a parent (as there's no 0 ID for products). Typically this will be a NULL value instead.
If it were a NULL value, the SQL statement to get everything without a parent would be as simple as this:
SELECT * FROM Products WHERE ParentID IS NULL
However, we can't do that. If we make the assumption that 0 = no parent, we can do this:
SELECT * FROM Products WHERE ParentID = 0
However, that's a dangerous assumption to make. Thus, the correct way to do this (given your schema above), would be to compare the two tables and ensure that the parentID exists as a ProductID:
SELECT a.*
FROM Products AS a
WHERE EXISTS (SELECT * FROM Products AS b WHERE a.ID = b.ParentID)
Next, to get the pricing, we have to join those two tables together on a common ID. As the Prices table seems to reference a ProductID, we can use that like so:
SELECT p.ProductID, p.ProductName, pr.Price
FROM Products AS p INNER JOIN Prices AS pr ON p.ProductID = pr.ProductID
WHERE EXISTS (SELECT * FROM Products AS b WHERE p.ID = b.ParentID)
ORDER BY pr.Price
That might be sufficient per the data you've shown, but usually that type of table structure indicates that it's possible to have more than one price associated with a product (we're unable to tell whether this is true based on the quick snapshot).
That should get you close... if you need something more, we'll need more detail.

use the below script if you are using ssms.
SELECT pd.ProductId,ProductName,Price
FROM product pd
LEFT JOIN price pr ON pd.ProductId=pr.ProductID
WHERE EXISTS (SELECT 1 FROM product pd1 WHERE pd.productID=pd1.ParentID)
ORDER BY pr.Price ASC
Note :neither of your parent product have price in price table. If you want the sum of price of their child product use the below script.
SELECT pd.ProductId,pd.ProductName,SUM(ISNULL(pr.Price,0)) SUM_ChildPrice
FROM product pd
LEFT JOIN product pd1 ON pd.productID=pd1.ParentID
LEFT JOIN price pr ON pd1.ProductId=pr.ProductID
GROUP BY pd.ProductId,pd.ProductName
ORDER BY pr.Price ASC

You will have to use self-join:
For example:
SELECT * FROM products parent
JOIN products children ON parent.id = children.parent_id
JOIN prices ON prices.product_id = children.id
ORDER BY prices.price
Because we are using JOIN it will filter out all entries that don't have any children.
I haven't tested it, I hope it would work.

Related

SQL how to count the number of relations between two tables and include zeroes?

I have a table of orders, and a table of products contained in these orders. (The products-table has order_id, a foreign key referring to orders.id).
I would like to query the number of products contained in each order. However, I also want orders to be contained in the results if they do not contain any products at all.
This means that a simple
SELECT *, COUNT(*) n_products FROM `orders` INNER JOIN `products` on `products.order_id` = `orders.id` GROUP_BY `order_id`
does not work, since orders without any products disappear.
Using a LEFT OUTER JOIN instead would add rows without product-information, but the distinction between an order with 1 product and an order with 0 products is lost.
What am I missing here?

You need a left join here, and you should be counting some column from the products table:
SELECT
o.*,
COUNT(p.order_id) AS n_products
FROM orders o
LEFT JOIN products p
ON p.order_id = o.id
GROUP BY
o.id;
Note that I assume that Postgres would allow grouping by orders.id and then selecting all columns from that table. If not, then you would only be able to select o.id in addition to the count.

Deleting data from one table if the reference doesn't exist in two other tables

I managed to import too much data into one of my database tables. I want to delete most of this data, but I need to ensure that the reference doesn't exist in either of two other tables before I delete it.
I figured this query would be the solution. It give me the right result on a test database, but in the production environment it returns no hits.
select product
from products
where 1=1
and product not in (select product from location)
and product not in (select product from lines)

You are getting no results/hits it means that you table location and/or lines having the null values in the product column. in clause failed if column having null value.
try below query just added the null condition on the top of your shared query.
select product from products
where 1=1
and product not in ( select product from location where product is not null)
and product not in ( select product from lines where product is not null)

Use EXISTS instead of IN which is more efficient
DELETE FROM products WHERE
NOT EXISTS
(
SELECT
1
FROM [Location]
WHERE Product = Products.Product
) AND
NOT EXISTS
(
SELECT
1
FROM lines
WHERE Product = Products.Product
)

Try this..
DELETE FROM Products where not exists
(select 1 from Location
join lines on lines.Product = Location.Product
and Location.Product = Products.Product
);

It's difficult to tell from your post why the query would return results in the test database but not production other than there is different data or different structures. You might try including the DDL for the participating tables in your post so that we know what the table structures are. For example, is the "product" column a PK or a text name?
One thing that does jump out is that your query will probably perform poorly. Try something like this instead: (Assuming the "product" column is a PK in Products and FK in the other tables.)
Select product
From Products As p
Left Outer Join Location As l
On p.product = l.product
And l.product is null
Left Outer Join Lines as li
On p.product = li.product
And li.product is null;

This simple set based approach may help ...
DELETE p
FROM products p
LEFT JOIN location lo ON p.product = lo.product
LEFT JOIN lines li ON p.product = li.product
WHERE lo.product IS NULL AND li.product IS NULL

SQL Server query for related products

I am trying to get related products but the issue which I'm facing is that there is product photos table which has one-to-many relationship with products table, so when I get products by matching category Id it also returns multiple product photos with that product which i do not want. I want only one product photo from product photos table of specific product. Is there any way to use distinct in joins or any other way? what I have done so far....
SELECT [Product].[ID],
,[Thumbnail]
,[ProductName]
,[Model]
,[SKU]
,[Price]
,[IsExclusive]
,[DiscountPercentage]
,[DiscountFixed]
,[NetPrice]
,[Url]
FROM [dbo].[Product]
INNER JOIN [ProductPhotos] ON [ProductPhotos].[ProductID]=[Product].[ID]
INNER JOIN [ProductCategories] ON [ProductCategories].[ProductID]=
[Product].[ID]
WHERE [ProductCategories].[CategoryID]=4
And the result I am getting is...
Product Photos table has
Is there any way to use distinct or group by on product Id column in product photos table to return only one row from photos table.

Instead of using inner join, use cross apply:
SELECT . . .
FROM dbo.Product p CROSS APPLY
(SELECT TOP (1) pp.*
FROM ProductPhotos pp
WHERE pp.ProductID = p.id
ORDER BY NEW_ID()
) pp INNER JOIN
ProductCategories pc
ON pc.ProductID = p.id
WHERE pc.CategoryID = 4;
Notes:
The ORDER BY NEWID() chooses a random photo. You can order by specific columns to get the earliest, latest, biggest, or whatever.
Note that I added table aliases. These make the query easier to write and to read.
You should qualify all column names in your query, so it is clear which tables they come from.
I removed the superfluous square braces. They just make the query harder to write and to read.

You can use ROW_NUMBER() to return one row for ProductID, like this:
JOIN (SELECT *,
ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY PhotoID) rn
FROM [ProductPhotos]) [ProductPhotos]
ON [ProductPhotos].[ProductID]=[Product].[ID] AND [ProductPhotos].rn = 1
Instead of this:
JOIN [ProductPhotos] ON [ProductPhotos].[ProductID]=[Product].[ID]

you can use sub query in join with distinct instead of joining table directly.
you can create alias and use that column as distinct in select statement, but it will create performance issues when having loads of data inside.
if you have 3 different photos for same product Id (like 2). you can use sub-query with top 1 order by PK desc to get latest picture.

SQL many-to-many, how to check criteria on multiple rows

In many-to-many table, how to find ID where all criteria are matched, but maybe one row matches one criterion and another row matches another criterion?
For example, let's say I have a table that maps shopping carts to products, and another table where the products are defined.
How can I find a shopping cart that has at least one one match for every criterion?
Criteria could be, for example, product.category like '%fruit%', product.category like '%vegetable%', etc.
Ultimately I want to get back a shopping cart ID (could be all of them, but in my specific case I am happy to get any matching ID) that has at least one of each match in it.

I am assuming a table named cart_per_product with fields cart,product, and a table named product with fields product,category.
select cart from cart_per_product c
where exists
(
select 1 from product p1 where p1.product=c.product and p1.category like N'%fruit%'
)
and exists
(
select 1 from product p2 where p2.product=c.product and p2.category like N'%vegetable%'
)

You can use ANY and ALL operators combined with outer joins. A simple sample on a M:N relation:
select p.name
from products p
where id_product = ALL -- all operator
( select pc.id_product
from categories c
left outer join product_category pc on pc.id_product = p.id_product and
pc.id_category = c.id_category
)

I think you can figure out the column names
select c.id
from cart c
join product p
on c.pID = p.ID
group by c.id
having count(distinct p.catID) = (select count(distinct p.catID) from product)

Generic approach that possibly isn't the most efficient:
with data as (
select *,
count(case when <match condition> then 1 end)
over (partition by cartid) as matches
from <cart inner join products ...>
)
select * from data
where matches > 0;

Excluding multiple results in specific column (SQL JOIN)

I'm taking my first steps in terms of practical SQL use in real life.
I have a few tables with contractual and financial information and the query works exactly as I need - to a certain point. It looks more or less like that:
SELECT /some columns/ from CONTRACTS
Linked 3 extra tables with INNER JOIN to add things like department names, product information etc. This all works but they all have simplish one-to-one relationship (one contract related to single department in Department table, one product information entry in the corresponding table etc).
Now this is my challenge:
I also need to add contract invoicing information doing something like:
inner join INVOICES on CONTRACTS.contnoC = INVOICES.contnoI
(and selecting also the Invoice number linked to the Contract number, although that's partly optional)
The problem I'm facing is that unlike with other tables where there's always one-to-one relationship when joining tables, INVOICES table can have multiple (or none at all) entries that correspond to a single contract no. The result is that I will get multiple query results for a single contract no (with different invoice numbers presented), needlessly crowding the query results.
Essentially I'm looking to add INVOICES table to a query to just identify if the contract no is present in the INVOICES table (contract has been invoiced or not). Invoice number itself could be presented (it is with INNER JOIN), however it's not critical as long it's somehow marked. Invoice number fields remains blank in the result with the INNER JOIN function, which is also necessary (i.e. to have the row presented even if the match is not found in INVOICES table).
SELECT DISTINCT would look to do what I need, but I seemed to face the problem that I need to levy DISTINCT criteria only for column representing contract numbers, NOT any other column (there can be same values presented, but all those should be presented).
Unfortunately I'm not totally aware of what database system I am using.

Seems like the question is still getting some attention and in an effort to provide some explanation here are a few techniques.
If you just want any contract with details from the 1 to 1 tables you can do it similarily to what you have described. the key being NOT to include any column from Invoices table in the column list.
SELECT
DISTINCT Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
INNER JOIN Invoices i
ON c.contnoC = i.contnoI
Perhaps a Little cleaner would be to use IN or EXISTS like so:
SELECT
Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
WHERE
EXISTS (SELECT 1 FROM Invoices i WHERE i.contnoI = c.contnoC )
SELECT
Contract, Department, ProductId .....(nothing from Invoices Table!!!)
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
WHERE
contnoC IN (SELECT contnoI FROM Invoices)
Don't use IN if the SELECT ... list can return a NULL!!!
If you Actually want all of the contracts and just know if a contract has been invoiced you can use aggregation and a case expression:
SELECT
Contract, Department, ProductId, CASE WHEN COUNT(i.contnoI) = 0 THEN 0 ELSE 1 END as Invoiced
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
LEFT JOIN Invoices i
ON c.contnoC = i.contnoI
GROUP BY
Contract, Department, ProductId
Then if you actually want to return details about a particular invoice you can use a technique similar to that of cybercentic87 if your RDBMS supports or you could use a calculated column with TOP or LIMIT depending on your system.
SELECT
Contract, Department, ProductId, (SELECT TOP 1 InvoiceNo FROM invoices i WHERE c.contnoC = i.contnoI ORDER BY CreateDate DESC) as LastestInvoiceNo
FROM
Contracts c
INNER JOIN Departments D
ON c.departmentId = d.Department
INNER JOIN Product p
ON c.ProductId = p.ProductId
GROUP BY
Contract, Department, ProductId

I would do it this way:
with mainquery as(
<<here goes you main query>>
),
invoices_rn as(
select *,
ROW_NUMBER() OVER (PARTITION BY contnoI order by
<<some column to decide which invoice you want to take eg. date>>) as rn
)
invoices as (
select * from invoices_rn where rn = 1
)
select * from mainquery
left join invoices i on contnoC = i.contnoI
This gives you an ability to get all of the invoice details to your query, also it gives you full control of which invoice you want see in your main query. Please read more about CTEs; they are pretty handy and much easier to understand / read than nested selects.
I still don't know what database you are using. If ROW_NUMBER is not available, I will figure out something else :)
Also with a left join you should use COALESCE function for example:
COALESCE(i.invoice_number,'0')
Of course this gives you some more possibilities, you could for example in your main select do:
CASE WHEN i.invoicenumber is null then 'NOT INVOICED'
else 'INVOICED'
END as isInvoiced

You can use
SELECT ..., invoiced = 'YES' ... where exists ...
union
SELECT ..., invoiced = 'NO' ... where not exists ...
or you can use a column like "invoiced" with a subquery into invoices to set it's value depending on whether you get a hit or not

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas