Find a value which contains in ALL rows of a table - sql

I need to select all values which are contained in ALL rows of a table.
I have table “Ingredient” and ProductIngredient(there I have a recipe of a product).
Ingredient
| ingredient_id | name | price |
| 1 | Bla | 100
| 2 | foo | 50
ProductIngredient.
| Product_id | ingredient_id
| 1 | 1
| 1 | 2
| 2 | 1
The output should be
| 1 | Bla |
as it is in all rows of ProductIngredient.
SELECT DISTINCT Ingredient_Id
FROM Ingredients I
WHERE Ingredient_Id = ALL
(SELECT Ingredient_id FROM ProductIngredient PI
WHERE PI.Ingredient_Id = I.Ingredient_Id );
How can I fix my code to make it work?

This will give you all Ingredients from I that are in PI for each product. This is assuming that each product does not have multiple rows for a product and ingredient combination.
SELECT I.Ingredient_Id
FROM Ingredients I INNER JOIN ProductIngredient PI
ON PI.Ingredient_Id = I.Ingredient_Id
GROUP BY I.Ingredient_Id
HAVING COUNT(*) >= (SELECT COUNT(DISTINCT Product_id) FROM ProductIngredient)

Related

Return rows from a table and add field for that row if the ID has a relationship with another table

DBMS used: Amazon Aurora
I have a table that I store a list of all my products, let's call it products
+----+--------------+
| id | product_name |
+----+--------------+
| 1 | Product 1 |
+----+--------------+
| 2 | Product 2 |
+----+--------------+
| | |
+----+--------------+
Another table called redeemed_products stores the ID of the product that the user has redeemed.
+----+---------+------------+
| id | user_id | product_id |
+----+---------+------------+
| 1 | 1 | 1 |
+----+---------+------------+
| | | |
+----+---------+------------+
| | | |
+----+---------+------------+
I would like to retrieve all rows of products and add an extra field to the row which has a relation in the redeemed_products
+----+--------------+----------+
| id | product_name | redeemed |
+----+--------------+----------+
| 1 | Product 1 | true |
+----+--------------+----------+
| 2 | Product 2 | |
+----+--------------+----------+
| | | |
+----+--------------+----------+
The purpose of this is to retrieve the list of products and it will show which of the product has already been redeemed by the user. I do not know how I should approach this problem.
Use an outer join:
select p.id, p.product_name, rp.product_id is not null as redeemed
from products p
left join redeemed_products rp on rp.product_id = p.id;
Note that this will repeat rows from the products table if the product_id occurs more than once in the redeemed_products table (e.g. the same product_id for multiple user_ids).
If that is the case you could use a scalar sub-select:
select p.id, p.product_name,
exists (select *
redeemed_products rp
where rp.product_id = p.id) as redeemed
from products p;
You haven't tagged your DBMS, but the above is standard ANSI SQL, but not all DBMS products actually support boolean expressions like that in the SELECT list.
One option would be using a conditional within a LEFT JOIN query :
SELECT p.*, CASE WHEN r.product_id IS NOT NULL THEN 'true' END AS redeemed
FROM products p
LEFT JOIN redeemed_products r
ON r.product_id = p.id

SQL - IN clause with no match

I'm trying to build a query where I can select from a table all products with a certain ID but I would also like to find out what products were not found within the IN clause.
Product Table
ID | Name
---|---------
1 | ProductA
2 | ProductB
4 | ProductD
5 | ProductE
6 | ProductF
7 | ProductG
select *
from products
where id in (2,3,7);
As you can see, product id 3 does not exist in the table.
My query will only return rows 2 and 7.
I would like a blank/null row returned if a value in the IN clause did not return anything.
Desired Results:
ID | Name
---|---------
2 | ProductB
3 | null
7 | ProductG
You can use a left join:
select i.id, p.name
from (select 2 as id union all select 3 union all select 7
) i left join
products p
on p.id = i.id
IN is not useful in this case.
Use a CTE with the ids that you want to search for and left join to the table:
with cte(id) as (select * from (values (2),(3),(7)))
select c.id, p.name
from cte c left join products p
on p.id = c.id
See the demo.
Results:
| id | Name |
| --- | -------- |
| 2 | ProductB |
| 3 | |
| 7 | ProductG |

How to write this query to avoid cartesian product?

I want to create a CSV export for orders showing the warehouse_id where each order_item had shipped from, if available.
For brevity, here is the pertinent schema:
create table o (id integer);
orders have many order_items:
create table oi (id integer, o_id integer, sku text, quantity integer);
For each order_item in the CSV we want to show a warehouse_id from where it shipped out of. But that is not stored in order_items. It is stored in the shipment.
An order can be split up into many shipments from potentially from different warehouses.
create table s (id integer, o_id integer, warehouse_id integer);
shipments have many shipment items too:
create table si (id integer, s_id integer, oi_id integer, quantity_shipped integer);
How do I extract the warehouse_id for each order_item, given that warehouse_id is on the shipment and not every order has shipped yet (may not have a shipment record or shipment_items).
We are doing something like this (simplified):
select oi.sku, s.warehouse_id from oi
left join s on s.o_id = oi.o_id;
However if an order has 2 order items, let's call them sku A and B. And that order was split into two shipments where A was shipped from warehouse '50' and then a second shipment shipped B from '200'.
What we want would be a CSV output like:
sku | warehouse_id
-----|--------------
A | 50
B | 200
But what we get is some kind of cartesian product:
=================================
Here is the sample data:
select * from o;
id
----
1
(1 row)
select * from oi;
id | o_id | sku | quantity
----+------+-----+----------
1 | 1 | A | 1
2 | 1 | B | 1
(2 rows)
select * from s;
id | o_id | warehouse_id
----+------+--------------
1 | 1 | 50
2 | 1 | 200
(2 rows)
select * from si;
id | s_id | oi_id
----+------+------
1 | 1 | 1
2 | 2 | 2
(2 rows)
select oi.sku, s.warehouse_id from oi left join s on s.o_id = oi.o_id;
sku | warehouse_id
-----+--------------
A | 50
A | 200
B | 50
B | 200
(4 rows)
UPDATE ========
Per spencer, I'm adding a different example with different pk ids for more clarity. The following is 2 example orders. Order 2 has items A,B,C. A,B are shipped from shipment 200, C is shipped from shipment 201. Order 3 has 2 items E and A. E is not yet shipped and A is shipped twice out of the same warehouse '700', (like it was on back order).
# select * from o;
id
----
2
3
(2 rows)
# select * from oi;
id | o_id | sku | quantity
-----+------+-----+----------
100 | 2 | A | 1
101 | 2 | B | 1
102 | 2 | C | 1
103 | 3 | E | 1
104 | 3 | A | 2
(5 rows)
# select * from s;
id | o_id | warehouse_id
-----+------+--------------
200 | 2 | 700
201 | 2 | 800
202 | 3 | 700
203 | 3 | 700
(4 rows)
# select * from si;
id | s_id | oi_id
-----+------+-------
300 | 200 | 100
301 | 200 | 101
302 | 201 | 102
303 | 202 | 104
304 | 203 | 104
(5 rows)
I think this works, I use left join to keep the order_items in the report no matter if the order has shipped or not, I use group by to squash multiple shipments from the same warehouse. I believe this is what I need.
# select oi.o_id, oi.id, oi.sku, s.warehouse_id from oi left join si on si.oi_id = oi.id left join s on s.id = si.s_id group by oi.o_id, oi.id, oi.sku, s.warehouse_id order by oi.o_id;
o_id | id | sku | warehouse_id
------+-----+-----+--------------
2 | 102 | C | 800
2 | 101 | B | 700
2 | 100 | A | 700
3 | 104 | A | 700
3 | 103 | E |
(5 rows)
Order items that have shipped ...
SELECT oi.id
, oi.sku
, s.warehouse_id
FROM oi
JOIN si ON si.oi_id = oi.id
JOIN s ON s.id = si.s_id
Order items that haven't yet shipped, using anti-join to exclude rows where there is a matching row in si
SELECT oi.id
, oi.sku
, s.warehouse_id
FROM oi
JOIN s ON s.o_id = oi.o_id -- fk to fk shortcut join
-- anti-join
LEFT
JOIN si ON si.oi_id = oi.id
WHERE si.oi_id IS NULL
But this will still produce a (partial) Cartesian product. We can add a GROUP BY clause to collapse the rows...
GROUP BY si.oi_id
This doesn't avoid producing an intermediate cartesian product; the addition of the GROUP BY clause collapses the set. But it's indeterminate which of matching rows from s column values will be returned from.
The two queries could be combined with a UNION ALL operation. If I did that, I'd likely add a discriminator column (an additional column in each query with different values, which would tell which query returned a row.)
This set might meet the specification outlined in the OP question. But I don't think this is really the set that needs to be returned. Figuring out which warehouse an item should ship from may involve several factors... total quantity ordered, quantity available in each warehouse, can order be fulfilled from one warehouse, which warehouse is closer to delivery destination, etc.
I don't want to leave anyone with the impression that this query is really a "fix" for the cartesian product problem... this query just hides a bigger problem.
I think you need the si table:
select oi.sku, s.warehouse_id
from si join
oi
on si.o_id = oi.o_id join
s
on s.s_id = si.s_id;
si seems to be the proper junction table between the tables. I'm not sure why there is another join key that doesn't use it.

Doing a market basket analysis on the order details

I have a table that looks (abbreviated) like:
| order_id | item_id | amount | qty | date |
|---------- |--------- |-------- |----- |------------ |
| 1 | 1 | 10 | 1 | 10-10-2014 |
| 1 | 2 | 20 | 2 | 10-10-2014 |
| 2 | 1 | 10 | 1 | 10-12-2014 |
| 2 | 2 | 20 | 1 | 10-12-2014 |
| 2 | 3 | 45 | 1 | 10-12-2014 |
| 3 | 1 | 10 | 1 | 9-9-2014 |
| 3 | 3 | 45 | 1 | 9-9-2014 |
| 4 | 2 | 20 | 1 | 11-11-2014 |
I would like to run a query that would calculate the list of items
that most frequently occur together.
In this case the result would be:
|items|frequency|
|-----|---------|
|1,2, |2 |
|1,3 |1 |
|2,3 |1 |
|2 |1 |
Ideally, first presenting orders with more than one items, then presenting
the most frequently ordered single items.
Could anyone please provide an example for how to structure this SQL?
This query generate all of the requested output, in the cases where 2 items occur together. It doesn't include the last item of the requested output since a single value (2) technically doesn't occur together with anything... although you could easily add a UNION query to include values that happen alone.
This is written for PostgreSQL 9.3
create table orders(
order_id int,
item_id int,
amount int,
qty int,
date timestamp
);
INSERT INTO ORDERS VALUES(1,1,10,1,'10-10-2014');
INSERT INTO ORDERS VALUES(1,2,20,1,'10-10-2014');
INSERT INTO ORDERS VALUES(2,1,10,1,'10-12-2014');
INSERT INTO ORDERS VALUES(2,2,20,1,'10-12-2014');
INSERT INTO ORDERS VALUES(2,3,45,1,'10-12-2014');
INSERT INTO ORDERS VALUES(3,1,10,1,'9-9-2014');
INSERT INTO ORDERS VALUES(3,3,45,1,'9-9-2014');
INSERT INTO ORDERS VALUES(4,2,10,1,'11-11-2014');
with order_pairs as (
select (pg1.item_id, pg2.item_id) as items, pg1.date
from
(select distinct item_id, date
from orders) as pg1
join
(select distinct item_id, date
from orders) as pg2
ON
(
pg1.date = pg2.date AND
pg1.item_id != pg2.item_id AND
pg1.item_id < pg2.item_id
)
)
SELECT items, count(*) as frequency
FROM order_pairs
GROUP by items
ORDER by items;
output
items | frequency
-------+-----------
(1,2) | 2
(1,3) | 2
(2,3) | 1
(3 rows)
Market Basket Analysis with Join.
Join on order_id and compare if item_id < self.item_id. So for every item_id you get its associated items sold. And then group by items and count the number of rows for each combinations.
select items,count(*) as 'Freq' from
(select concat(x.item_id,',',y.item_id) as items from orders x
JOIN orders y ON x.order_id = y.order_id and
x.item_id != y.item_id and x.item_id < y.item_id) A
group by A.items order by A.items;

Tree with recursive and default

Using Postgres.
I have a pricelists
CREATE TABLE pricelists(
id SERIAL PRIMARY KEY,
name TEXT,
parent_id INTEGER REFERENCES pricelists
);
and another table, prices, referencing it
CREATE TABLE prices(
pricelist_id INTEGER REFERENCES pricelists,
name TEXT,
value INTEGER NOT NULL,
PRIMARY KEY (pricelist_id, name)
);
Parent pricelist id=1 may have 10 prices.
Pricelist id=2 as a child of parent 1 may have 5 prices which override parent 1 prices of the same price name.
Child Pricelist id=3 as as a child of pricelist 2 may have 2 price which override child 2 prices of the same price name.
Thus when I ask for child 3 prices, I want to get
all prices of child 3 and
those prices of his parent (child 2) that do not exists in child 3 and
all parent 1 prices that do not exists until now.
The schema can be changed in order to be efficient.
Example:
If
SELECT pl.id AS id, pl.parent_id AS parent, p.name AS price_name, value
FROM pricelists pl
JOIN prices p ON pl.id = p.pricelist_id;
gives
| id | parent | price_name | value |
|----------|:-------------:|------------:|------------:|
| 1 | 1 | bb | 10 |
| 1 | 1 | cc | 10 |
| 2 | 1 | aa | 20 |
| 2 | 1 | bb | 20 |
| 3 | 2 | aa | 30 |
then I'm looking for a way of fetching pricelist_id = 3 prices that'd give me
| id | parent | price_name | value |
|----------|:-------------:|------------:|------------:|
| 1 | 1 | cc | 10 |
| 2 | 1 | bb | 20 |
| 3 | 2 | aa | 30 |
WITH RECURSIVE cte AS (
SELECT id, name, parent_id, 1 AS lvl
FROM pricelists
WHERE id = 3 -- provide your id here
UNION ALL
SELECT pl.id, pl.name, pl.parent_id, c.lvl + 1
FROM cte c
JOIN pricelists pl ON pl.id = c.parent_id
)
SELECT DISTINCT ON (p.price_name)
c.id, c.parent_id, p.price_name, p.value
FROM cte c
JOIN prices p ON p.pricelist_id = c.id
ORDER BY p.price_name, c.lvl; -- lower lvl beats higher level
Use a recursive CTE like here:
Total children values based on parent
Recursive SELECT query to return rates of arbitrary depth?
There are many related answers.
Join to prices once at the end, that's cheaper.
Use DISTINCT ON the get the "greatest per group":
Select first row in each GROUP BY group?