SQL Query 2 tables null results - sql

I was asked this question in an interview:
From the 2 tables below, write a query to pull customers with no sales orders.
How many ways to write this query and which would have best performance.
Table 1: Customer - CustomerID
Table 2: SalesOrder - OrderID, CustomerID, OrderDate
Query:
SELECT *
FROM Customer C
RIGHT OUTER JOIN SalesOrder SO ON C.CustomerID = SO.CustomerID
WHERE SO.OrderID = NULL
Is my query correct and are there other ways to write the query and get the same results?

Answering for MySQL instead of SQL Server, cause you tagged it later with SQL Server, so I thought (since this was an interview question, that it wouldn't bother you, for which DBMS this is). Note though, that the queries I wrote are standard sql, they should run in every RDBMS out there. How each RDBMS handles those queries is another issue, though.
I wrote this little procedure for you, to have a test case. It creates the tables customers and orders like you specified and I added primary keys and foreign keys, like one would usually do it. No other indexes, as every column worth indexing here is already primary key. 250 customers are created, 100 of them made an order (though out of convenience none of them twice / multiple times). A dump of the data follows, posted the script just in case you want to play around a little by increasing the numbers.
delimiter $$
create procedure fill_table()
begin
create table customers(customerId int primary key) engine=innodb;
set #x = 1;
while (#x <= 250) do
insert into customers values(#x);
set #x := #x + 1;
end while;
create table orders(orderId int auto_increment primary key,
customerId int,
orderDate timestamp,
foreign key fk_customer (customerId) references customers(customerId)
) engine=innodb;
insert into orders(customerId, orderDate)
select
customerId,
now() - interval customerId day
from
customers
order by rand()
limit 100;
end $$
delimiter ;
call fill_table();
For me, this resulted in this:
CREATE TABLE `customers` (
`customerId` int(11) NOT NULL,
PRIMARY KEY (`customerId`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `customers` VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100),(101),(102),(103),(104),(105),(106),(107),(108),(109),(110),(111),(112),(113),(114),(115),(116),(117),(118),(119),(120),(121),(122),(123),(124),(125),(126),(127),(128),(129),(130),(131),(132),(133),(134),(135),(136),(137),(138),(139),(140),(141),(142),(143),(144),(145),(146),(147),(148),(149),(150),(151),(152),(153),(154),(155),(156),(157),(158),(159),(160),(161),(162),(163),(164),(165),(166),(167),(168),(169),(170),(171),(172),(173),(174),(175),(176),(177),(178),(179),(180),(181),(182),(183),(184),(185),(186),(187),(188),(189),(190),(191),(192),(193),(194),(195),(196),(197),(198),(199),(200),(201),(202),(203),(204),(205),(206),(207),(208),(209),(210),(211),(212),(213),(214),(215),(216),(217),(218),(219),(220),(221),(222),(223),(224),(225),(226),(227),(228),(229),(230),(231),(232),(233),(234),(235),(236),(237),(238),(239),(240),(241),(242),(243),(244),(245),(246),(247),(248),(249),(250);
CREATE TABLE `orders` (
`orderId` int(11) NOT NULL AUTO_INCREMENT,
`customerId` int(11) DEFAULT NULL,
`orderDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`orderId`),
KEY `fk_customer` (`customerId`),
CONSTRAINT `orders_ibfk_1` FOREIGN KEY (`customerId`) REFERENCES `customers` (`customerId`)
) ENGINE=InnoDB AUTO_INCREMENT=128 DEFAULT CHARSET=utf8;
INSERT INTO `orders` VALUES (1,247,'2013-06-24 19:50:07'),(2,217,'2013-07-24 19:50:07'),(3,8,'2014-02-18 20:50:07'),(4,40,'2014-01-17 20:50:07'),(5,52,'2014-01-05 20:50:07'),(6,80,'2013-12-08 20:50:07'),(7,169,'2013-09-10 19:50:07'),(8,135,'2013-10-14 19:50:07'),(9,115,'2013-11-03 20:50:07'),(10,225,'2013-07-16 19:50:07'),(11,112,'2013-11-06 20:50:07'),(12,243,'2013-06-28 19:50:07'),(13,158,'2013-09-21 19:50:07'),(14,24,'2014-02-02 20:50:07'),(15,214,'2013-07-27 19:50:07'),(16,25,'2014-02-01 20:50:07'),(17,245,'2013-06-26 19:50:07'),(18,182,'2013-08-28 19:50:07'),(19,166,'2013-09-13 19:50:07'),(20,69,'2013-12-19 20:50:07'),(21,85,'2013-12-03 20:50:07'),(22,44,'2014-01-13 20:50:07'),(23,103,'2013-11-15 20:50:07'),(24,19,'2014-02-07 20:50:07'),(25,33,'2014-01-24 20:50:07'),(26,102,'2013-11-16 20:50:07'),(27,41,'2014-01-16 20:50:07'),(28,94,'2013-11-24 20:50:07'),(29,43,'2014-01-14 20:50:07'),(30,150,'2013-09-29 19:50:07'),(31,218,'2013-07-23 19:50:07'),(32,131,'2013-10-18 19:50:07'),(33,77,'2013-12-11 20:50:07'),(34,2,'2014-02-24 20:50:07'),(35,45,'2014-01-12 20:50:07'),(36,230,'2013-07-11 19:50:07'),(37,101,'2013-11-17 20:50:07'),(38,31,'2014-01-26 20:50:07'),(39,56,'2014-01-01 20:50:07'),(40,176,'2013-09-03 19:50:07'),(41,223,'2013-07-18 19:50:07'),(42,145,'2013-10-04 19:50:07'),(43,26,'2014-01-31 20:50:07'),(44,62,'2013-12-26 20:50:07'),(45,195,'2013-08-15 19:50:07'),(46,153,'2013-09-26 19:50:07'),(47,179,'2013-08-31 19:50:07'),(48,104,'2013-11-14 20:50:07'),(49,7,'2014-02-19 20:50:07'),(50,209,'2013-08-01 19:50:07'),(51,86,'2013-12-02 20:50:07'),(52,110,'2013-11-08 20:50:07'),(53,204,'2013-08-06 19:50:07'),(54,187,'2013-08-23 19:50:07'),(55,114,'2013-11-04 20:50:07'),(56,38,'2014-01-19 20:50:07'),(57,236,'2013-07-05 19:50:07'),(58,79,'2013-12-09 20:50:07'),(59,96,'2013-11-22 20:50:07'),(60,37,'2014-01-20 20:50:07'),(61,207,'2013-08-03 19:50:07'),(62,22,'2014-02-04 20:50:07'),(63,120,'2013-10-29 20:50:07'),(64,200,'2013-08-10 19:50:07'),(65,51,'2014-01-06 20:50:07'),(66,181,'2013-08-29 19:50:07'),(67,4,'2014-02-22 20:50:07'),(68,123,'2013-10-26 19:50:07'),(69,108,'2013-11-10 20:50:07'),(70,55,'2014-01-02 20:50:07'),(71,76,'2013-12-12 20:50:07'),(72,6,'2014-02-20 20:50:07'),(73,18,'2014-02-08 20:50:07'),(74,211,'2013-07-30 19:50:07'),(75,53,'2014-01-04 20:50:07'),(76,216,'2013-07-25 19:50:07'),(77,32,'2014-01-25 20:50:07'),(78,74,'2013-12-14 20:50:07'),(79,138,'2013-10-11 19:50:07'),(80,197,'2013-08-13 19:50:07'),(81,221,'2013-07-20 19:50:07'),(82,118,'2013-10-31 20:50:07'),(83,61,'2013-12-27 20:50:07'),(84,28,'2014-01-29 20:50:07'),(85,16,'2014-02-10 20:50:07'),(86,39,'2014-01-18 20:50:07'),(87,3,'2014-02-23 20:50:07'),(88,46,'2014-01-11 20:50:07'),(89,189,'2013-08-21 19:50:07'),(90,59,'2013-12-29 20:50:07'),(91,249,'2013-06-22 19:50:07'),(92,127,'2013-10-22 19:50:07'),(93,47,'2014-01-10 20:50:07'),(94,178,'2013-09-01 19:50:07'),(95,141,'2013-10-08 19:50:07'),(96,188,'2013-08-22 19:50:07'),(97,220,'2013-07-21 19:50:07'),(98,15,'2014-02-11 20:50:07'),(99,175,'2013-09-04 19:50:07'),(100,206,'2013-08-04 19:50:07');
Okay, now to the queries. Three ways came to my mind, I omitted the right join that MDiesel did, because it's actually just another way of writing left join. It was invented for lazy sql developers, that don't want to switch table names, but instead just rewrite one word.
Anyway, first query:
select
c.*
from
customers c
left join orders o on c.customerId = o.customerId
where o.customerId is null;
Results in an execution plan like this:
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| 1 | SIMPLE | c | index | NULL | PRIMARY | 4 | NULL | 250 | Using index |
| 1 | SIMPLE | o | ref | fk_customer | fk_customer | 5 | wtf.c.customerId | 1 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
Second query:
select
c.*
from
customers c
where c.customerId not in (select distinct customerId from orders);
Results in an execution plan like this:
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
| 1 | PRIMARY | c | index | NULL | PRIMARY | 4 | NULL | 250 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | orders | index_subquery | fk_customer | fk_customer | 5 | func | 2 | Using index |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
Third query:
select
c.*
from
customers c
where not exists (select 1 from orders o where o.customerId = c.customerId);
Results in an execution plan like this:
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| 1 | PRIMARY | c | index | NULL | PRIMARY | 4 | NULL | 250 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | o | ref | fk_customer | fk_customer | 5 | wtf.c.customerId | 1 | Using where; Using index |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
We can see in all execution plans, that the customers table is read as a whole, but from the index (the implicit one as the only column is primary key). This may change, when you select other columns from the table, that are not in an index.
The first one seems to be the best. For each row in customers only one row in orders is read. The id column suggests, that MySQL can do this in one step, as only indexes are involved.
The second query seems to be the worst (though all 3 queries shouldn't perform too bad). For each row in customers the subquery is executed (the select_type column tells this).
The third query is not much different in that it uses a dependent subquery, but should perform better than the second query. Explaining the small differences would lead to far now. If you're interested, here's the manual page that explains what each column and their values mean here: EXPLAIN output
Finally: I'd say, that the first query will perform best, but as always, in the end one has to measure, to measure and to measure.

I can thing of two other ways to write this query:
SELECT C.*
FROM Customer C
LEFT OUTER JOIN SalesOrder SO ON C.CustomerID = SO.CustomerID
WHERE SO.CustomerID IS NULL
SELECT C.*
FROM Customer C
WHERE NOT C.CustomerID IN(SELECT CustomerID FROM SalesOrder)

The solutions involving outer joins will perform better than a solution using NOT IN.

Related

SQL N:M query merging results by condition flag in intermediate table

[First of all, if this is a duplicate, sorry, I couldn't find a response for this, as this is a strange solution for a limitation on an ORM and I'm clearly a noobie on SQL]
Domain requirements:
A brigades must be composed by one user (the commissar one) and, optionally, one and only one assistant (1:1)
A user can only be part of one brigade (1:1)
CREATE TABLE Users
(
id SERIAL PRIMARY KEY,
username VARCHAR(100) NOT NULL UNIQUE,
password VARCHAR(100) NOT NULL
);
CREATE TABLE Brigades
(
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
-- N:M relationship with a flag inside which determine if that user is a commissar or not
CREATE TABLE Brigade_User
(
brigade_id INT NOT NULL REFERENCES Brigades(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
user_id INT NOT NULL REFERENCES Users(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
is_commissar BOOLEAN NOT NULL
PRIMARY KEY(brigade_id, user_id)
);
Ideally, as relations are 1:1, Brigade_User intermediate table could be erased and a Brigade table with two foreign keys could be created instead (this is not supported by Diesel Rust ORM, so I think I'm coupled to first approach)
CREATE TABLE Brigades
(
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
-- 1:1
commisar_id INT NOT NULL REFERENCES Users(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
-- 1:1
assistant_id INT NOT NULL REFERENCES Users(id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
An example...
> SELECT * FROM brigade_user LEFT JOIN brigades ON brigade_user.brigade_id = brigades.id;
brigade_id | user_id | is_commissar | id | name
------------+---------+--------------+----+------------------
1 | 1 | t | 1 | Patrulla gatuna
1 | 2 | f | 1 | Patrulla gatuna
2 | 3 | t | 2 | Patrulla perruna
2 | 4 | f | 2 | Patrulla perruna
3 | 6 | t | 3 | Patrulla canina
3 | 5 | f | 3 | Patrulla canina
(4 rows)
Is it possible to make a query which returns a table like this?
brigade_id | commissar_id | assistant_id | name
-----------+--------------+--------------+--------------------
1 | 1 | 2 | Patrulla gatuna
2 | 3 | 4 | Patrulla perruna
3 | 6 | 5 | Patrulla canina
See that each two rows have been merged into one (remember, a brigade is composed by one commissary and, optionally, one assistant) depending on the flag.
Could this model be improved (having in mind the limitation on multiple foreign keys referencing the same table, discussed here)
Try the following:
with cte as
(
SELECT A.brigade_id,A.user_id,A.is_commissar,B.name
FROM brigade_user A LEFT JOIN brigades B ON A.brigade_id = B.id
)
select C1.brigade_id, C1.user_id as commissar_id , C2.user_id as assistant_id, C1.name from
cte C1 left join cte C2
on C1.brigade_id=C2.brigade_id
and C1.user_id<>C2.user_id
where C1.is_commissar=true
See a demo from here.

Result of query as column value

I've got three tables:
Lessons:
CREATE TABLE lessons (
id SERIAL PRIMARY KEY,
title text NOT NULL,
description text NOT NULL,
vocab_count integer NOT NULL
);
+----+------------+------------------+-------------+
| id | title | description | vocab_count |
+----+------------+------------------+-------------+
| 1 | lesson_one | this is a lesson | 3 |
| 2 | lesson_two | another lesson | 2 |
+----+------------+------------------+-------------+
Lesson_vocabulary:
CREATE TABLE lesson_vocabulary (
lesson_id integer REFERENCES lessons(id),
vocabulary_id integer REFERENCES vocabulary(id)
);
+-----------+---------------+
| lesson_id | vocabulary_id |
+-----------+---------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 2 |
| 2 | 4 |
+-----------+---------------+
Vocabulary:
CREATE TABLE vocabulary (
id integer PRIMARY KEY,
hiragana text NOT NULL,
reading text NOT NULL,
meaning text[] NOT NULL
);
Each lesson contains multiple vocabulary, and each vocabulary can be included in multiple lessons.
How can I get the vocab_count column of the lessons table to be calculated and updated whenevr I add more rows to the lesson_vocabulary table. Is this possible, and how would I go about doing this?
Thanks
You can use SQL triggers to serve your purpose. This would be similar to mysql after insert trigger which updates another table's column.
The trigger would look somewhat like this. I am using Oracle SQL, but there would just be minor tweaks for any other implementation.
CREATE TRIGGER vocab_trigger
AFTER INSERT ON lesson_vocabulary
FOR EACH ROW
begin
for lesson_cur in (select LESSON_ID, COUNT(VOCABULARY_ID) voc_cnt from LESSON_VOCABULARY group by LESSON_ID) LOOP
update LESSONS
set VOCAB_COUNT = LESSON_CUR.VOC_CNT
where id = LESSON_CUR.LESSON_ID;
end loop;
END;
It's better to create a view that calculates that (and get rid of the column in the lessons table):
select l.*, lv.vocab_count
from lessons l
left join (
select lesson_id, count(*)
from lesson_vocabulary
group by lesson_id
) as lv(lesson_id, vocab_count) on l.id = lv.lesson_id
If you really want to update the lessons table each time the lesson_vocabulary changes, you can run an UPDATE statement like this in a trigger:
update lessons l
set vocab_count = t.cnt
from (
select lesson_id, count(*) as cnt
from lesson_vocabulary
group by lesson_id
) t
where t.lesson_id = l.id;
I would recommend using a query for this information:
select l.*,
(select count(*)
from lesson_vocabulary lv
where lv.lesson_id = l.lesson_id
) as vocabulary_cnt
from lessons l;
With an index on lesson_vocabulary(lesson_id), this should be quite fast.
I recommend this over an update, because the data remains correct.
I recommend this over a trigger, because it is simpler.
I recommend this over a subquery with aggregation because it should be faster, particularly if you are filtering on the lessons table.

How to add foreign key constraint to Table A (id, type) referencing either of two tables Table B (id, type) or Table C (id, type)?

I'm looking to use two columns in Table A as foreign keys for either one of two tables: Table B or Table C. Using columns table_a.item_id and table_a.item_type_id, I want to force any new rows to either have a matching item_id and item_type_id in Table B or Table C.
Example:
Table A: Inventory
+---------+--------------+-------+
| item_id | item_type_id | count |
+---------+--------------+-------+
| 2 | 1 | 32 |
| 3 | 1 | 24 |
| 1 | 2 | 10 |
+---------+--------------+-------+
Table B: Recipes
+----+--------------+-------------------+-------------+----------------------+
| id | item_type_id | name | consistency | gram_to_fluid_ounces |
+----+--------------+-------------------+-------------+----------------------+
| 1 | 1 | Delicious Juice | thin | .0048472 |
| 2 | 1 | Ok Tasting Juice | thin | .0057263 |
| 3 | 1 | Protein Smoothie | heavy | .0049847 |
+----+--------------+-------------------+-------------+----------------------+
Table C: Products
+----+--------------+----------+--------+----------+----------+
| id | item_type_id | name | price | in_stock | is_taxed |
+----+--------------+----------+--------+----------+----------+
| 1 | 2 | Purse | $200 | TRUE | TRUE |
| 2 | 2 | Notebook | $14.99 | TRUE | TRUE |
| 3 | 2 | Computer | $1,099 | FALSE | TRUE |
+----+--------------+----------+--------+----------+----------+
Other Table: Item_Types
+----+-----------+
| id | type_name |
+----+-----------+
| 1 | recipes |
| 2 | products |
+----+-----------+
I want to be able to have an inventory table where employees can enter inventory counts regardless of whether an item is a recipe or a product. I don't want to have to have a product_inventory and recipe_inventory table as there are many operations I need to do across all inventory items regardless of item types.
One solution would be to create a reference table like so:
Table CD: Items
+---------+--------------+------------+-----------+
| item_id | item_type_id | product_id | recipe_id |
+---------+--------------+------------+-----------+
| 2 | 1 | NULL | 2 |
| 3 | 1 | NULL | 3 |
| 1 | 2 | 1 | NULL |
+---------+--------------+------------+-----------+
It just seems very cumbersome, plus I'd now need to add/remove products/recipes from this new table whenever they are added/removed from their respective tables. (Is there an automatic way to achieve this?)
CREATE TABLE [dbo].[inventory] (
[id] [bigint] IDENTITY(1,1) NOT NULL,
[item_id] [smallint] NOT NULL,
[item_type_id] [tinyint] NOT NULL,
[count] [float] NOT NULL,
CONSTRAINT [PK_inventory_id] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
What I would really like to do is something like this...
ALTER TABLE [inventory]
ADD CONSTRAINT [FK_inventory_sources] FOREIGN KEY ([item_id],[item_type_id])
REFERENCES {[products] ([id],[item_type_id]) OR [recipes] ([id],[item_type_id])}
Maybe there is no solution as I'm describing it, so if you have any ideas where I can maintain the same/similar schema, I'm definitely open to hearing them!
Thanks :)
Since your products and recipes are stored separately, and appear to mostly have separate columns, then separate inventory tables is probably the correct approach. e.g.
CREATE TABLE dbo.ProductInventory
(
Product_id INT NOT NULL,
[count] INT NOT NULL,
CONSTRAINT FK_ProductInventory__Product_id FOREIGN KEY (Product_id)
REFERENCES dbo.Product (Product_id)
);
CREATE TABLE dbo.RecipeInventory
(
Recipe_id INT NOT NULL,
[count] INT NOT NULL,
CONSTRAINT FK_RecipeInventory__Recipe_id FOREIGN KEY (Recipe_id)
REFERENCES dbo.Recipe (Recipe_id )
);
If you need all types combined, you can simply use a view:
CREATE VIEW dbo.Inventory
AS
SELECT Product_id AS item_id,
2 AS item_type_id,
[Count]
FROM ProductInventory
UNION ALL
SELECT recipe_id AS item_id,
1 AS item_type_id
[Count]
FROM RecipeInventory;
GO
IF you create a new item_type, then you need to amend the DB design anyway to create a new table, so you would just need to amend the view at the same time
Another possibility, would be to have a single Items table, and then have Products/Recipes reference this. So you start with your items table, each of which has a unique ID:
CREATE TABLE dbo.Items
(
item_id INT IDENTITY(1, 1) NOT NULL
Item_type_id INT NOT NULL,
CONSTRAINT PK_Items__ItemID PRIMARY KEY (item_id),
CONSTRAINT FK_Items__Item_Type_ID FOREIGN KEY (Item_Type_ID) REFERENCES Item_Type (Item_Type_ID),
CONSTRAINT UQ_Items__ItemID_ItemTypeID UNIQUE (Item_ID, Item_type_id)
);
Note the unique key added on (item_id, item_type_id), this is important for referential integrity later on.
Then each of your sub tables has a 1:1 relationship with this, so your product table would become:
CREATE TABLE dbo.Products
(
item_id BIGINT NOT NULL,
Item_type_id AS 2,
name VARCHAR(50) NOT NULL,
Price DECIMAL(10, 4) NOT NULL,
InStock BIT NOT NULL,
CONSTRAINT PK_Products__ItemID PRIMARY KEY (item_id),
CONSTRAINT FK_Products__Item_Type_ID FOREIGN KEY (Item_Type_ID)
REFERENCES Item_Type (Item_Type_ID),
CONSTRAINT FK_Products__ItemID_ItemTypeID FOREIGN KEY (item_id, Item_Type_ID)
REFERENCES dbo.Item (item_id, item_type_id)
);
A few things to note:
item_id is again the primary key, ensuring the 1:1 relationship.
the computed column item_type_id (as 2) ensuring all item_type_id's are set to 2. This is key as it allows a foreign key constraint to be added
the foreign key on (item_id, item_type_id) back to the items table. This ensures that you can only insert a record to the product table, if the original record in the items table has an item_type_id of 2.
A third option would be a single table for recipes and products and make any columns not required for both nullable. This answer on types of inheritance is well worth a read.
I think there is a flaw in your database design. The best way to solve your actual problem, is to have Recipies and products as one single table. Right now you have a redundant column in each table called item_type_id. That column is not worth anything, unless you actually have the items in the same table. I say redundant, because it has the same value for absolutely every entry in each table.
You have two options. If you can not change the database design, work without foreign keys, and make the logic layer select from the correct tables.
Or, if you can change the database design, make products and recipies exist in the same table. You already have a item_type table, which can identify item categorization, so it makes sense to put all items in the same table
you can only add one constraint for a column or pair of columns. Think about apples and oranges. A column cannot refer to both oranges and apples. It must be either orange or apple.
As a side note, this can be somehow achieved with PERSISTED COMPUTED columns, however It only introduces overhead and complexity.
Check This for Reference
You can add some computed columns to the Inventory table:
ALTER TABLE Inventory
ADD _recipe_item_id AS CASE WHEN item_type_id = 1 THEN item_id END persisted
ALTER TABLE Inventory
ADD _product_item_id AS CASE WHEN item_type_id = 2 THEN item_id END persisted
You can then add two separate foreign keys to the two tables, using those two columns instead of item_id. I'm assuming the item_type_id column in those two tables is already computed/constraint appropriately but if not you may want to consider that too.
Because these computed columns are NULL when the wrong type is selected, and because SQL Server doesn't check FK constraints if at least one column value is NULL, they can both exist and only one or the other will be satisfied at any time.

Union all data unmanaged

Forgive me if this question already ask,not much of db guy here ,
here is what i tried,
select row_number() over (partition by name order by challanto_date) , *
from (
select
rma,
p.id,
p.name,
challanto_date,
CURRENT_TIMESTAMP as fromDate
from challan_to_vendor cv
left join challan_to_vendor_detail cvd on cv.id = cvd.challan_to_vendor_id
inner join main_product p on p.id = cvd.product_id
union all
select
rma,
p.id,
p.name,
CURRENT_TIMESTAMP as toDate,
challan_date
from challan_from_vendor cv
left join challan_from_vendor_detail cvd on cv.id = cvd.challan_from_vendor_id
inner join main_product p on p.id = cvd.product_id
) as a
Here is my create table script :
challan_from_vendor
CREATE TABLE public.challan_from_vendor
(
id character varying NOT NULL,
date_ad date,
rma integer DEFAULT 1,
CONSTRAINT psk PRIMARY KEY (id)
)
challan_from_vendor_detail
CREATE TABLE public.challan_from_vendor_detail
(
id character varying NOT NULL,
challan_from_id character varying,
product_id character varying,
CONSTRAINT psks PRIMARY KEY (id),
CONSTRAINT fsks FOREIGN KEY (challan_from_id)
REFERENCES public.challan_from_vendor (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
challan_to_vendor;
CREATE TABLE public.challan_to_vendor
(
id character varying NOT NULL,
date_ad date,
rma integer DEFAULT 1,
CONSTRAINT pk PRIMARY KEY (id)
)
challan_to_vendor_detail
CREATE TABLE public.challan_to_vendor_detail
(
id character varying NOT NULL,
challan_to_id character varying,
product_id character varying,
CONSTRAINT pks PRIMARY KEY (id),
CONSTRAINT fks FOREIGN KEY (challan_to_id)
REFERENCES public.challan_to_vendor (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
product
CREATE TABLE public.product
(
id character varying NOT NULL,
product_name character varying,
CONSTRAINT pks PRIMARY KEY (id)
)
Here is my table structures and desire output.
challan_from_vendor
| id | rma | date |
|:-----------|------------:|:------------:|
| 12012 | 0001 | 2018-11-10
| 123121 | 0001 | 2018-11-11
challan_to_vendor
| id | rma | date |
|:-----------|------------:|:------------:|
| 12 | 0001 | 2018-12-10
| 123 | 0001 | 2018-12-11
challan_from_vendor_detail
| id | challan_from_vendor_id | product_id |
|:-----------|------------:|:------------:|
| 121 | 12012 | 121313
| 1213 | 12012 | 131381
challan_to_vendor_detail
challan_from_vendor_detail
| id | challan_to_vendor_id | product_id |
|:-----------|------------------------|:------------:|
| 121 | 12 | 121313
| 1213 | 123 | 131381
product
| id | product_name |
|:-----------|------------:|
| 191313 | apple |
| 89113 | banana |
Output
| ram | product_id | challan_from_date | challan_to_date|
|:-----------|------------:|:-----------------:|:--------------:|
| 0001 | 191313| 2018-11-10 |2018-11-11 |
| 0001 | 89113 | 2018-12-10 |2018-12-11 |
There is some strange things in the query you have tried so it is not clear what tables, how they are related or what the columns are in those tables.
So by some guessing I give you this to start of with:
select
main_product.*,
challan_to_vendor.toDate,
challan_from_vendor.fromDate
from main_product
join challan_to_vendor using(product_id)
join challan_from_vendor using(product_id)
If you explain more about your db an what you want out of it I might be able to help you more.
Edit: So I could not run your create statements in my db since there was naming conflicts among other minor things. Here is some advice on the create process that I find useful:
Let the id's be integer instead of character varying otherwise it is probably a name-column and should not be named id. You also used integer-id's in your examples.
Use SERIAL PRIMARY KEY (see tutorial) to help you with the key creation. This also removes the naming-conflict since the constraints are given implicit unique names.
Use the same column-name for the same thing in all places to avoid confusion by having multiple things called id after a join plus that it simplify's the join. So for example the id of the product should be product_id in all places, that way you could use using(product_id) as your join condition.
So with the advises given above here's how I would create one of your table and then query them:
CREATE TABLE public.challan_to_vendor_detail
(
challan_to_vendor_detail_id SERIAL PRIMARY KEY,
challan_to_vendor_id integer,
product_id integer,
CONSTRAINT fks FOREIGN KEY (challan_to_vendor_id)
REFERENCES public.challan_to_vendor (challan_to_vendor_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
select
product_name,
challan_to_vendor.date_ad as date_to,
challan_from_vendor.date_ad as date_from
from product
join challan_to_vendor_detail using(product_id)
join challan_to_vendor using(challan_to_vendor_id)
join challan_from_vendor_detail using(product_id)
join challan_from_vendor using(challan_from_vendor_id)
Unfortunately the overall db-design does not make sense to me so I do not know if this is what you expect.
Good luck!

Removing duplicate SQL records to permit a unique key

I have a table ('sales') in a MYSQL DB which should rightfully have had a unique constraint enforced to prevent duplicates. To first remove the dupes and set the constraint is proving a bit tricky.
Table structure (simplified):
'id (unique, autoinc)'
product_id
The goal is to enforce uniqueness for product_id. The de-duping policy I want to apply is to remove all duplicate records except the most recently created, eg: the highest id.
Or to put another way, I would like to delete only duplicate records, excluding the ids matched by the following query whilst also preserving the existing non-duped records:
select id
from sales s
inner join (select product_id,
max(id) as maxId
from sales
group by product_id
having count(product_id) > 1) groupedByProdId on s.product_id
and s.id = groupedByProdId.maxId
I've struggled with this on two fronts - writing the query to select the correct records to delete and then also the constraint in MYSQL where a subselect FROM clause of a DELETE cannot reference the same table from which data is being removed.
I checked out this answer and it seemed to deal with the subject, but seem specific to sql-server, though I wouldn't rule this question out from duplicating another.
In reply to your comment, here's a query that works in MySQL:
delete YourTable
from YourTable
inner join YourTable yt2
on YourTable.product_id = yt2.product_id
and YourTable.id < yt2.id
This would only remove duplicate rows. The inner join will filter out the latest row for each product, even if no other rows for the same product exist.
P.S. If you try to alias the table after FROM, MySQL requires you to specify the name of the database, like:
delete <DatabaseName>.yt
from YourTable yt
inner join YourTable yt2
on yt.product_id = yt2.product_id
and yt.id < yt2.id;
Perhaps use ALTER IGNORE TABLE ... ADD UNIQUE KEY.
For example:
describe sales;
+------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| product_id | int(11) | NO | | NULL | |
+------------+---------+------+-----+---------+----------------+
select * from sales;
+----+------------+
| id | product_id |
+----+------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 3 |
| 6 | 2 |
+----+------------+
ALTER IGNORE TABLE sales ADD UNIQUE KEY idx1(product_id), ORDER BY id DESC;
Query OK, 6 rows affected (0.03 sec)
Records: 6 Duplicates: 3 Warnings: 0
select * from sales;
+----+------------+
| id | product_id |
+----+------------+
| 6 | 2 |
| 5 | 3 |
| 2 | 1 |
+----+------------+
See this pythian post for more information.
Note that the ids end up in reverse order. I don't think this matters, since order of the ids should not matter in a database (as far as I know!). If this displeases you however, the post linked to above shows a way to solve this problem too. However, it involves creating a temporary table which requires more hard drive space than the in-place method I posted above.
I might do the following in sql-server to eliminate the duplicates:
DELETE FROM Sales
FROM Sales
INNER JOIN Sales b ON Sales.product_id = b.product_id AND Sales.id < b.id
It looks like the analogous delete statement for mysql might be:
DELETE FROM Sales
USING Sales
INNER JOIN Sales b ON Sales.product_id = b.product_id AND Sales.id < b.id
This type of problem is easier to solve with CTEs and Ranking functions, however, you should be able to do something like the following to solve your problem:
Delete Sales
Where Exists(
Select 1
From Sales As S2
Where S2.product_id = Sales.product_id
And S2.id > Sales.Id
Having Count(*) > 0
)