First: I know to use all types of join but I don't know why it works like this for this Query
I have a Scenario for making a SQL Query, by using 3 tables and a left outer join between selling and order items.
My Tables:
--------------------
Item
--------------------
ID | Code
--------------------
1 | 7502
SQL > select * from Item where id = 1
---------------------
Item_Order
---------------------------
Item | Box | Quantity
---------------------------
1 | 30 | 15000
1 | 12 | 6000
SQL > select * from Item_Order where Item = 1
--------------------------
Invoice_Item
-------------------
Item | Num | Quantity
-------------------------
1 | 1.64 | 10
1 | 2.4 | 8
SQL > select * from Invoice_Item where Item = 1
I want this output:
Item | OrderQ | OrderB | SellN | SellQ
-----------------------------------------
1 | 1500 | 30 | 1.64 | 10
1 | 6000 | 12 | 2.4 | 8
My SQL code:
SELECT Item.ID, Item_Order.Box As OrderB, Item_Order.Quantity As OrderQ, Invoice_Item.Num As SellN, Invoice_Item.Quantity As SellQ
FROM Item LEFT OUTER JOIN
Invoice_Item ON Item.ID = Invoice_Item.Item LEFT OUTER JOIN
Item_Order ON Item_Order.Item = Item.ID
where Item.ID = 1
Why is my output 2x? or why does my output return 4 records?
Your result can be achieve with row_number:
select a.ID
, a.OrderB
, a.OrderQ
, b.Quantity SellQ
, b.Num SellN
from
(SELECT Item.ID
, Item_Order.Box As OrderB
, Item_Order.Quantity As OrderQ
, row_number () over (order by Item.ID) rn
FROM Item
left outer JOIN Item_Order ON Item.ID = Item_Order.Item) a
left outer join (select Item
, Num
, Quantity
, row_number () over (order by Item) rn
from Invoice_Item ) b
on a.ID = b.Item
and a.rn = b.rn
Here is a demo
You can add more tables like this:
left outer join (select Item
, Num
, Quantity
, row_number () over (order by Item) rn
from Invoice_Item ) b
Because when you first join Item with Item_Order it outputs two records because there are two records in Item_Order. Now this resulting query will be left join with Invoice_Item and that two records will be join with all of the records of Invoice_Item
You can better understand this like this
SELECT Item.ID, Item_Order.Box As OrderB, Item_Order.Quantity As OrderQ, Invoice_Item.Num As SellN, Invoice_Item.Quantity As SellQ
FROM Item LEFT OUTER JOIN
Invoice_Item ON Item.ID = Invoice_Item.Item LEFT OUTER JOIN
where Item.ID = 1 into table4 //Only to explain
Now the result of first query table4 will be joined with Items_Order
You are joining on one key -- two rows with the same key in one table times two rows in the second table = 4 rows.
You need a separate key. You can generate one using row_number():
SELECT i.ID, io.Box As OrderB, io.Quantity As OrderQ,
ii.Num As SellN, ii.Quantity As SellQ
FROM Item i LEFT OUTER JOIN
((SELECT ii.*,
ROW_NUMBER() OVER (PARTITION BY ii.item ORDER BY ii.item) as seqnum
FROM Invoice_Item ii
) FULL JOIN
(SELECT io.*,
ROW_NUMBER() OVER (PARTITION BY io.item ORDER BY io.item) as seqnum
FROM Item_Order io
) io
ON io.Item = ii.ID AND io.seqnum = ii.seqnum
)
ON i. = ii.Item
where i.ID = 1;
Note that this is one of the few cases where I use parentheses in the FROM clause. This code can handle additional rows in either of the tables -- if one table is longer than the other, the columns from the other will be NULL.
If you know the two tables have the same number of rows (for a given item) you can just use inner joins and no parentheses.
It is duplicating because you have no secondary association between Invoice_Item and Item_Order. For each record in Invoice_Item it is matching to Item_Order (known as a Cartesian result) base ONLY on the Item ID. So, your order qty APPEARS to be a 1:1 reference such that the first Invoice item Qty of 10 is MEANT to be associated with Item_Order Box = 30. and Qty 8 is MEANT to be associated with Item_Order Box = 12.
Item_Order
Item Box Quantity
1 30 15000
1 12 6000
Invoice_Item
Item Num Quantity
1 1.64 10
1 2.4 8
You probably need to tack on the "Box" reference so Item_Order and Invoice_Item are a 1:1 match.
What is happening is for each item in Invoice Item is joined to the Item_Order based on Item ID. So you are getting two. If you had 3 Invoice Items with 1 and 6 of Items_Order, you would be getting 18 rows.
FEEDBACK
Even though you have an accepted answer based on an OVER/PARTITION/ROW NUMBER, that process is forcing a surrogate secondary ID to each row. Relying on this approach is not best for an overall data structure association. What happens if you delete the second item on an order. are you positive you are deleting the second item in the invoice_items?
As for returning 2 records in the original scenario, you can via the surrogate process, but I think it would be better for you long term to understand what is happening on the join. Going back to your sample data of Item_Order and Invoice_Item. So lets start with the Item_Order table. The SQL engine is going to process each row individually.
First row SQL grabs Item = 1, Box = 30, Qty = 15000.
So now it joins to the Invoice Item table, and since your criteria it only joins based on Item. So, it sees the first row and says... yup this is item 1, so include that with the item order record (first row returned). Now it goes to the second line in the invoice item table... yup, it too is the same item 1, so it returns it again (second row returned).
Now, SQL grabs the second row Item = 1, Box = 12, Qty = 6000.
Goes back to the Invoice Item table and does exact same test... and for each row in the Item Order that has an Item = 1, and 3rd and 4th row hence your doubling... If either table had more records with the same Item id, it would return that many more records... 3 and 3 records would have returned 9 rows. 4 and 4 records would return 16 rows, etc. Doing the surrogate will work, but I don't think as safe as a better/updated design structure.
Related
thanks in advance for any help on this, I am a bit of a newbie to MS SQL and I want to do something that I think is achievable but don't have the know how.
I have a simple table called "suppliers" where I can do (SELECT id, name FROM suppliers ORDER BY id ASC)
id
name
1
ACME
2
First Stop Business Supplies
3
All in One Supply Warehouse
4
Farm First Supplies
I have another table called "products"
id
name
supplier_id
1
Item 1
2
2
Item 2
1
3
Item 3
1
4
Item 4
3
5
Item 5
2
I want to list all the suppliers and get the total amount of products for each supplier if that makes sense on the same row? I am just not sure how to pass the suppliers.id through the query to get the count.
I am hoping to get to this:
id
name
total_products
1
ACME
2
2
First Stop Business Supplies
2
3
All in One Supply Warehouse
1
4
Farm First Supplies
0
I really appreciate any help on this.
Three concepts to grasp here. Left Join, group by, and Count().
select s.id, s.name, Count(*) as total_products
from suppliers s
left join products p on s.id=p.supplier_id --the left join gets your no matches
group by s.id, s.name
left join is a join where all of the values from the first table are kept even if there are no matches in the second.
Group by is an aggregation tool where the columns to be aggregated are entered.
Count() is simply a count of transactions for the grouped columns.
Try this :-
SELECT id, name, C.total_products
FROM Suppliers S
OUTER APPLY (
SELECT Count(id) AS total_products
FROM Products P
WHERE P.supplier_id = S.id
) C
Today I've faced some unexplainable (for me) behavior in PostgreSQL — LEFT OUTER JOIN does not return records for main table (with nulls for joined one fields) in case the joined table fields are used in WHERE expression.
To make it easier to grasp the case details, I'll provide an example. So, let's say we have 2 tables: item with some goods, and price, referring item, with prices for the goods in different years:
CREATE TABLE item(
id INTEGER PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE price(
id INTEGER PRIMARY KEY,
item_id INTEGER NOT NULL,
year INTEGER NOT NULL,
value INTEGER NOT NULL,
CONSTRAINT goods_fk FOREIGN KEY (item_id) REFERENCES item(id)
);
The table item has 2 records (TV set and VCR items), and the table price has 3 records, a price for TV set in years 2000 and 2010, and a price for VCR for year 2000 only:
INSERT INTO item(id, name)
VALUES
(1, 'TV set'),
(2, 'VCR');
INSERT INTO price(id, item_id, year, value)
VALUES
(1, 1, 2000, 290),
(2, 1, 2010, 270),
(3, 2, 2000, 770);
-- no price of VCR for 2010
Now let's make a LEFT OUTER JOIN query, to get prices for all items for year 2010:
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN price p ON i.id = p.item_id
WHERE p.year = 2010 OR p.year IS NULL;
For some reason, this query will return a results only for TV set, which has a price for this year. Record for VCR is absent in results:
id | name | year | value
----+--------+------+-------
1 | TV set | 2010 | 270
(1 row)
After some experimenting, I've found a way to make the query to return results I need (all records for item table, with nulls in the fields of joined table in case there are no mathing records for the year. It was achieved by moving year filtering into a JOIN condition:
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN (
SELECT * FROM price
WHERE year = 2010 -- <= here I filter a year
) p ON i.id = p.item_id;
And now the result is:
id | name | year | value
----+--------+------+-------
1 | TV set | 2010 | 270
2 | VCR | |
(2 rows)
My main question is — why the first query (with year filtering in WHERE) does not work as expected, and turns instead into something like INNER JOIN?
I'm severely blocked by this issue on my current project, so I'll be thankful about tips/hints on the next related questions too:
Are there any other options to achieve the proper results?
... especially — easily translatable to Django's ORM queryset?
Upd: #astentx suggested to move filtering condition directly into JOIN (and it works too):
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN price p
ON
i.id = p.item_id
AND p.year = 2010;
Though, the same as my first solution, I don't see how to express it in terms of Django ORM querysets. Are there any other suggestions?
The first query does not work as expected because expectation is wrong. It does not work as INNER JOIN as well. The query returns a record for VCR only if there is no price for VCR at all.
SELECT
i.*,
y.year,
p.value
FROM item i
CROSS JOIN (SELECT 2010 AS year) y -- here could be a table
LEFT OUTER JOIN price p
ON (p.item_id = i.id
AND p.year = y.year);
We have a situation in which one part of our stored procedure need to be filled with a join query, which had multiple filters in it. We need a solution only with join (it is easy to implement in the subquery, but our situation demands it to be a join [since the procedure has a where clause followed by it] )
We have two tables Customer and Order. We need to exclude the rows of Customer table, if Customer_id is present Order table & order_code = 10 & Customer.Grade = 3. It is not mandatory for all Customer_id to be present in Order table, but we still need it in the final result.
Customer Table OrderTable
Customer_id Grade Customer_id order_code
1 3 1 10
2 3 1 40
3 2 2 50
4 3 3 30
*Multiple Customer_id can be present in the OrderTable
Expected result :
Customer_id Grade
2 3
3 2
4 3
I think this may be what you need, not sure I understand the question properly.
select c.id, c.grade
from customer c left join customer_order o on (c.id = o.customer_id and o.order_code <> 10)
where c.grade = 3
This should give you all customers with a Grade of 3 that also have orders, provided the order_code is not 10. If you want to show customers that do not have any orders also, make it a left join.
You can express the logic like this:
select c.*
from customers c
where not (grade = 3 and
exists (select 1
from orders o
where o.customer_id = c.customer_id and
o.order_code = 10
)
);
I have three tables (at least, something similar) with the following relationships:
Item table:
ID | Val
---------+---------
1 | 12
2 | 5
3 | 22
Group table:
ID | Parent | Range
---------+---------+---------
1 | NULL | [10-30]
2 | 1 | [20-25]
3 | NULL | [0-15]
GroupToItem table:
GroupID | ItemID
---------+---------
1 | 1
1 | 3
And now I want to add rows to the GroupToItem table for Groups 2 and 3, using the same query (since some other conditions not shown here are more complicated). I want to restrict the items through which I search if the new group has a parent, but to look through all items if there is not.
At the moment I am using an IF/ELSE on two statements that are almost exactly the same, but for the addition of another JOIN row when a parent exists. Is it possible to do a join to reduce the number of items to look at, only if a restriction is possible?
My two queries as they stand are given below:
DECLARE #GroupID INT = 2;...
INSERT INTO GroupToItem(GroupID, ItemID)
SELECT g.ID,
i.ID,
FROM Group g
JOIN Item i ON i.Val IN g.Range
JOIN GroupToItem gti ON g.Parent = gti.GroupID AND i.ID = gti.ItemID
WHERE g.ID = #GroupID
-
DECLARE #GroupID INT = 3;...
INSERT INTO GroupToItem(GroupID, ItemID)
SELECT g.ID,
i.ID,
FROM Group g
JOIN Item i ON i.Val IN g.Range
WHERE g.ID = #GroupID
So essentially I only want to do the second JOIN if the given group has a parent. Is this possible in a single query? It is important that the number of items that are compared against the range is as small as possible, since for me this is an intensive operation.
EDIT: This seems to have solved it in this test setup, similar to what was suggested by Denis Valeev. I'll accept if I can get it to work with my live data. I've been having some weird issues - potentially more questions coming up.
SELECT g.Id,
i.Id
FROM Group g
JOIN Item i ON (i.Val > g.Start AND i.Val < g.End)
WHERE g.Id = 2
AND (
(g.ParentId IS NULL)
OR
(EXISTS(SELECT 1 FROM GroupToItem gti WHERE g.ParentId = gti.GroupId AND i.Id = gti.ItemId))
)
SQL Fiddle
Try this:
INSERT INTO GroupToItem(GroupID, ItemID)
SELECT g.ID,
i.ID,
FROM Group g
JOIN Item i ON i.Val IN g.Range
WHERE g.ID = #GroupID
and (g.ID in (3) or exists (select top 1 1 from GroupToItem gti where g.Parent = gti.GroupID AND i.ID = gti.ItemID))
If a Range column is a varchar datatype, you can try something like this:
INSERT INTO GROUPTOITEM (GROUPID, ITEMID)
SELECT A.ID, B.ID
FROM GROUP AS A
LEFT JOIN ITEM AS B
ON B.VAL BETWEEN CAST(SUBSTRING(SUBSTRING(A.RANGE,1,CHARINDEX('-',A.RANGE,1)-1),2,10) AS INT)
AND CAST(REPLACE(SUBSTRING(A.RANGE,CHARINDEX('-',A.RANGE,1)+1,10),']','') AS INT)
I am using postgresql.
I have a table called custom_field_answers. The data looks like this
Id | product_id | value | number_value |
4 | 2 | | 117 |
3 | 1 | | 107 |
2 | 1 | bangle | |
1 | 2 | necklace | |
I want to find all the products which has text_value as 'bangle' and number_value less than 50.
Here was my first attempt.
SELECT "products".* FROM "products" INNER JOIN "custom_field_answers"
ON "custom_field_answers"."product_id" = "products"."id"
WHERE ("custom_field_answers"."value" ILIKE 'bangle')
Here is my second attempt.
SELECT "products".* FROM "products" INNER JOIN "custom_field_answers"
ON "custom_field_answers"."product_id" = "products"."id"
where ("custom_field_answers"."number_value" < 50)
Here is my final attempt.
SELECT "products".* FROM "products" INNER JOIN "custom_field_answers"
ON "custom_field_answers"."product_id" = "products"."id"
WHERE ("custom_field_answers"."value" ILIKE 'bangle')
AND ("custom_field_answers"."number_value" < 50)
but this does not select any product record.
A WHERE clause can only look at columns from one row at a time.
So if you need a condition that applies to two different rows from a table, you need to join to that table twice, so you can get columns from both rows.
SELECT p.*
FROM "products" AS p
INNER JOIN "custom_field_answers" AS a1 ON p."id" = a1."product_id"
INNER JOIN "custom_field_answers" AS a2 ON p."id" = a1."product_id"
WHERE a1."value" = 'bangle' AND a2."number_value" < 50
It produces no records because there is no custom_field_answers record that meets both criteria. What you want is a list of product_ids that have the necessary records in the table. Just in case no one gets to writing the SQL for you, and until I have a chance to work it out myself, I thought I would at least explain to you why your query is not working.
This should work:
SELECT p.* FROM products LEFT JOIN custom_field_answers c
ON (c.product_id = p.id AND c.value LIKE '%bangle%' AND c.number_value
Hope it helps
Your bangle-related number_value fields are null, so you won't be able to do a straight comparison in those cases. Instead, convert your nulls to 0s first.
SELECT "products".* FROM "products" INNER JOIN "custom_field_answers"
ON "custom_field_answers"."product_id" = "products"."id"
WHERE ("custom_field_answers"."value" LIKE '%bangle%')
AND (coalesce("custom_field_answers"."number_value", 0) < 50)
Didn't actually test it, but this general idea should work:
SELECT *
FROM products
WHERE
EXISTS (
SELECT *
FROM custom_field_answers
WHERE
custom_field_answers.product_id = products.id
AND value = 'bangle'
)
AND EXISTS (
SELECT *
FROM custom_field_answers
WHERE
custom_field_answers.product_id = products.id
AND number_value < 5
)
In plain English: Get all products such that...
there is a related row in custom_field_answers where value = 'bangle'
and there is (possibly different) related row in custom_field_answers where number_value < 5.