SQL LEFT JOIN empties out the left table columns - sql

I came across some weird behavior today with postgresql.
WITH actual_prices AS (
-- Looking for prices from now to the given number of days back
SELECT *
FROM prices
WHERE price_date >= now()::date - 93
)
, distinct_products_sold AS (
SELECT distinct(id_product) as pid FROM products_sold
)
, first_prices AS (
SELECT s.pid, p.product_id, p.price_date, p.price
FROM distinct_products_sold s
LEFT JOIN actual_prices p ON p.product_id = s.pid
)
select * from first_prices;
This code outputs something of this kind:
129 | | |
195 | | |
251 | | |
...
In other words, columns of table actual_prices are empty. I tried messing around with JOIN just to see what's going on: if I do RIGHT JOIN instead of LEFT JOIN, it empties the column of distinct_products_sold but the columns of actual_prices are displayed correctly. What can cause this?

You have it the wrong way around: it is not that the outer join causes data to be lost from one table, rather it forces a union between the tables by padding the missing columns with nulls e.g.
WITH P ( PID ) AS
(
SELECT *
FROM (
VALUES ( 1 ), ( 2 ), ( 3 )
) AS T ( C )
),
Q ( QID ) AS
(
SELECT *
FROM (
VALUES ( 4 ), ( 5 ), ( 6 )
) AS T ( C )
)
SELECT p.PID, q.QID
FROM P p, Q q
WHERE p.PID = q.QID
UNION
SELECT p.PID, NULL
FROM P p
WHERE p.PID NOT IN ( SELECT QID FROM Q );

Forgive me for my brainfart. Turns out it output unmatched results(how surprising). LEFT/RIGHT Joins also output unmatched results of left or right table.
P.S. Have a launch before posting a question.

No need for WITH clause here , try this:
SELECT t.pid , p.product_id, p.price_date, p.price
FROM (SELECT distinct id_product as pid FROM products_sold) t
LEFT JOIN prices p
ON(t.pid = p.product_id AND p.price_date >= now()::date - 93)
If all the columns from table prices are still NULL, then there are just no matches.
A left join keeps all the records from the leading table(the left table) and only the matched data from the right table.

Related

Select max column from joined tables

The joined tables will result as a view like below. I wish to select just 1 record of the max id or prod_month column.
con_model srt_value_current con_id prod_month id
model 4 49 37 45145
model 4 49 38 45726
SELECT DISTINCT TOP (100) PERCENT dbo.DM_TBL_CONFIGURATION_MODEL.con_model, dbo.SRT_Data.SRT_VALUE_CURRENT, dbo.DM_TBL_CONFIGURATION_MODEL.con_id, dbo.SRT_Data.ID, dbo.SRT_Data.PROD_MONTH
FROM dbo.DM_TBL_CONFIGURATION_MODEL LEFT OUTER JOIN
dbo.SRT_ItemNumbers ON dbo.DM_TBL_CONFIGURATION_MODEL.con_model = dbo.SRT_ItemNumbers.ITEM_NUMBER LEFT OUTER JOIN
dbo.SRT_Data ON dbo.SRT_ItemNumbers.ID = dbo.SRT_Data.ITEM_NUMBER_ID
WHERE (SRT_Data.id) IN
( SELECT MAX(id)
FROM SRT_Data
)
and con_model='model'
If you want one row, use TOP (1):
SELECT TOP (1) cm.con_model, s.SRT_VALUE_CURRENT, cm.con_id, s.ID, s.PROD_MONTH
FROM dbo.DM_TBL_CONFIGURATION_MODEL cm LEFT OUTER JOIN
dbo.SRT_ItemNumbers i
ON cmL.con_model = i.ITEM_NUMBER LEFT OUTER JOIN
dbo.SRT_Data s
ON i.ID = s.ITEM_NUMBER_ID
WHERE cm.con_model = 'model'
ORDER BY s.prod_month DESC;
I am using Cross Apply.
SELECT *
FROM dbo.DM_TBL_CONFIGURATION_MODEL a CROSS APPLY
(
Select Top 1 b.ID,dbo.SRT_Data.SRT_VALUE_CURRENT
From dbo.SRT_ItemNumbers b LEFT OUTER JOIN
dbo.SRT_Data ON b.ID = dbo.SRT_Data.ITEM_NUMBER_ID
Where b.ITEM_NUMBER = a.con_model
Order By ID Desc
) X

Getting Latest 3 orders by Supplier ID

I have the following SQL Server code to get information from a combination of 4 tables.
I would like to modify it to only retrieve the latest 3 orders (pmpOrderDate) by supplier (pmpSupplierOrganizationID).
SELECT
PO.pmpPurchaseOrderID, PO.pmpOrderDate, PO.pmpSupplierOrganizationID, O.cmoName
FROM
PurchaseOrders PO
INNER JOIN
PurchaseOrderLines POL ON PO.pmpPurchaseOrderID = POL.pmlPurchaseOrderID
INNER JOIN
Organizations O ON PO.pmpSupplierOrganizationID = O.cmoOrganizationID
INNER JOIN
Parts P ON POL.pmlPartID = P.impPartID
WHERE
P.impPartClassID LIKE 'PUMP%'
Can you please help?
EDIT:
I wasn't fully clear on my actual requirements. To clarify further, what I need in the end is to display the latest 3 unique Purchase Orders by Supplier ID based on at least one of the PartClassID for the PartID in the PurchaseOrderLines to have criteria of beginning with string 'PUMP'
Use a ROW_NUMBER to partition by pmpSupplierOrganizationID and order by pmpOrderDate.
with cteTopOrders AS (
SELECT PO.pmpPurchaseOrderID, PO.pmpOrderDate, PO.pmpSupplierOrganizationID, O.cmoName,
ROW_NUMBER() OVER(PARTITION BY pmpSupplierOrganizationID ORDER BY pmpOrderDate DESC) AS RowNum
FROM PurchaseOrders PO
Inner Join PurchaseOrderLines POL ON PO.pmpPurchaseOrderID = POL.pmlPurchaseOrderID
Inner Join Organizations O On PO.pmpSupplierOrganizationID = O.cmoOrganizationID
Inner Join Parts P ON POL.pmlPartID = P.impPartID
WHERE P.impPartClassID Like 'PUMP%'
)
SELECT pmpPurchaseOrderID, pmpOrderDate, pmpSupplierOrganizationID, cmoName
FROM cteTopOrders
WHERE RowNum <= 3;
I'm a fan of lateral joins for this . . . cross apply:
select p.*, O.cmoName
from Organizations O cross apply
(select top (3) PO.pmpPurchaseOrderID, PO.pmpOrderDate, PO.pmpSupplierOrganizationID
from PurchaseOrders PO join
PurchaseOrderLines POL
on PO.pmpPurchaseOrderID = POL.pmlPurchaseOrderID join
Parts P
on POL.pmlPartID = P.impPartID
where PO.pmpSupplierOrganizationID = O.cmoOrganizationID and
P.impPartClassID Like 'PUMP%'
order by PO.pmpOrderDate desc
) p
You need a nested row_number to get the three rows per supplier and another OLAP-function on top of it:
with OrderRowNum as
(
SELECT PO.pmpPurchaseOrderID, PO.pmpOrderDate, PO.pmpSupplierOrganizationID, O.cmoName, P.impPartClassID,
row_number()
over (partition by PO.pmpSupplierOrganizationID
order by pmpOrderDate desc) as rn
FROM PurchaseOrders PO
Inner Join PurchaseOrderLines POL ON PO.pmpPurchaseOrderID = POL.pmlPurchaseOrderID
Inner Join Organizations O On PO.pmpSupplierOrganizationID = O.cmoOrganizationID
Inner Join Parts P ON POL.pmlPartID = P.impPartID
)
, CheckPUMP as
(
select *,
-- check if at least one of the three rows contains PUMP
max(case when impPartClassID Like 'PUMP%' then 1 else 0 end)
over (partition by PO.pmpSupplierOrganizationID) as PUMPflag
from OrderRowNum
where rn <= 3 -- get the last three rows per supplier
)
select *
from CheckPUMP
where flag = 1

Changing SQL NOT IN to JOINS

Hello guys,
Our aim is to get a script that will insert the missing pairs of product - TaxCategory in the intermediate table (ProductTaxCategory)
The following script is correctly working but we are trying to find a way to optimize it:
INSERT ProductTaxCategory
(ProductTaxCategory_TaxCategoryId,ProductTaxCategory_ProductId)
SELECT
TaxCategoryId
,ProductId
FROM Product pr
CROSS JOIN TaxCategory tx
WHERE pr.ProductId NOT IN
(
SELECT ProductTaxCategory_ProductId
FROM ProductTaxCategory
)
OR
pr.ProductId IN
(
SELECT ProductTaxCategory_ProductId
FROM ProductTaxCategory
)
AND
tx.TaxCategoryId NOT IN
(
SELECT ProductTaxCategory_TaxCategoryId
FROM ProductTaxCategory
WHERE ProductTaxCategory_ProductId = pr.ProductId
)
How can we optimize this query ?
Try something like (full statement now):
INSERT INTO ProductTaxCategory
(ProductTaxCategory_TaxCategoryId,ProductTaxCategory_ProductId)
SELECT TaxCategoryId, ProductId
FROM Product pr CROSS JOIN TaxCategory tx
WHERE NOT EXISTS
(SELECT 1 FROM ProductTaxCategory
WHERE ProductTaxCategory_ProductId = pr.ProductId
AND ProductTaxCategory_TaxCategoryId = tx.TaxCategoryId)
EXISTS with (SELECT 1 ... WHERE ID=...) is often a better alternative to IN (SELECT ID FROM ... ) constructs.
You can do a LEFT JOIN with ProductTaxCategoryand check for NULLs.
Something like this.
INSERT ProductTaxCategory
(
ProductTaxCategory_TaxCategoryId,
ProductTaxCategory_ProductId
)
SELECT p.TaxCategoryId, p.ProductId
FROM
(
SELECT TaxCategoryId, ProductId
FROM Product pr
CROSS JOIN TaxCategory tx
) p
LEFT JOIN ProductTaxCategory ptx
ON P.TaxCategoryId = ptx.ProductTaxCategory_TaxCategoryId
AND P.ProductId = ptx.ProductTaxCategory_ProductId
WHERE ptx.ProductTaxCategory_ProductId IS NULL
Use CROSS JOIN and EXCEPT
INSERT ProductTaxCategory(ProductTaxCategory_ProductId, ProductTaxCategory_TaxCategoryId)
SELECT p.ProductID, tc.TaxCategoryId FROM Product p CROSS JOIN TaxCategory tc
EXCEPT
SELECT ProductTaxCategory_ProductId, ProductTaxCategory_TaxCategoryId FROM ProductTaxCategory
CROSS JOIN will search all the possible pairs. EXCEPT will get you what's missing. Finally you can INSERT them onto the table.

Join record with most recent record on second table

I have 2 tables
Delivery
--------
deliveryid int (PK)
description long varchar
DeliveryHistory
---------------
historyid int
delievryid int
statusid int
recordtime timestamp
WHat I am trying to do is a left outer join to bring back all records from table Delivery with only the most recent entry in DeliveryHistory for each delivery. However if there are no entries in the DeliveryHistory for the delivery I would like a null value
I have done this:
select d.deliveryid,d.description, h.statusid from delivery d
left outer join Deliveryhistory h on d.deliveryid = h.deliveryid
where h.recordtime =
( SELECT MAX(recordtime)
FROM Deliveryhistory
WHERE deliveryid = d.deliveryid)
But it only returns the rows that have an entry in DeliveryHistory.
Your where clause is resulting in all null values being excluded. Try
where h.RecordTime is null OR
h.recordtime =
( SELECT MAX(recordtime)
FROM Deliveryhistory
WHERE deliveryid = d.deliveryid)
select d.deliveryid,d.description, h.statusid from delivery d
left outer join Deliveryhistory h on d.deliveryid = h.deliveryid
where (h.recordtime =
( SELECT MAX(recordtime)
FROM Deliveryhistory
WHERE deliveryid = d.deliveryid)
or h.deliveryid = null)
The existing answers are all it takes but if you'd like to do this without using a WHERE clause you can use following construct.
SELECT d.deliveryid
,d.description
, dh.statusid
FROM Delivery d
LEFT OUTER JOIN (
SELECT deliveryid, MAX(recordtime) AS recordtime
FROM DeliveryHistory
GROUP BY
deliveryid
) dhm ON dhm.deliveryid = d.deliveryid
LEFT OUTER JOIN DeliveryHistory dh ON dh.deliveryid = dhm.deliveryid
AND dh.recordtime = dhm.recordtime
CTE to yield the maxrow (IFF the implementation supports CTEs ;-) plus simple left join with the CTE.
WITH last AS (
SELECT * FROM Deliveryhistory dh
WHERE NOT EXISTS (
SELECT *
FROM Deliveryhistory nx
WHERE nx.deliveryid = dh.deliveryid
AND nx.recordtime > dh.recordtime -- no one is bigger: dh must be the max
)
)
SELECT d.deliveryid, d.description, l.statusid
FROM delivery d
LEFT JOIN last l ON d.deliveryid = l.deliveryid
;

Creating Association between 2 product IDs

I have to create an association between 2 products, which has unique product_ids and insert them into an already constructed table. The association is created based on unique part number these product ids have. For instance:
Product_id = 7578711
Part Number = 0101-2478
Product Id = 7957948
Part Number = 0101-2478
Product Id = 10558140
Part Number = 0101-2478
and my current table has the following columns:
ID (int) identity (1, 1)
product_id
date
guid
where data is in the form of:
1, 7578711, 12345, 2010-08-24 04:29:04.000,00286AFB-3880-4085-BAA0-DBCC0D59A391
I have a query which has the ability to roll Product_id to part number level and then a query to roll the part number to product_id level.
Based on the above data, where they have same part number, i want to create an association and generate insert statements which will add 2 records in the form of:
2, 7957948, 12345, 2010-08-24 04:29:04.000,00286AFB-3880-4085-BAA0-DBCC0D59A391
3, 10558140, 12345, 2010-08-24 04:29:04.000,00286AFB-3880-4085-BAA0-DBCC0D59A391
There are going to be many product IDs in that table. The above one is just an example:
I have 2 Common Table Expressions: 1 rolls the product Id to part number level, and another rolls back the part number to multiple product Ids. I am trying to avoid a cursor.
Could anyone here help me with this problem?
My 2 CTEs as as follows:
;WITH cte (product_id, item_number)
AS
(
SELECT DISTINCT --TOP 1000
pds.product_id
--,pd.productOwner_id
, i.item_number
FROM SurfwatcherEndeavorStats.dbo.productDetailBySite pds WITH ( NOLOCK )
INNER JOIN ProductData.dbo.productDimensions pd with ( NOLOCK ) ON pds.product_id = pd.product_id
INNER JOIN ProductData.dbo.options o with ( NOLOCK ) ON pds.product_id = o.product_id
INNER JOIN ProductData.dbo.items i with ( NOLOCK ) ON o.option_id = i.item_id
WHERE pds.productDetail_date > DATEADD(yyyy, -1, GETDATE())
AND i.item_number IS NOT NULL
--AND i.item_number = '0101-3258'
)
SELECT TOP 1 item_number
FROM cte WITH (NOLOCK)
WHERE product_id = 7957948
;WITH cte1 (product_id, item_number)
AS
(
SELECT DISTINCT --TOP 1000
pds.product_id
--,pd.productOwner_id
, i.item_number
FROM SurfwatcherEndeavorStats.dbo.productDetailBySite pds WITH ( NOLOCK )
INNER JOIN ProductData.dbo.productDimensions pd with ( NOLOCK ) ON pds.product_id = pd.product_id
INNER JOIN ProductData.dbo.options o with ( NOLOCK ) ON pds.product_id = o.product_id
INNER JOIN ProductData.dbo.items i with ( NOLOCK ) ON o.option_id = i.item_id
WHERE pds.productDetail_date > DATEADD(yyyy, -1, GETDATE())
AND i.item_number IS NOT NULL
)
SELECT product_id
FROM cte1 WITH (NOLOCK)
WHERE item_number = '0101-2478'
try this Sql. it should give you a complete list of all associations between two products that both use the same part number... Is that what you want ?
Select Distinct A.Product_Id, B.Product_ID
From YourTable A
Join YourTable B
On B.PartNumber = A.PartNumber
And B.Product_Id > A.Product_Id