Duplication happening when second call to warehouse item - sql

I am getting some strange duplication in my query the problem is I need to be able to query both stock qtys from warehouse item table and as you see i am calling it twice hence there is duplication of warehouse items. Will find that its my joins that will be the issue.
SELECT ki.[KitItemID]
,ki.[KitHeaderID]
,kh.StockCode as KitHeaderStockCode
,ki.[StockCode]
,ki.[StockDescription]
,ki.[StockItemID]
,ki.[Qty]
,ki.[IsBoard]
,ki.[IsSubAssembly],
ISNULL(wip.ConfirmedQtyInStock,0) as InStockCs,
ISNULL(SUM(wip2.ConfirmedQtyInStock), 0) as InStockWip,
#ppBatch as BatchID
FROM KitItem ki
LEFT JOIN KitHeader kh ON ki.KitHeaderID = kh.KitHeaderID
LEFT JOIN WarehouseItem wip2 ON ki.StockItemID = wip2.ItemID AND wip2.WarehouseID = (SELECT TOP 1 WIP_LocationID FROM Settings)
INNER JOIN
dbo.WarehouseItem wip ON ki.StockItemID =wip.ItemID INNER JOIN
dbo.Warehouse war ON wip.WarehouseID = war.WarehouseID
WHERE ki.IsBox = 0
GROUP BY
ki.[KitItemID]
,ki.[KitHeaderID]
,kh.StockCode
,ki.[StockCode]
,ki.[StockDescription]
,ki.[StockItemID]
,ki.[Qty]
,ki.[IsBoard]
,ki.[IsSubAssembly]
,wip.ConfirmedQtyInStock
,wip2.ConfirmedQtyInStock

After ON ki.StockItemID =wip.ItemID, add
and wip.ItemID=wip2.ItemID
so that each row does only have data about 1 item, not 2

If you get full duplicates you can use distinct in select, otherwise add a column in group by which is different

Related

SQL - join three tables based on (different) latest dates in two of them

Using Oracle SQL Developer, I have three tables with some common data that I need to join.
Appreciate any help on this!
Please refer to https://i.stack.imgur.com/f37Jh.png for the input and desired output (table formatting doesn't work on all tables).
These tables are made up in order to anonymize them, and in reality contain other data with millions of entries, but you could think of them as representing:
Product = Main product categories in a grocery store.
Subproduct = Subcategory products to the above. Each time the table is updated, the main product category may loses or get some new suproducts assigned to it. E.g. you can see that from May to June the Pulled pork entered while the Fishsoup was thrown out.
Issues = Status of the products, for example an apple is bad if it has brown spots on it..
What I need to find is: for each P_NAME, find the latest updated set of subproducts (SP_ID and SP_NAME), and append that information with the latest updated issue status (STATUS_FLAG).
Please note that each main product category gets its set of subproducts updated at individual occasions i.e. 1234 and 5678 might be "latest updated" on different dates.
I have tried multiple queries but failed each time. I am using combos of SELECT, LEFT OUTER JOIN, JOIN, MAX and GROUP BY.
Latest attempt, which gives me the combo of the first two tables, but missing the third:
SELECT
PRODUCT.P_NAME,
SUBPRODUCT.SP_PRODUCT_ID, SUBPRODUCT.SP_NAME, SUBPRODUCT.SP_ID, SUPPRODUCT.SP_VALUE_DATE
FROM SUBPRODUCT
LEFT OUTER JOIN PRODUCT ON PRODUCT.P_ID = SUBPRODUCT.SP_PRODUCT_ID
JOIN(SELECT SP_PRODUCT_ID, MAX(SP_VALUE_DATE) AS latestdate FROM SUBPRODUCT GROUP BY SP_PRODUCT_ID) sub ON
sub.SP_PRODUCT_ID = SUBPRODUCT.SP_PRODUCT_ID AND sub.latestDate = SUBPRODUCT.SP_VALUE_DATE;
Trying to find a row with a max value is a common SQL pattern - you can do it with a join, like your example, but it's usually more clear to use a subquery or a window function.
Correlated subquery example
select
PRODUCT.P_NAME,
SUBPRODUCT.SP_PRODUCT_ID, SUBPRODUCT.SP_NAME, SUBPRODUCT.SP_ID, SUPPRODUCT.SP_VALUE_DATE,
ISSUES.STATUS_FLAG, ISSUES.STATUS_LAST_UPDATED
from PRODUCT
join SUBPRODUCT
on PRODUCT.P_ID = SUBPRODUCT.SP_PRODUCT_ID
and SUBPRODUCT.SP_VALUE_DATE = (select max(S2.SP_VALUE_DATE) as latestDate
from SUBPRODUCT S2
where S2.SP_PRODUCT_ID = SUBPRODUCT.SP_PRODUCT_ID)
join ISSUES
on ISSUES.ISSUE_ID = SUBPRODUCT.SP_ID
and ISSUES.STATUS_LAST_UPDATED = (select max(I2.STATUS_LAST_UPDATED) as latestDate
from ISSUES I2
where I2.ISSUE_ID = ISSUES.ISSUE_ID)
Window function / inline view example
select
PRODUCT.P_NAME,
S.SP_PRODUCT_ID, S.SP_NAME, S.SP_ID, S.SP_VALUE_DATE,
I.STATUS_FLAG, I.STATUS_LAST_UPDATED
from PRODUCT
join (select SUBPRODUCT.*,
max(SP_VALUE_DATE) over (partition by SP_PRODUCT_ID) as latestDate
from SUBPRODUCT) S
on PRODUCT.P_ID = S.SP_PRODUCT_ID
and S.SP_VALUE_DATE = S.latestDate
join (select ISSUES.*,
max(STATUS_LAST_UPDATED) over (partition by ISSUE_ID) as latestDate
from ISSUES) I
on I.ISSUE_ID = S.SP_ID
and I.STATUS_LAST_UPDATED = I.latestDate
This often performs a bit better, but window functions can be tricky to understand.

SQL Server - Need to SUM values in across multiple returned records

In the following query I am trying to get TotalQty to SUM across both the locations for item 6112040, but so far I have been unable to make this happen. I do need to keep both lines for 6112040 separate in order to capture the different location.
This query feeds into a Jasper ireport using something called Java.Groovy. Despite this, none of the PDFs printed yet have been either stylish or stained brown. Perhaps someone could address that issue as well, but this SUM issue takes priority
I know Gordon Linoff will get on in about an hour so maybe he can help.
DECLARE #receipt INT
SET #receipt = 20
SELECT
ent.WarehouseSku AS WarehouseSku,
ent.PalletId AS [ReceivedPallet],
ISNULL(inv.LocationName,'') AS [ActualLoc],
SUM(ISNULL(inv.Qty,0)) AS [LocationQty],
SUM(ISNULL(inv.Qty,0)) AS [TotalQty],
MAX(CAST(ent.ReceiptLineNumber AS INT)) AS [LineNumber],
MAX(ent.WarehouseLotReference) AS [WarehouseLot],
LEFT(SUM(ent.WeightExpected),7) AS [GrossWeight],
LEFT(SUM(inv.[Weight]),7) AS [NetWeight]
FROM WarehouseReceiptDetail AS det
INNER JOIN WarehouseReceiptDetailEntry AS ent
ON det.ReceiptNumber = ent.ReceiptNumber
AND det.FacilityName = ent.FacilityName
AND det.WarehouseName = ent.WarehouseName
AND det.ReceiptLineNumber = ent.ReceiptLineNumber
LEFT OUTER JOIN Inventory AS inv
ON inv.WarehouseName = det.WarehouseName
AND inv.FacilityName = det.FacilityName
AND inv.WarehouseSku = det.WarehouseSku
AND inv.CustomerLotReference = ent.CustomerLotReference
AND inv.LotReferenceOne = det.ReceiptNumber
AND ISNULL(ent.CaseId,'') = ISNULL(inv.CaseId,'')
WHERE
det.WarehouseName = $Warehouse
AND det.FacilityName = $Facility
AND det.ReceiptNumber = #receipt
GROUP BY
ent.PalletId
, ent.WarehouseSku
, inv.LocationName
, inv.Qty
, inv.LotReferenceOne
ORDER BY ent.WarehouseSku
The lines I need partially coalesced are 4 and 5 in the above return.
Create a second dataset with a subquery and join to that subquery - you can extrapolate from the following to apply to your situation:
First the Subquery:
SELECT
WarehouseSku,
SUM(Qty)
FROM
Inventory
GROUP BY
WarehouseSku
Now apply to your query - insert into the FROM clause:
...
LEFT JOIN (
SELECT
WarehouseSKU,
SUM(Qty)
FROM
Inventory
GROUP BY
WarehouseSKU
) AS TotalQty
ON Warehouse.WarehouseSku = TotalQty.WarehouseSku
Without seeing the actual schema DDL it is hard to know the exact cardinality, but I think this will point you in the right direction.

MS Access SQL left join giving all 0

I am trying to make a summary table of all the items I have. I have a raw data table with 10 users who respectively have different items. There are maximum 3 different items and I want to do a count to see how many items each individual has. The following is my code.
Select b.Country,b.UserID,Num_including_fruits, Apple,Orange
from
(((SELECT o.Country,o.UserID, IIF(ISNULL(Count(o.UserID)),0,Count(o.UserID))
AS Num_including_fruits
FROM [SEA2_View] as o
GROUP BY o.UserID, o.Country
ORDER BY Country)as b
LEFT JOIN
(SELECT o.Country,o.UserID,IIF(ISNULL(Count(o.UserID)),0,Count(o.UserID)
AS Apple
FROM [APAC2_View] as o
WHERE o.fruit_status <>"fresh" AND o.HWType = "Apple"
GROUP BY o.Country,o.UserID)as d
ON (b.UserID = d.UserID))
LEFT JOIN
(SELECT o.Country,o.UserID,IIF(ISNULL(Count(o.UserID)),0,Count(o.UserID))
AS Orange
FROM [SEA2_View] as o
WHERE o.fruit_status <>"fresh" AND o.HWType = "Orange"
GROUP BY o.Country,o.UserID)as e
ON (d.UserID = e.UserID))
;
The first join returns the correct result but the second join somehow returns all 0, which is incorrect. Therefore please help! and I would appreciate any advice for best practice when it comes to joins in SQL. Thanks lot!
Are you sure you don't have a table naming error?
You're first joining [SEA2_View] with [APAC2_View]. The second join is joining with [SEA2_View] with itself.

Timeout running SQL query

I'm trying to using the aggregation features of the django ORM to run a query on a MSSQL 2008R2 database, but I keep getting a timeout error. The query (generated by django) which fails is below. I've tried running it directs the SQL management studio and it works, but takes 3.5 min
It does look it's aggregating over a bunch of fields which it doesn't need to, but I wouldn't have though that should really cause it to take that long. The database isn't that big either, auth_user has 9 records, ticket_ticket has 1210, and ticket_watchers has 1876. Is there something I'm missing?
SELECT
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined],
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined]
HAVING
(COUNT([tickets_ticket].[id]) > 0 OR COUNT(T3.[id]) > 0 )
EDIT:
Here are the relevant indexes (excluding those not used in the query):
auth_user.id (PK)
auth_user.username (Unique)
tickets_ticket.id (PK)
tickets_ticket.capturer_id
tickets_ticket.responsible_id
tickets_ticket_watchers.id (PK)
tickets_ticket_watchers.user_id
tickets_ticket_watchers.ticket_id
EDIT 2:
After a bit of experimentation, I've found that the following query is the smallest that results in the slow execution:
SELECT
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id]
The weird thing is that if I comment out any two lines in the above, it runs in less that 1s, but it doesn't seem to matter which lines I remove (although obviously I can't remove a join without also removing the relevant SELECT line).
EDIT 3:
The python code which generated this is:
User.objects.annotate(
Count('tickets_captured'),
Count('assigned_tickets'),
Count('tickets_watched')
)
A look at the execution plan shows that SQL Server is first doing a cross-join on all the table, resulting in about 280 million rows, and 6Gb of data. I assume that this is where the problem lies, but why is it happening?
SQL Server is doing exactly what it was asked to do. Unfortunately, Django is not generating the right query for what you want. It looks like you need to count distinct, instead of just count: Django annotate() multiple times causes wrong answers
As for why the query works that way: The query says to join the four tables together. So say an author has 2 captured tickets, 3 assigned tickets, and 4 watched tickets, the join will return 2*3*4 tickets, one for each combination of tickets. The distinct part will remove all the duplicates.
what about this?
SELECT auth_user.*,
C1.tickets_captured__count
C2.assigned_tickets__count
C3.tickets_watched__count
FROM
auth_user
LEFT JOIN
( SELECT capturer_id, COUNT(*) AS tickets_captured__count
FROM tickets_ticket GROUP BY capturer_id ) AS C1 ON auth_user.id = C1.capturer_id
LEFT JOIN
( SELECT responsible_id, COUNT(*) AS assigned_tickets__count
FROM tickets_ticket GROUP BY responsible_id ) AS C2 ON auth_user.id = C2.responsible_id
LEFT JOIN
( SELECT user_id, COUNT(*) AS tickets_watched__count
FROM tickets_ticket_watchers GROUP BY user_id ) AS C3 ON auth_user.id = C3.user_id
WHERE C1.tickets_captured__count > 0 OR C2.assigned_tickets__count > 0
--WHERE C1.tickets_captured__count is not null OR C2.assigned_tickets__count is not null -- also works (I think with beter performance)

SQL: return 0 count in case no record is found

A very simple issue as it appears but somehow not working for me on Oracle 10gXE.
Based on my SQLFiddle, I have to show all staff names and count if present or 0 if no record found having status = 2
How can I achieve it in a single query without calling Loop in my application side.
SELECT S.NAME,ISTATUS.STATUS,COUNT(ISTATUS.Q_ID) as TOTAL
FROM STAFF S
LEFT OUTER JOIN QUESTION_STATUS ISTATUS
ON S.ID = ISTATUS.DONE_BY
AND ISTATUS.STATUS = 2 <--- instead of WHERE
GROUP BY S.NAME,ISTATUS.STATUS
By filtering in the WHERE clause, you filter too late, and you remove STAFF rows that you do want to see. Moving the filter into the join condition means only QUESTION_STATUS rows get filtered out.
Note that STATUS is not really a useful column here, since you won't ever get any result other than 2 or NULL, so you could omit it:
SELECT S.NAME,COUNT(ISTATUS.Q_ID) as TOTAL
FROM STAFF S
LEFT OUTER JOIN QUESTION_STATUS ISTATUS
ON S.ID = ISTATUS.DONE_BY
AND ISTATUS.STATUS = 2
GROUP BY S.NAME
I corrected your sqlfiddle: http://sqlfiddle.com/#!4/90ba0/12
The rule of thumb is that the filters must appear in the ON condition of the table they depend on.
Move filter into the LEFT JOIN , also use COALESCE to have your results display 0 instead of null as you requested in your question
select S.NAME,COALESCE(ISTATUS.STATUS,0),COUNT(ISTATUS.Q_ID) as TOTAL
from STAFF S
LEFT OUTER JOIN QUESTION_STATUS ISTATUS
ON S.ID = ISTATUS.DONE_BY
AND ISTATUS.STATUS =2
GROUP BY S.NAME,ISTATUS.STATUS