Oracle SQL Self-joins in the same statement - sql

I'm trying to achieve the following. Any help will be highly appreciated!
1.In the first table, do a self-join on a main deal with its corresponding offset deal
2.In the second table, find all the main deals where the prices don't match with the corresponding offset deals.
I don't think the second self-join works. That's supposed to link the main and the offset deals in the prices table and find cases where the prices don't match for certain price dates.
SELECT *
FROM transactions tran1 transactions tran2
,prices pr1
,prices pr1
WHERE tran1.tran_type = 1 --Original deal
AND tran2.tran_type = 2 -- Offset deal
AND tran1.tran_num = tran2.offset_tran_num
AND tran1.ins_num = pr1.ins_num
AND tran2.ins_num = pr2.ins_num
AND pr1.ins_num = pr2.ins_num
AND pr1.profile_num = pr2.profile_num
AND pr1.price_date = pr2.price_date
AND pr1.value != pr2.value

This statement could benefit from using proper join syntax, instead of this older style - immediately on trying to convert it you can see the alias 'pr1' is used twice, you meant to use 'pr2'
Given that being correct, (and assuming you may need some help on converting it to a modern syntax) you have :
SELECT *
FROM transactions tran1
inner join prices pr1 on pr1.ins_num = tran1.ins_num
inner join prices pr2 on pr2.ins_num = pr1.ins_num
and pr2.profile_num = pr1.profile_num
and pr2.price_date = pr1.price_date
and pr2.value != pr1.value
inner join transactions tran2 on tran2.ins_num = pr2.ins_num
and tran2.offset_tran_num = tran1.tran_num
WHERE tran1.tran_type = 1 --Original deal
AND tran2.tran_type = 2 -- Offset deal
It's easier to read but we would need more information, input + expected output to confirm if it is what you intended.

Related

Sub-query works but would a join or other alternative be better?

I am trying to select rows from one table where the id referenced in those rows matches the unique id from another table that relates to it like so:
SELECT *
FROM booklet_tickets
WHERE bookletId = (SELECT id
FROM booklets
WHERE bookletNum = 2000
AND seasonId = 9
AND bookletTypeId = 3)
With the bookletNum/seasonId/bookletTypeId being filled in by a user form and inserted into the query.
This works and returns what I want but seems messy. Is a join better to use in this type of scenario?
If there is even a possibility for your subquery to return multiple value you should use in instead:
SELECT *
FROM booklet_tickets
WHERE bookletId in (SELECT id
FROM booklets
WHERE bookletNum = 2000
AND seasonId = 9
AND bookletTypeId = 3)
But I would prefer exists over in :
SELECT *
FROM booklet_tickets bt
WHERE EXISTS (SELECT 1
FROM booklets b
WHERE bookletNum = 2000
AND seasonId = 9
AND bookletTypeId = 3
AND b.id = bt.bookletId)
It is not possible to give a "Yes it's better" or "no it's not" answer for this type of scenario.
My personal rule of thumb if number of rows in a table is less than 1 million, I do not care optimising "SELECT WHERE IN" types of queries as SQL Server Query Optimizer is smart enough to pick an appropriate plan for the query.
In reality however you often need more values from a joined table in the final resultset so a JOIN with a filter WHERE clause might make more sense, such as:
SELECT BT.*, B.SeasonId
FROM booklet_tickes BT
INNER JOIN booklets B ON BT.bookletId = B.id
WHERE B.bookletNum = 2000
AND B.seasonId = 9
AND B.bookletTypeId = 3
To me it comes down to a question of style rather than anything else, write your code so that it'll be easier for you to understand it months later. So pick a certain style and then stick to it :)
The question however is old as the time itself :)
SQL JOIN vs IN performance?

Should I use an SQL full outer join for this?

Consider the following tables:
Table A:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
Table B:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
The DOC_TYPE and NEXT_STATUS columns have different meanings between the two tables, although a NEXT_STATUS = 999 means "closed" in both. Also, under certain conditions, there will be a record in each table, with a reference to a corresponding entry in the other table (i.e. the RELATED_DOC_NUM columns).
I am trying to create a query that will get data from both tables that meet the following conditions:
A.RELATED_DOC_NUM = B.DOC_NUM
A.DOC_TYPE = "ST"
B.DOC_TYPE = "OT"
A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999
A.DOC_TYPE = "ST" represents a transfer order to transfer inventory from one plant to another. B.DOC_TYPE = "OT" represents a corresponding receipt of the transferred inventory at the receiving plant.
We want to get records from either table where there is an ST/OT pair where either or both entries are not closed (i.e. NEXT_STATUS < 999).
I am assuming that I need to use a FULL OUTER join to accomplish this. If this is the wrong assumption, please let me know what I should be doing instead.
UPDATE (11/30/2021):
I believe that #Caius Jard is correct in that this does not need to be an outer join. There should always be an ST/OT pair.
With that I have written my query as follows:
SELECT <columns>
FROM A LEFT JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
Does this make sense?
UPDATE 2 (11/30/2021):
The reality is that these are DB2 database tables being used by the JD Edwards ERP application. The only way I know of to see the table definitions is by using the web site http://www.jdetables.com/, entering the table ID and hitting return to run the search. It comes back with a ton of information about the table and its columns.
Table A is really F4211 and table B is really F4311.
Right now, I've simplified the query to keep it simple and keep variables to a minimum. This is what I have currently:
SELECT CAST(F4211.SDDOCO AS VARCHAR(8)) AS SO_NUM,
F4211.SDRORN AS RELATED_PO,
F4211.SDDCTO AS SO_DOC_TYPE,
F4211.SDNXTR AS SO_NEXT_STATUS,
CAST(F4311.PDDOCO AS VARCHAR(8)) AS PO_NUM,
F4311.PDRORN AS RELATED_SO,
F4311.PDDCTO AS PO_DOC_TYPE,
F4311.PDNXTR AS PO_NEXT_STATUS
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = CAST(F4311.PDDOCO AS VARCHAR(8))
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
The other part of the story is that I'm using a reporting package that allows you to define "virtual" views of the data. Virtual views allow the report developer to specify the SQL to use. This is the application where I am using the SQL. When I set up the SQL, there is a validation step that must be performed. It will return a limited set of results if the SQL is validated.
When I enter the query above and validate it, it says that there are no results, which makes no sense. I'm guessing the data casting is causing the issue, but not sure.
UPDATE 3 (11/30/2021):
One more twist to the story. The related doc number is not only defined as a string value, but it contains leading zeros. This is true in both tables. The main doc number (in both tables) is defined as a numeric value and therefore has no leading zeros. I have no idea why those who developed JDE would have done this, but that is what is there.
So, there are matching records between the two tables that meet the criteria, but I think I'm getting no results because when I convert the numeric to a string, it does not match, because one value is, say "12345", while the other is "00012345".
Can I pad the numeric -> string value with zeros before doing the equals check?
UPDATE 4 (12/2/2021):
Was able to finally get the query to work by converting the numeric doc num to a left zero padded string.
SELECT <columns>
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = RIGHT(CONCAT('00000000', CAST(F4311.PDDOCO AS VARCHAR(8))), 8)
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
AND ( F4211.SDNXTR < 999
OR F4311.PDNXTR < 999 )
You should write your query as follows:
SELECT <columns>
FROM A INNER JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
LEFT join is a type of OUTER join; LEFT JOIN is typically a contraction of LEFT OUTER JOIN). OUTER means "one side might have nulls in every column because there was no match". Most critically, the code as posted in the question (with a LEFT JOIN, but then has WHERE some_column_from_the_right_table = some_value) runs as an INNER join, because any NULLs inserted by the LEFT OUTER process, are then quashed by the WHERE clause
See Update 4 for details of how I resolved the "data conversion or mapping" error.

SQL - join three tables based on (different) latest dates in two of them

Using Oracle SQL Developer, I have three tables with some common data that I need to join.
Appreciate any help on this!
Please refer to https://i.stack.imgur.com/f37Jh.png for the input and desired output (table formatting doesn't work on all tables).
These tables are made up in order to anonymize them, and in reality contain other data with millions of entries, but you could think of them as representing:
Product = Main product categories in a grocery store.
Subproduct = Subcategory products to the above. Each time the table is updated, the main product category may loses or get some new suproducts assigned to it. E.g. you can see that from May to June the Pulled pork entered while the Fishsoup was thrown out.
Issues = Status of the products, for example an apple is bad if it has brown spots on it..
What I need to find is: for each P_NAME, find the latest updated set of subproducts (SP_ID and SP_NAME), and append that information with the latest updated issue status (STATUS_FLAG).
Please note that each main product category gets its set of subproducts updated at individual occasions i.e. 1234 and 5678 might be "latest updated" on different dates.
I have tried multiple queries but failed each time. I am using combos of SELECT, LEFT OUTER JOIN, JOIN, MAX and GROUP BY.
Latest attempt, which gives me the combo of the first two tables, but missing the third:
SELECT
PRODUCT.P_NAME,
SUBPRODUCT.SP_PRODUCT_ID, SUBPRODUCT.SP_NAME, SUBPRODUCT.SP_ID, SUPPRODUCT.SP_VALUE_DATE
FROM SUBPRODUCT
LEFT OUTER JOIN PRODUCT ON PRODUCT.P_ID = SUBPRODUCT.SP_PRODUCT_ID
JOIN(SELECT SP_PRODUCT_ID, MAX(SP_VALUE_DATE) AS latestdate FROM SUBPRODUCT GROUP BY SP_PRODUCT_ID) sub ON
sub.SP_PRODUCT_ID = SUBPRODUCT.SP_PRODUCT_ID AND sub.latestDate = SUBPRODUCT.SP_VALUE_DATE;
Trying to find a row with a max value is a common SQL pattern - you can do it with a join, like your example, but it's usually more clear to use a subquery or a window function.
Correlated subquery example
select
PRODUCT.P_NAME,
SUBPRODUCT.SP_PRODUCT_ID, SUBPRODUCT.SP_NAME, SUBPRODUCT.SP_ID, SUPPRODUCT.SP_VALUE_DATE,
ISSUES.STATUS_FLAG, ISSUES.STATUS_LAST_UPDATED
from PRODUCT
join SUBPRODUCT
on PRODUCT.P_ID = SUBPRODUCT.SP_PRODUCT_ID
and SUBPRODUCT.SP_VALUE_DATE = (select max(S2.SP_VALUE_DATE) as latestDate
from SUBPRODUCT S2
where S2.SP_PRODUCT_ID = SUBPRODUCT.SP_PRODUCT_ID)
join ISSUES
on ISSUES.ISSUE_ID = SUBPRODUCT.SP_ID
and ISSUES.STATUS_LAST_UPDATED = (select max(I2.STATUS_LAST_UPDATED) as latestDate
from ISSUES I2
where I2.ISSUE_ID = ISSUES.ISSUE_ID)
Window function / inline view example
select
PRODUCT.P_NAME,
S.SP_PRODUCT_ID, S.SP_NAME, S.SP_ID, S.SP_VALUE_DATE,
I.STATUS_FLAG, I.STATUS_LAST_UPDATED
from PRODUCT
join (select SUBPRODUCT.*,
max(SP_VALUE_DATE) over (partition by SP_PRODUCT_ID) as latestDate
from SUBPRODUCT) S
on PRODUCT.P_ID = S.SP_PRODUCT_ID
and S.SP_VALUE_DATE = S.latestDate
join (select ISSUES.*,
max(STATUS_LAST_UPDATED) over (partition by ISSUE_ID) as latestDate
from ISSUES) I
on I.ISSUE_ID = S.SP_ID
and I.STATUS_LAST_UPDATED = I.latestDate
This often performs a bit better, but window functions can be tricky to understand.

SQL Server - Need to SUM values in across multiple returned records

In the following query I am trying to get TotalQty to SUM across both the locations for item 6112040, but so far I have been unable to make this happen. I do need to keep both lines for 6112040 separate in order to capture the different location.
This query feeds into a Jasper ireport using something called Java.Groovy. Despite this, none of the PDFs printed yet have been either stylish or stained brown. Perhaps someone could address that issue as well, but this SUM issue takes priority
I know Gordon Linoff will get on in about an hour so maybe he can help.
DECLARE #receipt INT
SET #receipt = 20
SELECT
ent.WarehouseSku AS WarehouseSku,
ent.PalletId AS [ReceivedPallet],
ISNULL(inv.LocationName,'') AS [ActualLoc],
SUM(ISNULL(inv.Qty,0)) AS [LocationQty],
SUM(ISNULL(inv.Qty,0)) AS [TotalQty],
MAX(CAST(ent.ReceiptLineNumber AS INT)) AS [LineNumber],
MAX(ent.WarehouseLotReference) AS [WarehouseLot],
LEFT(SUM(ent.WeightExpected),7) AS [GrossWeight],
LEFT(SUM(inv.[Weight]),7) AS [NetWeight]
FROM WarehouseReceiptDetail AS det
INNER JOIN WarehouseReceiptDetailEntry AS ent
ON det.ReceiptNumber = ent.ReceiptNumber
AND det.FacilityName = ent.FacilityName
AND det.WarehouseName = ent.WarehouseName
AND det.ReceiptLineNumber = ent.ReceiptLineNumber
LEFT OUTER JOIN Inventory AS inv
ON inv.WarehouseName = det.WarehouseName
AND inv.FacilityName = det.FacilityName
AND inv.WarehouseSku = det.WarehouseSku
AND inv.CustomerLotReference = ent.CustomerLotReference
AND inv.LotReferenceOne = det.ReceiptNumber
AND ISNULL(ent.CaseId,'') = ISNULL(inv.CaseId,'')
WHERE
det.WarehouseName = $Warehouse
AND det.FacilityName = $Facility
AND det.ReceiptNumber = #receipt
GROUP BY
ent.PalletId
, ent.WarehouseSku
, inv.LocationName
, inv.Qty
, inv.LotReferenceOne
ORDER BY ent.WarehouseSku
The lines I need partially coalesced are 4 and 5 in the above return.
Create a second dataset with a subquery and join to that subquery - you can extrapolate from the following to apply to your situation:
First the Subquery:
SELECT
WarehouseSku,
SUM(Qty)
FROM
Inventory
GROUP BY
WarehouseSku
Now apply to your query - insert into the FROM clause:
...
LEFT JOIN (
SELECT
WarehouseSKU,
SUM(Qty)
FROM
Inventory
GROUP BY
WarehouseSKU
) AS TotalQty
ON Warehouse.WarehouseSku = TotalQty.WarehouseSku
Without seeing the actual schema DDL it is hard to know the exact cardinality, but I think this will point you in the right direction.

Complicated Calculation Using Oracle SQL

I have created a database for an imaginary solicitors, my last query to complete is driving me insane. I need to work out the total a solicitor has made in their career with the company, I have time_spent and rate to multiply and special rate to add. (special rate is a one off charge for corporate contracts so not many cases have them). the best I could come up with is the code below. It does what I want but only displays the solicitors working on a case with a special rate applied to it.
I essentially want it to display the result of the query in a table even if the special rate is NULL.
I have ordered the table to show the highest amount first so i can use ROWNUM to only show the top 10% earners.
CREATE VIEW rich_solicitors AS
SELECT notes.time_spent * rate.rate_amnt + special_rate.s_rate_amnt AS solicitor_made,
notes.case_id
FROM notes,
rate,
solicitor_rate,
solicitor,
case,
contract,
special_rate
WHERE notes.solicitor_id = solicitor.solicitor_id
AND solicitor.solicitor_id = solicitor_rate.solicitor_id
AND solicitor_rate.rate_id = rate.rate_id
AND notes.case_id = case.case_id
AND case.contract_id = contract.contract_id
AND contract.contract_id = special_rate.contract_id
ORDER BY -solicitor_made;
Query:
SELECT *
FROM rich_solicitors
WHERE ROWNUM <= (SELECT COUNT(*)/10
FROM rich_solicitors)
I'm suspicious of your use of ROWNUM in your example query...
Oracle9i+ supports analytic functions, like ROW_NUMBER and NTILE, to make queries like your example easier. Analytics are also ANSI, so the syntax is consistent when implemented (IE: Not on MySQL or SQLite). I re-wrote your query as:
SELECT x.*
FROM (SELECT n.time_spent * r.rate_amnt + COALESCE(spr.s_rate_amnt, 0) AS solicitor_made,
n.case_id,
NTILE(10) OVER (ORDER BY solicitor_made) AS rank
FROM NOTES n
JOIN SOLICITOR s ON s.solicitor_id = n.solicitor_id
JOIN SOLICITOR_RATE sr ON sr.solicitor_id = s.solicitor_id
JOIN RATE r ON r.rate_id = sr.rate_id
JOIN CASE c ON c.case_id = n.case_id
JOIN CONTRACT cntrct ON cntrct.contract_id = c.contract_id
LEFT JOIN SPECIAL_RATE spr ON spr.contract_id = cntrct.contract_id) x
WHERE x.rank = 1
If you're new to SQL, I recommend using ANSI-92 syntax. Your example uses ANSI-89, which doesn't support OUTER JOINs and is considered deprecated. I used a LEFT OUTER JOIN against the SPECIAL_RATE table because not all jobs are likely to have a special rate attached to them.
It's also not recommended to include an ORDER BY in views, because views encapsulate the query -- no one will know what the default ordering is, and will likely include their own (waste of resources potentially).
you need to left join in the special rate.
If I recall the oracle syntax is like:
AND contract.contract_id = special_rate.contract_id (+)
but now special_rate.* can be null so:
+ special_rate.s_rate_amnt
will need to be:
+ coalesce(special_rate.s_rate_amnt,0)