Complex Left Outer Joins in Oracle, converting to PostgreSQL - sql

I have this enormous SQL statement, from an Oracle+SAS environment. I get most of it, but what is confusing me most are the Left Outer Joins/plus signs in the WHERE clause. I need to convert this to Postgres. I can handle the first part of the code, it's the joins that confuse me.
SELECT
--A bunch of columns from several tables
FROM prd_acct_cmp_grp pacg,
product_acct pa,
customer_acct ca,
(SELECT DISTINCT member_id, group_id
FROM group_members
WHERE group_id IN (33158, 27156, 35376, 36217)) gm,
prd_acct_acct_cmp pac,
pacg_usage pu,
sales_hierarchy sh,
sales_region sr
WHERE pacg.component_group_cd = 'AN'
AND pacg.component_grp_val IN (%s) --string that is added in later
AND pacg.product_account_id = pa.product_account_id
AND pa.customer_acct_id = ca.customer_acct_id
AND ca.customer_acct_id = gm.member_id(+)
AND pacg.product_account_id = pac.product_account_id
AND pacg.occurencce_number = pac.occurence_number
AND pac.prcmp_code = 'USAGE'
AND pacg.component_group_cd = pu.component_group_cd(+)
AND pacg.component_grp_val = pu.component_grp_val(+)
AND ca.primary_sales_rep = sh.sales_rep_id(+)
AND sh.region_cd = sr.sales_region_code(+)
I know how to do simple joins when converting from Oracle, however, this one has multiple instances of the same tables being compared for joins, mixed in with many conditions that don't need to be joined. So how would the joins be done? And would I need an additional WHERE clause at the end of the statement?
Thanks.

Try this:
SELECT
--A bunch of columns from several tables
FROM prd_acct_cmp_grp pacg
JOIN product_acct pa
ON pacg.product_account_id = pa.product_account_id
JOIN customer_acct ca
ON pa.customer_acct_id = ca.customer_acct_id
JOIN prd_acct_acct_cmp pac
ON pacg.product_account_id = pac.product_account_id
AND pacg.occurencce_number = pac.occurence_number
AND pac.prcmp_code = 'USAGE'
LEFT JOIN (SELECT DISTINCT member_id, group_id
FROM group_members
WHERE group_id IN (33158, 27156, 35376, 36217)) gm
ON ca.customer_acct_id = gm.member_id
LEFT JOIN sales_hierarchy sh
ON ca.primary_sales_rep = sh.sales_rep_id
LEFT JOIN sales_region sr
ON sh.region_cd = sr.sales_region_code
LEFT JOIN pacg_usage pu
ON pacg.component_group_cd = pu.component_group_cd
AND pacg.component_grp_val = pu.component_grp_val
WHERE pacg.component_group_cd = 'AN'
AND pacg.component_grp_val IN (%s) --string that is added in later

Related

How to subtract from another table in SQL

SELECT
COUNT(ca.Plate) as 'OccupiedElectricSlots'
FROM cities C
JOIN ParkingHouses HS on C.Id = hs.CityId
JOIN ParkingSlots PS on HS.Id = ps.ParkingHouseId
LEFT JOIN Cars Ca on PS.Id = Ca.ParkingSlotsId
WHERE ps.ElectricOutlet = 1
GROUP BY hs.HouseName, C.CityName
SELECT
MAX(Ps.SlotNumber) as 'ParkingSlotTotal'
,MAX(PS.SlotNumber) - Count(ca.Plate) as 'FreeSlots'
,SUM(CAST(PS.ElectricOutlet AS INT)) as 'ElectricOutlet'
,Hs.HouseName
,C.CityName
FROM Cities C
JOIN ParkingHouses HS on C.Id = hs.CityId
JOIN ParkingSlots PS on HS.Id = ps.ParkingHouseId
LEFT JOIN Cars Ca on PS.Id = Ca.ParkingSlotsId
GROUP BY hs.HouseName, C.CityName
How can I subtract the first tables numbers on the second one?
I want to see how many free slots that have electric outlet.
Like this Column ElectricOutlet - OccupiedElectricSlots = result
I'm quite new at SQL, but I have tried to outer apply (don't fully understand it), and I tried to join them both tables togheter. Tried different where conditions but I'm stuck atm.
Your queries are almost identical as far as I can see. You can change your first query to:
SELECT COUNT(CASE WHEN ps.ElectricOutlet = 1 THEN ca.Plate END) as 'OccupiedElectricSlots'
FROM cities C
JOIN ParkingHouses HS on C.Id = hs.CityId
JOIN ParkingSlots PS on HS.Id = ps.ParkingHouseId
LEFT JOIN Cars Ca on PS.Id = Ca.ParkingSlotsId
GROUP BY hs.HouseName, C.CityName
I.e., instead of filtering on ps.ElectricOutlet you just ignore those rows in COUNT. Now you can just:
SELECT
[...]
,SUM(CAST(PS.ElectricOutlet AS INT)) - COUNT(CASE WHEN ...) AS result
[...]
FROM Cities C
JOIN ParkingHouses HS
ON C.Id = hs.CityId
JOIN ParkingSlots PS
ON HS.Id = ps.ParkingHouseId
LEFT JOIN Cars Ca
ON PS.Id = Ca.ParkingSlotsId
GROUP BY hs.HouseName, C.CityName
The MINUS operator is used to subtract the result set obtained by first SELECT query from the result set obtained by second SELECT query.
MINUS compares the data in two tables and returns only the rows of data using the specified columns that exist in the first table but not the second.

Speed up SQL query performance with nested queries

Could anyone help me speed this query up? It currently take 17 minutes to run but does return the correct data and it populates a subform in MS Access. Functions in the rest of the VBA are declared as long to try to speed up more.
Here's the full query:
SELECT lots of things
FROM (((((((((((((((ngstest
INNER JOIN patients
ON ngstest.internalpatientid = patients.internalpatientid)
INNER JOIN referral
ON ngstest.referralid = referral.referralid)
INNER JOIN checker
ON ngstest.bookby = checker.check1id)
INNER JOIN ngspanel
ON ngstest.ngspanelid = ngspanel.ngspanelid)
LEFT JOIN ngspanel AS ngspanel_1
ON ngstest.ngspanelid_b = ngspanel_1.ngspanelid)
INNER JOIN status
ON ngstest.statusid = status.statusid)
INNER JOIN dbo_patient_table
ON patients.patientid = dbo_patient_table.patienttrustid)
LEFT JOIN dna
ON ngstest.dna = dna.dnanumber)
INNER JOIN status AS status_1
ON patients.s_statusoverall = status_1.statusid)
LEFT JOIN gw_gendertable
ON dbo_patient_table.genderid = gw_gendertable.genderid)
LEFT JOIN ngswesbatch
ON ngstest.wesbatch = ngswesbatch.ngswesbatchid)
LEFT JOIN checker AS checker_1
ON ngstest.check1id = checker_1.check1id)
LEFT JOIN checker AS checker_2
ON ngstest.check2id = checker_2.check1id)
LEFT JOIN checker AS checker_3
ON ngstest.check3id = checker_3.check1id)
LEFT JOIN ngspanel AS ngspanel_2
ON ngstest.ngspanelid_c = ngspanel_2.ngspanelid)
LEFT JOIN checker AS checker_4
ON ngstest.check4id = checker_4.check1id
WHERE ((ngstest.referralid IN
(SELECT referralid FROM referral
WHERE grouptypeid = 14)
AND ngstest.ngstestid IN
(SELECT ngstest.ngstestid
FROM ngsanalysis
INNER JOIN ngstest
ON ngsanalysis.ngstestid = ngstest.ngstestid
WHERE ngsanalysis.pedigree = 3302) )
AND status.statusid = 1202218800)
ORDER BY ngstest.priority,
ngstest.daterequested;
The two nested queries are strings from elsewhere in the code so are called in the vba as " & includereferralls & " And " & ParentsStatusesFilter & "
They are:
ParentsStatusesFilter = "NGSTest.NGSTestID in
(SELECT NGSTest.NGSTestID
FROM NGSAnalysis
INNER JOIN NGSTest
ON NGSAnalysis.NGSTestID = NGSTest.NGSTestID
WHERE NGSAnalysis.Pedigree IN (3302,3303,3304)"
And
includereferrals = "NGSTest.ReferralID
(SELECT referralid FROM referral WHERE referral.grouptypeid = 14)"
The query needs to remain readable (and therefore editable) so can't use things like Distinct, Group By or contain any Unions. Have tried Exists instead of In for the nested queries but that stops it from actually filtering the results.
WHERE EXISTS (SELECT NGSTest.NGSTestID
FROM NGSAnalysis
INNER JOIN NGSTest
ON NGSAnalysis.NGSTestID = NGSTest.NGSTestID
WHERE NGSAnalysis.Pedigree IN (3302,3303,3304)
So the exist clause you have there isn't tied to the outer query which would run similar to just added 1 = 1 to the where clause. I took your where clause and converted it. It should look something like this...
WHERE EXISTS (
SELECT referralid
FROM referral
WHERE grouptypeid = 14 AND ngstest.referralid = referral.referralid)
AND EXISTS (
SELECT ngsanalysis.ngstestid
FROM ngsanalysis
WHERE ngsanalysis.pedigree IN (3302,3303,3304) AND ngstest.ngstestid = ngsanalysis.ngstestid
)
AND status.statusid = 1202218800
Adding exists will speed it up a bit, but the the bulk of the slowness is the left joins. Access does not handle the left joins as well as SQL Server does. Change all your joins to inner joins and you will see the query runs very fast. This is obviously not ideal since some relationships are optional. What I have done to get around this is add a default record that replaces a null relationship.
Here is what that looks like for you: In the checker table you could add a record that represents a null value. So put a record into the checker table with check1id of -1 or 0. Then default check1id, check2id, check3id on ngstest to -1 or 0. You will need to do that type of thing for all tables you need to left join on.

SQL select results not appearing if a value is null

I am building a complex select statement, and when one of my values (pcf_auto_key) is null it will not disipaly any values for that header entry.
select c.company_name, h.prj_number, h.description, s.status_code, h.header_notes, h.cm_udf_001, h.cm_udf_002, h.cm_udf_008, l.classification_code
from project_header h, companies c, project_status s, project_classification l
where exists
(select company_name from companies where h.cmp_auto_key = c.cmp_auto_key)
and exists
(select status_code from project_status s where s.pjs_auto_key = h.pjs_auto_key)
and exists
(select classification_code from project_classification where h.pcf_auto_key = l.pcf_auto_key)
and pjm_auto_key = 11
--and pjt_auto_key = 10
and c.cmp_auto_key = h.cmp_auto_key
and h.pjs_auto_key = s.pjs_auto_key
and l.pcf_auto_key = h.pcf_auto_key
and s.status_type = 'O'
How does my select statement look? Is this an appropriate way of pulling info from other tables?
This is an oracle database, and I am using SQL Developer.
Assuming you want to show all the data that you can find but display the classification as blank when there is no match in that table, you can use a left outer join; which is much clearer with explicit join syntax:
select c.company_name, h.prj_number, h.description, s.status_code, h.header_notes,
h.cm_udf_001, h.cm_udf_002, h.cm_udf_008, l.classification_code
from project_header h
join companies c on c.cmp_auto_key = h.cmp_auto_key
join project_status s on s.pjs_auto_key = h.pjs_auto_key
left join project_classification l on l.pcf_auto_key = h.pcf_auto_key
where pjm_auto_key = 11
and s.status_type = 'O'
I've taken out the exists conditions as they just seem to be replicating the join conditions.
If you might not have matching data in any of the other tables you can make the other inner joins into outer joins in the same way, but be aware that if you outer join to project_status you will need to move the statatus_type check into the join condition as well, or Oracle will convert that back into an inner join.
Read more about the different kinds of joins.

JOIN syntax and order for multiple tables

SQL Gurus,
I have a query that uses the "old" style of join syntax as follows using 7 tables (table and column names changed to protect the innocent), as shown below:
SELECT v1_col, p1_col
FROM p1_tbl, p_tbl, p2_tbl, p3_tbl, v1_tbl, v2_tbl, v3_tbl
WHERE p1_code = 1
AND v1_code = 1
AND p1_date >= v1_date
AND p_uid = p1_uid
AND p2_uid = p1_uid AND p2_id = v2_id
AND p3_uid = p1_uid AND p3_id = v3_id
AND v2_uid = v1_uid
AND v3_uid = v1_uid
The query works just fine and produces the results it is supposed to, but as an academic exercise, I tried to rewrite the query using the more standard JOIN syntax, for example, below is one version I tried:
SELECT V1.v1_col, P1.p1_col
FROM p1_tbl P1, v1_tbl V1
JOIN p_tbl P ON ( P.p_uid = P1.p1_uid )
JOIN p2_tbl P2 ON ( P2.p2_uid = P1.p1_uid AND P2.p2_id = V2.v2_id )
JOIN p3_tbl P3 ON ( P3.p3_uid = P1.p1_uid AND P3.p3_id = V3.v3_id )
JOIN v2_tbl V2 ON ( V2.v2_uid = V1.v1_uid )
JOIN v3_tbl V3 ON ( V3.v3_uid = V1.v1_uid )
WHERE P1.p1_code = 1
AND V1.v1_code = 1
AND P1.p1_date >= V1.v1_date
But, no matter how I arrange the JOINs (using MS SQL 2008 R2), I keep running into the error:
The Multi-part identifier "col-name" could not be bound,
where "col-name" varies depending on the order of the JOINs I am attempting...
Does anyone have any good examples on how use the JOIN syntax with this number of tables??
Thanks in advance!
When you use JOIN-syntax you can only access columns from tables in your current join or previous joins. In fact it's easier to write the old syntax, but it's more error-prone, e.g. you can easily forget a join-condition.
This should be what you want.
SELECT v1_col, p1_col
FROM p1_tbl
JOIN v1_tbl ON p1_date >= v1_date
JOIN v2_tbl ON v2_uid = v1_uid
JOIN v3_tbl ON v3_uid = v1_uid
JOIN p_tbl ON p_uid = p1_uid
JOIN p2_tbl ON p2_uid = p1_uid AND p2_id = v2_id
JOIN p3_tbl ON p3_uid = p1_uid AND p3_id = v3_id
WHERE p1_code = 1
AND v1_code = 1
You are not naming the tables in your join such that it doesn't know which column is from which table. Try something like:
SELECT a.v1_col, b.p1_col
FROM p1_tbl b
JOIN p_tbl a ON b.p_uid = a.p1_uid
WHERE b.p1_code = 1
From your query above, I am assuming a naming convention of p2_uid comes from p2_tbl. Below id my best interpretation of WHERE joins to using INNER joins.
SELECT
v1_col, p1_col
FROM
p1_tbl
INNER JOIN p1_tbl
ON p1_tbl.p1_date >= v1_tbl.v1_date
INNER JOIN p_tbl
ON p_tbl.p_uid = p1_tbl.p1_uid
INNER JOIN p2_tbl
ON p2_tbl.p2_uid = p1_tbl.p1_uid
INNER JOIN v2_tbl
ON p2_tbl.p2_id = v2_tbl.v2_id
INNER JOIN p3_tbl
ON p3_tbl.p3_uid = p1_tbl.p1_uid
INNER JOIN v3_tbl
ON p3_tbl.p3_id = v3_tbl.v3_id
INNER JOIN v1_tbl
ON v1_tbl.v1_uid = v2_tbl.v2_uid
AND v1_tbl.v1_uid = v3_tbl.v2_uid
WHERE
p1_code = 1
AND
v1_code = 1
Some general points I have found useful in SQL statements with many joins.
Always fully qualify the names. I.e dont use ID , rahter use
TableName.ID
Dont use aliases unless there is meaning. (I.e. joining a table to
its self where aliasing is needed.)

Selecting the first row out of many sql joins

Alright, so I'm putting together a path to select a revision of a particular novel:
SELECT Catalog.WbsId, Catalog.Revision, NovelRevision.Revision
FROM Catalog, BookInCatalog
INNER JOIN NovelMaster
INNER JOIN HasNovelRevision
INNER JOIN NovelRevision
ON HasNovelRevision.right = NovelRevision.obid
ON HasNovelRevision.Left=NovelMaster.obid
ON NovelMaster.obid = BookInCatalog.Right
WHERE Catalog.obid = BookInCatalog.Left;
This returns all revisions that are in the Novel Master for each Novel Master that is in the catalog.
The problem is, I only want the FIRST revision of each novel master in the catalog. How do I go about doing that? Oh, and btw: my flavor of sql is hobbled, as many others are, in that it does not support the LIMIT Function.
****UPDATE****
So using answer 1 as a guide I upgraded my query to this:
SELECT Catalog.wbsid
FROM Catalog, BookInCatalog, NovelVersion old, NovelMaster, HasNovelRevision
LEFT JOIN NovelVersion newRevs
ON old.revision < newRevs.revision AND HasNovelRevision.right = newRevs.obid
LEFT JOIN HasNovelRevision NewerHasNovelRevision
ON NewerHasNovelRevision.right = newRevs.obid
LEFT JOIN NovelMaster NewTecMst
ON NewerHasNovelRevision.left = NewTecMst.obid
WHERE Catalog.programName = 'E18' AND Catalog.obid = BookInCatalog.Left
AND BookInCatalog.right = NewTecMst.obid AND newRevs.obid = null
ORDER BY newRevs.documentname;
I get an error on the fourth line:
"old"."revision": invalid identifier
SOLUTION
Well, I had to go to another forum, but I got a working solution:
select nr1.title, nr1.revision
from novelrevision nr1
where nr1.revision in (select min(revision) from novelrevision nr2
where nr1.title = nr2.title)
So this solution uses the JOIN mentioned by the OA, along with the IN keyword to match it to a revision.
Something like this might work, it's called an exclusive left join:
....
INNER JOIN NovelRevision
ON HasNovelRevision.right = NovelRevision.obid
LEFT JOIN NovelRevision as NewerRevision
ON HasNovelRevision.right = NewerRevision.obid
AND NewerRevision.revision > NovelRevision.revision
...
WHERE NeverRevision.obid is null
The where clause filters out rows for which a newer revision exists. This effectively limits the query to the newest revisions.
In response to your comment, you could filter out only revisions that have a newer revision in the same NovelMaster. For example:
....
LEFT JOIN NovelRevision as NewerRevision
ON HasNovelRevision.right = NewerRevision.obid
AND NewerRevision.revision > NovelRevision.revision
LEFT JOIN HasNovelRevision as NewerHasNovelRevision
ON NewerHasNovelRevision.right = NewerRevision.obid
LEFT JOIN NovelMaster as NewerNovelMaster
ON NewerHasNovelRevision.left = NewerNovelMaster.obid
AND NewerNovelMaster.obid = NovelMaster.obid
....
WHERE NeverNovelMaster.obid is null
P.S. I don't think you can group JOINs and follow them with a group of ON conditions. An ON must directly follow its JOIN.
You can use CTE
Check this
WITH NovelRevesion_CTE(obid,RevisionDate)
AS
(
SELECT obid,MIN(RevisionDate) RevisionDate FROM NovelRevision Group by obid
)
SELECT Catalog.WbsId, Catalog.Revision, NovelRevision.Revision
FROM Catalog, BookInCatalog
INNER JOIN NovelMaster
INNER JOIN HasNovelRevision
INNER JOIN NovelRevesion
INNER JOIN NovelRevesion_CTE
ON HasNovelRevision.[right] = NovelRevision.obid
ON HasNovelRevision.[Left]=NovelMaster.obid
ON NovelMaster.obid = BookInCatalog.[Right]
ON NovelRevesion_CTE.obid = NovelRevesion.obid
WHERE Catalog.obid = BookInCatalog.[Left];
First it select the first revision written for each novel (assuming obid is novel foriegn key) by taking the smallest date and group them.
then add it as join in your query