How can this be translated into postgresql? - sql

I am fairly new to sql in general, and I am trying to work with a database which is extremely large. Now, on the website's examples, there is this query
SELECT m.chembl_id AS compound_chembl_id,
s.canonical_smiles,
r.compound_key,
NVL(TO_CHAR(d.pubmed_id),d.doi) AS pubmed_id_or_doi,
a.description AS assay_description, act.standard_type,
act.standard_relation,
act.standard_value,
act.standard_units,
act.activity_comment
FROM compound_structures s,
molecule_dictionary m,
compound_records r,
docs d,
activities act,
assays a,
target_dictionary t
WHERE s.molregno (+) = m.molregno
AND m.molregno = r.molregno
AND r.record_id = act.record_id
AND r.doc_id = d.doc_id
AND act.assay_id = a.assay_id
AND a.tid = t.tid
AND t.chembl_id = 'CHEMBL1827';
because of this (+) and this NVL I assumed it is Oracle. I am working on pgadmin4 and all my attempts at translating this after doing some research, resulted in errors. Example of my attempt and the error given bellow
SELECT m.chembl_id AS compound_chembl_id,
s.canonical_smiles,
r.compound_key,
COALESCE(CAST(d.pubmed_id AS varchar),d.doi) AS pubmed_id_or_doi,
a.description AS assay_description,
act.standard_type,
act.standard_relation,
act.standard_value,
act.standard_units,
act.activity_comment
FROM compound_structures s,
compound_records r,
docs d,
activities act,
assays a,
target_dictionary t
LEFT OUTER JOIN molecule_dictionary m ON s.molregno = m.molregno
WHERE m.molregno = r.molregno
AND r.record_id = act.record_id
AND r.doc_id = d.doc_id
AND act.assay_id = a.assay_id
AND a.tid = t.tid
AND t.chembl_id = 'CHEMBL1827';
ERROR: invalid reference to FROM-clause entry for table "s"
LINE 16: LEFT OUTER JOIN molecule_dictionary m ON s.molregno = m.molr...
^
HINT: There is an entry for table "s", but it cannot be referenced from this part of the
query.
SQL state: 42P01
Character: 492
Could someone help me?

Don't mix the old, ancient implicit joins and explicit JOIN operator:
SELECT m.chembl_id AS compound_chembl_id,
s.canonical_smiles,
r.compound_key,
COALESCE(CAST(d.pubmed_id AS varchar),d.doi) AS pubmed_id_or_doi,
a.description AS assay_description,
act.standard_type,
act.standard_relation,
act.standard_value,
act.standard_units,
act.activity_comment
FROM compound_structures s
LEFT JOIN molecule_dictionary m ON s.molregno = m.molregno
JOIN compound_records r ON m.molregno = r.molregno
JOIN docs d ON r.doc_id = d.doc_id
JOIN activities act ON r.record_id = act.record_id
JOIN assays a ON act.assay_id = a.assay_id
JOIN target_dictionary t ON a.tid = t.tid
WHERE t.chembl_id = 'CHEMBL1827';
I hope I got the direction of the outer join correct - I haven't used that Oracle syntax for decades (note that even Oracle recommends to stop using it).
But the inner join on the following tables effectively turns that outer join back into an inner join - I don't think the Oracle syntax changes that (meaning: I think that attempt on an outer join in the original query was wrong to begin with). So maybe you can simplify the LEFT JOIN to a JOIN. I am not sure if that outer join actually makes sense in the Oracle query.

Consider:
SELECT
m.chembl_id AS compound_chembl_id,
s.canonical_smiles,
r.compound_key,
COALESCE((d.pubmed_id)::text, d.doi) AS pubmed_id_or_doi,
a.description AS assay_description,
act.standard_type,
act.standard_relation,
act.standard_value,
act.standard_units,
act.activity_comment
FROM
compound_structures s
INNER JOIN molecule_dictionary m ON s.molregno = m.molregno
INNER JOIN compound_records r ON m.molregno = r.molregno
INNER JOIN docs d ON r.doc_id = d.doc_id
INNER JOIN activities act ON r.record_id = act.record_id
INNER JOIN assays a ON act.assay_id = a.assay_id
INNER JOIN target_dictionary t ON a.tid = t.tid AND t.chembl_id = 'CHEMBL1827'
Rationale:
use standard, explicit joins everywhere; old school joins have been deprecated 20 years ago!
COALESCE() is more standard than NVL()
a LEFT JOIN followed by INNER JOINs that relate to columns coming for the left table is actually equivalent to INNER JOIN

Related

I am converting following oracle query to bigquery but the result record counts are different

I am converting following oracle query to bigquery query but the results(record counts) are different, eventhough base tables involved in the query are having same number of records in both oracle and bq.
oracle :
SELECT
to_char(R_PROJECT_S.PROJECT_COPYRIGHT_YEAR),
R_PROJECT_S.PROJECT_TITLE,
to_char(R_PROJECT_S.EDITION),
R_PROJECT_S.CIRCULATION_DESC,
R_PROJECT_S.DISTRIBUTION_DESC,
R_PROJECT_S.PROJECT_ID,
DB.R_USAGE_INFO_S.OBJECT_ID,
UPPER(DB.R_INFO_S.PHOTOGRAPHER),
UPPER(DB.R_INFO_S.SOURCE_CAPTION),
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEAU,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHECPYR,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEED,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEPRDDE,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEGRDE,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHSODE,
R_PROJECT_S.CHARGE_TO_ISBN,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEPTIT,
DB.R_INFO_S.SOURCE_NAME,
R_PROJECT_S.LANGUAGE_DESC,
R_PROJECT_S.PROJECT_FORMAT_DESC,
DB.R_USAGE_INFO_S.USAGE_ID,
DB.R_USAGE_INFO_S.PAGE,
DB.R_USAGE_INFO_S.CHAPTER,
DB.R_INFO_S.WORK_PROJECT_ID,
DB.R_INFO_S.IMAGE_TYPE_DESC,
DB.R_INFO_S.IMAGE_DESC,
DB.R_USAGE_INFO_S.PERMISSION_TYPE_DESC,
DB.R_USAGE_INFO_S.PERMISSION_STATUS_DESC,
DB.R_USAGE_INFO_S.PERMISSION_USAGE_DESC,
DB.R_USAGE_INFO_S.USAGE_LABEL,
DB.R_USAGE_INFO_S.QUOTED_COST,
DB.R_INFO_S.SOURCE_OBJECT_ID,
DB.R_USAGE_INFO_S.USAGE_TYPE_DESC,
GHEPM_TITLE_PSPP.TITLE_DESCRIPTION,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHESOAB,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEGRCD
FROM
DB.R_PROJECT_S_VW R_PROJECT_S,
DB.R_USAGE_INFO_S,
DB.R_INFO_S,
ADMIN.BIC_APHEISBN00_BO_VW,
DB.GHEPM_TITLE GHEPM_TITLE_PSPP
WHERE
( R_PROJECT_S.PROJECT_ID=DB.R_USAGE_INFO_S.PROJECT_ID(+)
)
AND ( DB.R_USAGE_INFO_S.OBJECT_ID=DB.R_INFO_S.OBJECT_ID )
AND ( R_PROJECT_S.PROJECT_ID=ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHETIIS(+) )
AND ( R_PROJECT_S.PROJECT_ID=DB.GHEPM_TITLE_PSPP.ISBN10(+) )
AND UPPER(DB.R_USAGE_INFO_S.USAGE_LABEL) NOT LIKE UNISTR('%KILL%')
BQ:
SELECT
CAST(R_PROJECT_S.PROJECT_COPYRIGHT_YEAR AS string) COPYRIGHT_YEAR,
R_PROJECT_S.PROJECT_TITLE,
CAST(R_PROJECT_S.EDITION AS string) EDITION,
R_PROJECT_S.CIRCULATION_DESC,
R_PROJECT_S.DISTRIBUTION_DESC,
R_PROJECT_S.PROJECT_ID,
R_USAGE_INFO_S.OBJECT_ID,
UPPER(R_INFO_S.PHOTOGRAPHER) PHOTOGRAPHER,
UPPER(R_INFO_S.SOURCE_CAPTION) SOURCE_CAPTION,
BIC_APHEISBN00_BO._BIC_ZCHEAU,
BIC_APHEISBN00_BO._BIC_ZCHECPYR,
BIC_APHEISBN00_BO._BIC_ZCHEED,
BIC_APHEISBN00_BO._BIC_ZCHEPRDDE,
BIC_APHEISBN00_BO._BIC_ZCHEGRDE,
BIC_APHEISBN00_BO._BIC_ZCHSODE,
R_PROJECT_S.CHARGE_TO_ISBN,
BIC_APHEISBN00_BO._BIC_ZCHEPTIT,
R_INFO_S.SOURCE_NAME,
R_PROJECT_S.LANGUAGE_DESC,
R_PROJECT_S.PROJECT_FORMAT_DESC,
R_USAGE_INFO_S.USAGE_ID,
R_USAGE_INFO_S.PAGE,
R_USAGE_INFO_S.CHAPTER,
R_INFO_S.WORK_PROJECT_ID,
R_INFO_S.IMAGE_TYPE_DESC,
R_INFO_S.IMAGE_DESC,
R_USAGE_INFO_S.PERMISSION_TYPE_DESC,
R_USAGE_INFO_S.PERMISSION_STATUS_DESC,
R_USAGE_INFO_S.PERMISSION_USAGE_DESC,
R_USAGE_INFO_S.USAGE_LABEL,
R_USAGE_INFO_S.QUOTED_COST,
R_INFO_S.SOURCE_OBJECT_ID,
R_USAGE_INFO_S.USAGE_TYPE_DESC,
GHEPM_TITLE_PSPP.TITLE_DESCRIPTION,
BIC_APHEISBN00_BO._BIC_ZCHESOAB,
BIC_APHEISBN00_BO._BIC_ZCHEGRCD
FROM
`domain-rr.oracle_DB_DB.R_info_s` R_INFO_S
inner join
`domain-rr.oracle_DB_DB.R_usage_info_s` R_USAGE_INFO_S
on
R_USAGE_INFO_S.OBJECT_ID=R_INFO_S.OBJECT_ID
right outer join
`domain-rr.DB_RPT.R_PROJECT_S_VW` R_PROJECT_S
on
R_PROJECT_S.PROJECT_ID=R_USAGE_INFO_S.PROJECT_ID
left outer join
`domain-rr.DB_RPT.BIC_APHEISBN00_BO_VW` BIC_APHEISBN00_BO
ON
R_PROJECT_S.PROJECT_ID=BIC_APHEISBN00_BO._BIC_ZCHETIIS
left outer join
`domain-rr.oracle_DB_DB.ghepm_title` GHEPM_TITLE_PSPP
ON
R_PROJECT_S.PROJECT_ID=GHEPM_TITLE_PSPP.ISBN10
AND UPPER(R_USAGE_INFO_S.USAGE_LABEL) NOT LIKE '%KILL%'
Oracle count - 1553437
BQ count - 2414413
Please help me on how to get counts are same on both oracle and bq
Thanks,
Naren
Had you used more readable, shortened table aliases several differences can be illuminated:
Oracle does not attempt any RIGHT JOIN;
GBQ should run UPPER(...) expression in WHERE not on last LEFT JOIN clause or move expression to INNER JOIN on ui table (but without testing may not make a difference but readability);
Table order may make a difference especially with use of both INNER and OUTER joins;
Oracle (using the older, outdated implicit joins)
...
FROM
GRDW.RMS_IMAGE_PROJECT_S_VW p,
GRDW.RMS_IMAGE_USAGE_INFO_S ui,
GRDW.RMS_IMAGE_INFO_S i,
BOADMIN.BIC_APHEISBN00_BO_VW b,
GRDW.GHEPM_TITLE g
WHERE
( p.PROJECT_ID = ui.PROJECT_ID(+) -- LEFT JOIN
)
AND ( ui.OBJECT_ID = i.OBJECT_ID ) -- INNER JOIN
AND ( p.PROJECT_ID = b.BIC_ZCHETIIS(+) ) -- LEFT JOIN
AND ( p.PROJECT_ID = g.ISBN10(+) ) -- LEFT JOIN
AND UPPER(ui.USAGE_LABEL) NOT LIKE UNISTR('%KILL%')
Google BigQuery (using current standard of explicit joins)
...
FROM
`pearson-rr.oracle_grdw_grdw.rms_image_info_s` i
INNER JOIN
`pearson-rr.oracle_grdw_grdw.rms_image_usage_info_s` ui
ON ui.OBJECT_ID = i.OBJECT_ID
RIGHT OUTER JOIN
`pearson-rr.GRDW_RPT.RMS_IMAGE_PROJECT_S_VW` p
ON p.PROJECT_ID = ui.PROJECT_ID
LEFT OUTER JOIN
`pearson-rr.GRDW_RPT.BIC_APHEISBN00_BO_VW` b
ON p.PROJECT_ID = b._BIC_ZCHETIIS
LEFT OUTER JOIN
`pearson-rr.oracle_grdw_grdw.ghepm_title` g
ON p.PROJECT_ID = g.ISBN10
AND UPPER(ui.USAGE_LABEL) NOT LIKE '%KILL%'
Therefore, to account for table order and appropriate JOIN, consider below adjusted Google BigQuery:
...
FROM
`pearson-rr.GRDW_RPT.RMS_IMAGE_PROJECT_S_VW` p
LEFT OUTER JOIN
`pearson-rr.oracle_grdw_grdw.rms_image_usage_info_s` ui
ON p.PROJECT_ID = ui.PROJECT_ID
INNER OUTER JOIN
`pearson-rr.oracle_grdw_grdw.rms_image_info_s` i
ON ui.OBJECT_ID = i.OBJECT_ID AND UPPER(ui.USAGE_LABEL) NOT LIKE '%KILL%'
LEFT OUTER JOIN
`pearson-rr.GRDW_RPT.BIC_APHEISBN00_BO_VW` b
ON p.PROJECT_ID = b._BIC_ZCHETIIS
LEFT OUTER JOIN
`pearson-rr.oracle_grdw_grdw.ghepm_title` g
ON p.PROJECT_ID = g.ISBN10

i m confuse about this query

What is the name|type of this query? Like inner join, outer join.
SELECT a.tutorial_id, a.tutorial_author, b.tutorial_count
FROM tutorials_tbl a, tcount_tbl b
WHERE a.tutorial_author = b.tutorial_author
It's an Implicit INNER JOIN most commonly found in older code. It is synonymous with:
SELECT a.tutorial_id,
a.tutorial_author,
b.tutorial_count
FROM tutorials_tbl a
INNER JOIN tcount_tbl b ON a.tutorial_author = b.tutorial_author
which is also synonymous with just using JOIN:
SELECT a.tutorial_id,
a.tutorial_author,
b.tutorial_count
FROM tutorials_tbl a
JOIN tcount_tbl b ON a.tutorial_author = b.tutorial_author

join graph disconnect ORACLE

The query below won't run as I have a join graph disconnect. As far as I understand when you alias a table you have to put it first in the join condition, but when i do so I still get the same error, any suggestions?
select
date1.LABEL_YYYY_MM_DD as Some_Label1,
A.Some_Field as Some_Label2,
round(sum(D.Some_Field3)/count (*),0)as Some_Label3
from Table_A A
inner JOIN Table_B B ON (A.some_key = B.some_key)
inner JOIN date_time date1 ON (A.START_DATE_TIME_KEY = date1.DATE_TIME_KEY)
left outer join Table_C C on(C.some_GUID = A.some_ID)
left outer join Table_D D on(D.a_ID = C.a_ID)
where
(1=1)
and date1.LABEL_YYYY_MM_DD ='2015-03-30'
and D.blah ='1'
group by
date1.LABEL_YYYY_MM_DD,
A.Some_Field
order by
date1.LABEL_YYYY_MM_DD
;
Based on my comment above I should change the first inner join to B.some_key = A.some_key and so on however I still get the disconnect...
according to the Oracle documentation (http://docs.oracle.com/javadb/10.8.3.0/ref/rrefsqlj18922.html) there is space required between ON and the boolean expression

sql join history table to active table SSRS report

I'm trying to pull results from a database (sql server 2005) which takes 4 tables:
Subscriber S, Member M, ClaimLines L, ClaimHistoryLines H
Query is as follows:
select S.SBSB_ID, M.MEME_NAME,
(CASE L.CLCL_ID WHEN '' THEN H.CLCL_ID ELSE L.CLCL_ID END) AS CLAIM_ID
FROM CMC_CDDL_CL_LINE L, CMC_MEME_MEMBER M LEFT OUTER JOIN CMC_CLDH_DEN_HIST H
ON H.MEME_CK = M.MEME_CK, CMC_SBSB_SUBSC S
WHERE
S.SBSB_ID = '120943270' AND
L.MEME_CK = M.MEME_CK AND
M.SBSB_CK = S.SBSB_CK
This query successfully pulls in the result rows from the ClaimLines L table but no results from the History table are shown. I'm not sure how to do this, any sql experts out there that can help would be great. -Thanks!
CMC_CDDL_CL_LINE L, CMC_MEME_MEMBER M LEFT OUTER JOIN CMC_CLDH_DEN_HIST H
don't mix obsolete implied join syntax with left join. They don't play well together. Use the correct ANSII standard join syntax. Infact stop using the obsolete syntax altogether.
This was a very silly mistake on my part. I neglected to think of using a UNION which solved my problem and allowed me to pull from both places into a single results set without creating massive duplicate rows. Glad someone pointed out the proper way to write sql using ANSII standards.
select S.SBSB_ID, M.MEME_NAME,
L.CLCL_ID AS CLAIM_ID
FROM
CMC_SBSB_SUBSC S INNER JOIN CMC_MEME_MEMBER M ON S.SBSB_CK = M.SBSB_CK
INNER JOIN CMC_CDDL_CL_LINE L ON L.MEME_CK = M.MEME_CK
WHERE
S.SBSB_ID = '120943270'
UNION
select S.SBSB_ID, M.MEME_NAME,
H.CLCL_ID AS CLAIM_ID
FROM
CMC_SBSB_SUBSC S INNER JOIN CMC_MEME_MEMBER M ON S.SBSB_CK = M.SBSB_CK
INNER JOIN CMC_CLDH_DEN_HIST H ON H.MEME_CK = M.MEME_CK
WHERE
S.SBSB_ID = '120943270'

How to convert Sql query with inner join statement to sql query with Where statement(no inner join in statement)

I have generated the following sql query code in access using the Qbe and converting it to sql.However, i want proof that i did without help so i intend to take out all inner join statements and replace them with the WHERE statement. How do i work it out. Please explain and provide answer. thank you.
SQL Query:
SELECT Entertainer.EntertainerID, Entertainer.FirstName, Entertainer.LastName,
Booking.CustomerID, Booking.EventDate, BookingDetail.Category, BookingDetail.Duration,
Speciality.SpecialityDescription, EntertainerSpeciality.EntertainerSpecialityCost
FROM (Entertainer INNER JOIN (Booking INNER JOIN BookingDetail ON
Booking.BookingID=BookingDetail.BookingID) ON
Entertainer.EntertainerID=BookingDetail.EntertainerID)
INNER JOIN (Speciality INNER JOIN EntertainerSpeciality ON
Speciality.SpecialityID=EntertainerSpeciality.SpecialityID) ON
Entertainer.EntertainerID=EntertainerSpeciality.EntertainerID
WHERE (((Entertainer.EntertainerID)=[Enter EntertainerID]));
That is the weirdest JOIN statement I've seen to date. Here's your query converted to ANSI-89 syntax:
SELECT e.entertainerid,
e.firstname,
e.lastname,
b.customerid,
b.eventdate,
bd.category,
bd.duration,
s.specialitydescription,
es.entertainerspecialitycost
FROM Entertainer e,
BookingDetail bd,
Booking b,
EntertainerSpeciality es,
Speciality s
WHERE e.entertainerid = bd.entertainerid
AND b.bookingid = bd.bookingid
AND e.entertainerid = es.entertainerid
AND s.specialityid = es.specialityid
AND e.entertainerid = [Enter EntertainerID]
Here your original query with the syntax cleaned up - it should help make it easier to see the common information:
SELECT e.entertainerid,
e.firstname,
e.lastname,
b.customerid,
b.eventdate,
bd.category,
bd.duration,
s.specialitydescription,
es.entertainerspecialitycost
FROM ENTERTAINER e
JOIN BOOKINGDETAIL bd ON bd.entertainerid = e.entertainerid
JOIN BOOKING b ON b.bookingid = bd.bookingid
JOIN ENTERTAINERSPECIALITY es ON es.entertainerid = e.entertainerid
JOIN SPECIALITY s ON s.specialityid = es.specialityid
WHERE e.entertainerid = ?
The difference is the ANSI-89 syntax includes the join criteria in the WHERE clause, along with the actual filter criteria. To highlight, ANSI-92:
FROM ENTERTAINER e
JOIN BOOKINGDETAIL bd ON bd.entertainerid = e.entertainerid
...vs ANSI-89:
FROM ENTERTAINER e,
BOOKINGDETAIL bd
,... -- omitted for purpose of example
WHERE e.entertainerid = bd.entertainerid
ANSI-92 syntax is preferred:
ANSI-89 didn't have consistently implemented LEFT JOIN syntax in various databases, so statements were that portable
ANSI-92 provides more powerful JOINs (IE: x ON x.id = y.id AND x.col IS NOT NULL)
ANSI-92 is easier to read, separating the join criteria from the actual filter criteria
It's a strange thing to want to do (as commenters say, INNER JOIN is usually better, and applying a mechanical transformation proves nothing), but not difficult. Each INNER JOIN can be mechanically transformed into a WHERE as follows:
instead of each 2-way inner join A INNER JOIN B ON (cond1) WHERE (cond2)
rewrite it as A, B WHERE (cond1) AND (cond2)
whatever the tables A and B and the conditions cond1 and cond2 might be.