Is there a better way to write this Oracle SQL query? - sql

I have been using Oracle SQL for around 6 months so still a beginner. I need to query the database to get information on all items on a particular order (order number is via $_GET['id']).
I have come up with the below query, it works as expected and as I need but I do not know whether I am over complicating things which would slow the query down at all. I understand there are a number of ways to do a single thing and there may be better methods to write this query since I am a beginner.
I am using Oracle 8i (due to this is the version an application we use is supplied with) so I believe that some JOIN etc. are not available in this version, but is there a better way to write a query such as the below?
SELECT auf_pos.auf_pos,
(SELECT auf_stat.anz
FROM auf_stat
WHERE auf_stat.auf_pos = auf_pos.auf_pos
AND auf_stat.auf_nr = ".$_GET['id']."),
(SELECT auf_text.zl_str
FROM auf_text
WHERE auf_text.zl_mod = 0
AND auf_text.auf_pos = auf_pos.auf_pos
AND auf_text.auf_nr = ".$_GET['id']."),
(SELECT glas_daten_basis.gl_bez
FROM glas_daten_basis
WHERE glas_daten_basis.idnr = auf_pos.glas1),
(SELECT lzr_daten.lzr_breite
FROM lzr_daten
WHERE lzr_daten.lzr_idnr = auf_pos.lzr1),
(SELECT glas_daten_basis.gl_bez
FROM glas_daten_basis
WHERE glas_daten_basis.idnr = auf_pos.glas2),
auf_pos.breite,
auf_pos.hoehe,
auf_pos.spr_jn
FROM auf_pos
WHERE auf_pos.auf_nr = ".$_GET['id']."
Thanks in advance to any Oracle gurus that could help this beginner out!

You could rewrite it using joins. If your subselects aren't expected to return any NULL values, then you can use INNER JOINS:
SELECT auf_pos.auf_pos,
auf_stat.anz,
auf_text.zl_str,
glas_daten_basis.gl_bez,
lzr_daten.lzr_breite,
glas_daten_basis.gl_bez,
auf_pos.breite,
auf_pos.hoehe,
auf_pos.spr_jn
FROM auf_pos
INNER JOIN auf_stat ON auf_stat.auf_pos = auf_pos.auf_pos AND auf_stat.auf_nr = ".$_GET['id'].")
INNER JOIN auf_text ON auf_text.zl_mod = 0 AND auf_text.auf_pos = auf_pos.auf_pos AND auf_text.auf_nr = ".$_GET['id'].")
INNER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas1
INNER JOIN lzr_daten ON lzr_daten.lzr_idnr = auf_pos.lzr1
INNER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas2
Or if there are cases where you wouldn't have matches on all the tables, you could replace the INNER joins with LEFT OUTER joins:
SELECT auf_pos.auf_pos,
auf_stat.anz,
auf_text.zl_str,
glas_daten_basis.gl_bez,
lzr_daten.lzr_breite,
glas_daten_basis.gl_bez,
auf_pos.breite,
auf_pos.hoehe,
auf_pos.spr_jn
FROM auf_pos
LEFT OUTER JOIN auf_stat ON auf_stat.auf_pos = auf_pos.auf_pos AND auf_stat.auf_nr = ".$_GET['id'].")
LEFT OUTER JOIN auf_text ON auf_text.zl_mod = 0 AND auf_text.auf_pos = auf_pos.auf_pos AND auf_text.auf_nr = ".$_GET['id'].")
LEFT OUTER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas1
LEFT OUTER JOIN lzr_daten ON lzr_daten.lzr_idnr = auf_pos.lzr1
LEFT OUTER JOIN glas_daten_basis ON glas_daten_basis.idnr = auf_pos.glas2
Whether or not you see any performance gains is debatable. As I understand it, the Oracle query optimizer should take your query and execute it with a similar plan to the join queries, but this is dependent on a number of factors, so the best thing to do it give it a try..

Related

Speed up SQL query performance with nested queries

Could anyone help me speed this query up? It currently take 17 minutes to run but does return the correct data and it populates a subform in MS Access. Functions in the rest of the VBA are declared as long to try to speed up more.
Here's the full query:
SELECT lots of things
FROM (((((((((((((((ngstest
INNER JOIN patients
ON ngstest.internalpatientid = patients.internalpatientid)
INNER JOIN referral
ON ngstest.referralid = referral.referralid)
INNER JOIN checker
ON ngstest.bookby = checker.check1id)
INNER JOIN ngspanel
ON ngstest.ngspanelid = ngspanel.ngspanelid)
LEFT JOIN ngspanel AS ngspanel_1
ON ngstest.ngspanelid_b = ngspanel_1.ngspanelid)
INNER JOIN status
ON ngstest.statusid = status.statusid)
INNER JOIN dbo_patient_table
ON patients.patientid = dbo_patient_table.patienttrustid)
LEFT JOIN dna
ON ngstest.dna = dna.dnanumber)
INNER JOIN status AS status_1
ON patients.s_statusoverall = status_1.statusid)
LEFT JOIN gw_gendertable
ON dbo_patient_table.genderid = gw_gendertable.genderid)
LEFT JOIN ngswesbatch
ON ngstest.wesbatch = ngswesbatch.ngswesbatchid)
LEFT JOIN checker AS checker_1
ON ngstest.check1id = checker_1.check1id)
LEFT JOIN checker AS checker_2
ON ngstest.check2id = checker_2.check1id)
LEFT JOIN checker AS checker_3
ON ngstest.check3id = checker_3.check1id)
LEFT JOIN ngspanel AS ngspanel_2
ON ngstest.ngspanelid_c = ngspanel_2.ngspanelid)
LEFT JOIN checker AS checker_4
ON ngstest.check4id = checker_4.check1id
WHERE ((ngstest.referralid IN
(SELECT referralid FROM referral
WHERE grouptypeid = 14)
AND ngstest.ngstestid IN
(SELECT ngstest.ngstestid
FROM ngsanalysis
INNER JOIN ngstest
ON ngsanalysis.ngstestid = ngstest.ngstestid
WHERE ngsanalysis.pedigree = 3302) )
AND status.statusid = 1202218800)
ORDER BY ngstest.priority,
ngstest.daterequested;
The two nested queries are strings from elsewhere in the code so are called in the vba as " & includereferralls & " And " & ParentsStatusesFilter & "
They are:
ParentsStatusesFilter = "NGSTest.NGSTestID in
(SELECT NGSTest.NGSTestID
FROM NGSAnalysis
INNER JOIN NGSTest
ON NGSAnalysis.NGSTestID = NGSTest.NGSTestID
WHERE NGSAnalysis.Pedigree IN (3302,3303,3304)"
And
includereferrals = "NGSTest.ReferralID
(SELECT referralid FROM referral WHERE referral.grouptypeid = 14)"
The query needs to remain readable (and therefore editable) so can't use things like Distinct, Group By or contain any Unions. Have tried Exists instead of In for the nested queries but that stops it from actually filtering the results.
WHERE EXISTS (SELECT NGSTest.NGSTestID
FROM NGSAnalysis
INNER JOIN NGSTest
ON NGSAnalysis.NGSTestID = NGSTest.NGSTestID
WHERE NGSAnalysis.Pedigree IN (3302,3303,3304)
So the exist clause you have there isn't tied to the outer query which would run similar to just added 1 = 1 to the where clause. I took your where clause and converted it. It should look something like this...
WHERE EXISTS (
SELECT referralid
FROM referral
WHERE grouptypeid = 14 AND ngstest.referralid = referral.referralid)
AND EXISTS (
SELECT ngsanalysis.ngstestid
FROM ngsanalysis
WHERE ngsanalysis.pedigree IN (3302,3303,3304) AND ngstest.ngstestid = ngsanalysis.ngstestid
)
AND status.statusid = 1202218800
Adding exists will speed it up a bit, but the the bulk of the slowness is the left joins. Access does not handle the left joins as well as SQL Server does. Change all your joins to inner joins and you will see the query runs very fast. This is obviously not ideal since some relationships are optional. What I have done to get around this is add a default record that replaces a null relationship.
Here is what that looks like for you: In the checker table you could add a record that represents a null value. So put a record into the checker table with check1id of -1 or 0. Then default check1id, check2id, check3id on ngstest to -1 or 0. You will need to do that type of thing for all tables you need to left join on.

Updating upon a join

I can simply use a select statment on a join to pull results I'd Like using:
Select * from RESULTS (NOLOCK) left join orders on Results.ordno = orders.ordno
left join folders on folders.folderno = orders.folderno
left join pranaparms on folders.prodcode = pranaparms.prodcode and results.analyte = pranaparms.analyte
WHERE Results.s <> 'OOS-A' and Results.Final Between pranaparms.LOWERQCLIMIT and pranaparms.UPPERQCLIMIT and (pranaparms.LOWERQCLIMIT IS NOT NULL and pranaparms.UPPERQCLIMIT IS NOT NULL)
and results.ordno in (1277494)
However, is there a convenient way which I could do an update on the selected fields?
I have tried this so far:
Update RESULTS (NOLOCK) left join orders on Results.ordno = orders.ordno
left join folders on folders.folderno = orders.folderno
left join pranaparms on folders.prodcode = pranaparms.prodcode and results.analyte = pranaparms.analyte
set Results.S = 'OOS-B'
WHERE Results.s <> 'OOS-A' and Results.Final Between pranaparms.LOWERQCLIMIT and pranaparms.UPPERQCLIMIT and (pranaparms.LOWERQCLIMIT IS NOT NULL and pranaparms.UPPERQCLIMIT IS NOT NULL)
and results.ordno in (1277494)
However, it is passing an error indicating "Incorrect syntax near '('"
Is there a way for me to update off of this join or will I need to do tables individually?
Your select syntax suggests SQL Server. The correct update syntax in SQL Server is:
update r
set S = 'OOS-B'
from results r left join
orders o
on r.ordno = o.ordno left join
folders f
on f.folderno = o.folderno left join
pranaparms p
on f.prodcode = p.prodcode and r.analyte = p.analyte
where r.s <> 'OOS-A' and
r.Final Between p.LOWERQCLIMIT and p.UPPERQCLIMIT and
(p.LOWERQCLIMIT IS NOT NULL and p.UPPERQCLIMIT IS NOT NULL) and
r.ordno in (1277494);
Notes:
Table aliases make the query much easier to write and to read.
Do not use NOLOCK unless you know what you are doing. Given that you don't know the syntax rules for update for the database you are using, I will guess that you don't understand locks either.
Your WHERE conditions are turning the outer joins into inner joins. You should just specify the correct join type -- and probably move the conditions to the on clauses. I've left the logic as you wrote it.

Use query with InnerJoin

I'm trying to pass this query to inner join, but it does not know how?
This is the query I want to use InnerJoin with these values ​​where
SELECT
ticket.id_ticket,
ticket.id_rede,
historico.id_historico,
historico.id_ticket,
centro.id_centro,
eqpto.id_eqpto,
eqpto.nome,
centro.sigla_centro,
interface.id_interface,
interface.id_eqpto,
interface.desig,
tecnologia.descricao,
interface.id_tecnologia,
tecnologia.id_tecnologia,
eqpto.id_centro,
eqpto.id_rede
FROM
app_gpa_ticket.ticket,
app_gpa_ticket.historico,
dados_v3.centro,
dados_v3.eqpto,
dados_v3.interface,
dados_v3.tecnologia
WHERE
ticket.id_ticket = historico.id_ticket AND
centro.id_centro = eqpto.id_centro AND
eqpto.id_eqpto = interface.id_eqpto AND
eqpto.id_rede = ticket.id_rede AND
tecnologia.id_tecnologia = interface.id_tecnologia;
Thank you!
I think you are trying to go to the standard (explicit) syntax. What you need to do is take what would be your JOIN operators in the WHERE clause and move them near the table itself. What you need to know is what tables on the left (Before the INNER JOIN operator you are joining to the right (After the INNER JOIN operator)
SELECT
ticket.id_ticket,
ticket.id_rede,
historico.id_historico,
historico.id_ticket,
centro.id_centro,
eqpto.id_eqpto,
eqpto.nome,
centro.sigla_centro,
interface.id_interface,
interface.id_eqpto,
interface.desig,
tecnologia.descricao,
interface.id_tecnologia,
tecnologia.id_tecnologia,
eqpto.id_centro,
eqpto.id_rede
FROM app_gpa_ticket.ticket
INNER JOIN app_gpa_ticket.historico ON ticket.id_ticket = historico.id_ticket
INNER JOIN dados_v3.eqpto ON eqpto.id_rede = ticket.id_rede
INNER JOIN dados_v3.interface ON eqpto.id_eqpto = interface.id_eqpto
INNER JOIN dados_v3.centro ON centro.id_centro = eqpto.id_centro
INNER JOIN dados_v3.tecnologia ON tecnologia.id_tecnologia = interface.id_tecnologia

Super Slow Query - sped up, but not perfect... Please help

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.

Complex SQL Query to NHibernate DetachedCriteria or HQL

I have the following SQL Query returning the results I need:
SELECT
Person.FirstName,Person.LastName,OrganisationUnit.Name AS UnitName, RS_SkillsArea.Name AS SkillsArea, Activity.Name AS ActivityName, Activity.CLASS, Activity.StartsOn, Activity.EndsOn,
SUM(ActivityCost.CostAmount) /
NULLIF(
(
SELECT COUNT(Registration.ActivityId) FROM
Registration INNER JOIN AttemptResultsSummary ON Registration.CurrentResultId = AttemptResultsSummary.AttemptResultsSummaryId AND
Registration.RegistrationId = AttemptResultsSummary.RegistrationId
WHERE (Registration.Status = 1) AND (Registration.ActivityId = Activity.ActivityId)
AND (AttemptResultsSummary.AttendanceStatus <> 1)
)
,0)
AS IndividualCost
FROM Registration AS Registration_1 INNER JOIN
Activity ON Registration_1.ActivityId = Activity.ActivityId INNER JOIN
Person ON Registration_1.PersonId = Person.PersonId INNER JOIN
OrganisationUnit ON Person.OrganisationUnitId = OrganisationUnit.OrganisationUnitId INNER JOIN
AttemptResultsSummary ON Registration_1.CurrentResultId = AttemptResultsSummary.AttemptResultsSummaryId AND
Registration_1.RegistrationId = AttemptResultsSummary.RegistrationId AND Activity.ActivityId = AttemptResultsSummary.ActivityId AND
Person.PersonId = AttemptResultsSummary.PersonId INNER JOIN
ActivityCost ON Activity.ActivityId = ActivityCost.ActivityId LEFT OUTER JOIN
(SELECT Category.Name, Category.CategoryId
FROM Category INNER JOIN
CategoryGroup ON Category.[Group] = CategoryGroup.CategoryGroupId
WHERE (CategoryGroup.Name = N'Skills Area')) AS RS_SkillsArea INNER JOIN
ActivityInCategory ON RS_SkillsArea.CategoryId = ActivityInCategory.CategoryId ON Activity.ActivityId = ActivityInCategory.ActivityId
AND AttemptResultsSummary.AttendanceStatus <> 1
GROUP BY RS_SkillsArea.Name, Person.FirstName,Person.LastName,Activity.Name, Activity.CLASS, Activity.StartsOn, Activity.EndsOn, Activity.ActivityId, OrganisationUnit.Name,
AttemptResultsSummary.CompletionStatus, AttemptResultsSummary.AttendanceStatus
HAVING AttemptResultsSummary.AttendanceStatus <> 1
Essentially is there any way using either DetachedCriteria or HQL to do the same against the entities rather than direct SQL?
The two challenges are:
The query for cost calculation per row.
The derived table join (which needs to be an outer join as this value may not exist)
I'd appreciate any pointers. I'd rather not use SQL because of infrastructure changes and the issues with (lack of) refactoring support
Take a look at the official HQL examples # http://docs.jboss.org/hibernate/stable/core/reference/en/html/queryhql.html#queryhql-examples .
In my opinion, the 'derived joins' would be even easier to pull off using HQL.
In the case of performance, my first start would be to catch how much it costs using native SQL using your prefered profiler, and then how much it costs on NHibernate using NHProf.