Should a subquery on a join use tables from an outer query in the where clause?

Should a subquery on a join use tables from an outer query in the where clause? - sql

I need to add a subquery to a join, because one payment can have more than one allotment, so I only need to account for the first match (where rownum = 1).
However, I'm not sure if adding pmt from the outer query to the subquery on the allotment join is best.
Should I be doing this differently in the event of performance hits, etc.. ?
SELECT
pmt.payment_uid,
alt.allotment_uid,
FROM
payment pmt
/* HERE: is the reference to pmt.pay_key and pmt.client_id
incorrect in the below subquery? */
INNER JOIN allotment alc ON alt.allotment_uid = (
SELECT
allotment_uid
FROM
allotment
WHERE
pay_key = pmt.pay_key
AND
pay_code = 'xyz'
AND
deleted = 'N'
AND
client_id = pmt.client_id
AND
ROWNUM = 1
)
WHERE
AND
pmt.deleted = 'N'
AND
pmt.date_paid >= TO_DATE('2017-07-01')
AND
pmt.date_paid < TO_DATE('2017-10-01') + 1;

It's difficult to identify the performance issue in your query without seeing an explain plan output. You query does seem to do an additional SELECT on the allotment for every record from the main query.
Here is a version which doesn't use correlated sub query. Obviously I haven't been able to test it. It does a simple join in and then filters all records except one of the allotments. Hope this helps.
WITH v_payment
AS
(
SELECT
pmt.payment_uid,
alt.allotment_uid,
ROW_NUMBER () OVER(PARTITION BY allotment_id) r_num
FROM
payment pmt JOIN allotment alt
ON (pmt.pay_key = alt.pay_key AND
pmt.client_id = alt.client_id)
WHERE pmt.deleted = 'N' AND
pmt.date_paid >= TO_DATE('2017-07-01') AND
pmt.date_paid < TO_DATE('2017-10-01') + 1 AND
alt.pay_code = 'xyz' AND
alt.deleted = 'N'
)
SELECT payment_uid,
allotment_uid
FROM v_payment
WHERE r_num = 1;
Let's know how this performs!

You can phrase the query that way. I would be more likely to do:
SELECT . . .
FROM payment p INNER JOIN
(SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY pay_key, client_id
ORDER BY allotment_uid
) as seqnum
FROM allotment a
WHERE pay_code = 'xyz' AND deleted = 'N'
) a
ON a.pay_key = p.pay_key AND a.client_id = p.client_id AND
seqnum = 1
WHERE p.deleted = 'N' AND
p.date_paid >= DATE '2017-07-01' AND
p.date_paid < (DATE '2017-10-01') + 1;

Related

SQL count(*) with having

I have this query on oracle.
SELECT CBG.refs, CBG.cuo, CBG.date, CBG.nber, CG.date, CBG.conso,
(SELECT COUNT(*)
FROM MAD.VIN CBV
WHERE CBV.CUO = CBG.CUO AND
CBV.NBER = CBG.NBER AND
CBV.DATE = CBG.DATE AND
CBV.REFS = CBG.REFS
GROUP BY CUO , DATE , NBER , REFS ) AS COUNTS ,
CBG.CONSO_CONCESS AS CONCESS
FROM MAD.GEN CBG, MAD.CAR_GEN CG
WHERE CBG.cuo = CG.cuo AND
CBG.CONSO_DATE IS NOT NULL AND
CBG.date = CG.date AND
CBG.nber = CG.nber
HAVING COUNTS > 0;
when i run this sql query it gives me an error that says: invalid identifier counts.
How do we get results only if count is greater than a given parameter?
Thanks.

Unlike in MySQL, in Oracle we cannot refer to an alias in the HAVING clause (aliases can only be referenced in the ORDER BY clause). One workaround would be to put your current logic into a CTE and then filter it.
WITH cte AS (
SELECT CBG.refs, CBG.cuo, CBG.date AS cbg_date, CBG.nber, CG.date AS cg_date,
CBG.conso,
(SELECT COUNT(*)
FROM MAD.VIN CBV
WHERE CBV.CUO = CBG.CUO AND
CBV.NBER = CBG.NBER AND
CBV.DATE = CBG.DATE AND
CBV.REFS = CBG.REFS
GROUP BY CUO, DATE, NBER, REFS) AS COUNTS,
CBG.CONSO_CONCESS AS CONCESS
FROM MAD.GEN CBG
INNER JOIN MAD.CAR_GEN CG
ON CBG.cuo = CG.cuo AND
CBG.date = CG.date AND
CBG.nber = CG.nber
WHERE CBG.CONSO_DATE IS NOT NULL
)
SELECT refs, cuo, cbg_date, nber, cg_date, conso, COUNTS, CONCESS
FROM cte
WHERE COUNTS > 0;

SELECT DISTINCT, ORDER BY expressions must appear in target list

I have done this SQL request :
SELECT DISTINCT
*, j_start_date, typeNP.j_value, g_challenge.j_title
FROM g_challenge
INNER JOIN g_challenge_catset typeNP ON g_challenge.j_row_id = typeNP.j_item_id AND typeNP.j_value IN ('jbm_5233','jbm_5232','bfr_8227')
INNER JOIN g_challenge_catset as tps ON g_challenge.j_row_id = tps.j_item_id AND tps.j_value IN ('aga_7777','aga_7778','aga_7776')
LEFT OUTER JOIN g_dbsocialgroup gdb ON g_challenge.j_dbsocialgroup_id = gdb.j_row_id || '_DBSocialGroup'
WHERE (j_start_date >= '2018-08-13' OR (j_start_date <= '2018-08-13' AND j_end_date >= '2018-08-13')) AND g_challenge.j_pstatus = 0 AND g_challenge.j_mutualiste_type = false
ORDER BY
CASE typeNP.j_value
WHEN 'bfr_8227' THEN '1'
WHEN 'jbm_5233' THEN '2'
WHEN 'jbm_5232' THEN '3'
END,
j_start_date ASC, g_challenge.j_title ASC
But I got this error : SELECT DISTINCT, ORDER BY expressions must appear in target list and I don't know why ?
Help me please

Ok, I solved this with a group by
GROUP BY g_challenge.j_start_date, typeNP.j_value, g_challenge.j_title, g_challenge.j_row_id, typeNP.j_item_id, tps.j_item_id, tps.j_value, gdb.j_row_id
and with removing the DISTINCT and named columns in the select

ORACLE SQL: Slow query when using "join table on id = id" vs "where id = number"

I'm having performance problem in a querie when I use a subquery to set an ID = number, and then join that subquery in the main query to look for that ID, this method takes about 150 seconds. But if I delete the subquery and look for the ID = number directly in the main query, it takes 0,5 second.
Here some code as exemple:
This is the example of 150 seconds
In this I set the cto_in_codigo in the With clause.
WITH CONTRATOS AS (
SELECT CTO_IN_CODIGO FROM MGCAR.CAR_CONTRATO
WHERE CTO_IN_CODIGO = 14393
)
SELECT
PT.PAR_IN_CODIGO,
PTC.PARCOR_IN_INDICE
FROM (
SELECT
MAX(PT.HPAR_IN_CODIGO) OVER (PARTITION BY PT.PAR_IN_CODIGO, PT.CTO_IN_CODIGO) HPAR_IN_CODIGO_MAX,
PT.HPAR_IN_CODIGO,
PT.CTO_IN_CODIGO,
PT.PAR_IN_CODIGO
FROM
QUERIE.PARCELA_TOTAL PT
JOIN CONTRATOS CTO
ON CTO.CTO_IN_CODIGO = PT.CTO_IN_CODIGO
WHERE
PT.PAR_DT_REAJUSTE <= TO_DATE('31/12/2017', 'DD/MM/YYYY')
) PT
LEFT OUTER JOIN (
SELECT
MAX(PTC.PARCOR_IN_CODIGO) OVER (PARTITION BY PTC.PAR_IN_CODIGO, PTC.CTO_IN_CODIGO) PARCOR_IN_CODIGO_MAX,
PTC.PARCOR_IN_CODIGO,
PTC.CTO_IN_CODIGO,
PTC.PAR_IN_CODIGO,
PTC.HPAR_IN_CODIGO,
PTC.PARCOR_IN_INDICE
FROM
QUERIE.PARCELA_TOTAL_CORRECAO PTC
JOIN CONTRATOS CTO
ON CTO.CTO_IN_CODIGO = PTC.CTO_IN_CODIGO
) PTC
ON PTC.CTO_IN_CODIGO = PT.CTO_IN_CODIGO
AND PTC.PAR_IN_CODIGO = PT.PAR_IN_CODIGO
AND PTC.HPAR_IN_CODIGO = PT.HPAR_IN_CODIGO
AND PTC.PARCOR_IN_CODIGO = PTC.PARCOR_IN_CODIGO_MAX
WHERE
PT.HPAR_IN_CODIGO = PT.HPAR_IN_CODIGO_MAX
and this is the 0,5 sec.
in this I set the cto_in_codigo inside each query
SELECT
PT.PAR_IN_CODIGO,
PTC.PARCOR_IN_INDICE
FROM (
SELECT
MAX(PT.HPAR_IN_CODIGO) OVER (PARTITION BY PT.PAR_IN_CODIGO, PT.CTO_IN_CODIGO) HPAR_IN_CODIGO_MAX,
PT.HPAR_IN_CODIGO,
PT.CTO_IN_CODIGO,
PT.PAR_IN_CODIGO
FROM
QUERIE.PARCELA_TOTAL PT
WHERE
PT.PAR_DT_REAJUSTE <= TO_DATE('31/12/2017', 'dd/MM/yyyy')
AND PT.CTO_IN_CODIGO = 14393
) PT
LEFT OUTER JOIN (
SELECT
MAX(PTC.PARCOR_IN_CODIGO) OVER (PARTITION BY PTC.PAR_IN_CODIGO, PTC.CTO_IN_CODIGO) PARCOR_IN_CODIGO_MAX,
PTC.PARCOR_IN_CODIGO,
PTC.CTO_IN_CODIGO,
PTC.PAR_IN_CODIGO,
PTC.HPAR_IN_CODIGO,
PTC.PARCOR_IN_INDICE
FROM
QUERIE.PARCELA_TOTAL_CORRECAO PTC
WHERE
PTC.CTO_IN_CODIGO = 14393
) PTC
ON PTC.CTO_IN_CODIGO = PT.CTO_IN_CODIGO
AND PTC.PAR_IN_CODIGO = PT.PAR_IN_CODIGO
AND PTC.HPAR_IN_CODIGO = PT.HPAR_IN_CODIGO
AND PTC.PARCOR_IN_CODIGO = PTC.PARCOR_IN_CODIGO_MAX
WHERE
PT.HPAR_IN_CODIGO = PT.HPAR_IN_CODIGO_MAX
what is confusing to me is that the with clause returns just one row with the cto_in_codigo number, much like if I hard code then inside each query like the second code. What is could be causing this super delay?

Query specific field based on max sequence number

I have a record set where some of the rows are duplicated. In particular, they are duplicated for the last three rows of the record set. Of the entire four rows, the correct result set that I desire would include the first row and the last row. I desire this because for a particular SARAPPD_TERM_CODE_ENTRY, the record needed is the one where the SARAPPD_SEQ_NO value is at its max. So, the first row because for that particular term, the sequence number is maxed at one and the last row because the sequence number is maxed at six. Image and query are below.
select ppd.sarappd_seq_no, ppd.sarappd_term_code_entry, ppd.sarappd_apdc_code,
dap. dap.saradap_term_code_entry,
spri.spriden_id,
t.sgbstdn_astd_code, t.*
from sgbstdn t
left join spriden spri on t.sgbstdn_pidm = spri.spriden_pidm
left join saradap dap on spri.spriden_pidm = dap.saradap_pidm
join sarappd ppd on dap.saradap_pidm = ppd.sarappd_pidm
where t.sgbstdn_astd_code not in ('AS', 'DS', 'WD', 'SU', 'LA')
and t.sgbstdn_stst_code = 'AS'
and spri.spriden_change_ind is null
and spri.spriden_id = '123456789'
and (ppd.sarappd_apdc_code = 25 or ppd.sarappd_apdc_code = 30
or ppd.sarappd_apdc_code =35)
and ppd.sarappd_term_code_entry = dap.saradap_term_code_entry
--where b.sarappd_term_code_entry = ppd.sarappd_term_code_entry)
order by ppd.sarappd_term_code_entry
I believe this is a simple "where this = ( select max() ) type of query but I've been trying some different things and nothing is working. I'm not getting the results I want. So with that said, any help on this would be greatly appreciated. Thanks in advance.

You could use ROW_NUMBER()
;WITH cte
AS
(select
ROW_NUMBER() OVER (PARTITION BY SARAPPD_TERM_CODE_ENTRY ORDER BY SARAPPD_SEQ_NO DESC) AS RN
ppd.sarappd_seq_no,
ppd.sarappd_term_code_entry,
ppd.sarappd_apdc_code,
dap. dap.saradap_term_code_entry,
spri.spriden_id,
t.sgbstdn_astd_code, t.*
from sgbstdn t
left join spriden spri on t.sgbstdn_pidm = spri.spriden_pidm
left join saradap dap on spri.spriden_pidm = dap.saradap_pidm
join sarappd ppd on dap.saradap_pidm = ppd.sarappd_pidm
where t.sgbstdn_astd_code not in ('AS', 'DS', 'WD', 'SU', 'LA')
and t.sgbstdn_stst_code = 'AS'
and spri.spriden_change_ind is null
and spri.spriden_id = '123456789'
and (ppd.sarappd_apdc_code = 25 or ppd.sarappd_apdc_code = 30
or ppd.sarappd_apdc_code =35)
and ppd.sarappd_term_code_entry = dap.saradap_term_code_entry
order by ppd.sarappd_term_code_entry) a
SELECT *
FROM cte WHERE rn = 1

Get the rest of the row in a max group by

I'm trying to acquire the most recently passed training someone has taken. To do this, I have a view that works great
CREATE OR REPLACE FORCE VIEW MYAPP.most_recent_training (
employee_id, course_id, date_taken
) AS SELECT
who.employee_id,
course.course_id,
MAX(sess.end_date) date_taken
FROM employee_session_join esj
JOIN training_session sess on sess.session_id = esj.session_id
JOIN course_version vers on vers.version_id = sess.version_id
JOIN course course on course.course_id = vers.course_id
JOIN employee who on who.employee_id = esj.employee_id
WHERE esj.active_flag = 'Y'
AND sess.active_flag = 'Y'
AND course.active_flag = 'Y'
AND who.active_flag = 'Y'
AND esj.approval_status = 5 -- successfully passed
GROUP BY who.employee_id, course.course_id
Okay, so my query works excellent. Here's my problem - I also need the expiry date so I know when they go out of compliance. This is stored as a number of months on the version. But I can't add vers.valid_for_months because it complains ORA-00979: not a GROUP BY expression.
I just want to get whatever the rest of that row is. How can I do this?

I would think this would solve your problem:
SELECT who.employee_id, course.course_id,
MAX(add_months(sess.end_date, vers.valid_for_months))
That gets the latest end date. If you want the end date for the last session, use row_number():
SELECT employee_id, course_id, end_date
FROM (SELECT who.employee_id, course.course_id, sess.end_date,
row_number() over (partition by who.employee_id, course.course_id
order by sess.end_date
) as seqnum
FROM employee_session_join esj
JOIN training_session sess on sess.session_id = esj.session_id
JOIN course_version vers on vers.version_id = sess.version_id
JOIN course course on course.course_id = vers.course_id
JOIN employee who on who.employee_id = esj.employee_id
WHERE esj.active_flag = 'Y'
AND sess.active_flag = 'Y'
AND course.active_flag = 'Y'
AND who.active_flag = 'Y'
AND esj.approval_status = 5 -- successfully passed
) e
WHERE seqnum = 1;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas