SQL (snowflake) - how can I return 1 row from a join or use MAX in a second join from result of first - sql

I have a large query that I have pasted parts of below.
I am wanting to use the result of the first join in my second join.
What I am trying to do get the last session that has a lead_conversion then I am getting all sessions in between then and the current row
This is the part I am struggling with
left join (
select ss.id, ss.session_start, ss.lead_id
from sessions ss
inner join lead_conversions inner_lc on inner_lc.session_id = ss.id
) prev_lc
on prev_lc.lead_id = lc.lead_id
and prev_lc.session_start::TIMESTAMP < s.session_start::TIMESTAMP
left join cte_sessions reset_prev_sess
on reset_prev_sess.lead_id = lc.lead_id
and reset_prev_sess.session_start::TIMESTAMP <= s.session_start::TIMESTAMP
and (
prev_lc.session_start::TIMESTAMP IS NULL
OR
reset_prev_sess.session_start::TIMESTAMP > prev_lc.session_start::TIMESTAMP
)
my issue is I cant just fetch the last prev_lc and I cant seem to use max(prev_lc.session_start)
I have tried grouping in first select and using max but this does not work as I believe this is ran before the on
left join (
select max(ss.session_start) as session_start, max(ss.lead_id) as lead_id
from sessions ss
inner join lead_conversions inner_lc on inner_lc.session_id = ss.id
group by inner_lc.id
) prev_lc on prev_lc.lead_id = lc.lead_id
I have also tried using max in the second join but this give the error
SQL compilation error: Invalid aggregate function in ON clause [MAX(CAST(PREV_LC.SESSION_START AS TIMESTAMP_NTZ(9)))]
left join cte_sessions reset_prev_sess
on reset_prev_sess.lead_id = lc.lead_id
and reset_prev_sess.session_start::TIMESTAMP <= s.session_start::TIMESTAMP
and (
prev_lc.session_start::TIMESTAMP IS NULL
OR
reset_prev_sess.session_start::TIMESTAMP > max(prev_lc.session_start::TIMESTAMP)
)
any help with this would be very appreciated
Thank you

if I understand correctly you are looking for to join with the last session start,so what you can do is to order by startsession in your subquery and limit to 1 record:
left join (
select ss.id, ss.session_start, ss.lead_id
from sessions ss
inner join lead_conversions inner_lc on inner_lc.session_id = ss.id
order by ss.session_start desc
limit 1
) prev_lc
the rest of query stays untouched.

So I have found a solution for this if any one comes across this. I ended up just rethinking how I go about it.
I ended up adding a row number for each conversion
with cte_sessions as (
select
s.id
,s.lead_id
,s.session_start::TIMESTAMP as session_start
,CASE WHEN MAX(lc.id) IS NOT NULL
then ROW_NUMBER() over (partition by s.lead_id, (CASE WHEN
MAX(lc.id) IS NOT NULL then 1 else 0 end)
order by s.session_start
)
END as conversion_row
from sessions s
left join lead_conversions lc on lc.session_id = s.id
group by s.id, s.session_start, s.lead_id, s.project_id, s.crawler_id
order by s.session_start
)
The I just did this in the join
left join cte_sessions prev_lc on prev_lc.lead_id = lc.lead_id and prev_lc.conversion_row = s.conversion_row - 1

Related

LEFT JOIN & SUM GROUP BY

EDIT:
The result supposed to be like this:
desired result
I have this query:
SELECT DISTINCT mitarbeiter.mitarbnr, mitarbeiter.login, mitarbeiter.name1, mitarbeiter.name2
FROM vertragspos
left join vertrag_ek_vk_zuord ON vertragspos.id = vertrag_ek_vk_zuord.ek_vertragspos_id
left join mitarbeiter ON vertrag_ek_vk_zuord.anlage_mitarbnr = mitarbeiter.mitarbnr
left join vertragskopf ON vertragskopf.id = vertragspos.vertrag_id
left join
(
SELECT wkurse.*, fremdwaehrung.wsymbol
FROM wkurse
INNER join
(
SELECT lfdnr, Max(tag) AS maxTag
FROM wkurse
WHERE tag < SYSDATE
GROUP BY lfdnr
) t1
ON wkurse.lfdnr = t1.lfdnr AND wkurse.Tag = t1.maxTag
INNER JOIN fremdwaehrung ON wkurse.lfdnr = fremdwaehrung.lfdnr
) wkurse ON vertragskopf.blfdwaehrung = wkurse.lfdnr
left join
(
SELECT vertrag_ID, Sum (preis) preis, Sum (menge) menge, Sum (preis * menge / Decode (vertragskopf.zahlintervall, 1,1,2,2,3,3,4,6,5,12,1) / wkurse.kurs) vertragswert
FROM vertragspos
GROUP BY vertrag_ID
) s ON vertragskopf.id = s.vertrag_id
But I always get an error on line 21 Pos 145:
ORA-00904 WKURSE.KURS invalid identifier
The WKURSE table is supposed be joined already above, but why do I still get error?
How can I do join with all these tables?
I need to join all these tables:
Mitarbeiter, Vertragspos, vertrag_ek_vk_zuord, wkurse, fremdwaehrung, vertragskopf.
What is the right syntax? I'm using SQL Tool 1,8 b38
Thank you.
Because LEFT JOIN is executed on entire dataset, and not in row-by-row manner. So there's no wkurse.kurs available in the execution context of subquery. Since you join that tables, you can place the calculation in the top-most select statement.
EDIT:
After you edited the statement, it became clear where does vertragskopf.zahlintervall came from. But I don't know where are you going to use calculated vertragswert (now it is absent in the query), so I've put it in the result. As I'm not a SQL parser and have no idea of your tables, so I cannot check the code, but calculation now can be resolved (all the values are available in calculation context).
SELECT DISTINCT mitarbeiter.mitarbnr, mitarbeiter.login, mitarbeiter.name1, mitarbeiter.name2, s.amount / Decode (vertragskopf.zahlintervall, 1,1,2,2,3,3,4,6,5,12,1) / wkurse.kurs) vertragswert
FROM vertragspos
left join vertrag_ek_vk_zuord ON vertragspos.id = vertrag_ek_vk_zuord.ek_vertragspos_id
left join mitarbeiter ON vertrag_ek_vk_zuord.anlage_mitarbnr = mitarbeiter.mitarbnr
left join vertragskopf ON vertragskopf.id = vertragspos.vertrag_id
left join (
SELECT wkurse.*, fremdwaehrung.wsymbol
FROM wkurse
INNER join (
SELECT lfdnr, Max(tag) AS maxTag
FROM wkurse
WHERE tag < SYSDATE
GROUP BY lfdnr
) t1
ON wkurse.lfdnr = t1.lfdnr AND wkurse.Tag = t1.maxTag
INNER JOIN fremdwaehrung ON wkurse.lfdnr = fremdwaehrung.lfdnr
) wkurse ON vertragskopf.blfdwaehrung = wkurse.lfdnr
left join (
SELECT vertrag_ID, Sum (preis) preis, Sum (menge) menge, Sum (preis * menge) as amount
FROM vertragspos
GROUP BY vertrag_ID
) s ON vertragskopf.id = s.vertrag_id
Rewriting the code using WITH clause makes it much clearer than select from select.
Also get the rate on last day before today in oracle is as simple as
select wkurse.lfdnr
, max(wkurse.kurs) keep (dense_rank first order by wkurse.tag desc) as rate
from wkurse
where tag < sysdate
group by wkurse.lfdnr
One option is a lateral join:
left join lateral
(SELECT vertrag_ID, Sum(preis) as preis, Sum(menge) as menge,
Sum (preis * menge / Decode (vertragskopf.zahlintervall, 1,1,2,2,3,3,4,6,5,12,1) / wkurse.kurs) vertragswert
FROM vertragspos
GROUP BY vertrag_ID
) s
ON vertragskopf.id = s.vertrag_id

Subquery SQL DB2

I am trying to create a subquery (for a particular column) inside my base query. the code is as follows.
SELECT z.po_id,
max
(
select etcdc.ship_evnt_tms
FROM covinfos.shipment_event etcdc
WHERE etcdc.ship_evnt_cd = '9P'
AND etcdc.ship_id=scdc.ship_id
ORDER BY etcdc.updt_job_tms desc
FETCH first ROW only) AS llp_estimated_delivery_cdc
FROM covinfos.ibm_plant_order z
LEFT JOIN covinfos.ipo_line_to_case a
ON z.po_id = a.po_id
LEFT JOIN covinfos.shipment scdc
ON (
a.ship_id = scdc.ship_id
AND a.ship_to_loc_code = scdc.ship_to_loc_code
AND scdc.loc_type = 'CDC')
GROUP BY z.po_id
There seems to be some kind of typo somewhere based on the error message that pops up when I try to run the code.
BIC00004. DAL01008. An error occurred while accessing the database.
ILLEGAL SYMBOL ".". SOME SYMBOLS THAT MIGHT BE LEGAL ARE: , ). SQLCODE=-104,
SQLSTATE=42601, DRIVER=3.62.56; THE CURSOR SQL_CURLH200C1 IS NOT IN A
PREPARED STATE. SQLCODE=-514, SQLSTATE=26501, DRIVER=3.62.56
However, at plain sight or at least my sight, there is nothing that spots the misstake. Furthermore, running the subselect in a blank sheet (outside the base query, new one) it does correctly.
Thanks
You would probably be best to remove the co-related sub-select, and just join to a plain sub-select. E.g.
SELECT z.po_id,
max(ship_evnt_tms) AS llp_estimated_delivery_cdc
FROM covinfos.ibm_plant_order z
LEFT JOIN covinfos.ipo_line_to_case a
ON z.po_id = a.po_id
LEFT JOIN covinfos.shipment scdc
ON a.ship_id = scdc.ship_id
AND a.ship_to_loc_code = scdc.ship_to_loc_code
AND scdc.loc_type = 'CDC'
LEFT JOIN
( select ship_id
, ship_evnt_tms
FROM
( select ship_id
, ship_evnt_tms
, row_number() over(partition by ship_id order by updt_job_tms desc) as RN
FROM covinfos.shipment_event
WHERE ship_evnt_cd = '9P'
) s
WHERE RN = 1
) AS etcdc
ON etcdc.ship_id=scdc.ship_id
GROUP BY z.po_id
P.S. you could just INNER JOINs unless you want to include po_id's with no ship_evnt_tms
Try adding parentheses around the sub-select. At least this then parses Data Studio using z/OS validation
SELECT z.po_id,
max
((
select etcdc.ship_evnt_tms
FROM covinfos.shipment_event etcdc
WHERE etcdc.ship_evnt_cd = '9P'
AND etcdc.ship_id=scdc.ship_id
ORDER BY etcdc.updt_job_tms desc
FETCH first ROW only)) AS llp_estimated_delivery_cdc
FROM covinfos.ibm_plant_order z
LEFT JOIN covinfos.ipo_line_to_case a
ON z.po_id = a.po_id
LEFT JOIN covinfos.shipment scdc
ON (
a.ship_id = scdc.ship_id
AND a.ship_to_loc_code = scdc.ship_to_loc_code
AND scdc.loc_type = 'CDC')
GROUP BY z.po_id
Still, I'm not sure this is a very nice bit of SQL code.

select top 1 in subquery?

I'm trying to get a top 1 returned for each code in this query but this is giving me syntax errors. Any ideas what I can do?
SELECT d.doc_no,
d.doc_short_name,
d.doc_name,
r.revision_code,
r.revision,
r.description,
dtg.group_name,
(SELECT top 1 di.index_user_id
FROM document_instance
WHERE doc_no = d.doc_no
ORDER BY entry_time DESC) AS 'last indexed by'
FROM documents d
LEFT JOIN document_revision r ON r.doc_no = d.doc_no
LEFT JOIN document_instance di ON di.doc_no = d.doc_no
LEFT JOIN document_type_group dtg ON dtg.doc_no = d.doc_no
Assuming Sybase ASE, the top # clause is only supported in derived tables, ie, it's not supported in correlated sub-queries like you're attempting.
Also note that order by is not supported in any sub-queries (derived table or correlated).
If I'm reading the query correctly, you want the index_user_id for the record with the max(entry_time); if this is correct, and assuming a single record is returned for a given doc_no/entry_time combo, you could try something like:
SELECT d.doc_no,
d.doc_short_name,
d.doc_name,
r.revision_code,
r.revision,
r.description,
dtg.group_name,
(SELECT d2.index_user_id
FROM document_instance d2
WHERE d2.doc_no = d1.doc_no
and d2.entry_time = (select max(d3.entry_time)
from document_instance d3
where d3.doc_no = d1.doc_no)
) as 'last indexed by'
FROM documents d
LEFT JOIN document_revision r ON r.doc_no = d.doc_no
LEFT JOIN document_instance d1 ON d1.doc_no = d.doc_no
LEFT JOIN document_type_group dtg ON dtg.doc_no = d.doc_no

Problems with joins and a query

Edit: Deleted my old message cause it was confusing. And i can't answer my own question for now.
Found, problem come from GROUP BY.
After some researches, i found that we can't use GROUP BY for group a column inside grouped rows.
So this work as expected :
SELECT candidats.*
, AVG(test_results.rate_good_answer) AS toto
FROM "candidats"
LEFT JOIN "sessiontests"
ON "sessiontests"."candidat_id" = "candidats"."id"
LEFT JOIN "test_results"
ON "test_results"."sessiontest_id" = "sessiontests"."id"
LEFT JOIN "questionnaires"
ON "questionnaires"."id" = "test_results"."questionnaire_id"
WHERE (sessiontests.terminate = 't' )
AND ("questionnaires"."category" LIKE '%java%' )
GROUP BY candidats.id
ORDER BY toto
But this will grouped only my column in test_results :
SELECT candidats.*,
FROM "candidats"
LEFT JOIN "sessiontests"
ON "sessiontests"."candidat_id" = "candidats"."id"
LEFT JOIN "test_results"
ON "test_results"."sessiontest_id" = "sessiontests"."id"
LEFT JOIN "questionnaires"
ON "questionnaires"."id" = "test_results"."questionnaire_id"
WHERE (sessiontests.terminate = 't' )
AND ("questionnaires"."category" LIKE '%java%' )
GROUP BY candidats.id, test_results.rate_good_answer
ORDER BY AVG(test_results.rate_good_answer)
Edit 3 :
My problem was the second query was returning each different test_results row for my candidats, whereas i expected it to return me one line per candidat.
First query, is the answer, and it works nice.
SELECT candidats.*
, AVG(test_results.rate_good_answer) AS toto
FROM "candidats"
LEFT JOIN "sessiontests"
ON "sessiontests"."candidat_id" = "candidats"."id"
LEFT JOIN "test_results"
ON "test_results"."sessiontest_id" = "sessiontests"."id"
LEFT JOIN "questionnaires"
ON "questionnaires"."id" = "test_results"."questionnaire_id"
WHERE (sessiontests.terminate = 't' )
AND ("questionnaires"."category" LIKE '%java%' )
GROUP BY candidats.id
ORDER BY toto

Max date in view on left outer join

Thanks to a previous question, I found out how to pull the most recent data based on a linked table. BUT, now I have a related question.
The solution that I found used row_number() and PARTITION to pull the most recent set of data. But what if there's a possibility for zero or more rows in a linked table in the view? For example, the table FollowUpDate might have 0 rows, or 1, or more. I just want the most recent FollowUpDate:
SELECT
EFD.FormId
,EFD.StatusName
,MAX(EFD.ActionDate)
,EFT.Name AS FormType
,ECOA.Account AS ChargeOffAccount
,ERC.Name AS ReasonCode
,EAC.Description AS ApprovalCode
,MAX(EFU.FollowUpDate) AS FollowUpDate
FROM (
SELECT EF.FormId, EFD.ActionDate, EFS.Name AS StatusName, EF.FormTypeId, EF.ChargeOffId, EF.ReasonCodeId, EF.ApprovalCodeId,
row_number() OVER ( PARTITION BY EF.FormId ORDER BY EFD.ActionDate DESC ) DateSortKey
FROM Extension.FormDate EFD INNER JOIN Extension.Form EF ON EFD.FormId = EF.FormId INNER JOIN Extension.FormStatus EFS ON EFD.StatusId = EFS.StatusId
) EFD
INNER JOIN Extension.FormType EFT ON EFD.FormTypeId = EFT.FormTypeId
LEFT OUTER JOIN Extension.ChargeOffAccount ECOA ON EFD.ChargeOffId = ECOA.ChargeOffId
LEFT OUTER JOIN Extension.ReasonCode ERC ON EFD.ReasonCodeId = ERC.ReasonCodeId
LEFT OUTER JOIN Extension.ApprovalCode EAC ON EFD.ApprovalCodeId = EAC.ApprovalCodeId
LEFT OUTER JOIN (Select EFU.FormId, EFU.FollowUpDate, row_number() OVER (PARTITION BY EFU.FormId ORDER BY EFU.FollowUpDate DESC) FUDateSortKey FROM Extension.FormFollowUp EFU INNER JOIN Extension.Form EF ON EFU.FormId = EF.FormId) EFU ON EFD.FormId = EFU.FormId
WHERE EFD.DateSortKey = 1
GROUP BY
EFD.FormId, EFD.ActionDate, EFD.StatusName, EFT.Name, ECOA.Account, ERC.Name, EAC.Description, EFU.FollowUpDate
ORDER BY
EFD.FormId
If I do a similar pull using row_number() and PARTITION, I get the data only if there is at least one row in FollowUpDate. Kinda defeats the purpose of a LEFT OUTER JOIN. Can anyone help me get this working?
I rewrote your query - you had unnecessary subselects, and used row_number() for the FUDateSortKey but didn't use the column:
SELECT t.formid,
t.statusname,
MAX(t.actiondate) 'actiondate',
t.formtype,
t.chargeoffaccount,
t.reasoncode,
t.approvalcode,
MAX(t.followupdate) 'followupdate'
FROM (
SELECT t.formid,
fs.name 'StatusName',
t.actiondate,
ft.name 'formtype',
coa.account 'ChargeOffAccount',
rc.name 'ReasonCode',
ac.description 'ApprovalCode',
ffu.followupdate,
row_number() OVER (PARTITION BY ef.formid ORDER BY t.actiondate DESC) 'DateSortKey'
FROM EXTENSION.FORMDATE t
JOIN EXTENSION.FORM ef ON ef.formid = t.formid
JOIN EXTENSION.FORMSTATUS fs ON fs.statusid = t.statusid
JOIN EXTENSION.FORMTYPE ft ON ft.formtypeid = ef.formtypeid
LEFT JOIN EXTENSION.CHARGEOFFACCOUNT coa ON coa.chargeoffid = ef.chargeoffid
LEFT JOIN EXTENSION.REASONCODE rc ON rc.reasoncodeid = ef.reasoncodeid
LEFT JOIN EXTENSION.APPROVALCODE ac ON ac.approvalcodeid = ef.approvalcodeid
LEFT JOIN EXTENSION.FORMFOLLOWUP ffu ON ffu.formid = t.formid) t
WHERE t.datesortkey = 1
GROUP BY t.formid, t.statusname, t.formtype, t.chargeoffaccount, t.reasoncode, t.approvalcode
ORDER BY t.formid
The change I made to allow for FollowUpDate was to use a LEFT JOIN onto the FORMFOLLOWUP table - you were doing an INNER JOIN, so you'd only get rows with FORMFOLLOWUP records associated.
It's pretty hard to guess what's going on without table definitions and sample data.
Also, this is confusing: "the table FollowUpDate might have 0 rows" and you "want the most recent FollowUpDate." (especially when there is no table named FollowUpDate) There is no "most recent FollowUpDate" if there are zero FollowUpDates.
Maybe you want
WHERE <follow up date row number> in (1,NULL)
I figured it out. And as usual, I need a nap. I just needed to change my subselect to something I would swear I'd tried with no success:
SELECT field1, field2
FROM Table1 t1
LEFT JOIN (
SELECT field3, max(dateColumn)
FROM Table2
GROUP BY
field3
) t2
ON (t1.field1 = t2.field3)