How to run multiple queries one after other in pyspark? - dataframe

Hi I have 3 queries to run one after other and generate the output. I am trying to run them but getting error. Could you please help me how can I run it?
Example:
UPDATE vw_usr_hist set end_date = current_timestamp where id in (select u.id from vw_usr u,vw_usr_hist uh where u.id = uh.id and (uh.is_active <> u.is_active or nvl(uh.dept,'x') <> nvl(u.dept,'x')) and uh.end_date is null) and end_date is null;INSERT INTO vw_usr_hist (id,is_active,dept, start_date) select u.id,u.is_active,u.dept,u.created_dt as start_date from vw_usr u left outer join (select id,is_active,dept,start_date from (select * from vw_usr_hist where end_date is null)) uh on u.id = uh.id where not exists (select 1 from vw_usr_hist where uh.id = u.id and uh.is_active = u.is_active and nvl(uh.dept,'x') = nvl(u.dept,'x'));select * from vw_usr_hist;
All 3 queries separated by ;. created temporary view(vw_usr & vw_usr_hist) and update & insert need to do into these 2 views and in last need to run select query to generate the output. Is it possible?
I am using below command to run the sql-
df = spark.sql(query)

Related

Trying to update a field conditionally in SQL Stored Procedure

I have a procedure that populates two sets of application information into the same fields. First the fields are filled out with applicable accounts from group "A" and then the same process happens for group "B" accounts.
Most of the group B fields are filled in by a insert/select statement. However, the query to select "account number" is a little more complex and that is in an UPDATE statement. I will paste the code below but I cannot get it to properly update the rows (for group B) with account numbers, despite the fact the query works on its own outside the procedure (essentially, the account numbers do exist).
Any idea why? I tried adding a case statement to single out group B rows (the where clause is hardcoded for group B... e.g. clfcode = 3) but that didn't work. Let me know if you need more information. I haven't much experience with update statements in stored procedures.
update src
set account_key = case when src.clfcode = 3 and src.branch_key = 12 then a.account_key else src.account_key end
from #src_table src
inner join SDFDW_Landing.cu.FICS_ms_Investor_Loan l
on l.loan_id = src.application_number
left join dm.dim_product p
on p.product_key = src.product_key
left join (
Select Distinct t.PARENTACCOUNT, t.USERCHAR1 as loan_id
from SDFDW_Landing.dbo.tracking t
where t.TYPE = 1
and t.ProcessDate = #v_max_last_processed_date
and t.USERCHAR1 is not null
) t on t.loan_id = l.loan_id
left join dm.dim_account a
on t.PARENTACCOUNT = a.account_nkey
WHERE p.bdw_report_category = 'Mortgage'
and l.processdate = #v_max_last_processed_date
The join on a subquery might cause the issue. You could try to replace it with an apply and see if that helps.
update
src
set
account_key =
case
when
src.clfcode = 3
and src.branch_key = 12
then
a.account_key
else
src.account_key
end
from
#src_table src
inner join
SDFDW_Landing.cu.FICS_ms_Investor_Loan l
on l.loan_id = src.application_number
left join
dm.dim_product p
on p.product_key = src.product_key
outer apply (
Select
acc.*
from
dm.dim_account acc
inner join
SDFDW_Landing.dbo.tracking t
on acc.account_nkey = t.parentaccount
where
t.TYPE = 1
and t.ProcessDate = #v_max_last_processed_date
and t.USERCHAR1 is not null
and t.loan_id = l.loan_id
) a
WHERE
p.bdw_report_category = 'Mortgage'
and l.processdate = #v_max_last_processed_date
alternatively since you are already within a stored procedure, I'd populate a temp table with the data from your subquery and simply join on that temp table from your update statement.

how to set expression variable in query select oracle sql

I have Oracle SQL like these :
SELECT
z."date", z.id_outlet as idOutlet, z.name as outletName, z.matClass, z.targetBulanan, z.targetBulanan/totalVisit as targetAwal,
z.actual,rownumber = tartot + rownumber as targetTotal
FROM (SELECT
b.visit_date as "date", a.id_outlet, max(o.name) as name, max(a.target_sales) as targetBulanan, a.id_material_class as matClass,
max(x.totalVisit) as totalVisit, NVL(SUM(d.billing_value),0) as actual
FROM (
select * from target_bulanan
where deleted = 0 and enabled = 1 and id_salesman = :id_salesman AND id_material_class like :id_material_class AND id_outlet like :id_outlet AND month = TO_NUMBER(TO_CHAR(current_date,'mm')) and year = to_number(TO_CHAR(current_date,'YYYY'))
) a
INNER JOIN outlet o ON o.id_outlet = a.id_outlet
LEFT JOIN visit_plan b ON b.deleted = 0 and a.id_salesman = b.id_salesman AND a.month = TO_NUMBER(TO_CHAR(b.visit_date,'mm')) AND a.year = to_number(TO_CHAR(b.visit_date,'yyyy')) AND a.id_outlet = b.id_outlet
LEFT JOIN so_header c ON SUBSTR(c.id_to,'0',1) = 'TO' AND a.id_salesman = c.id_salesman AND a.id_outlet = c.id_outlet
LEFT JOIN assign_billing d ON c.no_so_sap = d.no_so_sap AND d.billing_date = b.visit_date AND a.id_material_class = (SELECT id_material_class FROM material WHERE id = d.id_material)
LEFt JOIN (SELECT id_salesman, to_char(visit_date,'mm') as month, to_char(visit_date,'yyyy') as year, id_outlet, COUNT(*) as totalVisit FROM visit_plan
WHERE deleted = 0
group by id_salesman, id_outlet,to_char(visit_date,'mm'), to_char(visit_date,'yyyy')) x on
x.id_salesman = a.id_salesman AND x.month = a.month AND x.year = a.year AND x.id_outlet = a.id_outlet
GROUP BY b.visit_date, a.id_outlet, a.id_material_class) z
CROSS JOIN (SELECT 0 as rownumber FROM DUAL ) r
CROSS JOIN (SELECT 0 as tartot FROM DUAL ) t
CROSS JOIN (SELECT '' as mat FROM DUAL ) m
CROSS JOIN (SELECT '' as outlet FROM DUAL ) o
ORDER by outletName, z.matClass, z."date"
I want value of rownumber is formula in my select query but the result is error with this message
ORA-00923: FROM keyword not found where expected
00923. 00000 - "FROM keyword not found where expected"
Anyone can help me ? thanks
Just for enumeration -
replace the line
rownumber = rownumber + 1 AS row_number
with this
rownum AS row_number
rownum is an Oracle inbuilt function that enumerates each record of the result set and with auto increments
As mentioned by Gordon Linoff in his answer, there are further problems in your query.
At the first look (without executing it), I could list the problematic lines -
AND month = TO_NUMBER(TO_CHAR(current_date,'mm'))
AND year = to_number(TO_CHAR(current_date,'YYYY'))
Instead of current_date use sysdate
LEFT JOIN so_header c ON SUBSTR(c.id_to,'0',1) = 'TO'
I guess, you meant to do this -
LEFT JOIN so_header c ON SUBSTR(c.id_to,0,2) = 'TO'
i.e. substring from index 0 upto 2 characters
Plus, no need of those cross joins
THIS ADDRESSES THE ORIGINAL QUESTION.
You may have multiple problems in your query. After all, the best way to debug and to write queries is to start simple and gradually add complexity.
But, you do have an obvious error. In your outermost select:
SELECT z."date", z.id_outlet as idOutlet, z.name as outletName,
z.matClass, z.targetBulanan, z.targetBulanan/totalVisit as targetAwal,
z.actual,
rownumber = rownumber + 1 as row_number
The = is not Oracle syntax -- it looks like a SQL Server extension for naming a column or a MySQL use of variables.
I suspect that you want to enumerate the rows. If so, one syntax is row_number():
SELECT z."date", z.id_outlet as idOutlet, z.name as outletName,
z.matClass, z.targetBulanan, z.targetBulanan/totalVisit as targetAwal,
z.actual,
row_number() over (order by outletName, z.matClass, z."date") as row_number
In Oracle, you could also do:
rownum as row_number

SQL Most Recent Register FROM Second Table by Id

I have 2 tables (Opportunity and Stage). I need to get each opportunity with the most recent stage by StageTypeId.
Opportunity: Id, etc
Stage: Id, CreatedOn, OpportunityId, StageTypeId.
Let's suppose I have "opportunity1" and "opportunity2" each one with many Stages added.
By passing the StageTypeId I need to get the opportunity which has this StageTypeId as most recent.
I'm trying the following query but it´s replicating the same Stage for all the Opportunities.
It seems that it's ignoring this line: "AND {Stage}.[OpportunityId] = ID"
SELECT {Opportunity}.[Id] ID,
{Opportunity}.[Name],
{Opportunity}.[PotentialAmount],
{Contact}.[FirstName],
{Contact}.[LastName],
(SELECT * FROM
(
SELECT {Stage}.[StageTypeId]
FROM {Stage}
WHERE {Stage}.[StageTypeId] = #StageTypeId
AND {Stage}.[OpportunityId] = ID
ORDER BY {Stage}.[CreatedOn] DESC
)
WHERE ROWNUM = 1) AS StageTypeId
FROM {Opportunity}
LEFT JOIN {Contact}
ON {Opportunity}.[ContactId] = {Contact}.[Id]
Thank you
Most of DBMS support fetch first clause So, you can do :
select o.*
from Opportunity o
where o.StageTypeId = (select s.StageTypeId
from Stage s
where s.OpportunityId = o.id
order by s.CreatedOn desc
fetch first 1 rows only
);
you can try below way all dbms will support
select TT*. ,o*. from
(
select s1.OpportunityId,t.StageTypeId from Stage s1 inner join
(select StageTypeId,max(CreatedOn) as createdate Stage s
group by StageTypeId
) t
on s1.StageTypeId=t.StageTypeId and s1.CreatedOn=t.createdate
) as TT inner join Opportunity o on TT.OpportunityId=o.id

How to rewrite a SQL query that contains repeated SubQuery?

Could anyone help me to rewrite a SQL query as below in short ? There is a subquery to be repeated.
update policy
set totalvehicles = (
select count(*) from riskunit
where riskunit.policyId = policy.id
and riskunit.subtype = 7)
where policy.verified = '1'
and policy.Totalvehicles <(
select count(*)
from riskunit
where riskunit.policyId = policy.id
and riskunit.subtype = 7
);
Thanks !!
I prefer this because it's easy to insert a select above the from and see what would be changed.
UPDATE p
SET totalvehicles = cnt.[Count]
FROM policy p
INNER JOIN (
SELECT
policyId,COUNT(*) [Count]
FROM riskunit
WHERE ru.subtype = 7
GROUP BY policyId
) cnt on cnt.policyId=p.policyId
WHERE p.verified = '1'
AND p.Totalvehicles < cnt.[Count]
This should work (assuming MySQL, also working for Oracle):
update policy p
inner join (
select policyId, count(*) as n from riskunit
where riskunit.subtype = 7
group by policyId
) ru on ru.policyId = p.id
set p.totalvehicles = ru.n
where p.verified = '1'
and p.Totalvehicles < ru.n;

SQL JOIN Statement

Lets say I have a table e.g
Request No. Type Status
---------------------------
1 New Renewed
and then another table
Action ID Request No LastUpdated
------------------------------------
1 1 06-10-2010
2 1 07-14-2010
3 1 09-30-2010
How can I join the second table with the first table but only get the latest record from the second table(e.g Last Updated DESC)
SELECT T1.RequestNo ,
T1.Type ,
T1.Status,
T2.ActionId ,
T2.LastUpdated
FROM TABLE1 T1
JOIN TABLE2 T2
ON T1.RequestNo = T2.RequestNo
WHERE NOT EXISTS
(SELECT *
FROM TABLE2 T2B
WHERE T2B.RequestNo = T2.RequestNo
AND T2B.LastUpdated > T2.LastUpdated
)
Using aggregates:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
JOIN (SELECT t.request_no,
MAX(t.lastupdated) AS latest
FROM REQUEST_EVENTS t
GROUP BY t.request_no) x ON x.request_no = re.request_no
AND x.latest = re.lastupdated
Using LEFT JOIN & NOT EXISTS:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
WHERE NOT EXISTS(SELECT NULL
FROM REQUEST_EVENTS re2
WHERE re2.request_no = r2.request_no
AND re2.LastUpdated > re.LastUpdated)
SELECT *
FROM REQUEST, ACTION
WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO --Joining here
AND ACTION.LastUpdated = (SELECT MAX(LastUpdated) FROM ACTION WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO);
A sub-query is used to get the last updated record's date and matches against itself to prevent the other records being joined.
Granted, depending on how precise the LastUpdated field is, it can have problems with two records being updated on the same date, but that is a problem encountered in any other implementation, so the precision would have to be increased or some other logic would have to be in place or another distinguishing characteristic to prevent multiple rows being returned.
SELECT r.RequestNo, r.Type, r.Status, a.ActionID, MAX(a.LastUpdated)
FROM Request r
INNER JOIN Action a ON r.RequestNo = a.RequestNo
GROUP BY r.RequestNo, r.Type, r.Status, a.ActionID
We can use the operation Top 1 with ORDER BY clause. For instance, if your tables are RequestTable(ID,Type,Status) and ActionTable(ActionID,RequestID,LastUpdated), the query will be like this:
Select Top 1 rq.ID, rq.Status, at.ActionID
From RequestTable as rq
JOIN ActionTable as at ON rq.ID = at.RequestID
Order by at.LastUpdated DESC