I have the following SQL :
with q1 (Tdata) as (
SELECT XMLtype(transportdata, nls_charset_id('AL32UTF8'))
from bph_owner.paymentinterchange pint
where --PINT.INCOMING = 'T' and
PINT.TRANSPORTTIME >= to_date('2022-08-10', 'yyyy-mm-dd')
and pint.fileformat = 'pain.001.001.03'
)
--select XMLQuery('//*:GrpHdr/*:InitgPty/Nm/text()'
select tdata, XMLQuery('//*:GrpHdr/*:CtrlSum/text()'
passing Tdata
returning content).getstringval()
,
XMLQuery('//*:GrpHdr/*:MsgId/text()'
passing Tdata
returning content).getstringval()
from q1;
This works but for the InitgPty/Nm/ it doesn't - anybody know how I can extract this information?
Please be gentle as I don't work with XML much.
Thanks
Based on sample data from a previous question and other relating to this XML schema, it appears you just need to wildcard the namespace for Nm node, as well as its parents:
'//*:GrpHdr/*:InitgPty/*:Nm/text()'
so the query could become:
with q1 (Tdata) as (
SELECT XMLtype(transportdata, nls_charset_id('AL32UTF8'))
from paymentinterchange pint
where --PINT.INCOMING = 'T' and
PINT.TRANSPORTTIME >= to_date('2022-08-10', 'yyyy-mm-dd')
and pint.fileformat = 'pain.001.001.03'
)
select
XMLQuery('//*:GrpHdr/*:InitgPty/*:Nm/text()'
passing Tdata
returning content).getstringval() as nm
,
XMLQuery('//*:GrpHdr/*:CtrlSum/text()'
passing Tdata
returning content).getstringval() as ctrlsum
,
XMLQuery('//*:GrpHdr/*:MsgId/text()'
passing Tdata
returning content).getstringval() as msgig
from q1;
But if you're pulling multiple values out of the same document then it's simpler to use XMLTable rather than multiple XMLQuery calls:
select x.nm, x.ctrlsum, x.msgid
from paymentinterchange pint
cross apply XMLTable(
'//*:GrpHdr'
passing XMLtype(pint.transportdata, nls_charset_id('AL32UTF8'))
columns nm varchar2(50) path '*:InitgPty/*:Nm',
ctrlsum number path '*:CtrlSum',
msgid varchar2(50) path '*:MsgId'
) x
where pint.transporttime >= date '2022-08-10'
and pint.fileformat = 'pain.001.001.03';
db<>fiddle with data adapted from a previous question.
Before Oracle 12c you can use cross join instead of cross apply. I don't think it would make a difference in later versions either, that just seems to be my default now...
select x.nm, x.ctrlsum, x.msgid
from paymentinterchange pint
cross join XMLTable(
'//*:GrpHdr'
passing XMLtype(pint.transportdata, nls_charset_id('AL32UTF8'))
columns nm varchar2(50) path '//*:InitgPty/*:Nm',
ctrlsum number path '//*:CtrlSum',
msgid varchar2(50) path '//*:MsgId'
) x
where pint.transporttime >= date '2022-08-12'
and pint.fileformat = 'pain.001.001.03';
db<>fiddle showing it in 11.2.0.2, where it seems to need the // at the start of all the paths, including in the columns clause - those aren't needed in later versions, but here it returns nulls without them.
I have an Oracle 11.2.0.4.0 table named LOOKUPTABLE with 3 fields
LOOKUPTABLEID NUMBER(12)
LOOKUPTABLENM NVARCHAR2(255)
LOOKUPTABLECONTENT NCLOB
The data in the NCLOB field is highly validated on insert so I'm certain the data always is a comma separated string with a CRLF on the end so reads exactly like a simple CSV file. Example ([CRLF] is representation of an actual CRLF, not text)
WITH lookuptable AS (
SELECT
1 AS "LOOKUPTABLEID",
'CODES.TBL' AS "LOOKUPTABLENM",
TO_NCLOB('851,ALL HOURS WORKED GLASS,G,0,,,,,,'||chr(10)||chr(13)||
'935,ALL OT AND HW HRS,G,0,,,,,,'||chr(10)||chr(13)||
'934,ALL PAID TIME,G,0,,,,,,'||chr(10)||chr(13)) AS "LOOKUPTABLECONTENT"
FROM dual
)
SELECT lookuptablecontent FROM lookuptable WHERE lookuptablenm='CODES.TBL';
"851,ALL HOURS WORKED GLASS,G,0,,,,,,[CRLF]935,ALL OT AND HW HRS,G,0,,,,,,[CRLF]934,ALL PAID TIME,G,0,,,,,,[CRLF]"
I essentially want to have a query that can output 1 row for each line in the CLOB. I'm using an application that will read this SQL and write it to a text file for me but it cannot handle CLOB data types and I don't have the option to write directly to file from SQL itself. I have to have a query that can produce this result and allow my app to write the file. I do have the ability to create/write my own tables so a procedure that would read the CLOB into a new table and then I would select from that table in my application would be acceptable if that's better, its just over my head right now. Desired output below, thanks in advance for any help :)
1. 851,ALL HOURS WORKED GLASS,G,0,,,,,,
2. 935,ALL OT AND HW HRS,G,0,,,,,,
3. 934,ALL PAID TIME,G,0,,,,,,
This is a specific case of a general question "how to split a string", and I link this question a lot for more details on that. In this case, instead of a comma, the delimiter that you want to split on is CRLF, or chr(10)||chr(13).
Here's a simple solution with regexp_substr. It's not the fastest solution, but it works fine in simple scenarios. If you need better performance, see the version in the link above with a recursive CTE and no regexp.
WITH lookuptable AS (
SELECT
1 AS LOOKUPTABLEID,
'CODES.TBL' AS LOOKUPTABLENM,
TO_NCLOB('851,ALL HOURS WORKED GLASS,G,0,,,,,,'||chr(10)||chr(13)||
'935,ALL OT AND HW HRS,G,0,,,,,,'||chr(10)||chr(13)||
'934,ALL PAID TIME,G,0,,,,,,'||chr(10)||chr(13)) AS LOOKUPTABLECONTENT
FROM dual
)
SELECT lookuptableid as id, to_char(regexp_substr(lookuptablecontent,'[^('||chr(13)||chr(10)||')]+', 1, level))
FROM lookuptable
WHERE lookuptablenm='CODES.TBL'
connect by level <= regexp_count(lookuptablecontent, '[^('||chr(13)||chr(10)||')]+')
and PRIOR lookuptableid = lookuptableid and PRIOR SYS_GUID() is not null -- needed if more than 1 source row
order by lookuptableid, level
;
Output:
id r
1 851,ALL HOURS WORKED GLASS,G,0,,,,,,
1 935,ALL OT AND HW HRS,G,0,,,,,,
1 934,ALL PAID TIME,G,0,,,,,,
My example data and format using the recursive CTE without regexp from link provided by #kfinity
WITH lookuptable (lookuptableid, lookuptablenm, lookuptablecontent) AS (
SELECT
1,
'CODES.TBL',
TO_NCLOB('ID,NAME,TYPE,ISMONEYSW,EARNTYPE,EARNCODE,RATESW,NEGATIVESW,OVERRIDEID,DAILYSW'||chr(13)||chr(10)||
'851,ALL HOURS WORKED GLASS,G,0,,,,,,'||chr(13)||chr(10)||
'935,ALL OT AND HW HRS,G,0,,,,,,'||chr(13)||chr(10)||
'934,ALL PAID TIME,G,0,,,,,,'
)
FROM dual
), CTE (lookuptableid, lookuptablenm, lookuptablecontent, startposition, endposition) AS (
SELECT
lookuptableid,
lookuptablenm,
lookuptablecontent,
1,
INSTR(lookuptablecontent, chr(13)||chr(10))
FROM lookuptable
WHERE lookuptablenm = 'CODES.TBL'
UNION ALL
SELECT
lookuptableid,
lookuptablenm,
lookuptablecontent,
endposition + 1,
INSTR(lookuptablecontent, chr(13)||chr(10), endposition+1)
FROM CTE
WHERE endposition > 0
)
SELECT
lookuptableid,
lookuptablenm,
SUBSTR(lookuptablecontent, startposition, DECODE(endposition, 0, LENGTH(lookuptablecontent) + 1, endposition) - startposition) AS lookuptablecontent
FROM CTE
ORDER BY lookuptableid, startposition;
I'm trying to simplify a column in BigQuery by using BigQuery extract on it but I am having a bit of an issue.
Here are two examples of the data I'm extracting from:
dc_pre=CLXk_aigyOMCFQb2dwod4dYCZw;gtm=2wg7f1;gcldc=;gclaw=;gac=UA-5815571-8:;auiddc=;u1=OVERDRFT;u2=undefined;u3=undefined;u4=undefined;u5=SSA;u6=undefined;u7=na;u8=undefined;u9=undefined;u10=undefined;u11=undefined;~oref=https://www.online.bank.co.za/onlineContent/ga_bridge.html
dc_pre=COztt4-tyOMCFcji7Qod440PCw;gtm=2wg7f1;gcldc=;gclaw=;gac=UA-5815571-8:;auiddc=;u1=DDA13;u2=undefined;u3=undefined;u4=undefined;u5=SSA;u6=undefined;u7=na;u8=undefined;u9=undefined;u10=undefined;u11=undefined;~oref=https://www.online.support.co.za/onlineContent/ga_bridge.html
I want to extract the portion between ;u1= and ;u2
Running the following legacy SQL Query
SELECT
Date(Event_Time),
Activity_ID,
REGEXP_EXTRACT(Other_Data, r'(?<=u1=)(.*\n?)(?=;u2)')
FROM
[sprt-data-transfer:dtftv2_sprt.p_activity_166401]
WHERE
Activity_ID in ('8179851')
AND Site_ID_DCM NOT IN ('2134603','2136502','2539719','2136304','2134604','2134602','2136701','2378406')
AND Event_Time BETWEEN 1563746400000000 AND 1563832799000000
I get the error...
Failed to parse regular expression "(?<=u1=)(.*\n?)(?=;u2)": invalid
perl operator: (?<
And this is where my talent runs out, is the error being caused because I'm using legacy SQL? Or is an unsupported format for REGEX?
Just tried this, and it worked, but with "Standart SQL" enabled.
select
other_data,
regexp_extract(other_data, ';u1=(.+?);u2') as some_part
from
unnest([
'dc_pre=CLXk_aigyOMCFQb2dwod4dYCZw;gtm=2wg7f1;gcldc=;gclaw=;gac=UA-5815571-8:;auiddc=;u1=OVERDRFT;u2=undefined;u3=undefined;u4=undefined;u5=SSA;u6=undefined;u7=na;u8=undefined;u9=undefined;u10=undefined;u11=undefined;~oref=https://www.online.bank.co.za/onlineContent/ga_bridge.html',
'dc_pre=COztt4-tyOMCFcji7Qod440PCw;gtm=2wg7f1;gcldc=;gclaw=;gac=UA-5815571-8:;auiddc=;u1=DDA13;u2=undefined;u3=undefined;u4=undefined;u5=SSA;u6=undefined;u7=na;u8=undefined;u9=undefined;u10=undefined;u11=undefined;~oref=https://www.online.support.co.za/onlineContent/ga_bridge.html'
]) as other_data
Not using regex but it still works...
with test as (
select 1 as id, 'dc_pre=CLXk_aigyOMCFQb2dwod4dYCZw;gtm=2wg7f1;gcldc=;gclaw=;gac=UA-5815571-8:;auiddc=;u1=OVERDRFT;u2=undefined;u3=undefined;u4=undefined;u5=SSA;u6=undefined;u7=na;u8=undefined;u9=undefined;u10=undefined;u11=undefined;~oref=https://www.online.bank.co.za/onlineContent/ga_bridge.html' as my_str UNION ALL
select 2 as id, 'dc_pre=COztt4-tyOMCFcji7Qod440PCw;gtm=2wg7f1;gcldc=;gclaw=;gac=UA-5815571-8:;auiddc=;u1=DDA13;u2=undefined;u3=undefined;u4=undefined;u5=SSA;u6=undefined;u7=na;u8=undefined;u9=undefined;u10=undefined;u11=undefined;~oref=https://www.online.support.co.za/onlineContent/ga_bridge.html'
),
temp as (
select
id,
split(my_str,';') as items
from test
),
flattened as (
select
id,
split(i,'=')[SAFE_OFFSET(0)] as left_side,
split(i,'=')[SAFE_OFFSET(1)] as right_side
from temp
left join unnest(items) i
)
select * from flattened
where left_side = 'u1'
I need to find a substring that is in a text field that is actually partially xml. I tried converting it to xml and then use the .value method but to no avail.
The element(substring) I am looking for is a method name that looks like this:
AssemblyQualifiedName="IPMGlobal.CRM2011.IPM.CustomWorkflowActivities.ProcessChildRecords,
where the method at the end "ProcessChildRecords" could be another name such as "SendEmail". I know I can use the "CustomWorkflowActivities." and the , (comma) to find the substring (method name) but not sure how to accomplish it. In addition, there may be more that one instance listed of the **"CustomWorkflowActvities.<method>"**
Some Clarifications:
Below is my original query. It returns that first occurrence in each row but no additional. For example I might have in the string '...IPM.CustomWorkflowActivities.ProcessChildRecords...' and
'...IPM.CustomWorkflowActivities.GetworkflowContext...'
The current query only returns Approve Time Process,
ipm_mytimesheetbatch,
ProcessChildRecords
SELECT WF.name WFName,
(
SELECT TOP 1 Name
FROM entity E
WHERE WF.primaryentity = E.ObjectTypeCode
) Entity,
Convert(xml, xaml) Xaml,
SUBSTRING(xaml, Charindex('CustomWorkflowActivities.', xaml) + Len('CustomWorkflowActivities.'), Charindex(', IPMGlobal.CRM2011.IPM.CustomWorkflowActivities, Version=1.0.0.0', xaml) - Charindex('CustomWorkflowActivities.', xaml) - Len('CustomWorkflowActivities.'))
FROM FilteredWorkflow WF
WHERE 1 = 1
AND xaml LIKE '%customworkflowactivities%'
AND statecodename = 'Activated'
AND typename = 'Definition'
ORDER BY NAME
If you are using Oracle you could use REGEXP function:
WITH cte(t) as (
SELECT 'AssemblyQualifiedName="IPMGlobal.CRM2011.IPM.CustomWorkflowActivities.ProcessChildRecords,' FROM dual
)
SELECT t,
regexp_replace(t, '.*CustomWorkflowActivities.(.+)\,.*', '\1') AS r
FROM cte;
DBFiddle Demo
SQL Server:
WITH cte(t) as (
SELECT 'AssemblyQualifiedName="IPMGlobal.CRM2011.IPM.CustomWorkflowActivities.ProcessChildRecords,asfdsa'
)
SELECT t,SUBSTRING(t, s, CHARINDEX(',', t, s)-s)
FROM (SELECT t, PATINDEX( '%CustomWorkflowActivities.%', t) + LEN('CustomWorkflowActivities.') AS s
FROM cte
) sub;
DBFiddle Demo 2