In TDE Marklogic how to escape null value triples? - marklogic-9

<template xmlns="http://marklogic.com/xdmp/tde">
<context>/test</context>
<vars>
<var>
<name>subprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
<var>
<name>objprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
</vars>
<triples>
<triple>
<subject>
<val>sem:iri($subprefix || ElemenetName)</val>
<invalid-values>ignore</invalid-values>
</subject>
<predicate>
<val>sem:iri('is')</val>
</predicate>
<object>
<val>sem:iri($objprefix || FullName)</val>
<invalid-values>ignore</invalid-values>
</object>
</triple>
</triples>
</template>
I have created a Template to get triples out of XML.
But want to escape null value triples(s,p or o).
I am using ignore, but this works only if there is not prefix in subject or object. If there is prefix it creates triples with null(only prefix).
Do we have any way to handle this in MarkLogic TDE?
Nullable object/subject issue.

You can take more use from the context expression, particularly if you use sub-templates. Here a crude example showing a sub-template, applied to 3 example docs:
xquery version "1.0-ml";
let $tde :=
<template xmlns="http://marklogic.com/xdmp/tde">
<context>/test</context>
<vars>
<var>
<name>subprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
<var>
<name>objprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
</vars>
<templates>
<template>
<context>FullName</context>
<triples>
<triple>
<subject>
<val>sem:iri($subprefix || ../ElemenetName)</val>
<invalid-values>ignore</invalid-values>
</subject>
<predicate>
<val>sem:iri('is')</val>
</predicate>
<object>
<val>sem:iri($objprefix || .)</val>
<invalid-values>ignore</invalid-values>
</object>
</triple>
</triples>
</template>
</templates>
</template>
let $xml1 := <test><ElemenetName>elem</ElemenetName><FullName>full</FullName></test>
let $xml2 := <test><ElemenetName>elem</ElemenetName></test>
let $xml3 := <test><FullName>full</FullName></test>
return tde:node-data-extract(($xml1, $xml2, $xml3), $tde)
Some more background on sub-templates can be found here:
https://docs.marklogic.com/guide/sql/creating-template-views#id_28999

There is simplest way to handle the NULL values in TDE:
<...>
<val>if(ElemenetName ne '') then sem:iri($subprefix || ElemenetName) else ()</val>
<invalid-values>ignore</invalid-values>
<....>

Related

loop / Extract nodes from clob xml column in oracle pl sql

I have this xml content stored in a clob column of a table, I have to loop through the "molecule" nodes under the "reactantList" node,and store each "molecule" node into another table containing a list of molecules,
Any help please?
I tried with xmltype, xmlsequence, xmltable etc but did not work, I also have to specify the namespace "xmlns=.." somewhere as an argument to xmltype I think, to be able to make it work...
<cml xmlns="http://www.chemaxon.com" version="ChemAxon file format v20.20.0, generated by vunknown" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.chemaxon.com http://www.chemaxon.com/marvin/schema/mrvSchema_20_20_0.xsd">
<MDocument>
<MChemicalStruct>
<reaction>
<arrow type="DEFAULT" x1="-8.022119140625" y1="0.8333333333333334" x2="-3.5637858072916657" y2="0.8333333333333334" />
<reactantList>
<molecule molID="m1">
<atomArray>
<atom id="a1" elementType="C" x2="-13.938333333333334" y2="0.7083333333333333" />
<atom id="a2" elementType="O" x2="-15.478333333333333" y2="0.7083333333333333" lonePair="2" />
</atomArray>
<bondArray>
<bond id="b1" atomRefs2="a1 a2" order="1" />
</bondArray>
</molecule>
<molecule molID="m2">
<atomArray>
<atom id="a1" elementType="O" x2="-9.897119140624998" y2="0.8333333333333333" mrvValence="0" lonePair="3" />
</atomArray>
<bondArray />
</molecule>
</reactantList>
<agentList />
<productList />
</reaction>
</MChemicalStruct>
<MReactionSign toption="NOROT" fontScale="14.0" halign="CENTER" valign="CENTER" autoSize="true" id="o1">
<Field name="text">
<![CDATA[{D font=SansSerif,size=18,bold}+]]>
</Field>
<MPoint x="-11.730452473958332" y="0.6666666666666666" />
<MPoint x="-11.217119140624998" y="0.6666666666666666" />
<MPoint x="-11.217119140624998" y="1.18" />
<MPoint x="-11.730452473958332" y="1.18" />
</MReactionSign>
</MDocument>
</cml>
You can use:
INSERT INTO molecules (molecule)
SELECT x.molecule
FROM table_name t
CROSS APPLY XMLTABLE(
XMLNAMESPACES(
'http://www.w3.org/2001/XMLSchema-instance' AS "xsi",
DEFAULT 'http://www.chemaxon.com'
),
'/cml/MDocument/MChemicalStruct/reaction/reactantList/molecule'
PASSING XMLTYPE(t.xml)
COLUMNS
molecule XMLTYPE PATH '.'
) x
db<>fiddle here

XPath doesn't provide proper tag

I'm trying to get tag "" from xml below.
If i execute request like this:
WITH x(col) AS (select'<document xmlns="http://example.com/digital/back/" xmlns:ns2="http://example.com/digital/back/complexId" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="">
<header>
<docId>13a2f29a28b12ecb</docId>
<dt>2018-12-10T11:59:48.112+03:00</dt>
</header>
<pay>
<reqTransfer id="154638">
<source>
<card>
<virtualCardNum>4B74C1EE187</virtualCardNum>
<bsc>VISA</bsc>
</card>
</source>
</reqTransfer>
</pay>
</document>
'::xml)
SELECT xpath('/document/pay/reqTransfer/source/card/bsc/text()', col) AS bsc
FROM x;
I get {}, but if I relpace the document start tag
<document xmlns="http://example.com/digital/back/" xmlns:ns2="http://example.com/digital/back/complexId" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="">
with <document> or even <document xmlns="">, I get { VISA } - that is right.
What should I do to replace <document xmlns="..."> with <document> or get { VISA } without replacement?
If you are working with XML namespaces, they are worth mentioning in your Xpath queries too, i.e. use
SELECT xpath('/d:document/d:pay/d:reqTransfer/d:source/d:card/d:bsc/text()', col,
ARRAY[ARRAY['d', 'http://example.com/digital/back/']]) AS bsc
http://sqlfiddle.com/#!17/9eecb/24719
See also:
how to ignore namespaces with XPath

Use RegEx to execute multiple find/replace commands in Notepad++ in one click

The purpose of this is to extract list of field names from the XML (of Adobe LiveCycle Designer). So, I created the fields in designer, then, I copy the XML of the related fields, paste in Notepad++, and then executer find/replace (ctrl-h) to get only the field names, one field in each line.
This will make it then easier to write the SQL statements to add such fields to the DB to register them.
The XML looks like the following:
<field xmlns="http://www.xfa.org/schema/xfa-template/2.8/" y="0in" x="0.343mm" w="8.881pt" h="9.108pt" name="detcon_recreation_only">
<ui>
<checkButton size="8.881pt">
<border>
<edge stroke="lowered"/>
<fill/>
</border>
</checkButton>
</ui>
<font size="0pt" typeface="Adobe Pi Std"/>
<para vAlign="middle"/>
<value>
<text>0</text>
</value>
<items>
<text>1</text>
<text>0</text>
<text/>
</items>
</field>
<field xmlns="http://www.xfa.org/schema/xfa-template/2.8/" name="detcon_special_housing" y="5.393mm" w="27.94mm" h="4.134mm" x="0.343mm">
<ui>
<choiceList>
<border>
<edge stroke="lowered"/>
</border>
<margin/>
</choiceList>
</ui>
<font typeface="Arial Narrow" size="6pt"/>
<margin topInset="0mm" bottomInset="0mm" leftInset="0mm" rightInset="0mm"/>
<para vAlign="middle"/>
<value>
<text>NA</text>
</value>
<items>
<text>Not Applicable</text>
<text>Hotel Component</text>
</items>
<items save="1" presence="hidden">
<text>NA</text>
<text>HC</text>
</items>
</field>
<exclGroup xmlns="http://www.xfa.org/schema/xfa-template/2.8/" name="detcon_photo_taken" x="0in" y="0in">
<?templateDesigner itemValuesSpecified 1?>
<field w="12.446mm" h="3.825mm" name="lb_yes">
<ui>
<checkButton size="1.7639mm" shape="round">
<border>
<?templateDesigner StyleID apcb1?>
<edge/>
<fill/>
</border>
</checkButton>
</ui>
<font typeface="Myriad Pro"/>
<margin leftInset="1mm" rightInset="1mm"/>
<para vAlign="middle"/>
<caption placement="right" reserve="7.698mm">
<para vAlign="middle" spaceAbove="0pt" spaceBelow="0pt" textIndent="0pt" marginLeft="0pt" marginRight="0pt"/>
<font size="8pt" typeface="Arial Narrow" baselineShift="0pt"/>
<value>
<text>YES</text>
</value>
</caption>
<value>
<text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
</value>
<items>
<text>1</text>
</items>
</field>
<field w="28.702mm" h="3.825mm" name="lb_no" x="13.233mm">
<ui>
<checkButton size="1.7639mm" shape="round">
<border>
<?templateDesigner StyleID apcb1?>
<edge/>
<fill/>
</border>
</checkButton>
</ui>
<font typeface="Myriad Pro"/>
<margin leftInset="1mm" rightInset="1mm"/>
<para vAlign="middle"/>
<caption placement="right" reserve="23.954mm">
<para vAlign="middle" spaceAbove="0pt" spaceBelow="0pt" textIndent="0pt" marginLeft="0pt" marginRight="0pt"/>
<font size="8pt" typeface="Arial Narrow" baselineShift="0pt"/>
<value>
<text>NO (see comments)</text>
</value>
</caption>
<value>
<text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
</value>
<items>
<text>0</text>
</items>
</field>
<border>
<edge presence="hidden"/>
</border>
<?templateDesigner expand 1?></exclGroup>
So I figured out the following RegEx to perform find/replace to get only the field names, one field on each line.
Find to get Field Name: (?i)<(field|exclGroup).*name="([a-z_]\w*)".*$
Replace: $2
Another find/replace...
Remove all other lines: ^.*<(?!.*name=).*.*[\r\n]*
Replace with blank
If you execute the above two find/replace sessions, you will end up with list of field names one field per line.
What I wanted to do is to perform the above in one find/replace session, and then convert the above into SQL Statements using also find/replace, using this template:
INSERT INTO table_name (element_id, element_name, element_type, default_value, required, clone)
VALUES (12345,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12346,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12347,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12348,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12349,"field_name_goes_here","/Tx", "", "N", "Y"),
The element_id field is sequential, but don't worry about that, I can take care of this in Excel.
Appreciate your help,
Tarek
Scraper Series
One small help. The element is slightly different than the which is making things a little more complicated. Your RegEx is so sophisticated, I couldn't modify it to include the element. I think I need more time to digest it. So could you modify it to include exclGroup and only extract name only without extracting the inner field elements of the exclGroup?
Ok, here you go.
It does make it a little more complicated.
I have 2 versions to do this. One that uses recursion, one that doesn't.
I'm posting the version that uses recursion.
If you need the non-recursion, let me know and I'll post that.
Find (?:(?!<(?:field|exclGroup)(?!\w)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)[\S\s])*(?><(field|exclGroup)(?=(?:[^>"']|"[^"]*"|'[^']*')*?\sname\s*=\s*(?:(['"])([\S\s]*?)\2))\s+(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)(?:(?&core)|)</\1\s*>(?:(?!<(?:field|exclGroup)(?!\w)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)[\S\s])*(?(DEFINE)(?<core>(?>(?><([\w:]+)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)(?:(?&core)|)</\5\s*>|(?!</[\w:]+\s*>)(?>[\S\s]))+))
Replace VALUES (12345,"$3","/Tx", "", "N", "Y"),\r\n
https://regex101.com/r/icnF3i/1
Formatted (incase you need to look at it)
(?: # Prefix - Optional any chars that don't start a field or exclGroup tag
(?!
<
(?: field | exclGroup )
(?! \w )
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
[\S\s]
)*
(?> # open 'field' or 'exclGroup' tag ------------------
<
( field | exclGroup ) # (1)
(?= # Asserttion (a pseudo atomic group)
(?: [^>"'] | " [^"]* " | ' [^']* ' )*?
\s name \s* = \s*
(?:
( ['"] ) # (2), Quote
( [\S\s]*? ) # (3), Name value - only thing we want
\2
)
)
\s+
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
(?:
(?&core) # Call the core recursion function (balanced tags)
|
)
</ \1 \s* > # Close 'field' or 'exclGroup' tag ------------------
(?: # Postfix - Optional any chars that don't start a field or exclGroup tag
(?!
<
(?: field | exclGroup )
(?! \w )
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
[\S\s]
)*
# ---------------------------------------------------------
(?(DEFINE)
(?<core> # (4 start), Inner balanced tags
(?>
(?>
<
( [\w:]+ ) # (5), Any open tag
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
(?: # Recurse core
(?&core)
|
)
</ \5 \s* > # Balanced close tag (I can see you 5)
|
(?! </ [\w:]+ \s* > ) # Any char not starting a close tag (passive)
(?> [\S\s] )
)+
) # (4 end)
)
You can view the non-recursive version here https://regex101.com/r/ztOrP5/1
I am trying to simplify the RegEx provided in the previous answer.
This is my simplified version:
RegEx: (?|(?><field.*name\s*=\s*"([a-z_]\w*)"(?:.|\n)*?(?:<\/field>))|(?:<exclGroup.*name\s*=\s*"([a-z_]\w*)"(?:.|\n)*?(?:<\/exclGroup>)))
Replace: $1
Check it out over here: https://regex101.com/r/icnF3i/3
Appreciate your feedback.
Thanks to sln for helping me to reach this level.
EDIT:
The above RegEx doesn't work in Notepad++.
To use the same under Notepad++ use the following find/replace combination:
Find: (?i)<(field|exclGroup).*name\s*=\s*"([a-z_]\w*)"[\s\S]*?<\/\1>
Replace: \(12345,"$2","/Tx", "", "N", "Y"\),

Parse out XML Value in SQL Server

I have the following XML contained in a column XML_TRANSACTION in a table TRANSACTION in SQL Server. I am trying to parse out the information inside the From and To tags into separate columns:
<Transaction Id="1234" Timestamp="2012-04-28T05:02:20" Version="TransactionVersion2" SenderId="abcd" SenderLocId="vxyz">
<Instance Name="Home" />
<Messages>
<Message Id="0" Timestamp="2014-04-28T01:00:46">
<MessageRequest Name="Movement" Xsd="Movement.xsd" Version="5">
<Body>
<Parts>
<Part PartNumber="11111" Qty="1" PersonUniqueId="A1B2C3" />
</Parts>
<Order Number="13579" Uid="01" />
<Ship Number="1ZW23" Type="Out" />
<From VendorId="XY1X2" VendorLocId="XY1X2" VendorName="Vendor_Extra" VendorStockRoom="OPEN" CountryCode="US" />
<To VendorId="XY1X2" VendorLocId="XY1X2" VendorName="Vendor_Extra" VendorStockRoom="CLOSED" CountryCode="US" />
</Body>
</MessageRequest>
</Message>
</Messages>
</Transaction>
Desired results:
From_VendorID || From_VendorLocID || From_VendorName || To_VendorID || To_VendorLocID
--------------++------------------++--------------------------------------------------
XY1X2 || XY1X2 || Vendor_Extra || XY1X2 || XY1X2
etc.
I have made several attempts, but have been unsuccessful. Any assistance would be greatly appreciated!
This should get you started:
SELECT
t.[XML_TRANSACTION].value('(//From/#VendorId)[1]','varchar(20)') as From_VendorID
,t.[XML_TRANSACTION].value('(//From/#VendorLocId)[1]','varchar(20)') as From_VendorLocID
,t.[XML_TRANSACTION].value('(//To/#VendorLocId)[1]','varchar(20)') as To_VendorLocID
,t.[XML_TRANSACTION].value('(//To/#VendorLocId)[1]','varchar(20)') as To_VendorLocID
FROM [CD].[TEST] AS t

FOR XML multiple control by attribute in tree concept

I want to figure out one issue.
I already had question about simple ordering issue but I want to order more detail.
check below this link :
SQL Server : FOR XML sorting control by attribute
I made a example case.
SQL Query.
select (
select '123' AS '#id', (
select
(
select 'test' AS '#testid' , '20' AS '#order'
FOR XML path ('tree') , TYPE
),
(
select 'test2' AS '#testid' , '30' AS '#order'
FOR XML path ('tree-order') , TYPE
),
(
select 'test' AS '#testid' , '10' AS '#order'
FOR XML path ('tree') , TYPE
)
FOR XML path ('Node') , TYPE
)
FOR XML path ('Sample') , TYPE
),
(select '456' AS '#id', (
select
(
select 'test' AS '#testid' , '20' AS '#order'
FOR XML path ('tree') , TYPE
),
(
select 'test2' AS '#testid' , '30' AS '#order'
FOR XML path ('tree-order') , TYPE
),
(
select 'test' AS '#testid' , '10' AS '#order'
FOR XML path ('tree') , TYPE
)
FOR XML path ('Node') , TYPE
)
FOR XML path ('Sample') , TYPE)
FOR XML path ('Main') , TYPE
Result :
<Main>
<Sample id="123">
<Node>
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
<tree testid="test" order="10" />
</Node>
</Sample>
<Sample id="456">
<Node>
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
<tree testid="test" order="10" />
</Node>
</Sample>
</Main>
Expected result :
<Main>
<Sample id="123">
<Node>
<tree testid="test" order="10" />
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
</Node>
</Sample>
<Sample id="456">
<Node>
<tree testid="test" order="10" />
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
</Node>
</Sample>
</Main>
final result :
<Main>
<Sample id="123">
<Node>
<tree testid="test" />
<tree testid="test" />
<tree-order testid="test2" />
</Node>
</Sample>
<Sample id="456">
<Node>
<tree testid="test" />
<tree testid="test" />
<tree-order testid="test2" />
</Node>
</Sample>
</Main>
That's order by tree-order.
finally I don't want to show order information in attribute
Any one has great Idea?
Thank you for everybody who interesting to this.
Updated ----------------------------------------
Thank you every body finally I solved problem as below about order by and remove attribute issue :
declare #resultData xml = (select #data.query('
element Main {
for $s in Main/Sample
return element Sample {
$s/#*,
for $n in $s/Node
return element Node {
for $i in $n/*
order by $i/#order
return $i
}
}
}'));
SET #resultData.modify('delete (Main/Sample/Node/tree/#order)');
SET #resultData.modify('delete (Main/Sample/Node/tree-order/#order)');
select #resultData
select #data.query('
element Main {
for $s in Main/Sample
return element Sample {
$s/#*,
for $n in $s/Node
return element Node {
for $i in Node/*
order by $i/#order
return
if ($i/self::tree)
then element tree { $i/#testid }
else element tree-order { $i/#testid }
}
}
}
}')
What's interesting to me is that in your original post, you're stating that you're generating the XML as the result of a SQL query. If it were me, I'd control the ordering at that level.