Use RegEx to execute multiple find/replace commands in Notepad++ in one click - sql

The purpose of this is to extract list of field names from the XML (of Adobe LiveCycle Designer). So, I created the fields in designer, then, I copy the XML of the related fields, paste in Notepad++, and then executer find/replace (ctrl-h) to get only the field names, one field in each line.
This will make it then easier to write the SQL statements to add such fields to the DB to register them.
The XML looks like the following:
<field xmlns="http://www.xfa.org/schema/xfa-template/2.8/" y="0in" x="0.343mm" w="8.881pt" h="9.108pt" name="detcon_recreation_only">
<ui>
<checkButton size="8.881pt">
<border>
<edge stroke="lowered"/>
<fill/>
</border>
</checkButton>
</ui>
<font size="0pt" typeface="Adobe Pi Std"/>
<para vAlign="middle"/>
<value>
<text>0</text>
</value>
<items>
<text>1</text>
<text>0</text>
<text/>
</items>
</field>
<field xmlns="http://www.xfa.org/schema/xfa-template/2.8/" name="detcon_special_housing" y="5.393mm" w="27.94mm" h="4.134mm" x="0.343mm">
<ui>
<choiceList>
<border>
<edge stroke="lowered"/>
</border>
<margin/>
</choiceList>
</ui>
<font typeface="Arial Narrow" size="6pt"/>
<margin topInset="0mm" bottomInset="0mm" leftInset="0mm" rightInset="0mm"/>
<para vAlign="middle"/>
<value>
<text>NA</text>
</value>
<items>
<text>Not Applicable</text>
<text>Hotel Component</text>
</items>
<items save="1" presence="hidden">
<text>NA</text>
<text>HC</text>
</items>
</field>
<exclGroup xmlns="http://www.xfa.org/schema/xfa-template/2.8/" name="detcon_photo_taken" x="0in" y="0in">
<?templateDesigner itemValuesSpecified 1?>
<field w="12.446mm" h="3.825mm" name="lb_yes">
<ui>
<checkButton size="1.7639mm" shape="round">
<border>
<?templateDesigner StyleID apcb1?>
<edge/>
<fill/>
</border>
</checkButton>
</ui>
<font typeface="Myriad Pro"/>
<margin leftInset="1mm" rightInset="1mm"/>
<para vAlign="middle"/>
<caption placement="right" reserve="7.698mm">
<para vAlign="middle" spaceAbove="0pt" spaceBelow="0pt" textIndent="0pt" marginLeft="0pt" marginRight="0pt"/>
<font size="8pt" typeface="Arial Narrow" baselineShift="0pt"/>
<value>
<text>YES</text>
</value>
</caption>
<value>
<text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
</value>
<items>
<text>1</text>
</items>
</field>
<field w="28.702mm" h="3.825mm" name="lb_no" x="13.233mm">
<ui>
<checkButton size="1.7639mm" shape="round">
<border>
<?templateDesigner StyleID apcb1?>
<edge/>
<fill/>
</border>
</checkButton>
</ui>
<font typeface="Myriad Pro"/>
<margin leftInset="1mm" rightInset="1mm"/>
<para vAlign="middle"/>
<caption placement="right" reserve="23.954mm">
<para vAlign="middle" spaceAbove="0pt" spaceBelow="0pt" textIndent="0pt" marginLeft="0pt" marginRight="0pt"/>
<font size="8pt" typeface="Arial Narrow" baselineShift="0pt"/>
<value>
<text>NO (see comments)</text>
</value>
</caption>
<value>
<text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
</value>
<items>
<text>0</text>
</items>
</field>
<border>
<edge presence="hidden"/>
</border>
<?templateDesigner expand 1?></exclGroup>
So I figured out the following RegEx to perform find/replace to get only the field names, one field on each line.
Find to get Field Name: (?i)<(field|exclGroup).*name="([a-z_]\w*)".*$
Replace: $2
Another find/replace...
Remove all other lines: ^.*<(?!.*name=).*.*[\r\n]*
Replace with blank
If you execute the above two find/replace sessions, you will end up with list of field names one field per line.
What I wanted to do is to perform the above in one find/replace session, and then convert the above into SQL Statements using also find/replace, using this template:
INSERT INTO table_name (element_id, element_name, element_type, default_value, required, clone)
VALUES (12345,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12346,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12347,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12348,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12349,"field_name_goes_here","/Tx", "", "N", "Y"),
The element_id field is sequential, but don't worry about that, I can take care of this in Excel.
Appreciate your help,
Tarek

Scraper Series
One small help. The element is slightly different than the which is making things a little more complicated. Your RegEx is so sophisticated, I couldn't modify it to include the element. I think I need more time to digest it. So could you modify it to include exclGroup and only extract name only without extracting the inner field elements of the exclGroup?
Ok, here you go.
It does make it a little more complicated.
I have 2 versions to do this. One that uses recursion, one that doesn't.
I'm posting the version that uses recursion.
If you need the non-recursion, let me know and I'll post that.
Find (?:(?!<(?:field|exclGroup)(?!\w)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)[\S\s])*(?><(field|exclGroup)(?=(?:[^>"']|"[^"]*"|'[^']*')*?\sname\s*=\s*(?:(['"])([\S\s]*?)\2))\s+(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)(?:(?&core)|)</\1\s*>(?:(?!<(?:field|exclGroup)(?!\w)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)[\S\s])*(?(DEFINE)(?<core>(?>(?><([\w:]+)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)(?:(?&core)|)</\5\s*>|(?!</[\w:]+\s*>)(?>[\S\s]))+))
Replace VALUES (12345,"$3","/Tx", "", "N", "Y"),\r\n
https://regex101.com/r/icnF3i/1
Formatted (incase you need to look at it)
(?: # Prefix - Optional any chars that don't start a field or exclGroup tag
(?!
<
(?: field | exclGroup )
(?! \w )
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
[\S\s]
)*
(?> # open 'field' or 'exclGroup' tag ------------------
<
( field | exclGroup ) # (1)
(?= # Asserttion (a pseudo atomic group)
(?: [^>"'] | " [^"]* " | ' [^']* ' )*?
\s name \s* = \s*
(?:
( ['"] ) # (2), Quote
( [\S\s]*? ) # (3), Name value - only thing we want
\2
)
)
\s+
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
(?:
(?&core) # Call the core recursion function (balanced tags)
|
)
</ \1 \s* > # Close 'field' or 'exclGroup' tag ------------------
(?: # Postfix - Optional any chars that don't start a field or exclGroup tag
(?!
<
(?: field | exclGroup )
(?! \w )
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
[\S\s]
)*
# ---------------------------------------------------------
(?(DEFINE)
(?<core> # (4 start), Inner balanced tags
(?>
(?>
<
( [\w:]+ ) # (5), Any open tag
(?>
" [\S\s]*? "
| ' [\S\s]*? '
| (?:
(?! /> )
[^>]
)?
)+
>
)
(?: # Recurse core
(?&core)
|
)
</ \5 \s* > # Balanced close tag (I can see you 5)
|
(?! </ [\w:]+ \s* > ) # Any char not starting a close tag (passive)
(?> [\S\s] )
)+
) # (4 end)
)
You can view the non-recursive version here https://regex101.com/r/ztOrP5/1

I am trying to simplify the RegEx provided in the previous answer.
This is my simplified version:
RegEx: (?|(?><field.*name\s*=\s*"([a-z_]\w*)"(?:.|\n)*?(?:<\/field>))|(?:<exclGroup.*name\s*=\s*"([a-z_]\w*)"(?:.|\n)*?(?:<\/exclGroup>)))
Replace: $1
Check it out over here: https://regex101.com/r/icnF3i/3
Appreciate your feedback.
Thanks to sln for helping me to reach this level.
EDIT:
The above RegEx doesn't work in Notepad++.
To use the same under Notepad++ use the following find/replace combination:
Find: (?i)<(field|exclGroup).*name\s*=\s*"([a-z_]\w*)"[\s\S]*?<\/\1>
Replace: \(12345,"$2","/Tx", "", "N", "Y"\),

Related

Replace spaces in a string by another string of characters

I have a text file containing values as follow:
<data>
<values=11.0200004578 -1.17999994755 -16.1200008392 />
<values=97.0999984741 -0.449999988079 2.16000008583 />
<values=41.7299995422 60.6699981689 43.75 />
</data>
I am trying to get it as this:
<data>
<values A="11.0200004578" B="-1.17999994755" C="-16.1200008392 />
<values A="97.0999984741" B="-0.449999988079" C="2.16000008583 />
<values A="41.7299995422" B="60.6699981689" C="43.75 />
</data>
For the first part, it's easy, it is just a sed replacement
sed 's#<values=#<values A="#'
But I cannot manage to find a way for the other values.
Instead of hardcoding attributes A, B, C etc you may use this awk solution that generated these values from ASCII code:
awk -F= '
$1 ~ /<values$/ && (n=split($2, a, /[[:blank:]]+/)) {
s = ""
ch = 0
for (i=1; i<n; ++i)
s = s sprintf("%c=\"%s\" ", 65+ch++, a[i])
$2 = s a[n]
} 1' file
<data>
<values A="11.0200004578" B="-1.17999994755" C="-16.1200008392" />
<values A="97.0999984741" B="-0.449999988079" C="2.16000008583" />
<values A="41.7299995422" B="60.6699981689" C="43.75" />
</data>
Using sed
$ sed -E 's/(=)([[:digit:].-]+) ([[:digit:].-]+) ([[:digit:].-]+)/ A\1"\2" B\1"\3" C\1"\4"/' input_file
<data>
<values A="11.0200004578" B="-1.17999994755" C="-16.1200008392" />
<values A="97.0999984741" B="-0.449999988079" C="2.16000008583" />
<values A="41.7299995422" B="60.6699981689" C="43.75" />
</data>
If you always have 3 values A,B, and C you can match the whole tag and use capture groups to retrieve the values. In sed syntax I think it is:
s#<values=\(\d+\.?\d*\) \(\d+\.?\d*\) \(\d+\.?\d*\) />#<values A="\1" B="\2" C="\3" />#

loop / Extract nodes from clob xml column in oracle pl sql

I have this xml content stored in a clob column of a table, I have to loop through the "molecule" nodes under the "reactantList" node,and store each "molecule" node into another table containing a list of molecules,
Any help please?
I tried with xmltype, xmlsequence, xmltable etc but did not work, I also have to specify the namespace "xmlns=.." somewhere as an argument to xmltype I think, to be able to make it work...
<cml xmlns="http://www.chemaxon.com" version="ChemAxon file format v20.20.0, generated by vunknown" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.chemaxon.com http://www.chemaxon.com/marvin/schema/mrvSchema_20_20_0.xsd">
<MDocument>
<MChemicalStruct>
<reaction>
<arrow type="DEFAULT" x1="-8.022119140625" y1="0.8333333333333334" x2="-3.5637858072916657" y2="0.8333333333333334" />
<reactantList>
<molecule molID="m1">
<atomArray>
<atom id="a1" elementType="C" x2="-13.938333333333334" y2="0.7083333333333333" />
<atom id="a2" elementType="O" x2="-15.478333333333333" y2="0.7083333333333333" lonePair="2" />
</atomArray>
<bondArray>
<bond id="b1" atomRefs2="a1 a2" order="1" />
</bondArray>
</molecule>
<molecule molID="m2">
<atomArray>
<atom id="a1" elementType="O" x2="-9.897119140624998" y2="0.8333333333333333" mrvValence="0" lonePair="3" />
</atomArray>
<bondArray />
</molecule>
</reactantList>
<agentList />
<productList />
</reaction>
</MChemicalStruct>
<MReactionSign toption="NOROT" fontScale="14.0" halign="CENTER" valign="CENTER" autoSize="true" id="o1">
<Field name="text">
<![CDATA[{D font=SansSerif,size=18,bold}+]]>
</Field>
<MPoint x="-11.730452473958332" y="0.6666666666666666" />
<MPoint x="-11.217119140624998" y="0.6666666666666666" />
<MPoint x="-11.217119140624998" y="1.18" />
<MPoint x="-11.730452473958332" y="1.18" />
</MReactionSign>
</MDocument>
</cml>
You can use:
INSERT INTO molecules (molecule)
SELECT x.molecule
FROM table_name t
CROSS APPLY XMLTABLE(
XMLNAMESPACES(
'http://www.w3.org/2001/XMLSchema-instance' AS "xsi",
DEFAULT 'http://www.chemaxon.com'
),
'/cml/MDocument/MChemicalStruct/reaction/reactantList/molecule'
PASSING XMLTYPE(t.xml)
COLUMNS
molecule XMLTYPE PATH '.'
) x
db<>fiddle here

In TDE Marklogic how to escape null value triples?

<template xmlns="http://marklogic.com/xdmp/tde">
<context>/test</context>
<vars>
<var>
<name>subprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
<var>
<name>objprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
</vars>
<triples>
<triple>
<subject>
<val>sem:iri($subprefix || ElemenetName)</val>
<invalid-values>ignore</invalid-values>
</subject>
<predicate>
<val>sem:iri('is')</val>
</predicate>
<object>
<val>sem:iri($objprefix || FullName)</val>
<invalid-values>ignore</invalid-values>
</object>
</triple>
</triples>
</template>
I have created a Template to get triples out of XML.
But want to escape null value triples(s,p or o).
I am using ignore, but this works only if there is not prefix in subject or object. If there is prefix it creates triples with null(only prefix).
Do we have any way to handle this in MarkLogic TDE?
Nullable object/subject issue.
You can take more use from the context expression, particularly if you use sub-templates. Here a crude example showing a sub-template, applied to 3 example docs:
xquery version "1.0-ml";
let $tde :=
<template xmlns="http://marklogic.com/xdmp/tde">
<context>/test</context>
<vars>
<var>
<name>subprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
<var>
<name>objprefix</name>
<val>"http://www.test.com/resource/test/"</val>
</var>
</vars>
<templates>
<template>
<context>FullName</context>
<triples>
<triple>
<subject>
<val>sem:iri($subprefix || ../ElemenetName)</val>
<invalid-values>ignore</invalid-values>
</subject>
<predicate>
<val>sem:iri('is')</val>
</predicate>
<object>
<val>sem:iri($objprefix || .)</val>
<invalid-values>ignore</invalid-values>
</object>
</triple>
</triples>
</template>
</templates>
</template>
let $xml1 := <test><ElemenetName>elem</ElemenetName><FullName>full</FullName></test>
let $xml2 := <test><ElemenetName>elem</ElemenetName></test>
let $xml3 := <test><FullName>full</FullName></test>
return tde:node-data-extract(($xml1, $xml2, $xml3), $tde)
Some more background on sub-templates can be found here:
https://docs.marklogic.com/guide/sql/creating-template-views#id_28999
There is simplest way to handle the NULL values in TDE:
<...>
<val>if(ElemenetName ne '') then sem:iri($subprefix || ElemenetName) else ()</val>
<invalid-values>ignore</invalid-values>
<....>

How can I highlight syntax in Microsoft OneNote 2013?

I want to highlight syntax for my programming language of choice (proprietary) in Microsoft OneNote 2013 with a macro or script. I found a free Macro creator for MS OneNote '13 that allows creation of custom macros called "OneTastic". I created a macro that is given two arrays with lists of predefined words associated with different colors to give each list (ex: List 1 words = blue, list 2 words = orange, etc.)
API: https://www.omeratay.com/onetastic/docs/
Problem: The search logic is finding words inside of bigger words, like "IN" inside of the word "domain" (domaIN). My code is below:
<?xml version="1.0" encoding="utf-16"?>
<Macro name="CCL TEST 3" category="Color" description="" version="10">
<ModifyVar name="KEYWORDS1" op="set">
<Function name="String_Split">
<Param name="string" value="drop create program go %i declare call set end END execute else elseif protect constant curqual of subroutine to noconstant record free range in is protect define macro endmacro" />
<Param name="delimiter" value=" " />
</Function>
</ModifyVar>
<ModifyVar name="counter" op="set" value="0" />
<WhileVar name="counter" op="lt">
<Function name="Array_Length">
<Param name="array" var="KEYWORDS1" />
</Function>
<IsRootOp />
<ModifyVar name="keyword" op="set" var="KEYWORDS1">
<RightIndex var="counter" />
</ModifyVar>
<For each="Text">
<That hasProp="value" op="eq" var="keyword" />
<ModifyProp name="fontColor" op="set" value="blue" />
</For>
<ModifyVar name="counter" op="add" value="1" />
</WhileVar>
<ModifyVar name="KEYWORDS2" op="set">
<Function name="String_Split">
<Param name="string" value="datetimefind datetimediff cnvtdatetime cnvtalias format build concat findfile error alterlist alter initrec cnvtdate esmError echo max min avg sum count uar_get_code_meaning mod substring size trim hour day isnumeric expand locateval cnvtstring fillstring btestfindstring logical uar_get_code_display uar_get_meaning_by_codeset UAR_GET_CODE_BY sqltype cnvtreal echorecord cnvtupper cnvtlower cnvtdatetimeutc abs datetimediff year julian btest decode evaluate findstring asis replace validate nullterm parser value uar_timer_create uar_CreatePropList uar_SetPropString uar_CloseHandle uar_Timer_Destroy uar_Timer_Stop build2 patstring piece cnvtalphanum timestampdiff" />
<Param name="delimiter" value=" " />
</Function>
</ModifyVar>
<ModifyVar name="counter2" op="set" value="0" />
<WhileVar name="counter2" op="lt">
<Function name="Array_Length">
<Param name="array" var="KEYWORDS2" />
</Function>
<IsRootOp />
<ModifyVar name="keyword" op="set" var="KEYWORDS2">
<RightIndex var="counter2" />
</ModifyVar>
<For each="Text">
<That hasProp="value" op="eq" var="keyword" />
<ModifyProp name="fontColor" op="set" value="orange" />
</For>
<ModifyVar name="counter2" op="add" value="1" />
</WhileVar>
</Macro>
There is no such inbuilt feature available in OneNote but you can do it.
Use Visual Studio Code, it's free. Turn on right text copy/pasting. Write your code in in VS code. Copy it. It'll paste exactly as you see. Colors and all.
While this does not use VBA, I use and love the add-in NoteHightlight2016
If you don't find a language you can go through and add your own. I've added the Excel Formula keywords to the languages supported and I believe it is a bit easier than creating in OneTastic, which I also use and love.

FOR XML multiple control by attribute in tree concept

I want to figure out one issue.
I already had question about simple ordering issue but I want to order more detail.
check below this link :
SQL Server : FOR XML sorting control by attribute
I made a example case.
SQL Query.
select (
select '123' AS '#id', (
select
(
select 'test' AS '#testid' , '20' AS '#order'
FOR XML path ('tree') , TYPE
),
(
select 'test2' AS '#testid' , '30' AS '#order'
FOR XML path ('tree-order') , TYPE
),
(
select 'test' AS '#testid' , '10' AS '#order'
FOR XML path ('tree') , TYPE
)
FOR XML path ('Node') , TYPE
)
FOR XML path ('Sample') , TYPE
),
(select '456' AS '#id', (
select
(
select 'test' AS '#testid' , '20' AS '#order'
FOR XML path ('tree') , TYPE
),
(
select 'test2' AS '#testid' , '30' AS '#order'
FOR XML path ('tree-order') , TYPE
),
(
select 'test' AS '#testid' , '10' AS '#order'
FOR XML path ('tree') , TYPE
)
FOR XML path ('Node') , TYPE
)
FOR XML path ('Sample') , TYPE)
FOR XML path ('Main') , TYPE
Result :
<Main>
<Sample id="123">
<Node>
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
<tree testid="test" order="10" />
</Node>
</Sample>
<Sample id="456">
<Node>
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
<tree testid="test" order="10" />
</Node>
</Sample>
</Main>
Expected result :
<Main>
<Sample id="123">
<Node>
<tree testid="test" order="10" />
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
</Node>
</Sample>
<Sample id="456">
<Node>
<tree testid="test" order="10" />
<tree testid="test" order="20" />
<tree-order testid="test2" order="30" />
</Node>
</Sample>
</Main>
final result :
<Main>
<Sample id="123">
<Node>
<tree testid="test" />
<tree testid="test" />
<tree-order testid="test2" />
</Node>
</Sample>
<Sample id="456">
<Node>
<tree testid="test" />
<tree testid="test" />
<tree-order testid="test2" />
</Node>
</Sample>
</Main>
That's order by tree-order.
finally I don't want to show order information in attribute
Any one has great Idea?
Thank you for everybody who interesting to this.
Updated ----------------------------------------
Thank you every body finally I solved problem as below about order by and remove attribute issue :
declare #resultData xml = (select #data.query('
element Main {
for $s in Main/Sample
return element Sample {
$s/#*,
for $n in $s/Node
return element Node {
for $i in $n/*
order by $i/#order
return $i
}
}
}'));
SET #resultData.modify('delete (Main/Sample/Node/tree/#order)');
SET #resultData.modify('delete (Main/Sample/Node/tree-order/#order)');
select #resultData
select #data.query('
element Main {
for $s in Main/Sample
return element Sample {
$s/#*,
for $n in $s/Node
return element Node {
for $i in Node/*
order by $i/#order
return
if ($i/self::tree)
then element tree { $i/#testid }
else element tree-order { $i/#testid }
}
}
}
}')
What's interesting to me is that in your original post, you're stating that you're generating the XML as the result of a SQL query. If it were me, I'd control the ordering at that level.