Oracle SQL*Loader getting CDATA values - sql

Anybody knows how to do this? I know there's a better way of loading XML data to Oracle without using SQL*Loader, but I'm just curious on how this is done using it. I have already a code that can load XML data to the DB, however, it wont run if the XML file has values that contain a CDATA...
Below is the control file code which works if the values are not CDATA...
LOAD DATA
INFILE FRATS.xml "str '</ROW>'"
APPEND
INTO TABLE "FRATERNITIES"
(
DUMMY FILLER TERMINATED BY "<ROW>",
THE_CODE SEQUENCE (MAX, 1),
DUMMY2 FILLER TERMINATED BY "</COLUMN>",
STORE_NN_KJ ENCLOSED BY '<COLUMN NAME="THE_NAME">' AND '</COLUMN>',
STAFF_COUNT ENCLOSED BY '<COLUMN NAME="THE_COUNT">' AND '</COLUMN>'
)
Here's the XML file:
<?xml version='1.0' encoding='MS932' ?>
<RESULTS>
<ROW>
<COLUMN NAME="THE_CODE">777</COLUMN>
<COLUMN NAME="THE_NAME">CharlieOscarDelta</COLUMN>
<COLUMN NAME="THE_COUNT">24</COLUMN>
</ROW>
</RESULTS>
Here's the XML file with CDATA values. My control file will not run with it...:
<?xml version='1.0' encoding='MS932' ?>
<RESULTS>
<ROW>
<COLUMN NAME="THE_CODE"><![CDATA[777]]></COLUMN>
<COLUMN NAME="THE_NAME"><![CDATA[CharlieOscarDelta]]></COLUMN>
<COLUMN NAME="THE_COUNT"><![CDATA[24]]></COLUMN>
</ROW>
</RESULTS>

have you tried
STORE_NN_KJ "substr(substr(:STORE_NN_KJ,instr(:STORE_NN_KJ,'<![CDATA[')+9),0,instr(substr(:STORE_NN_KJ,instr(:STORE_NN_KJ,'<![CDATA[')+9),']]>'))" ENCLOSED BY '<COLUMN NAME="THE_NAME">' AND '</COLUMN>'
EDIT
Looks like I forgot a ).. Try this..

Related

Delete lines by multiple patterns in specific range of lines

I have the following (simplified) file:
<RESULTS>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,3</COLUMN>
</ROW>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,1</COLUMN>
</ROW>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,2</COLUMN>
</ROW>
</RESULTS>
What I am trying to achieve is to delete all ROW elements that match on the title, but do not match on the latest VERSION (in this case 1,3).
So, what I have in mind is something like the following with sed:
sed -i '/<ROW>/,/<\/ROW>/<COLUMN NAME=\"TITLE\">title 1.*<COLUMN NAME=\"VERSION\">^1,3<\/COLUMN>/d' file
The expected output should be the following:
<RESULTS>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,3</COLUMN>
</ROW>
</RESULTS>
Unfortunately, this did not work, neither did anything that I tried. I searched a lot for similar issues, but nothing worked for me.
Is there a way of achieving it with any Linux command line utility (sed, awk, etc)?
Thanks a lot in advance.
/<ROW>/,/<\/ROW>/ won't work, because sed uses greedy matching; it matches everything from the first /<ROW>/ to the last /<\/ROW>/.
You'll have to use one of the advanced features of sed. The simplest is probably the hold space.
This:
sed -n '/<ROW>/{h;d;};H;`
will store an entire ROW block in the hold space, and overwrite it when it encounters a new ROW block. (And print nothing.)
This:
sed -n '/<ROW>/{h;d;};H;/<\/ROW>/{g;p;}
will store the entire ROW block, then print it out when it is complete.
This:
sed -n '/<ROW>/{h;d;};H;/<\/ROW>/{g;/title 1/!d;p;}'
will do the same, but will delete a block that does not contain "title 1".
This:
sed -n '/<ROW>/{h;d;};H;/<\/ROW>/{g;/title 1/!d;/1,3/p;}'
will do the same, but print only if the block contains "1,3". (You can spell out the matching lines more explicitly; I'm trying to keep this code concise.)
This might work for you (GNU sed):
sed '/<ROW>/{:a;N;/<\/ROW>/!ba;/TITLE.*title 1/!b;/VERSION.*1,3/b;d}' file
Gather up lines between <ROW> and </ROW>.
If the lines collected don't contain the correct title, bail out.
If the lines collected do contain the correct version bail out.
Otherwise delete the lines collected.

Concatenate XML in BPEL 2.0

Need your help for a requirement in BPEL 2.0. I have a collection in the below format
<FilesCollection>
<Files>
<transactionid>
<status>
<filename>
<Files>
<FilesCollection>
I would be getting several such collections while traversing through a ForEach loop.
Once I have exited the loop , I need to concatenate all the collections so that finally I get something as below
<FilesCollection>
<Files>
<transactionid>
<status>
<filename>
<Files>
<Files>
<transactionid>
<status>
<filename>
<Files>
<Files>
<transactionid>
<status>
<filename>
<Files>
<FilesCollection>
Please note that the number of FilesCollection element and the number of Files element appearing within it would be dynamic.
Please help me with this.
Thanks
Arijit
As i understand you have multiple FilesCollection in XML document and you want to wrap inside one then you need to do something like this:
Note: Suppose your root element is root in source XML
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs" version="1.0">
<xsl:template match="root">
<root>
<FilesCollection>
<xsl:copy-of select="FilesCollection/node()"/>
</FilesCollection>
</root>
</xsl:template>
</xsl:stylesheet>

TransferSpreadsheet builds invalid XLSX file if user customized decimal separator

In Microsoft Access 2016 (build 16.0.8201.2200), the VBA TransferSpreadsheet method is not working properly when the format of numbers in Windows 10 is customized, specifically, on computer with US region selected, if you swap "decimal symbol" and "digit grouping symbol" to be formatted like customary in Germany:
When I use TransferSpreadsheet to save a query, when I subsequently attempt to open that workbook in Excel, it says:
We have found some problem in some content in '...'. Do you want us to try to recover as much as we can?
When I do, I get the following warning:
Excel was able to open the file by repairing or removing the unreadable content.
When I look at the contents of the XLSX contents, I'm not surprised it's having a problem, because the internal XML is not well-formed. Because I've replaced the decimal separator to be "," in Windows, it's creating the numbers in the XML with commas, not decimal places. But XML standards dictate that regardless of your regional preferences, numbers in XML should use a "." as decimal symbol.
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<dimension ref="A1:K20"/>
<sheetViews>...</sheetViews>
<sheetFormatPr defaultRowHeight="15"/>
<sheetData>
<row outlineLevel="0" r="1">...</row>
<row outlineLevel="0" r="2">
...
<c r="D2" s="0">
<v>2,9328903531E+16</v>
</c>
<c r="E2" s="0">
<v>5,404939826E+16</v>
</c>
<c r="F2" s="0">
<v>2,3923963705E+16</v>
</c>
...
</row>
...
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
While the "," might be the desired format for decimal symbol in the UI, the XLSX internal format must conform to XML standard, "." decimal symbol.
How do I solve this?
Bottom line, for the TransferSpreadsheet method to work correctly, if you want to change the formatting of numbers, do not use the "Customize Format" setting:
You should instead reset those values back to their defaults, and then select an appropriate region in the preceding dialog box, one that formats numbers as you prefer:
Having choosen a region that is formatted as desired, you thereby avoid the TransferSpreadsheet bug. When you do this, the spreadsheet will appear correctly in Excel:
But the XLSX will be formatted properly, too:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<dimension ref="D3:F3"/>
<sheetViews>
<sheetView tabSelected="1" workbookViewId="0">
<selection activeCell="F12" sqref="F12"/>
</sheetView>
</sheetViews>
<sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>
<cols>
<col min="4" max="6" width="26.85546875" style="1" bestFit="1" customWidth="1"/>
</cols>
<sheetData>
<row r="3" spans="4:6" x14ac:dyDescent="0.25">
<c r="D3" s="1">
<v>2.9328903531E+16</v>
</c>
<c r="E3" s="1">
<v>5.40493826E+16</v>
</c>
<c r="F3" s="1">
<v>2.3923963705E+16</v>
</c>
</row>
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>

Group by ID and Sum Amount in XSLT

I'm fairly new to XSLT and stuck on a current problem. I've done some searches throughout Stackflow (seems like Muenchian method is the common group method) but I can't seem to mimic some of the posted ideas as of yet.
So I'm using a line item read system of which I'm trying to write code in XSLT to read every line to check if the supplier ID is the same, if true, it will aggregate this into one line, then sum the amounts. If not true, it should start a new line with the ID and sum the amount and so forth. I am using xml version='1.0'
Below is my current data file in XML:
<data>
<row>
<column1>06-11111</column1>
<column2>CP</column2>
<column3>744.04</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11111</column1>
<column2>CP</column2>
<column3>105.09</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11111</column1>
<column2>CP</column2>
<column3>1366.24</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11111</column1>
<column2>CP</column2>
<column3>485.71</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11112</column1>
<column2>Ever</column2>
<column3>459.60</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11112</column1>
<column2>Ever</column2>
<column3>409.14</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11112</column1>
<column2>Ever</column2>
<column3>397.12</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11113</column1>
<column2>GE</column2>
<column3>1425</column3>
<column4>CAD</column4>
</row>
<row>
<column1>06-11114</column1>
<column2>Husky</column2>
<column3>-215.14</column3>
<column4>USD</column4>
</row>
<row>
<column1>06-11114</column1>
<column2>Husky</column2>
<column3>2015</column3>
<column4>USD</column4>
</row>
<row>
<column1>06-11114</column1>
<column2>Husky</column2>
<column3>11195.34</column3>
<column4>USD</column4>
</row>
</data>
The output I would like to achieve after running the XSLT is
06-11111 | CP |2701.08
06-11112 | Ever |1265.86
06-11113 | GE |1425
06-11114 | Husky |12995.20
Any help to get me started would be fantastic!
Here is the grouping using the Muenchian method. I'll let you play with getting the numbers formatted correctly based on the number of decimal points.
I typically don't use this because it's limited, tricky and doesn't lend itself to push programming. But, it will work for you today.
<xsl:template match="#* | node()">
<xsl:apply-templates select="#* | node()"/>
</xsl:template>
<xsl:key name="rows" match="row" use="concat(column1, '||', column2)" />
<xsl:template match="data">
<xsl:for-each select="row[generate-id(.) = generate-id(key('rows', concat(column1, '||', column2))[1])]">
<xsl:sort select="column1" data-type="text" order="ascending"/>
<xsl:sort select="column2" data-type="text" order="ascending"/>
<xsl:value-of select="concat(column1,'|',column2,'|')"/>
<xsl:variable name="mySum">
<xsl:value-of select="sum(key('rows', concat(column1, '||', column2))/column3)"/>
</xsl:variable>
<xsl:value-of select="format-number($mySum,'#,##0.00')"/>
<xsl:value-of select="'
'"/>
</xsl:for-each>
</xsl:template>

Piwik statitics about all websites

Is it possible to use the Piwik-API with all Websites, not just for a single one?
What i want to do is get a mean value of used browsers. I can do this for a single website like this:
?module=API&method=UserSettings.getBrowser&idSite=1&period=day&date=last10&format=rss
If i just remove idSite=1 i get an error.
You can specify all sites using idSite=all, you can also specify multiple sites by separating the ids with commas idSite=1,2,4,5.
The resulting output is given per idSite wrapped in an extra <results> tag, so whereas before you had
<result>
<row>
<label>Chrome 14.0</label>
<nb_uniq_visitors>13</nb_uniq_visitors>
...
</row>
<row>
<label>Chrome 13.0</label>
<nb_uniq_visitors>13</nb_uniq_visitors>
...
</row>
...
</result>
You now get
<results>
<result idSite="2">
<row>
<label>Chrome 14.0</label>
<nb_uniq_visitors>13</nb_uniq_visitors>
...
</row>
<row>
<label>Chrome 13.0</label>
<nb_uniq_visitors>13</nb_uniq_visitors>
...
</row>
...
</result>
<result idSite="3">
<row>
<label>Chrome 14.0</label>
<nb_uniq_visitors>13</nb_uniq_visitors>
...
</row>
<row>
<label>Chrome 13.0</label>
<nb_uniq_visitors>13</nb_uniq_visitors>
...
</row>
...
</result>
...
</results>
This does mean that any aggregating for your mean value will have to be done once you get the data but this should be relatively trivial.