How do you add a numbered or bulleted list in a table cell entry or line breaks - docbook-5

Docbook 5 trying to add a simplelist or single lines of text in a table row cell entry but cannot find the secret combination of allowed elements. Is it possible?

This is valid according to my editor:
<informaltable>
<tgroup cols="1">
<tbody>
<row>
<entry>
<para>A single line of text</para>
<simplelist>
<member>Apiary</member>
<member>Beekeeper</member>
</simplelist>
</entry>
</row>
</tbody>
</tgroup>
</informaltable>

Related

Delete lines by multiple patterns in specific range of lines

I have the following (simplified) file:
<RESULTS>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,3</COLUMN>
</ROW>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,1</COLUMN>
</ROW>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,2</COLUMN>
</ROW>
</RESULTS>
What I am trying to achieve is to delete all ROW elements that match on the title, but do not match on the latest VERSION (in this case 1,3).
So, what I have in mind is something like the following with sed:
sed -i '/<ROW>/,/<\/ROW>/<COLUMN NAME=\"TITLE\">title 1.*<COLUMN NAME=\"VERSION\">^1,3<\/COLUMN>/d' file
The expected output should be the following:
<RESULTS>
<ROW>
<COLUMN NAME="TITLE">title 1</COLUMN>
<COLUMN NAME="VERSION">1,3</COLUMN>
</ROW>
</RESULTS>
Unfortunately, this did not work, neither did anything that I tried. I searched a lot for similar issues, but nothing worked for me.
Is there a way of achieving it with any Linux command line utility (sed, awk, etc)?
Thanks a lot in advance.
/<ROW>/,/<\/ROW>/ won't work, because sed uses greedy matching; it matches everything from the first /<ROW>/ to the last /<\/ROW>/.
You'll have to use one of the advanced features of sed. The simplest is probably the hold space.
This:
sed -n '/<ROW>/{h;d;};H;`
will store an entire ROW block in the hold space, and overwrite it when it encounters a new ROW block. (And print nothing.)
This:
sed -n '/<ROW>/{h;d;};H;/<\/ROW>/{g;p;}
will store the entire ROW block, then print it out when it is complete.
This:
sed -n '/<ROW>/{h;d;};H;/<\/ROW>/{g;/title 1/!d;p;}'
will do the same, but will delete a block that does not contain "title 1".
This:
sed -n '/<ROW>/{h;d;};H;/<\/ROW>/{g;/title 1/!d;/1,3/p;}'
will do the same, but print only if the block contains "1,3". (You can spell out the matching lines more explicitly; I'm trying to keep this code concise.)
This might work for you (GNU sed):
sed '/<ROW>/{:a;N;/<\/ROW>/!ba;/TITLE.*title 1/!b;/VERSION.*1,3/b;d}' file
Gather up lines between <ROW> and </ROW>.
If the lines collected don't contain the correct title, bail out.
If the lines collected do contain the correct version bail out.
Otherwise delete the lines collected.

Remove base64 value from string column in Postgres

I have a text column in a table that contains HTML data along with image represented in base64 encoding.
Here is an example:
</p><p><span lang="EN"> </span></p><p>
</p><p><img width="263" height="135" align="right" src="data:image/png;base64,/9j/4AAQSkZJRgABAQEAYABgAA...." alt=""></p>
The string after base64 is really long. I want to remove the long string representation and replace with the word "image".
I tried a pattern match on base64, and remove everything after that until " mark before the alt keyword. It worked on cases where there is only occurrence of a base64 value. When there are multiple occurrences, it fails.
Is there a better way to approach this problem in order to remove just the string representing image in base64 encoding?
To have the actual replacement work more than just once, you need to use the "global" flag for the regexp_replace, e.g.:
=# SELECT regexp_replace(E'\n\n...height="135" align="right" src="data:image/png;base64,/9j/4AAQSkZJRgABAQEAYABgAA...." alt="" ...\n<p></p>\n<p><img align="left" src="data:image/png;base64,/9j/4AAQSkZJRgABAQEAYABgAA...." class="test" alt=""/>\n', '(data:[^,]+,)[^"]+', '\1<data>', 'g');
regexp_replace
------------------------------------------------------------------------------
+
+
...height="135" align="right" src="data:image/png;base64,<data>" alt="" ... +
<p></p> +
<p><img align="left" src="data:image/png;base64,<data>" class="test" alt=""/>+
(1 row)
...so: regexp_replace(my_html_column, '(data:[^,]+,)[^"]+', '\1<data>', 'g')
That should match and replace all data URI's of the given text.
Maybe your problem id the greedy match, and the solution is to match anything but " characters:
regexp_replace(col, 'base64,[^"]*', 'image')

TransferSpreadsheet builds invalid XLSX file if user customized decimal separator

In Microsoft Access 2016 (build 16.0.8201.2200), the VBA TransferSpreadsheet method is not working properly when the format of numbers in Windows 10 is customized, specifically, on computer with US region selected, if you swap "decimal symbol" and "digit grouping symbol" to be formatted like customary in Germany:
When I use TransferSpreadsheet to save a query, when I subsequently attempt to open that workbook in Excel, it says:
We have found some problem in some content in '...'. Do you want us to try to recover as much as we can?
When I do, I get the following warning:
Excel was able to open the file by repairing or removing the unreadable content.
When I look at the contents of the XLSX contents, I'm not surprised it's having a problem, because the internal XML is not well-formed. Because I've replaced the decimal separator to be "," in Windows, it's creating the numbers in the XML with commas, not decimal places. But XML standards dictate that regardless of your regional preferences, numbers in XML should use a "." as decimal symbol.
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<dimension ref="A1:K20"/>
<sheetViews>...</sheetViews>
<sheetFormatPr defaultRowHeight="15"/>
<sheetData>
<row outlineLevel="0" r="1">...</row>
<row outlineLevel="0" r="2">
...
<c r="D2" s="0">
<v>2,9328903531E+16</v>
</c>
<c r="E2" s="0">
<v>5,404939826E+16</v>
</c>
<c r="F2" s="0">
<v>2,3923963705E+16</v>
</c>
...
</row>
...
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>
While the "," might be the desired format for decimal symbol in the UI, the XLSX internal format must conform to XML standard, "." decimal symbol.
How do I solve this?
Bottom line, for the TransferSpreadsheet method to work correctly, if you want to change the formatting of numbers, do not use the "Customize Format" setting:
You should instead reset those values back to their defaults, and then select an appropriate region in the preceding dialog box, one that formats numbers as you prefer:
Having choosen a region that is formatted as desired, you thereby avoid the TransferSpreadsheet bug. When you do this, the spreadsheet will appear correctly in Excel:
But the XLSX will be formatted properly, too:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<dimension ref="D3:F3"/>
<sheetViews>
<sheetView tabSelected="1" workbookViewId="0">
<selection activeCell="F12" sqref="F12"/>
</sheetView>
</sheetViews>
<sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>
<cols>
<col min="4" max="6" width="26.85546875" style="1" bestFit="1" customWidth="1"/>
</cols>
<sheetData>
<row r="3" spans="4:6" x14ac:dyDescent="0.25">
<c r="D3" s="1">
<v>2.9328903531E+16</v>
</c>
<c r="E3" s="1">
<v>5.40493826E+16</v>
</c>
<c r="F3" s="1">
<v>2.3923963705E+16</v>
</c>
</row>
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
</worksheet>

Replace all tags below a node

I need to change all tags below a node in transformation.
Source XML looks like this:
<Address>
<s:name>name</s:name>
<s:lastName>last name <a:lastName>
<s:address1>Address Line 1</s:address1>
<s:address2>Address Line 2</s:address2>
Required O/p:
<Address>
<name>name</name>
<lastName>last name <lastName>
<address1>Address Line 1</address1>
<address2>Address Line 2</address2>
There are thousand of tags.So, can not write match to all. Is there a way I can take top level node and handle all tags below?
If you use match="/*//*" then you match all descendant elements of the root element. Then you can construct a new element using
<xsl:template match="/*//*"><xsl:element name="local-name()"><xsl:apply-templates select="#* | node ()"/></xsl:element></xsl:template>
Then add the identity transformation template to your code and you are done.

How to not escape special chars when updating XML in oracle SQL

I have a problem trying to update xmlType values in oracle.
I need to modify the xml looking similar to the following:
<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped <tags>\</tags> </c>
</a>
What I want to achieve is to modify <b/> without modifying <c/>
Unfortunately following modifyXml:
select
updatexml(XML_TO_MODIFY, '/a/b/text()', 'NewValue')
from dual;
returns this:
<a>
<b>NewValue</b>
<c>Here is some narrative containing weirdly escaped <tags></tags> </c>
</a>
as you can see, the '>' had been escaped.
Same happens for xmlQuery (the new non-deprecated version of updateXml):
select /*+ no_xml_query_rewrite */
xmlquery(
'copy $d := .
modify (
for $i in $d/a
return replace value of node $i/b with ''nana''
)
return $d'
passing t.xml_data
returning content
) as updated_doc
from (select xmlType('<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped \<tags>\</tags> </c>
</a>') as xml_data from dual) t
;
Also when using xmlTransform I will get the same result.
I tried to use the
disable-output-escaping="yes"
But it did the opposite - it unescaped the < :
select XMLTransform(
xmlType('<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped \<tags>\</tags> </c>
</a>'),
XMLType(
'<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/a/b">
<b>
<xsl:value-of select="text()"/>
</b>
</xsl:template>
<xsl:template match="/a/c">
<c>
<xsl:value-of select="text()" disable-output-escaping="yes"/>
</c>
</xsl:template>
</xsl:stylesheet>'))
from dual;
returned:
<a>
<b>NewValue</b>
<c>Here is some narrative containing weirdly escaped <tags></tags> </c>
</a>
Any suggestions?
Two things you need to know:
I cannot modify the initial format - it comes to me in this way and
I need to preserve it.
The original message is so big, that changing
the message to string and back (to use regexps as workaround) will
not do the trick.
The root of your issue seems to be that your original XML value for node C is not valid XML if it contains the > within the value instead of >, and not inside a CDATA section (also What does <![CDATA[]]> in XML mean?).
The string value of:
Here is some narrative containing weirdly escaped <tags>\</tags>
in XML format should really be
<c>Here is some narrative containing weirdly escaped &lt;tags>\&lt;/tags></c>
OR
<c><![CDATA[Here is some narrative containing weirdly escaped <tags>\</tags>]]></c>
I would either request that the XML be corrected at the source, or implement some method to sanitize the inputs yourself, such as wrapping the <c> node values in <![CDATA[]]>. If you need to save the exact original value, and the messages are large, then the best I can think of is the store duplicate copies, with the original value as string, and store the "sanitized" value as XML data type.
In the end we managed to do this with the help of java.
By:
reading the xml as a clob
modifying it in java
storing it back in the database using java.sql.Connection (for some reason, if we used
JdbcTemplate, it complained about casting to Long, which was
indication that string was over 4000 bytes (talking about clean
errors, all hail Oracle) and using CLOB Type didn't really
help. I guess it's a different story though)
When storing the data, oracle does not perform any magic, only updates tend to modify escape characters.
Possibly not an answer for everyone, but a nice workaround if you stumble upon same problem as we did.