XSLT concat, trasnlate to uppercase - xslt-1.0

I am trying to concat and trasnlate to upper-case in XSLT with the below coding:
<xsl:value-of select="concat(translate(E_Firstname, ',', E_Lastname, $uppercase)"/>
However, it just adds ABCDEFGHIJKLMNOPQRSTUVWXYZ to the field I am trying to transform into all caps. Can you someone tell me what this syntax should look like?

I am guessing (!) you want:
translate(concat(E_Firstname, ',', E_Lastname), $lowercase, $uppercase)

Related

How to remove a specific part of a string in Postgres SQL?

Say I have a column in postgres database called pid that looks like this:
set1/2019-10-17/ASR/20190416832-ASR.pdf
set1/2019-03-15/DEED/20190087121-DEED.pdf
set1/2021-06-22/DT/20210376486-DT.pdf
I want to remove everything after the last dash "-" including the dash itself. So expected results:
set1/2019-10-17/ASR/20190416832.pdf
set1/2019-03-15/DEED/20190087121.pdf
set1/2021-06-22/DT/20210376486.pdf
I've looked into replace() and split_part() functions but still can't figure out how
to do this. Please advise.
We can use a regex replacement here:
SELECT col, REGEXP_REPLACE(col, '^(.*)-[^-]+(\.\w+)$', '\1\2') AS col_out
FROM yourTable;
The regex used above captures the column value before the last dash in \1, and the extension in \2. It then builds the output using \1\2 as the replacement.
Here is a working regex demo.

How to efficiently replace special characters in an XML in Oracle SQL?

I'm parsing an xml in oracle sql.
XMLType(replace(column1,'&','<![CDATA[&]]>')) //column1 is a column name that has xml data
While parsing, I'm temporarily wrapping '&' in CDATA to prevent any xml exception. After getting rid of the exception caused by '&', I'm getting "invalid character 32 (' ') found in a Name or Nmtoken". This is because of '<' character.
E.g: <child> 40 < 50 </child> // This causes the above exception.
So I tried the below and it works.
XMLType(replace(replace(column1,'&','<![CDATA[&]]>'),'< ','<![CDATA[< ]]>'))
In the above, I'm wrapping '< '(less than symbol followed by space) in CDATA. But the above is a bit time consuming. So I'm trying to use regex to reduce the time taken.
Does anyone know how to implement the above action using regex in Oracle sql??
Input : <child> 40 & < 50 </child>
Expected Output : <child> 40 <![CDATA[&]]> <![CDATA[< ]]> 50 </child>
Note: Replacing '& ' with ampersand semicolon sometimes is leading to 'entity reference not well formed' exception. Hence I have opted to wrap in CDATA.
You can do that with a regexp like this:
select regexp_replace(sr.column1,'(&|< )','<![CDATA[\1]]>') from dual;
However, regexp_replace (and all the regexp_* functions) are often slower than using plain replace, because they do more complicated logic. So I'm not sure if it'll be faster or not.
You might already be aware, but your underlying problem here is that you're starting out with invalid XML that you're trying to fix, which is a hard problem! The ideal solution is to not have invalid XML in the first place - if possible, you should escape special characters when originally generating your XML. There are built-in functions which can do that quickly, like DBMS_XMLGEN.CONVERT or HTF.ESCAPE_SC.

How to remove semicolon in a string in HIVE SQL

I am trying to remove ";" semi-colon from a string.
What command in HIVE SQL should I use. I know regexp_replace may work..but what to put ?
It appears that ; - the special character does not work but other special characters like , or : works.
For example ,
Data looks like
;;;;;0123445
I want the data to look like this
0123445
Any help on this will be appreciated. I have been struggling with this.
REGEXP_REPLACE indeed looks like a good pick. For example, this removes all semicolons from the field :
REGEXP_REPLACE(my_column, ';', '')
From the documentation :
Returns the string resulting from replacing all substrings in INITIAL_STRING that match the java regular expression syntax defined in PATTERN with instances of REPLACEMENT.
Please note that the semicolon has no special meaning in the regexp language.
If you want to match on semicolons on the beginning of the string only (as shown in your question), use regexp special character ^, which indicates the beginning of the string
REGEXP_REPLACE(my_column, '^;', '')
To remove all semicolons, you can simply use replace():
replace(my_column, ';', '')
To remove leading semicolons, you can use:
replace(my_column, '^;+', '')
In Hive you need to escape the semi-colon.
regexp_replace(column_name,'\;','')

XSLT 1 - How can I replace '&' with '&' in XML file?

I need to find solution to fix by using XSLT 1! Most of sent XML files well formatted and someone make mess by adding characters (& < >. . .). Any way to do replace this on my side? I tried XSLT 2 and Replace function does not work as I use XSLT processor from Microsoft
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:saxon="http://saxon.sf.net/"
exclude-result-prefixes="xs saxon"
version="2.0">
<xsl:param name="path" select="'file:///E:/foo.xml'"></xsl:param>
<xsl:template match="/">
<xsl:copy-of select="unparsed-text($path)"></xsl:copy-of>
<xsl:copy-of select="saxon:parse(replace(unparsed-text($path), '&', '&amp;'))"/>
</xsl:template>
</xsl:stylesheet>
Any other suggestion how to solve this issue. for example I have input XML file like:
<?xml version="1.0" encoding="UTF-8"?>
<name>Stack & Exchange</name>
And is fail on '&' character.
Please advice!
Thank you
Conclusion
Two observations
XSLT requires at least a well formed XML document at the input, so I can't use it to correct invalid XML
(it is an XML transformation language)
in order to use replace or escape invalid characters of XML on input I need to make sure that I use an XSLT 2.0 processor
(I use Microsoft processor XSLT 1.0)
I see two options
If I receive an error on input, investigate and validate manually and send back the error message. - THIS IS I TRIED TO AVOID! (Use text tools like notepad++, excel to find an issue).
write a correcting parser in the .net language to fix before loading as XML

How to not escape special chars when updating XML in oracle SQL

I have a problem trying to update xmlType values in oracle.
I need to modify the xml looking similar to the following:
<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped <tags>\</tags> </c>
</a>
What I want to achieve is to modify <b/> without modifying <c/>
Unfortunately following modifyXml:
select
updatexml(XML_TO_MODIFY, '/a/b/text()', 'NewValue')
from dual;
returns this:
<a>
<b>NewValue</b>
<c>Here is some narrative containing weirdly escaped <tags></tags> </c>
</a>
as you can see, the '>' had been escaped.
Same happens for xmlQuery (the new non-deprecated version of updateXml):
select /*+ no_xml_query_rewrite */
xmlquery(
'copy $d := .
modify (
for $i in $d/a
return replace value of node $i/b with ''nana''
)
return $d'
passing t.xml_data
returning content
) as updated_doc
from (select xmlType('<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped \<tags>\</tags> </c>
</a>') as xml_data from dual) t
;
Also when using xmlTransform I will get the same result.
I tried to use the
disable-output-escaping="yes"
But it did the opposite - it unescaped the < :
select XMLTransform(
xmlType('<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped \<tags>\</tags> </c>
</a>'),
XMLType(
'<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/a/b">
<b>
<xsl:value-of select="text()"/>
</b>
</xsl:template>
<xsl:template match="/a/c">
<c>
<xsl:value-of select="text()" disable-output-escaping="yes"/>
</c>
</xsl:template>
</xsl:stylesheet>'))
from dual;
returned:
<a>
<b>NewValue</b>
<c>Here is some narrative containing weirdly escaped <tags></tags> </c>
</a>
Any suggestions?
Two things you need to know:
I cannot modify the initial format - it comes to me in this way and
I need to preserve it.
The original message is so big, that changing
the message to string and back (to use regexps as workaround) will
not do the trick.
The root of your issue seems to be that your original XML value for node C is not valid XML if it contains the > within the value instead of >, and not inside a CDATA section (also What does <![CDATA[]]> in XML mean?).
The string value of:
Here is some narrative containing weirdly escaped <tags>\</tags>
in XML format should really be
<c>Here is some narrative containing weirdly escaped &lt;tags>\&lt;/tags></c>
OR
<c><![CDATA[Here is some narrative containing weirdly escaped <tags>\</tags>]]></c>
I would either request that the XML be corrected at the source, or implement some method to sanitize the inputs yourself, such as wrapping the <c> node values in <![CDATA[]]>. If you need to save the exact original value, and the messages are large, then the best I can think of is the store duplicate copies, with the original value as string, and store the "sanitized" value as XML data type.
In the end we managed to do this with the help of java.
By:
reading the xml as a clob
modifying it in java
storing it back in the database using java.sql.Connection (for some reason, if we used
JdbcTemplate, it complained about casting to Long, which was
indication that string was over 4000 bytes (talking about clean
errors, all hail Oracle) and using CLOB Type didn't really
help. I guess it's a different story though)
When storing the data, oracle does not perform any magic, only updates tend to modify escape characters.
Possibly not an answer for everyone, but a nice workaround if you stumble upon same problem as we did.