Oracle error due to non ascii characters in xmlelement method - sql

I’m getting xml parsing failed error on this select query. ‘Colname’ contains non-ascii values. How to make this query not to fail due to the non-ascii characters.
SELECT RTRIM(XMLAGG(XMLELEMENT(E,colname,',').EXTRACT('//text()') ORDER BY colname).GetClobVal(),',')
FROM tablename;

I think you should use replace for special characters. Like: replace(column_name,'<','&lt'). The replace should be applied for each special character.
For a list of special chars, see: https://docs.oracle.com/cd/E14004_01/books/EAI5/EAI5_Overview5.html

Related

How to store hyphen/dash in Oracle Varchar2

I am trying to store some text with a hyphen aka dash (-) in Oracle 12c Varchar2 field.
But when I go to do a Select on the table value, the hyphen/dash character results in a funny looking symbol. I have tried escaping before using the dash (-) but that still produced the funny looking symbol.
How do i store hypens/dashes properly in Oracle?
Thank you
Putting as answer as for comment it would be too long.
First you have to establish the problem is with inserting dash or while fetching it. To verify, run this on the column
select * from table where column like '%-%';
If you get output, that means it is stored properly. So the problem is with displaying it.
If you don't get ouput, that means you are not inserting it properly. In that case show your insert statement. You just have to treat dash as any other string character.

Replace special character apostrophe with normal apostrophe

I have a field which is getting data that contains a special character type of apostrophe outside of the normal oracle ascii range of 0-127. I am trying to do a replace function on this but it keeps being switched to a ? in the DDL. Looking for another way to do the replace
This works in a query but switches when put in the DDL for a view
regexp_replace(field_name,'’',chr(39))
switches to
regexp_replace(field_name,'?',chr(39))
A dump function shows that oracle is storing the apostrophe as three characters of ascii 226,128,153. I tried to write the replace on a concatenation of those but that didn't work either.
First, examine the original data that contains the weird apostrophe. I'm not convinced that it is indeed three characters. Use this:
select value
, substr(value, 5, 1) one_character
, ascii(substr(value, 5, 1)) ascii_value
from table;
This would isolate the 5th character from a column value and its ascii value. Adjust the 5 to the place where the weird apostrophe is located.
When you have the ascii value, use plain replace like this to get rid of it (regexp_replace seems overkill):
replace(value, chr(ascii_value_of_weird_apostrophe), chr(39));

SQL Server regular expression select and update

I have a column that I need to clean the data up on.
First I'd like to do a select to get a record of the bad data then I've like to run a replace on the invalid charters.
I'm looking to select anything that contains non alphanumeric characters but ignores the slash "\" as the second character and also ignores underscores and dashes in the rest of the string. Here's a couple of example of the data I'm expecting to get back from this query.
#\AAA
A\Adam's
A\Amanda.Smith
B\Bear's-ltd
C\Couple & More
After this I'd like to run a replace on any of these invalid characters and replace them with underscores so the result would look like this:
_\AAA
A\Adam_s
A\Amanda_Smith
B\Bear_s-ltd
C\Couple_More
I do not think there is native support for that. You can create a CLR to support regex, ex: https://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/

Store string with special characters like quotes or backslash in postgresql table

I have a string with value
'MAX DATE QUERY: SELECT iso_timestamp(MAX(time_stamp)) AS MAXTIME FROM observation WHERE offering_id = 'HOBART''
But on inserting into postgresql table i am getting error:
org.postgresql.util.PSQLException: ERROR: syntax error at or near "HOBART".
This is probably because my string contains single quotes. I don't know my string value. Every time it keeps changing and may contain special characters like \ or something since I am reading from a file and saving into postgres database.
Please give a general solution to escape such characters.
As per the SQL standard, quotes are delimited by doubling them, ie:
insert into table (column) values ('I''m OK')
If you replace every single quote in your text with two single quotes, it will work.
Normally, a backslash escapes the following character, but literal backslashes are similarly escaped by using two backslashes"
insert into table (column) values ('Look in C:\\Temp')
You can use double dollar quotation to escape the special characters in your string.
The above query as mentioned insert into table (column) values ('I'm OK')
changes to insert into table (column) values ($$I'm OK$$).
To make the identifier unique so that it doesn't mix with the values, you can add any characters between 2 dollars such as
insert into table (column) values ($aesc6$I'm OK$aesc6$).
here $aesc6$ is the unique string identifier so that even if $$ is part of the value, it will be treated as a value and not a identifier.
You appear to be using Java and JDBC. Please read the JDBC tutorial, which describes how to use paramaterized queries to safely insert data without risking SQL injection problems.
Please read the prepared statements section of the JDBC tutorial and these simple examples in various languages including Java.
Since you're having issues with backslashes, not just 'single quotes', I'd say you're running PostgreSQL 9.0 or older, which default to standard_conforming_strings = off. In newer versions backslashes are only special if you use the PostgreSQL extension E'escape strings'. (This is why you always include your PostgreSQL version in questions).
You might also want to examine:
Why you should use prepared statements.
The PostgreSQL documentation on the lexical structure of SQL queries.
While it is possible to explicitly quote values, doing so is error-prone, slow and inefficient. You should use parameterized queries (prepared statements) to safely insert data.
In future, please include a code snippet that you're having a problem with and details of the language you're using, the PostgreSQL version, etc.
If you really must manually escape strings, you'll need to make sure that standard_conforming_strings is on and double quotes, eg don''t manually escape text; or use PostgreSQL-specific E'escape strings where you \'backslash escape\' quotes'. But really, use prepared statements, it's way easier.
Some possible approaches are:
use prepared statements
convert all special characters to their equivalent html entities.
use base64 encoding while storing the string, and base64 decoding while reading the string from the db table.
Approach 1 (prepared statements) can be combined with approaches 2 and 3.
Approach 3 (base64 encoding) converts all characters to hexadecimal characters without loosing any info. But you may not be able to do full-text search using this approach.
Literals in SQLServer start with N like this:
update table set stringField = N'/;l;sldl;'''mess'

How do I escape an enclosure character in a SQL Loader data file?

I have a SQL*Loader control file that has a line something like this:
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '#'
Normally, I'd use a quotation mark, but that seems to destroy emacs's python syntax highlighting if used inside a multi-line string. The problem is that we are loading an ADDRESS_LINE_2 column where only 7,000 out of a million records are loading because they have lines like this:
...(other columns),Apt #2,(other columns)...
Which is of course causing errors. Is there any way to escape the enclosing character so this doesn't happen? Or do I just need to choose a better enclosing character?
I've looked through the documentation, but don't seem to have found an answer to this.
I found it...
If two delimiter characters are encountered next to each other, a single occurrence of the delimiter character is used in the data value. For example, 'DON''T' is stored as DON'T. However, if the field consists of just two delimiter characters, its value is null.
Field List Reference
Unfortunately, SqlLoader computes both occurrences of the delimiter while checking for max length of the field. For instance, DON''T will be rejected in a CHAR(5) field, with ORA-12899: value too large for column blah.blah2 (actual: 6, maximum: 5).
At least in my 11gR2 . Haven't tried in other versions....