Teradata SQL - Replacing special characters - sql

I'm using Report Builder 3.0 for my reports. My report runs, however, if a user exports the results to Excel (xlsx) instead of Excel 2003 (xls), they get an "illegal xml character" message when the file is open.
4 of the columns contain "&" and / or " ' "; so I'm trying to replace these special characters; which I believe are causing the issue.
I've tried to update this line:
j.journal_desc AS "Jrnl Description",
with this line:
oreplace(oreplace(j.journal_desc, ’&’, ‘and’),'''','') AS "Jrnl Description",
and it works fine. However when I do this on a second line I get the message: "SELECT Failed. [9804] Response Row size or Constant Row size overflow".
I've tried "otranslate" and it works on 2 columns. However, when I try it on the 3rd column, I get the same overflow message.
Is it possible to use oreplace or otranslate on multiple columns? Am I doing something wrong? Is there a better way to replace these special characters? t
Thanks for the help......

oreplace and otranslate when used the result string will have length of 8000 unicode characterset.each of otranslate will make much longer by 8000. Try to cast to smaller length should fix problem.
CAST(oreplace(journal_desc,'&','and') AS VARCHAR(100))

Related

Replace string output from query in results tab

I have a DB with a large number of CLOB columns. Each of these columns contains a repeated set of characters, which are fillers designed to designate a paragraph break (I didn't design the tables).
Is there a way to write the script so that each time the script finds these characters it enters a paragraph or line break into the results that are returned, while still saying in the same row of the results?
The data would look like
"Hello all,XYZ!£$I can't get ... to work.XYZ!£$The error meassge says ..."
As an example:
SELECT *
FROM ALERTS
REPLACE(Alert_Text, 'XYZ!£$', CH(13))
(Obviously the above returns errors)
The ideal output of the query would be:
"Hello all,
I can't get ... to work.
The error meassge says ..."
I am using SqlDbx to connect to an Oracle DB.
The obvious error is ORA-00904: "CH": invalid identifier
The reason is, that the function name is CHR
select REPLACE(txt, 'XYZ!£$', CHR(13)) from tab;
REPLACE(TXT,'XYZ!£$',CHR(13))
--------------------------------------------------------------------------------
Hello all,
I can't get ... to work.
The error meassge says ...

HeidiSQL- Importing csv; Incorrect string value

I am a beginner to database management. Currently,I am trying to import a csv file using HeidiSQL from MS Access. I was able to import 5 of 6 tables properly. On the last one, I am constantly getting
Incorrect string value: '\xE1' to ...' for column 'Desc' at row 1248856 *
For numerous rows. From my research, I've tried numerous permutations of chinging the data type of "desc" to text, longblob,
changing the "default collation" to: utf8mb4-unicode_ci, changing "encoding" to UTF-8 Unicode (utf8mb4).
But nothing has worked thus far. Can someone tell me how to correct this?
I figured it out by changing encoding to "Western European" and removing backslashes that were escaping " characters that surrounded my fields.

Informix 11.5 SQL Select Carriage Return and Line Feed

Informix 11.5
I am trying to search for carriage returns and line feeds that may exist in a VARCHAR field. First, I need a SELECT statement to show that they exist. Second, I need to REPLACE them with a space or other character.
I've tried all kinds of variations:
CHR(10) + CHR(13)
CHR(10) || CHR(13)
CHAR(13) + CHAR(10)
CHAR(13) || CHAR(10)
SELECT CHR(10) from systables;
Everything gives an error: Routine (chr) can not be resolved.
I've been searching all over and just can't find anything that works, and I'm sure this is crazy stupid easy.
Get the ASCII package from the IIUG
The CHR() function was added to IDS 11.70; it isn't in IDS 11.50.
The good news is you can add the function because IDS is an extensible server. The better news for you is that you can obtain the relevant code from the IIUG web site in the Software Archive under the Miscellaneous section as ascii.
That should allow you to do what you need. (Note: I wrote the code way back when — before there was support built into any of the servers.)
Windows makes things more complicated
I was uploading the ascii.unl file and I get an error that the number of columns do not match on line 13. Have you seen this before? I'm on Windows 2008. The errors are:
846: Number of values in load file is not equal to number of columns.
847: Error in load file line 13.
I hadn't seen it before, but I've not tried the file on Windows and … well, let's say life gets trickier on Windows than it is on Unix (and this bit isn't all that simple on Unix).
First of all, the data file needs to have CRLF line endings instead of the NL-only line endings that are standard on Unix. (Note that NL, newline, is another name for LF, line feed — aka '\n'.) For most lines in the unload file, that isn't a problem.
The two entries for which it might be (is) a problem are for CR and LF — entries 13 and 10 respectively. In theory, if the entry for line 10 contains (in C string notation) "10|\\\n\r\n" (that is, 10, pipe, backslash, newline, CRLF), all should be OK; the absence of an error message for line 10 suggests that it is OK.
Similarly, the entry for line 13 is "13|\r\r\n", which apparently causes grief. The simplest trial fix is to add a backslash here too: "13|\\\r\r\nn". The backslash says "the next character doesn't have a special meaning". If that doesn't work, we'll probably have to try hex-escape notation: "13|\\0d\r\n" — and use dbaccess -X to enable the hex escape notation.
With luck, one of those two (or both) will work. If neither works, come back and we'll try to think of something else.
As per my above comment:
I was uploading the ascii.unl file and I get an error that the number of columns do not match on line 13. Have you seen this before? I'm on Windows 2008. 846: Number of values in load file is not equal to number of columns. 847: Error in load file line 13.
Here is what I see in the ascii.unl file.
If I put this into MS Word and turn on Show Formatting/Paragraph marks, it shows this:

Pentaho Spoon - Validate Fixed Width Input File Format

I'm trying to process a fixed width input file in pentaho and validate the format. The file will be a mixture of strings, numbers and dates. However when attempting to process a number field that has an incorrect character present (which i had expected would throw an error) it just reads the first part of the number and ignores the bad char.
I can recreate this issue with a very simple input file containing a single field:
I specify the expected number format, along with start position and length:
On running the transformation i would have expected the 'Q' to cause an error instead the following result is displayed, just reading the first two digits "67" and padding the rest to match the specified format:
If the input file is formatted correctly it runs perfectly well, but need it to throw an error otherwise. Any suggestions would be awesome. Thanks!
Just an FYI in case someone stumbles accross this question after hitting the same issues as myself.
I was able to construct a workaround by reading all values in the "Text File Input" step as strings, and then using a "Data Validator" step equipped with regex evaluation to ensure numbers were correctly formatted before parsing to number type with a following "Select Values" step.
Takes a bit longer to do this for every field, but was the most robust solution i could come up with.
Thanks

Replace character in SQL results

This is from a Oracle SQL query. It has these weird skinny rectangle shapes in the database in places where apostrophes should be. (I wish we would could paste screen shots in here)
It looks like this when I copy and paste the results.
spouse�s
is there a way to write a SQL SELECT statement that searches for this character in the field and replaces it with an apostrophe in the results?
Edit: I need to change only the results in a SELECT statement for reporting purposes, I can't change the Database.
I ran this
select dump('�') from dual;
which returned
Typ=96 Len=3: 239,191,189
This seems to work so far
select translate('What is your spouse�s first name?', '�', '''') from dual;
but this doesn't work
select translate(Fieldname, '�', '''') from TableName
Select FN from TN
What is your spouse�s first name?
SELECT DUMP(FN, 1016) from TN
Typ=1 Len=33 CharacterSet=US7ASCII: 57,68,61,74,20,69,73,20,79,6f,75,72,20,73,70,6f,75,73,65,92,73,20,66,69,72,73,74,20,6e,61,6d,65,3f
EDIT:
So I have established that is the backquote character. I can't get the DB updated so I'm trying this code
SELECT REGEX_REPLACE(FN,"\0092","\0027") FROM TN
and I"m getting ORA-00904:"Regex_Replace":invalid identifier
This seems a problem with your charset configuracion. Check your NLS_LANG and others NLS_xxx enviroment/regedit values. You have to check the oracle server, your client and the client of the inserter of that data.
Try to DUMP the value. you can do it with a select as simple as:
SELECT DUMP(the_column)
FROM xxx
WHERE xxx
UPDATE: I think that before try to replace, look for the root of the problem. If this happens because a charset trouble you can get big problems with bad data.
UPDATE 2: Answering the comments. The problem may be is not on the database server side, may be is in the client side. The problem (if this is the problem) can be a translation on server to/from client comunication. It's for a server-client bad configuracion-coordination. For instance if the server has defined UTF8 charset and your client uses US7ASCII, then all acutes will appear as ?.
Another approach can be that if the server has defined UTF8 charset and your client also UTF8 but the application is not able to show UTF8 chars, then the problem is in the application side.
UPDATE 3: On your examples:
select translate('What. It works because the � is exactly the same char: You have pasted on both sides.
select translate(Fieldname. It does not work because the � is not stored on database, it's the char that the client receives may be because some translation occurs from the data table until it's showed to you.
Next step: Look in DUMP syntax and try to extract the codes for the mysterious char (from the table not pasting �!).
I would say there's a good chance the character is a single-tick "smart quote" (I hate the name). The smart quotes are characters 91-94 (using a Windows encoding), or Unicode U+2018, U+2019, U+201C, and U+201D.
I'm going to propose a front-end application-based, client-side approach to the problem:
I suspect that this problem has more to do with a mismatch between the font you are trying to display the word spouse�s with, and the character �. That icon appears when you are trying to display a character in a Unicode font that doesn't have the glyph for the character's code.
The Oracle database will dutifully return whatever characters were INSERTed into its' column. It's more up to you, and your application, to interpret what it will look like given the font you are trying to display your data with in your application, so I suggest investigating as to what this mysterious � character is that is replacing your apostrophes. Start by using FerranB's recommended DUMP().
Try running the following query to get the character code:
SELECT DUMP(<column with weird character>, 1016)
FROM <your table>
WHERE <column with weird character> like '%spouse%';
If that doesn't grab your actual text from the database, you'll need to modify the WHERE clause to actually grab the offending column.
Once you've found the code for the character, you could just replace the character by using the regex_replace() built-in function by determining the raw hex code of the character and then supplying the ASCII / C0 Controls and Basic Latin character 0x0027 ('), using code similar to this:
UPDATE <table>
set <column with offending character>
= REGEX_REPLACE(<column with offending character>,
"<character code of �>",
"'")
WHERE regex_like(<column with offending character>,"<character code of �>");
If you aren't familiar with Unicode and different ways of character encoding, I recommend reading Joel's article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). I wasn't until I read that article.
EDIT: If your'e seeing 0x92, there's likely a charset mismatch here:
0x92 in CP-1252 (default Windows code page) is a backquote character, which looks kinda like an apostrophe. This code isn't a valid ASCII character, and it isn't valid in IS0-8859-1 either. So probably either the database is in CP-1252 encoding (don't find that likely), or a database connection which spoke CP-1252 inserted it, or somehow the apostrophe got converted to 0x92. The database is returning values that are valid in CP-1252 (or some other charset where 0x92 is valid), but your db client connection isn't expecting CP-1252. Hence, the wierd question mark.
And FerranB is likely right. I would talk with your DBA or some other admin about this to get the issue straightened out. If you can't, I would try either doing the update above (seems like you can't), or doing this:
INSERT (<normal table columns>,...,<column with offending character>) INTO <table>
SELECT <all normal columns>, REGEX_REPLACE(<column with offending character>,
"\0092",
"\0027") -- for ASCII/ISO-8859-1 apostrophe
FROM <table>
WHERE regex_like(<column with offending character>,"\0092");
DELETE FROM <table> WHERE regex_like(<column with offending character>,"\0092");
Before you do this you need to understand what actually happened. It looks to me that someone inserted non-ascii strings in the database. For example Unicode or UTF-8. Before you fix this, be very sure that this is actually a bug. The apostrophe comes in many forms, not just the "'".
TRANSLATE() is a useful function for replacing or eliminating known single character codes.