I uploaded a spreadsheet to my database and some of the emails have spaces after them. I tried using Trim and RTRIM and none of them work. Then I came to think that maybe its some invisible hex code. This is what it looks like when I copy it out
This is what shows in my queue
"john.red#test.com\u00a0\"
this is what it looks like in the database with the space
john.red#test.com
how would I remove this space from all the fields?
Following code will be helpful to you,
UPDATE Your_Table
SET Your_Column = REPLACE(Your_Column, NCHAR(0x00A0), '')
\u00a0 is the non-breaking space. You can remove it by replacing it with ''.
\u00a0 is the NO-BREAK SPACE. It's not a real space and RTRIM will not take it away.
You will have to use REPLACE to remove it:
REPLACE(<column-name>, NCHAR(0x00A0), '')
Related
I need to find and clean up line breaks, carriage returns, tabs and "SUB"-characters in a set of 400k+ string records, but this DB2 environment is taking a toll on me.
Thought I could do some search and replacing with the REPLACE() and CHR() functions, but it seems CHR() is not available on this system (Error: CHR in *LIBL type *N not found). Working with \t, \r, \n etc doesn't seem to be working either. The chars can be in the middle of strings or at the end of them.
DBMS = DB2
System = iSeries
Language = SQL
Encoding = Not sure, possibly EBCDIC
Any hints on what I can do with this?
I used this SQL to find x'25' and x'0D':
SELECT
<field>
, LOCATE(x'0D', <field>) AS "0D"
, LOCATE(x'25', <field>) AS "25"
, length(trim(<field>)) AS "Length"
FROM <file>
WHERE LOCATE(x'25', <field>) > 0
OR LOCATE(x'0D', <field>) > 0
And I used this SQL to replace them:
UPDATE <file>
SET <field> = REPLACE(REPLACE(<field>, x'0D', ' '), x'25', ' ')
WHERE LOCATE(x'25', <field>) > 0
OR LOCATE(x'0D', <field>) > 0
If you want to clear up specific characters like carriage return (EBCDIC x'0d') and line feed (EBCDIC x'25') you should find the translated character in EBCDIC then use the TRANSLATE() function to replace them with space.
If you just want to remove undisplayable characters then look for anything under x'40'.
Here is an sample script that replaces X'41' by X'40'. Something that was creating issues at our shop:
UPDATE [yourfile] SET [yourfield] = TRANSLATE([yourfield], X'40',
X'41') WHERE [yourfield] like '%' concat X'41' concat '%'
If you need to replace more than one character, extend the "to" and "from" hexadecimal strings to the values you need in the TRANSLATE function.
Try TRANSLATE or REPLACE.
The brute force method involves using POSITION to find the errant character, then SUBSTR before and after it. CONCAT the two substrings (less the undesirable character) to re-form the column.
The character encoding is almost certainly one of the EBCDIC character sets. Depending on how the table got loaded in the first place, the CR may be x'0d' and the LF x'15', x'25'. An easy way to find out is to get to a green screen and do a DSPPFM against the table. Press F10 then F11 to view the table is raw, hexadecimal (over/under) format.
For details on the available functions see the
DB2 for i5/OS SQL Reference.
Perhaps the TRANSLATE() function will serve your needs.
TRANSLATE( data, tochars, fromchars )
...where fromchars is the set of characters you don't want, and tochars is the corresponding characters you want them replaced with. You may have to write this out in hex format, as x'nnnnnn...' and you will need to know what character set you are working with.
Using the DSPFFD command on your table should show the CCSID of your fields.
we struggled a lot to replace the new line char and carriage return from flat file.
Finally we used below sql to sort the issue.
REPLACE(REPLACE(COLUMN_NAME, CHR(13), ''), CHR(10), '')
Try it out
CR = CHR(13)
LF = CHR(10)
I am having some rows in table with some unusual character. When I use ascii() or unicode() for that character, it returns 63. But when I try this -
update MyTable
set MyColumn = replace(MyColumn,char(63),'')
it does not replace. The unusual character still exists after the replace function. Char(63) incidentally looks like a question mark.
For example my string is 'ddd#dd ddd' where # it's my unusual character and
select unicode('#')
return me 63.But this code
declare #str nvarchar(10) = 'ddd#dd ddd'
set #char = char(unicode('#'))
set #str = replace(#str,#char,'')
is working!
Any ideas how to resolve this?
Additional information:
select ascii('�') returns 63, and so does select ascii('?'). Finally select char(63) returns ? and not the diamond-question-mark.
When this character is pasted into Excel or a text editor, it looks like a space, but in an SQL Server Query window (and, apparently, here on StackOverflow as well), it looks like a diamond containing a question mark.
Not only does char(63) look like a '?', it is actually a '?'.
(As a simple test ensure you have numlock on your keyboard on, hold down the alt key andtype '63' into the number pad - you can all sorts of fun this way, try alt-205, then alt-206 and alt-205 again: ═╬═)
Its possible that the '?' you are seeing isn't a char(63) however, and more indicitive of a character that SQL Server doesn't know how to display.
What do you get when you run:
select ascii(substring('[yourstring]',[pos],1));
--or
select unicode(substring('[yourstring]',[pos],1));
Where [yourstring] is your string and [pos] is the position of your char in the string
EDIT
From your comment it seems like it is a question mark. Have you tried:
replace(MyColumn,'?','')
EDIT2
Out of interest, what does the following do for you:
replace(replace(MyColumn,char(146),''),char(63),'')
char(63) is a question mark. It sounds like these "unusual" characters are displayed as a question mark, but are not actually characters with char code 63.
If this is the case, then removing occurrences of char(63) (aka '?') will of course have no effect on these "unusual" characters.
I believe you actually didn't have issues with literally CHAR(63), because that should be just a normal character and you should be able to properly work with it.
What I think happened is that, by mistake, an UTF character (for example, a cyrilic "А") was inserted into the table - and either your:
columns setup,
the SQL code,
or the passed in parameters
were not prepared for that.
In this case, the sign might be visible to you as ?, and its CHAR() function would actually give 63, but you should really use the NCHAR() to figure out the real code of it.
Let me give a specific example, that I had multiple times - issues
with that Cyrilic "А", which looks identical to the Latin one, but has
a unicode of 1040.
If you try to use the non-UTF CHAR function on that 1040 character,
you would get a code 63, which is not true (and is probably just an
info about the first byte of multibyte character).
Actually, run this to make the differences in my example obvious:
SELECT NCHAR(65) AS Latin_A, NCHAR(1040) Cyrilic_A, ASCII(NCHAR(1040)) Latin_A_Code, UNICODE(NCHAR(1040)) Cyrilic_A_Code;
That empty string Which shows us '?' in substring.
Gives us Ascii value as 63.
It's a Zero Width space which gets appended if you copy data from ui and insert into the database.
To replace the data, you can use below query
**set MyColumn = replace(MyColumn,NCHAR(8203),'')**
It's an older question, but I've run into this problem as well. I found the solution somewhere else on internet, but I thought it would be good to share it here as well. Have a good day.
Replace(YourString, nchar(65533) COLLATE Latin1_General_BIN2, '')
This should work as well:
UPDATE TABLE
SET [FieldName] = SUBSTRING([FieldName], 2, LEN([FieldName]))
WHERE ASCII([FieldName]) = 63
So I have a field that's basically storing an entire XML file per row, complete with line breaks, and I need to remove some text from close to three hundred rows. The replace() function doesn't find the offending text no matter what I do, and all I can find by searching is a bunchy of people trying to remove the line breaks themselves. I don't see any reason that replace() just wouldn't work, so I must just be formatting it wrong somehow. Help?
Edit: Here's an example of what I mean in broad terms:
<script>...</script><dependencies>...</dependencies><bunch of other stuff></bunch of other stuff><labels><label description="Field2" languagecode="1033" /></labels><events><event name="onchange" application="false" active="true"><script><![field2.DataValue = (some equation);
</script><dependencies /></event></events><a bunch more stuff></a bunch more stuff>
I need to just remove everything between the events tags. So my sql code is this:
replace(fieldname, '<events><event name="onchange" application="false" active="true"><script><![field2.DataValue = (some equation);
</script><dependencies /></event></events>', '')
I've tried it like that, and I've tried it all on one line, and I've tried using char(10) where the line breaks are supposed to be, and nothing.
Nathan's answer was close. Since this question is the first thing that came up from a search I wanted to add a solution for my problem.
select replace(field,CHAR(13)+CHAR(10),' ')
I replaced the line break with a space incase there was no break. It may be that you want to always replace it with nothing in which case '' should be used instead of ' '.
Hope this helps someone else and they don't have to click the second link in the results from the search engine.
Worked for me on SQL2012-
UPDATE YourTable
SET YourCol = REPLACE(YourCol, CHAR(13) + CHAR(10), '')
If your column is an xml typed column, you can use the delete method on the column to remove the events nodes. See http://msdn.microsoft.com/en-us/library/ms190254(v=SQL.90).aspx for more info.
try two simple tests.
try the replace on an xml string that has no double quotes (or single quotes) but does have CRLFs. Does it work? If yes, you need to escape the quote marks.
try the replace on an xml string that has no CRLFs. Does it work? Great. If yes use two nested replace() one for the CRLFs only, then a second outter replace for the string in question.
A lot of people do not remember that line breaks are two characters
(Char 10 \n, and Char 13 \r)
replace both, and you should be good.
SELECT
REPLACE(field , CHR(10)+CHR(13), '' )
FROM Blah..
Im doing some screen scraping and im getting back a string that appears to end with whitespace but neither string.strip or strip.gsub(/\s/u, '') removes the character.
Im guessing it's a character encoding issue. Any suggestions?
I think, there are a lot of "space characters".
You can use something like this:
my_string.gsub("\302\240", ' ').strip
You can try this: my_string.gsub(/\A[[:space:]]+|[[:space:]]+\z/, '')
This should remove all space characters from the beginning and the end of string, including all possible unicode space variations.
Figure out the character code of the last character (str[-1].ord) and explicitly search and destroy it. Rinse/repeat if there exist more unwanted characters after that. After doing this, report back here what the invisible character was. (Perhaps it's only invisible because the font you are using does not have that glyph?)
Hi I am facing a problem with the like command in SQL,
I want to search for special characters within a column .
The special characters are a single quotation mark ' and { and }..
I have tried placing these special characters under [] but still it doesn't work for '
I have also used the except option but that was also of no help..
Waiting for a response soon
When you specify a value which has single quote, you need to double it.
SELECT *
FROM dbo.Northwind
WHERE Summary LIKE 'single''quotes%'
Try using this-
select * from <table> where <column> like '%''%'
SQL Server escaping is a pain because there are various ways to escape characters, each with different meaning and use case.
A single quote is escaped with another single quote: WHERE myfield LIKE '%''%'.
The general solution is to escape the special character like so:
SELECT .... WHERE my_column like '%\'%' ESCAPE '\'