Replace Azerbaijani characters in string not works correct - sql

When I want to replace Azerbaijani character 'ş' in my string with 'sh'.it works but it also replaces 's' with 'sh'
How can i solve it .Any ideas?
REPLACE(mystring,'ş','sh')

The character "ş" belongs to Turkish_CI_AS. It is a problem inserting them to database and retreiving them too. The trick is to use nvarchar and use N while inserting and querying.
Refer to the example below.
SELECT REPLACE(N'arshad khan earns 1000ş',N'ş','sh');
SELECT 'ş'
SELECT N'ş'
Output is as below
arshad khan earns 1000sh
s
ş

This character is represented as Unicode
http://en.wikipedia.org/wiki/%C5%9E
Use this article, to solve your problem
how to insert unicode text to SQL Server from query window
Regards
R

Only solution after research.
It happens in two characters 'ş' and 'ç' if you want to replace these characters with something else it will also replace 's' and 'c'.
Of course if your database collation is 'Turkish_CI_AS' it will work .But in my case only for two characters i could not change my database collation.No logic in it.
so my client just wanted from me change Azeri characters to latin 'ş'->'s'.
My solution when i start i replace 's' with special character and put it back after i replace all azeri characters to latin.so my original 'c' and 's' characters not effected after replacement.
This is a function I wrote
Create FUNCTION [dbo].[funRGMReplaceAzeriCharacters]
(
#string nvarchar (MAX)
)
RETURNS varchar(MAX)
AS
BEGIN
DECLARE
#Result nvarchar(MAX)
Begin
SET #Result=REPLACE(#string,'s' ,'V1986Q')
SET #Result=REPLACE(#Result,'c' ,'V1987Q')
SET #Result=REPLACE(#Result,'ı' ,'i')
SET #Result=REPLACE(#Result,'ə','a')
SET #Result=REPLACE(#Result,'ğ','g')
SET #Result=REPLACE(#Result,'ü','u')
SET #Result=REPLACE(#Result,'ş','sh')
SET #Result=REPLACE(#Result,'ç','ch')
Set #Result=REPLACE(#Result,'ö','o')
-- bring back s and c
Set #Result=REPLACE(#Result,'V1986Q','s')
Set #Result=REPLACE(#Result,'V1987Q','c')
END
RETURN UPPER (#Result)
END

Related

How to remove digits and special characters from the beginning of a string?

For instance I have
'234 - ? Hi there'
The result should be:
'Hi there'
For oracle you have the regexp_replace function. So you could do the below to replace non-alphabetic characters from the beginning of the string:
select regexp_replace('24 Hi','^([^a-zA-Z]*)','') from dual
The first ^ in ^([^a-zA-Z]*) is to match the beginning of the string. The second ^ is to match any non-alphabetic characters.
In Oracle you can use REGEXP_REPLACE(). I recommend using a slightly different regex than the one in the accepted answer; there's no reason to do any replacing on a pattern that can be of zero width. Additionally, the parentheses are unnecessary since you don't need to capture a group:
SELECT REGEXP_REPLACE(my_column, '^[^A-Za-z]+') FROM my_table;
We can also exclude the 3rd argument to REGEXP_REPLACE since in Oracle, a NULL and an empty string are equivalent. Another alternative in Oracle is to use the POSIX character class [:alpha:]:
SELECT REGEXP_REPLACE(my_column, '^[^[:alpha:]]+')
FROM my_table;
Please see the SQL Fiddle here. You can read more about POSIX character classes here.
Use this Function to remove numeric and Special symbols.
CREATE function [dbo].[RemoveNumericandSpecialSymbolValue](#str varchar(500))
returns varchar(500)
begin
declare #text int
set #text=0
while 1=1
begin
set #text= patindex('%[^a-z .]%',#str)
if #text <> 0
begin
set #str = replace(#str,substring(#str,#text,1),'')
end
else break;
end
return #str
end
Example:
select dbo.RemoveNumericandSpecialSymbolValue('234 - ? Hi there')

How to trim string (with Ideographic space U+3000) in sql server?

I have to trim Japanese characters string which has double byte space at start of string and end of string.
I have to do this by procedure of SQL server 2016.
For Example,
SELECT LTRIM(RTRIM(' A A '))
above one is working perfect
But Problem is in bellows line
SELECT LTRIM(RTRIM(' A A '))
i want output of above one is 'A A'
Have any idea, how to do this ?
Adapted SQL from OP's post:
SELECT LTRIM(RTRIM(REPLACE(' A A ', ' ', ' ')))
Screenshot with result:
The space in that string is the Ideographic space (U+3000) Unicode character, which LTRIM and RTRIM don't recognize as whitespace. Even TRIM in SQL Server 2017 won't recognize it unless it's specified explicitly.
Another problem is that this character is outside the normal range of characters and can't appear in a varchar field or value. This leads to inconsistent results between SQL Server versions. In SQL Server 2014 it will even appear as a ?. In later versions LTRIM/RTRIM may or may not work without emitting the error character. I don't have access to all versions to test this.
In SQL Server 2017 it's possible to explicitly specify the trimmed character, eg :
select trim(N' ' from N' A A ')
This produces A A.
In previous versions, PATINDEX can be used to find the locations of the first and last non-space positions :
declare #str nvarchar(10)=N' A A ';
declare #start int=PATINDEX(N'%[^ ]%',#str)
declare #end int=PATINDEX(N'% ',#str)
SELECT SUBSTRING(#str,#start,#end-#start)
The pattern N'%[^ ]%' finds the first non-U+3000 character in the string. N'% ' finds the position of the last one. SUBSTRING(#str,#start,#end-#start) extracts the content between the two positions.
The result is:
A A
I got solution
Thank you so much for your efforts.
Please use this function for double byte space remove.
CREATE FUNCTION [RTRIMBYTE](#AV_VALUE NVARCHAR(MAX))
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE #AV_RETURN NVARCHAR(MAX) = #AV_VALUE;
WHILE DATALENGTH(#AV_RETURN) > 0 AND RIGHT(#AV_RETURN, 1) in (' ', ' ')
SET #AV_RETURN = LEFT(#AV_RETURN, LEN('X' + #AV_RETURN + 'X') -3 ) ;
RETURN #AV_RETURN;
END;

What is the difference when parsing between Tab and Spaces in sql server 2008 R2

I have encountered a scenario below
Declare #var int = ' 123'
select #var
Declare #var1 int = ' 123'
select #var1
for the first case I have used spaces in front of the value and while execute it returns value as 123
In Second case I have used tab instead of space in front of value and while execute it throws conversion error
Can anyone let know what is the difference between these 2 scenario..
Even though you have put same number of spaces (using spaces and then Tab) the character codes for both of them is different and that is the reason that space and TAB are treated as separately in SQL Server.
More information about character codes and character encoding can be found at below 2 links:-
https://www.computerhope.com/jargon/c/charcode.htm
https://www.pcmag.com/encyclopedia/term/51983/standards-character-codes
Also if you think mathematically and logically:- having spaces before integer numbers does not make sense. It's like having zeros before numbers.
For Example:-' 123' (5 spaces and then 123) is like 00000123.
Yet one more reason that spaces are trimmed before the integer numbers

Reverse characters in string with mixed Left-to-right and Right-to-left languages using SQL?

I have string values in my table which includes Hebrew characters (or any R-T-L language for this case) and English ones (or numbers).
The problem is that the English characters are reversed and looks like:
בדיקה 123456 esrever sti fI kcehC.
The numbers and the English characters are reversed, the Hebrew ones are fine.
How can I use built in SQL functions to identify the English substring (and the numbers) and reverse it while maintain the order on the other RTL characters? Any workaround will do :-) ...
thanks
I believe that your entire string is reversed and the fact that the Hebrew words are displaying in the correct order is actually the result of a different problem. What I suspect is that the Hebrew words are stored in a non-lexical order.
In theory you should be able to resolve your problem by simply reversing the string and then force SQL Server to display the Arabic words from left to right. This is done by appending a special character to the front and back of your string as follow:
DECLARE #sourceString NVARCHAR(100) = N'123456 בדיקה esrever sti fI kcehC';
DECLARE #reversedString NVARCHAR(4000) = nchar(8237) + REVERSE(#sourceString) + nchar(8236)
SELECT #reversedString;
I've never worked with Hebrew characters so I'm not sure if this will work,
However I think you can implement a function with a while loop using patindex
you'll need a variable for holding the reversed english part #EngTemp
a variable to hold the substring currently being processed #SubTemp
a variable to hold the remaining text in the string that still needs to be processed #SubNext
a variable to hold the length of the current substring #Len
an output variable #Out
Steps:
Take a string input, put it into #SubNext
while PatIndex('%[^a-z]%', #SubNext) > 0
substring to the pat index store in #SubTemp, also trim #SubNext to the patindex
store the length of the #SubTemp in #Len
if #Len > 1; set #Out = #Out + #EngTemp + #SubTemp; Set #EngTemp = ''
(This step assumes the possibility that there could be cases where the english string is not the end of the line)
if #Len = 1; set #EngTemp = #SubTemp + #EngTemp
if #Len = 0; set #Out = #Out + #EngTemp
(At this point the loop should close also)
I'm going to play with this when I have some time and post actual code, sorry if my scribbles doesn't make any sense
You can use ASCII function in SQL Server for getting the ascii value of characters in the text field in DB. Once you get the ascii value, compare that against the valid range of english visible characters and numerals. Anything else can be considered as Hebrew character.
Also there exists REVERSE function automatically in SQL Server for reversing the string as required.
Following link has some sample code.
http://www.sqlservercurry.com/2009/09/how-to-find-ascii-value-of-each.html

Why does CHARINDEX function return an index for 'Œ' in string 'manoeuvre'?

I have this SQL code
declare #s varchar(8000) = 'manoeuvre'
select CHARINDEX(char(140), #s, 0)
char(140) = Œ, which dose not exist in the string 'manoeuvre'.
yet SQL server returns the following
4 (indicating it had located the char(140) on this line)
if I replace 'Œ' with a '*' I get
man*uvre
it seem like SQL has replaced the 'o' and 'e' with the one character, but why?
why is is replacing 'oe' with 'Œ'?
the same effect can be see with the string 'mass' and 'ß' (which I believe is German for double s). replacing on this character returns the sting 'ma*'.
Is SQL trying to do something "smart" under the covers?
EDIT
Extra information:
SQL server 2008 R2.
collation of database is Latin1_General_CI_AS.
If you look up that sign (ASCII 140) it is described as
capital OE ligature
See www.table-ascii.com for instance
try
select CHARINDEX(char(140), #s COLLATE Latin1_General_BIN, 0)
which will do a binary search.