How to remove digits and special characters from the beginning of a string? - sql

For instance I have
'234 - ? Hi there'
The result should be:
'Hi there'

For oracle you have the regexp_replace function. So you could do the below to replace non-alphabetic characters from the beginning of the string:
select regexp_replace('24 Hi','^([^a-zA-Z]*)','') from dual
The first ^ in ^([^a-zA-Z]*) is to match the beginning of the string. The second ^ is to match any non-alphabetic characters.

In Oracle you can use REGEXP_REPLACE(). I recommend using a slightly different regex than the one in the accepted answer; there's no reason to do any replacing on a pattern that can be of zero width. Additionally, the parentheses are unnecessary since you don't need to capture a group:
SELECT REGEXP_REPLACE(my_column, '^[^A-Za-z]+') FROM my_table;
We can also exclude the 3rd argument to REGEXP_REPLACE since in Oracle, a NULL and an empty string are equivalent. Another alternative in Oracle is to use the POSIX character class [:alpha:]:
SELECT REGEXP_REPLACE(my_column, '^[^[:alpha:]]+')
FROM my_table;
Please see the SQL Fiddle here. You can read more about POSIX character classes here.

Use this Function to remove numeric and Special symbols.
CREATE function [dbo].[RemoveNumericandSpecialSymbolValue](#str varchar(500))
returns varchar(500)
begin
declare #text int
set #text=0
while 1=1
begin
set #text= patindex('%[^a-z .]%',#str)
if #text <> 0
begin
set #str = replace(#str,substring(#str,#text,1),'')
end
else break;
end
return #str
end
Example:
select dbo.RemoveNumericandSpecialSymbolValue('234 - ? Hi there')

Related

How to trim string (with Ideographic space U+3000) in sql server?

I have to trim Japanese characters string which has double byte space at start of string and end of string.
I have to do this by procedure of SQL server 2016.
For Example,
SELECT LTRIM(RTRIM(' A A '))
above one is working perfect
But Problem is in bellows line
SELECT LTRIM(RTRIM(' A A '))
i want output of above one is 'A A'
Have any idea, how to do this ?
Adapted SQL from OP's post:
SELECT LTRIM(RTRIM(REPLACE(' A A ', ' ', ' ')))
Screenshot with result:
The space in that string is the Ideographic space (U+3000) Unicode character, which LTRIM and RTRIM don't recognize as whitespace. Even TRIM in SQL Server 2017 won't recognize it unless it's specified explicitly.
Another problem is that this character is outside the normal range of characters and can't appear in a varchar field or value. This leads to inconsistent results between SQL Server versions. In SQL Server 2014 it will even appear as a ?. In later versions LTRIM/RTRIM may or may not work without emitting the error character. I don't have access to all versions to test this.
In SQL Server 2017 it's possible to explicitly specify the trimmed character, eg :
select trim(N' ' from N' A A ')
This produces A A.
In previous versions, PATINDEX can be used to find the locations of the first and last non-space positions :
declare #str nvarchar(10)=N' A A ';
declare #start int=PATINDEX(N'%[^ ]%',#str)
declare #end int=PATINDEX(N'% ',#str)
SELECT SUBSTRING(#str,#start,#end-#start)
The pattern N'%[^ ]%' finds the first non-U+3000 character in the string. N'% ' finds the position of the last one. SUBSTRING(#str,#start,#end-#start) extracts the content between the two positions.
The result is:
A A
I got solution
Thank you so much for your efforts.
Please use this function for double byte space remove.
CREATE FUNCTION [RTRIMBYTE](#AV_VALUE NVARCHAR(MAX))
RETURNS NVARCHAR(MAX)
AS
BEGIN
DECLARE #AV_RETURN NVARCHAR(MAX) = #AV_VALUE;
WHILE DATALENGTH(#AV_RETURN) > 0 AND RIGHT(#AV_RETURN, 1) in (' ', ' ')
SET #AV_RETURN = LEFT(#AV_RETURN, LEN('X' + #AV_RETURN + 'X') -3 ) ;
RETURN #AV_RETURN;
END;

Replace Azerbaijani characters in string not works correct

When I want to replace Azerbaijani character 'ş' in my string with 'sh'.it works but it also replaces 's' with 'sh'
How can i solve it .Any ideas?
REPLACE(mystring,'ş','sh')
The character "ş" belongs to Turkish_CI_AS. It is a problem inserting them to database and retreiving them too. The trick is to use nvarchar and use N while inserting and querying.
Refer to the example below.
SELECT REPLACE(N'arshad khan earns 1000ş',N'ş','sh');
SELECT 'ş'
SELECT N'ş'
Output is as below
arshad khan earns 1000sh
s
ş
This character is represented as Unicode
http://en.wikipedia.org/wiki/%C5%9E
Use this article, to solve your problem
how to insert unicode text to SQL Server from query window
Regards
R
Only solution after research.
It happens in two characters 'ş' and 'ç' if you want to replace these characters with something else it will also replace 's' and 'c'.
Of course if your database collation is 'Turkish_CI_AS' it will work .But in my case only for two characters i could not change my database collation.No logic in it.
so my client just wanted from me change Azeri characters to latin 'ş'->'s'.
My solution when i start i replace 's' with special character and put it back after i replace all azeri characters to latin.so my original 'c' and 's' characters not effected after replacement.
This is a function I wrote
Create FUNCTION [dbo].[funRGMReplaceAzeriCharacters]
(
#string nvarchar (MAX)
)
RETURNS varchar(MAX)
AS
BEGIN
DECLARE
#Result nvarchar(MAX)
Begin
SET #Result=REPLACE(#string,'s' ,'V1986Q')
SET #Result=REPLACE(#Result,'c' ,'V1987Q')
SET #Result=REPLACE(#Result,'ı' ,'i')
SET #Result=REPLACE(#Result,'ə','a')
SET #Result=REPLACE(#Result,'ğ','g')
SET #Result=REPLACE(#Result,'ü','u')
SET #Result=REPLACE(#Result,'ş','sh')
SET #Result=REPLACE(#Result,'ç','ch')
Set #Result=REPLACE(#Result,'ö','o')
-- bring back s and c
Set #Result=REPLACE(#Result,'V1986Q','s')
Set #Result=REPLACE(#Result,'V1987Q','c')
END
RETURN UPPER (#Result)
END

Remove last x characters until a specific character

I got this string /uk-en/contact-us/frequently-asked-questions/your-trip/there-wi-fi-access-in-the-eurostar-terminals-and-board-your-trains and I need to get just the last part of the URL (until the last /).
Then I want to replace '-' with a space. The strings are not with the same number of characters.
How can I do?
Thank you!
Solution using BigQuery functions:
select regexp_replace(last(split(x, "/")), "-", " ") from
(select
"/uk-en/contact-us/frequently-asked-questions/your-trip/there-wi-fi-access-in-the-eurostar-terminals-and-board-your-trains"
as x)
Here is what I tried in SQL Server
DECLARE #s VARCHAR(max)= '/uk-en/contact-us/frequently-asked-questions/your-trip/there-wi-fi-access-in-the-eurostar-terminals-and-board-your-trains'
SELECT REVERSE(SUBSTRING(REVERSE(#s),CHARINDEX('/',REVERSE(#s)),LEN(REVERSE(#s))))+REVERSE(REPLACE(SUBSTRING(REVERSE(#s),1,CHARINDEX('/',REVERSE(#s))-1),'-',' '))
Sorry this was for SQL Server
did you try using split in big query
SPLIT('str' [, 'delimiter']) Returns a set of substrings as a repeated string. If delimiter is specified, the SPLIT function breaks str into substrings, using delimiter as the delimiter.

Transact SQL replace part of string

Is it possible to delete part of string using regexp (or something else, may be something like CHARINDEX could help) in SQL query?
I use MS SQL Server (2008 most likely).
Example: I have strings like "[some useless info] Useful part of string" I want to delete parts with text in brackets if they are in line.
Use REPLACE
for example :
UPDATE authors SET city = replace(city, 'To Remove', 'With BLACK or Whatever')
WHERE city LIKE 'Salt%'; // with where condition
You can use the PATINDEX function. Its not a complete regular expression implementation but you can use it for simple things.
PATINDEX (Transact-SQL)> Returns the starting position of the first occurrence of a pattern in a specified expression, or zeros if the pattern is not found, on all valid text and character data types.
OR You can use CLR to extend the SQL Server with a complete regular expression implementation.
SQL Server 2005: CLR Integration
SELECT * FROM temp where replace(replace(replace(url,'http://',''),'www.',''),'https://','')='"+url+"';
You can use STUFF to insert a string into another string. It deletes a specified length of characters in the first string at the start position and then inserts the second string into the first string at the start position.
For example, the code below, replaces the 5 with 666666:
DECLARE #Variable NVARCHAR(MAX) = '12345678910'
SELECT STUFF(#Variable, 5, 1, '666666')
Note, that the second argument is not a string, it is a position and you are able to calculate it position using CHARINDEX for example.
Here is your case:
DECLARE #Variable NVARCHAR(MAX) = '[some useless info] Useful part of string'
SELECT STUFF(
#Variable
,CHARINDEX('[', #Variable)
,LEN(SUBSTRING(#Variable, CHARINDEX('[', #Variable), CHARINDEX(']', #Variable) - LEN(SUBSTRING(#Variable, 0, CHARINDEX('[', #Variable)))))
,''
)
Finally helps REPLACE, SUBSTRING and PATINDEX.
REPLACE(t.badString, Substring(t.badString , Patindex('%[%' , t.badString)+1 , Patindex('%]%' , t.badString)), '').
Thanks to all.

How to remove part of the string in oracle

Input data:
abcdef_fhj_viji.dvc
Expected output:
fhj_viji.dvc
The part to be trimmed is not constant.
Use the REPLACE method
Select REPLACE('abcdef_fhj_viji.dvc','abcde','')
If you want this query for your table :
Select REPLACE(column,'abcde','') from myTable
For update :
UPDATE TABLE
SET column = REPLACE(column,'abcde','')
select substr('abcdef_fhj_viji.dvc',instr('abcdef_fhj_viji.dvc','_')+1) from dual
So, Its all depends on INSTR function, define from which position and which occurrence, you will get the index and pass that index to SUBSTR to get your string.
Since you didn't give a lot of information I'm gonna assume some.
Let's assume you want a prefix of some string to be deleted. A good way to do that is by using Regular Expressions. There's a function called regexp_replace, that can find a substring of a string, depending on a pattern, and replace it with a different string. In PL/SQL you could write yourself a function using regexp_replace, like this:
function deletePrefix(stringName in varchar2) return varchar2 is
begin
return regexp_replace(stringName, '^[a-zA-Z]+_', '');
end;
or just use this in plain sql like:
regexp_replace(stringName, '^[a-zA-Z]+_', '');
stringName being the string you want to process, and the ^[a-zA-Z]+_ part depending on what characters the prefix includes. Here I only included upper- and lowercase letters.