Searching on telephone numbers with different formatting [duplicate] - sql

This question already has answers here:
SQL - Query Phonenumber that are stored inconsistently
(6 answers)
Closed 7 years ago.
We have a table where we store telephone numbers. These numbers are stored in a VARCHAR(200). But are in different formats, for example:
040-1551515
(073) 614 53 97
+31884637222
I would like to search on the non numeric stripped string of this table. So if my search value would be '0736145397' it would match the '(073) 614 53 97'. Is this even possible?
Ideally it would be best if we would convert them all to one format but this is not gonna happen soon.

Although ineligant, you can use REPLACE to strip characters from the phone number column in your WHERE clause, like so:
CREATE TABLE #DemoTable (number VARCHAR(200))
INSERT INTO #DemoTable (number) VALUES ('040-1551515'), ('(073) 614 53 97'), ('+31884637222')
SELECT *
FROM #DemoTable
WHERE (REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(number, '-', ''), '+', ''), ' ', ''), '(', ''), ')', '') = '0736145397')
DROP TABLE #DemoTable
You would need to add a new REPLACE for each character you wish to remove.
As I said, ineligant, but if you only have a few characters to exclude this will do the trick.

select Try this
select phon_no from table
where replace(replace(replace(replace(phone_no,' ',''),'(',''),')',''),'+','')='0736145397'
However you can use another column that has phone numbers with no formatting so that you can use that column for searching

You can use replace() a lot to solve the problem:
where replace(replace(replace(replace(replace(tnumber, ' ', ''
), '(', ''
), ')', ''
), '-', ''
), '+', ''
) = '0736145397'
If you cannot change the data itself, you can consider adding a "cleaned" number as a computed column. Something like:
alter table t add CleanNumber as (
replace(replace(replace(replace(replace(tnumber, ' ', ''
), '(', ''
), ')', ''
), '-', ''
), '+', ''
) )
Then the query would just be:
where CleanNumber = '0736145397'

We can Replace function for it ..
create table phonenumber (number varchar(200))
insert into phonenumber values('123456')
insert into phonenumber values('+555555')
insert into phonenumber values('77(123)989')
select replace(replace(number,'(',''),')','') formatednumber ,* from phonenumber with(nolock)
where replace(replace(number,'(',''),')','') ='77123989'
Thanks

Related

Remove items in a delimited list that are non numeric in SQL for Redshift

I am working with a field called codes that is a delimited list of values, separated by commas. Within each item there is a title ending in a colon and then a code number following the colon. I want a list of only the code numbers after each colon.
Example Value:
name-form-na-stage0:3278648990379886572,rules-na-unwanted-sdfle2:6886328308933282817,us-disdg-order-stage1:1273671130817907765
Desired Output:
3278648990379886572,6886328308933282817,1273671130817907765
The title does always start with a letter and the end with a colon so I can see how REGEXP_REPLACE might work to replace any string between starting with a letter and ending with a colon with '' might work but I am not good at REGEXP_REPLACE patterns. Chat GPT is down fml.
Side note, if anyone knows of a good guide for understanding pattern notation for regular expressions it would be much appreciated!
I tried this and it is not working REGEXP_REPLACE(REPLACE(REPLACE(codes,':', ' '), ',', ' ') ,' [^0-9]+ ', ' ')
This solution assumes a few things:
No colons anywhere else except immediately before the numbers
No number at the very start
At a high level, this query finds how many colons there are, splits the entire string into that many parts, and then only keeps the number up to the comma immediately after the number, and then aggregates the numbers into a comma-delimited list.
Assuming a table like this:
create temp table tbl_string (id int, strval varchar(1000));
insert into tbl_string
values
(1, 'name-form-na-stage0:3278648990379886572,rules-na-unwanted-sdfle2:6886328308933282817,us-disdg-order-stage1:1273671130817907765');
with recursive cte_num_of_delims AS (
select max(regexp_count(strval, ':')) AS num_of_delims
from tbl_string
), cte_nums(nums) AS (
select 1 as nums
union all
select nums + 1
from cte_nums
where nums <= (select num_of_delims from cte_num_of_delims)
), cte_strings_nums_combined as (
select id,
strval,
nums as index
from cte_nums
cross join tbl_string
), prefinal as (
select *,
split_part(strval, ':', index) as parsed_vals
from cte_strings_nums_combined
where parsed_vals != ''
and index != 1
), final as (
select *,
case
when charindex(',', parsed_vals) = 0
then parsed_vals
else left(parsed_vals, charindex(',', parsed_vals) - 1)
end as final_vals
from prefinal
)
select listagg(final_vals, ',')
from final

Show the ASCII code of each character in a string

In T-SQL, how to show the ASCII code of each character in a string?
Example of a string:
Adrs_line1
ABCD
I would like to show this:
Adrs_line1_ASCII
65|66|67|68
Currently, I do this (I increment the number manually):
ASCII(SUBSTRING(Adrs_line1, 1, 1))
The all purpose of this is to find which character break an address.
Because this code doesn't work for 5% of the addresses :
PARSENAME(REPLACE(REPLACE(ADDRESSSTREET, CHAR(13), ''), CHAR(10), '.'), 3) as Adrs_line1,
PARSENAME(REPLACE(REPLACE(ADDRESSSTREET, CHAR(13), ''), CHAR(10), '.'), 2) as Adrs_line2,
PARSENAME(REPLACE(REPLACE(ADDRESSSTREET, CHAR(13), ''), CHAR(10), '.'), 1) as Adrs_line3,
EDIT :
More information about my problem :
In our ERP, the address is shown with return carriage (line 1, line 2 and line 3). But in the database, the three lines are concatened in one string. So, I have to seperate the three lines by detecting the character that do the return carriage. Normally is char(10) or char(13), but for 5% of the adresses, it's another character that I can't find manually.
Here's something you might be able to use and adapt.
Create a table-valued function that will return any deemed "bad" characters:
create or alter function dbo.FindBadAscii (#v varchar(max))
returns table as
return
select v [Char], Ascii(v) [Ascii]
from (
select top (Len(#v)) Row_Number() over(order by (select null)) n
from master.dbo.spt_values
) d
cross apply(values(Substring(#v, d.n, 1)))x(v)
where Ascii(v)<32 and Ascii(v) not in (9,10,13)
You can then test it like so:
select * from dbo.FindBadAscii ('a' + Char(10) + 'b' + Char(20) + 'c')
This identifies any "control" characters that are not a tab/lf/cr
Result
20

Identify a record that contain the phone Number with different format

I have three types of phone numbers in my SQL server table as below
Can someone please suggest how can I perform a search operation for below scenarios.
Search by 10 digit number --- 029700456
Search by 10 digit number that was in the range --- 0294005623
You can convert all the numbers to a canonical format using translate() and replace(). From the canonical format, you can define the upper and lower bound on the range, depending on whether the phone has 10 or 14 characters:
select t.*
from t cross apply
(values (case when phone like '(+61)%'
then stuff(replace(translate(t.phone, '()+-', ' '), ' ', ''), 1, 2, '')
else replace(translate(t.phone, '()+-', ' '), ' ', '')
end)
) v(canonical)
(values (left(canonical, 10),
(case when len(canonical) = 10 then canonical
else left(canonical, 6) + right(canonical, 4)
end)
)
) v2(phone_upper, phone_lower)
Then your conditions are:
where #phone between v2.phone_lower and v2.phone_upper
I would advise you to figure out how to fix the data model. This is a really, really, really bad way to store phone numbers and phone number ranges.

need to replace some underscores with hypens

I have a column with value
AAA_ZZZZ_7890_10_28_2014_123456.jpg
I need to replace the middle underscores so that it displays it as date i.e.
AAA_ZZZZ_7890_10-28-2014_123456.jpg
Can some one please suggest a simple update query for this.
The Number of Underscores would be same for all the values in the column but the length will vary for example some can have
AAA_q10WRQ_001_10_28_2014_12.jpg
The following should do it:
http://sqlfiddle.com/#!3/d41d8/30384/0
declare #filename varchar(64) = 'AAA_ZZZZ_7890_10_28_2014_123456.jpg'
declare #datepattern varchar(64) = '%[_][0-1][0-9][_][0-3][0-9][_][1-2][0-9][0-9][0-9][_]%'
select
filename,
substring(filename,1,datepos+2)+'-'+
substring(filename,datepos+4,2)+'-'+
substring(filename,datepos+7,1000)
from
(
select
#filename filename,
patindex(#datepattern,#filename)
as datepos
) t
;
Resulting in
AAA_ZZZZ_7890_10-28-2014_123456.jpg
Caveats to watch out for:
It is important to exactly define how you find the date. In my definition it is MM_DD_YYYY surrounded by further two underscores, and I check that the first digits of M,D,Y are 0-1,0-3,1-2 respectively (i.e. I do NOT check if month is e.g. 13.) -- of course we assume that there is only one such string in any file name.
datepos actually finds the position of the underscore before the date -- this is not an issue if taken into account in the indexing of substring.
in the 3rd substring the length cannot be NULL or infinity and I couldn't get LEN() to work in SQL Fiddle so I dirty hardcoded a large enough number (1000). Corrections to this are welcome.
Try this (assuming that the DATE portion always starts at the same character index)
declare #string varchar(64) = 'AAA_ZZZZ_7890_10_28_2014_123456.jpg'
select replace(#string, reverse(substring(reverse(#string), charindex('_', reverse(#string), 0) + 1, 10)), replace(reverse(substring(reverse(#string), charindex('_', reverse(#string), 0) + 1, 10)), '_', '-'))
If there are exactly 6 _ then for the first
select STUFF ( 'AAA_ZZZZ_7890_10_28_2014_123456.jpg' , CHARINDEX ( '_' ,'AAA_ZZZZ_7890_10_28_2014_123456.jpg', CHARINDEX ( '_' ,'AAA_ZZZZ_7890_10_28_2014_123456.jpg', CHARINDEX ( '_' ,'AAA_ZZZZ_7890_10_28_2014_123456.jpg', CHARINDEX ( '_' ,'AAA_ZZZZ_7890_10_28_2014_123456.jpg', 0 ) + 1 ) + 1 ) + 1 ) , 1 , '-' )

How can I remove leading and trailing quotes in SQL Server?

I have a table in a SQL Server database with an NTEXT column. This column may contain data that is enclosed with double quotes. When I query for this column, I want to remove these leading and trailing quotes.
For example:
"this is a test message"
should become
this is a test message
I know of the LTRIM and RTRIM functions but these workl only for spaces. Any suggestions on which functions I can use to achieve this.
I have just tested this code in MS SQL 2008 and validated it.
Remove left-most quote:
UPDATE MyTable
SET FieldName = SUBSTRING(FieldName, 2, LEN(FieldName))
WHERE LEFT(FieldName, 1) = '"'
Remove right-most quote: (Revised to avoid error from implicit type conversion to int)
UPDATE MyTable
SET FieldName = SUBSTRING(FieldName, 1, LEN(FieldName)-1)
WHERE RIGHT(FieldName, 1) = '"'
I thought this is a simpler script if you want to remove all quotes
UPDATE Table_Name
SET col_name = REPLACE(col_name, '"', '')
You can simply use the "Replace" function in SQL Server.
like this ::
select REPLACE('this is a test message','"','')
note: second parameter here is "double quotes" inside two single quotes and third parameter is simply a combination of two single quotes. The idea here is to replace the double quotes with a blank.
Very simple and easy to execute !
My solution is to use the difference in the the column values length compared the same column length but with the double quotes replaced with spaces and trimmed in order to calculate the start and length values as parameters in a SUBSTRING function.
The advantage of doing it this way is that you can remove any leading or trailing character even if it occurs multiple times whilst leaving any characters that are contained within the text.
Here is my answer with some test data:
SELECT
x AS before
,SUBSTRING(x
,LEN(x) - (LEN(LTRIM(REPLACE(x, '"', ' ')) + '|') - 1) + 1 --start_pos
,LEN(LTRIM(REPLACE(x, '"', ' '))) --length
) AS after
FROM
(
SELECT 'test' AS x UNION ALL
SELECT '"' AS x UNION ALL
SELECT '"test' AS x UNION ALL
SELECT 'test"' AS x UNION ALL
SELECT '"test"' AS x UNION ALL
SELECT '""test' AS x UNION ALL
SELECT 'test""' AS x UNION ALL
SELECT '""test""' AS x UNION ALL
SELECT '"te"st"' AS x UNION ALL
SELECT 'te"st' AS x
) a
Which produces the following results:
before after
-----------------
test test
"
"test test
test" test
"test" test
""test test
test"" test
""test"" test
"te"st" te"st
te"st te"st
One thing to note that when getting the length I only need to use LTRIM and not LTRIM and RTRIM combined, this is because the LEN function does not count trailing spaces.
I know this is an older question post, but my daughter came to me with the question, and referenced this page as having possible answers. Given that she's hunting an answer for this, it's a safe assumption others might still be as well.
All are great approaches, and as with everything there's about as many way to skin a cat as there are cats to skin.
If you're looking for a left trim and a right trim of a character or string, and your trailing character/string is uniform in length, here's my suggestion:
SELECT SUBSTRING(ColName,VAR, LEN(ColName)-VAR)
Or in this question...
SELECT SUBSTRING('"this is a test message"',2, LEN('"this is a test message"')-2)
With this, you simply adjust the SUBSTRING starting point (2), and LEN position (-2) to whatever value you need to remove from your string.
It's non-iterative and doesn't require explicit case testing and above all it's inline all of which make for a cleaner execution plan.
The following script removes quotation marks only from around the column value if table is called [Messages] and the column is called [Description].
-- If the content is in the form of "anything" (LIKE '"%"')
-- Then take the whole text without the first and last characters
-- (from the 2nd character and the LEN([Description]) - 2th character)
UPDATE [Messages]
SET [Description] = SUBSTRING([Description], 2, LEN([Description]) - 2)
WHERE [Description] LIKE '"%"'
You can use following query which worked for me-
For updating-
UPDATE table SET colName= REPLACE(LTRIM(RTRIM(REPLACE(colName, '"', ''))), '', '"') WHERE...
For selecting-
SELECT REPLACE(LTRIM(RTRIM(REPLACE(colName, '"', ''))), '', '"') FROM TableName
you could replace the quotes with an empty string...
SELECT AllRemoved = REPLACE(CAST(MyColumn AS varchar(max)), '"', ''),
LeadingAndTrailingRemoved = CASE
WHEN MyTest like '"%"' THEN SUBSTRING(Mytest, 2, LEN(CAST(MyTest AS nvarchar(max)))-2)
ELSE MyTest
END
FROM MyTable
Some UDFs for re-usability.
Left Trimming by character (any number)
CREATE FUNCTION [dbo].[LTRIMCHAR] (#Input NVARCHAR(max), #TrimChar CHAR(1) = ',')
RETURNS NVARCHAR(max)
AS
BEGIN
RETURN REPLACE(REPLACE(LTRIM(REPLACE(REPLACE(#Input,' ','¦'), #TrimChar, ' ')), ' ', #TrimChar),'¦',' ')
END
Right Trimming by character (any number)
CREATE FUNCTION [dbo].[RTRIMCHAR] (#Input NVARCHAR(max), #TrimChar CHAR(1) = ',')
RETURNS NVARCHAR(max)
AS
BEGIN
RETURN REPLACE(REPLACE(RTRIM(REPLACE(REPLACE(#Input,' ','¦'), #TrimChar, ' ')), ' ', #TrimChar),'¦',' ')
END
Note the dummy character '¦' (Alt+0166) cannot be present in the data (you may wish to test your input string, first, if unsure or use a different character).
To remove both quotes you could do this
SUBSTRING(fieldName, 2, lEN(fieldName) - 2)
you can either assign or project the resulting value
You can use TRIM('"' FROM '"this "is" a test"') which returns: this "is" a test
CREATE FUNCTION dbo.TRIM(#String VARCHAR(MAX), #Char varchar(5))
RETURNS VARCHAR(MAX)
BEGIN
RETURN SUBSTRING(#String,PATINDEX('%[^' + #Char + ' ]%',#String)
,(DATALENGTH(#String)+2 - (PATINDEX('%[^' + #Char + ' ]%'
,REVERSE(#String)) + PATINDEX('%[^' + #Char + ' ]%',#String)
)))
END
GO
Select dbo.TRIM('"this is a test message"','"')
Reference : http://raresql.com/2013/05/20/sql-server-trim-how-to-remove-leading-and-trailing-charactersspaces-from-string/
I use this:
UPDATE DataImport
SET PRIO =
CASE WHEN LEN(PRIO) < 2
THEN
(CASE PRIO WHEN '""' THEN '' ELSE PRIO END)
ELSE REPLACE(PRIO, '"' + SUBSTRING(PRIO, 2, LEN(PRIO) - 2) + '"',
SUBSTRING(PRIO, 2, LEN(PRIO) - 2))
END
Try this:
SELECT left(right(cast(SampleText as nVarchar),LEN(cast(sampleText as nVarchar))-1),LEN(cast(sampleText as nVarchar))-2)
FROM TableName