Check valid phone number in SQL Server - sql

I am trying to check if the phone numbers in the table are in the proper format or not. If not I want it to move to another column.
Invalid phone numbers are:
Less or more than 10 digits (numbers only)
Number that has alphabet or special characters (except () -)
Sample input:
phoneNumber
--------------
+1(111)11-1111
1111111111
11(11)111111
11abcd1111
Expected result:
phoneNumber
---------------
+1(111)11-1111
1111111111
I wrote this query but it doesn't seem to work
SELECT PhoneNumber
FROM Customers
WHERE PhoneNumber LIKE '[(][0-9]{3}[)]\s[0-9]{3}[-][0-9]{4}'

What you are attempting to test in your code example does not match your list of requirements, and as noted above, SQL Server LIKE does not support the generalized type of regular expression that you have coded. Specifically, {3} and \s are not supported.
Using just your stated requirements, you can first check that the string does not contain any invalid characters and then separately check that the string contains exactly 10 digits after removing the valid non-numeric characters.
Something like:
SELECT PhoneNumber
FROM Customers
WHERE PhoneNumber NOT LIKE '%[^0-9()-]%'
AND REPLACE(REPLACE(REPLACE(PhoneNumber, '(', ''), ')', ''), '-', '')) LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
-- AND LEN(REPLACE(REPLACE(REPLACE(PhoneNumber, '(', ''), ')', ''), '-', '')) = 10 -- Alternate test
Note that this will still allow some unusual cases like ')123(45678-9-0-`.
See this db<>fiddle.

Related

split text if contains certain string postgres

exit_reason
sr_inefficient_management
tech_too_complex
company_member_resignation
sr_product_engagement
sr_contractual_reasons
sr_contractual_reasons-expectation_issues
sr_churn-takeover_business
I would like to split the column if the value contains the string "sr_" and keep the rest as it is. If the column contains "-" such as "sr_contractual_reasons-expectation_issues", I only want to keep it as "contractual reasons".
So far, my idea is to use
case when exit_reason like '%inefficient_management%' then 'inefficient management'
but if there are many different values, I am in trouble.
Expected output
exit_reason column
tech too complex
company member resignation
product engagement
contractual reasons
contractual reasons
churn
You can just replace 'sr_'
replace(exit_reason, 'sr_', '')
It is unlikely that 'sr_' would appear in any of the reasons. But you can use regexp_replace() to be sure:
regexp_replace(exit_reason, '^sr_', '')
You can try something like it:
REPLACE(
CASE
WHEN exit_reason LIKE '%-%'
THEN split_part(exit_reason,'-',2)
WHEN exit_reason LIKE 'sr_%'
THEN split_part(exit_reason,'sr_',2)
ELSE exit_reason
END
, '_', ' '
)
This code first checks if 'exist_reason' has a hyphen, then if it has 'sr_' and replaces all underscores with blanks.
To also remove the suffix, you could use:
SELECT replace(
regexp_replace(
'sr_contractual_reasons-expectation_issues',
'^(sr_)?([^-]*).*$',
'\2'
),
'_',
' '
);
replace
═════════════════════
contractual reasons
(1 row)
The regular expression matches an optional leading sr_, then all characters until the first -, then anything that follows that, and keeps only the middle part. replace then replaces underscores with spaces.

Replacing multiple special characters using Oracle SQL functions

#All - Thanks for your help
ID Email
1 karthik.sanu#gmail.com. , example#gmail.com#
2 karthik?sanu#gmail.com
In the above example, if you see the 1st row, the email address is invalid because of dot at the end of 1st email address and # at the end of 2nd email address.
In 2nd row, there is a ? in between email address.
Just wanted to know is there any way to handle there characters and remove those from email address using SQL function and update the same in table.
Thanks in advance.
you can also check a translate function
translate('my ,string#with .special chars','#,?. ', ' ')
You could nest multiple invokations of replace(), but this quickly becomes convoluted.
On the other hand, regexp_replace() comes handy for this:
regexp_replace(column_name, '#|,|\?|\.', ' ')
The pipe character (|) stands for or. The dot (.) and the question mark (?) are meaningful characters in regexes so they need to escaped with a backslash (\).
Something like this will "remove" everything but digits, letters and spaces (if that's what you wanted).
SQL> with test (col) as
2 (select 'This) is a se#nten$ce with. everything "but/ only 123 numbers, and ABC lett%ers' from dual)
3 select regexp_replace(col, '[^a-zA-Z0-9 ]') result
4 from test;
RESULT
-----------------------------------------------------------------------
This is a sentence with everything but only 123 numbers and ABC letters
SQL>

Search for multiple phone number formats in database

I got a database with a user table. This table contains a column phonenumber. The problem is that its fields use multiple number patterns.
The current patterns I found:
06403/975-0
+496403975 0
06403 975-0
06403 975 0
+49 6403 975 0
When searching for a user in the database, is there a way to search for all number patterns?
SELECT id FROM user WHERE phone = '0123456789'
I use Oracle and MS SQL
Assuming your question means this:
"Is it possible to remove all the non-digit characters from the stored phone number, before making the comparison in the WHERE clause?"
a possible solution looks like this:
...
where translate(phone, '0123456789' || phone, '0123456789') = <input value here>
TRANSLATE will translate every digit to itself, and all other characters in phone to nothing (they will simply be deleted from the string). This is exactly what you want.
If you find that the query is slow, you may want to create a (function-based) index on translate(phone, '0123456789' || phone, '0123456789').
EDIT: I missed the part where you said you are using both Oracle and SQL Server. I did a quick search and found that SQL Server does not have a function similar to Oracle's TRANSLATE. I will leave it to SQL Server experts to help you with that part; I don't know SQL Server.
In Oracle you could do it like this. Strip out the non-numeric characters with translate() to get the phone number. You need to handle the leading zero or international dialling code:
select username from your_table
where translate(phone, '1234567890+/ -', '1234567890') in ('064039750', '4964039750')
You may need to tweak this if you don't know what the international dialling code is.
Obviously the actual problem is one of data quality: the application should enforce a strict format on phone numbers. One bout of data cleansing on write saves a whole bunch of grief on read.
You have a database containing phone numbers. These are sometimes in international format, but often in some national format, probably German, where two leading zeros introduce a country code, while a single leading zero would introduce an area code instead (assuming the home country Germany then). Moreover, a phone number can contain symbols for readability, namely '-', '/', and ' '.
So
+49 12/3456-7 means +491234567 of course
00441234567 means +441234567
04412345 means +494412345
I suggest you convert all numbers into international format in these steps:
replace a leading + with a leading 00, thus making only digits important
remove every character that is not a digit
replace a leading 00 with a leading +
replace a leading 0 with a leading +49
Use Oracle's REGEXP_REPLACE for this:
select
regexp_replace(
regexp_replace(
regexp_replace(
regexp_replace(trim(phone),
'^\+', '00'), -- leading '+' -> leading '00'
'[^[:digit:]]', ''), -- remove all non-digits
'^00' , '+'), -- leading '00' -> leading '+'
'^0', '+49') -- leading '0' -> leading '+49'
as international_phone
from mytable;
You can do this in the WHERE clause of course:
SELECT id FROM user WHERE regexp_replace(...) = '+49123456789'
or even
SELECT id FROM user WHERE regexp_replace(...phone...) = regexp_replace(...'0123456789'...)
And you may write a PL/SQL function for this for convenience and use it so:
SELECT id FROM user WHERE international_phone(phone) = international_phone('0123456789')
This is for Oracle. There may be something alike for SQL Server.

Sybase SQLAnywhere matching phone numbers formatted different ways

I have a table with a phone number varchar field. This field has phone numbers that are formatted many different ways. 999-999-9999 or (999) 999-9999 and so on. I have a phone number that I am trying to find which is formatted like this: "9999999999". I would like to do something like this:
SELECT …
WHERE replace(PHO_PhoneNumber, "[^\\d]", “") = “9999999999”
Basically remove all non digits from the field and then compare.
Is there such a function "replace" that uses regex, or is there a better way of trying to find this number when the phone number field can have many different formatting characters in it ? I have no control over how phone numbers get entered into this table.
Thanks,
Warren
You don't say what version of SQL Anywhere you're using, but as of version 11.0, SQL Anywhere supports the REGEXP operator in the where clause, so you could do something like:
select ...
where PHO_PhoneNumber regexp '\(?\d{3}\)?-?\d{3}-?\d{4}'
Disclaimer: I work for SAP in SQL Anywhere engineering.
I don't think Sybase has such a function. You could write one. However, the "special" characters in phone numbers are typically: "()+- ". You can use multiple replaces for these:
WHERE replace(replace(replace(replace(replace(PHO_PhoneNumber, ' ', ''), ')', ''), '(', ''), '+', ''), '-', '') = '9999999999'

SQL Server CONCAT function

I trying to draw a statement like this
SELECT CONCAT(street_name, ' ', street_number) as 'street_detail'
FROM geo_map
WHERE CONCAT(street_name, ' ', street_number) LIKE '%'
My table is something like this
postal_code int
building_name nchar(200)
street_number nchar(60)
street_name nchar(120)
The result I get was just the street name, less the street number, although my street number have value, any idea what's went wrong in my concat.
I am using SQL Server
It is best to use NVARCHAR(...) instead of NCHAR(...) types for storing information like what you have. The reason is that for NCHAR(...) types, strings are padded with trailing spaces to fill the whole length of the field.
A string in an NCHAR(200) field is always 200 characters wide. The concatenation of street_name, a space and the street_number will be 261 characters wide. The building number will appear on the 202nd character in the concatenation.
Perhaps you are not seeing a street number in your concatenation because your display field (in your program, SSMS, webpage, ...) just isn't wide enough.
Now with storing your street name in an NVARCHAR(200) and pretty much all other related information in NVARCHAR(...) fields, you would not have that problem. Strings stored in those fields are not padded with trailing spaces, and you would see your street number at the place you expected in your concatenation.