I have a table with a phone number varchar field. This field has phone numbers that are formatted many different ways. 999-999-9999 or (999) 999-9999 and so on. I have a phone number that I am trying to find which is formatted like this: "9999999999". I would like to do something like this:
SELECT …
WHERE replace(PHO_PhoneNumber, "[^\\d]", “") = “9999999999”
Basically remove all non digits from the field and then compare.
Is there such a function "replace" that uses regex, or is there a better way of trying to find this number when the phone number field can have many different formatting characters in it ? I have no control over how phone numbers get entered into this table.
Thanks,
Warren
You don't say what version of SQL Anywhere you're using, but as of version 11.0, SQL Anywhere supports the REGEXP operator in the where clause, so you could do something like:
select ...
where PHO_PhoneNumber regexp '\(?\d{3}\)?-?\d{3}-?\d{4}'
Disclaimer: I work for SAP in SQL Anywhere engineering.
I don't think Sybase has such a function. You could write one. However, the "special" characters in phone numbers are typically: "()+- ". You can use multiple replaces for these:
WHERE replace(replace(replace(replace(replace(PHO_PhoneNumber, ' ', ''), ')', ''), '(', ''), '+', ''), '-', '') = '9999999999'
Related
I am trying to check if the phone numbers in the table are in the proper format or not. If not I want it to move to another column.
Invalid phone numbers are:
Less or more than 10 digits (numbers only)
Number that has alphabet or special characters (except () -)
Sample input:
phoneNumber
--------------
+1(111)11-1111
1111111111
11(11)111111
11abcd1111
Expected result:
phoneNumber
---------------
+1(111)11-1111
1111111111
I wrote this query but it doesn't seem to work
SELECT PhoneNumber
FROM Customers
WHERE PhoneNumber LIKE '[(][0-9]{3}[)]\s[0-9]{3}[-][0-9]{4}'
What you are attempting to test in your code example does not match your list of requirements, and as noted above, SQL Server LIKE does not support the generalized type of regular expression that you have coded. Specifically, {3} and \s are not supported.
Using just your stated requirements, you can first check that the string does not contain any invalid characters and then separately check that the string contains exactly 10 digits after removing the valid non-numeric characters.
Something like:
SELECT PhoneNumber
FROM Customers
WHERE PhoneNumber NOT LIKE '%[^0-9()-]%'
AND REPLACE(REPLACE(REPLACE(PhoneNumber, '(', ''), ')', ''), '-', '')) LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
-- AND LEN(REPLACE(REPLACE(REPLACE(PhoneNumber, '(', ''), ')', ''), '-', '')) = 10 -- Alternate test
Note that this will still allow some unusual cases like ')123(45678-9-0-`.
See this db<>fiddle.
I am having trouble finding the correct syntax to parse out a word between two characters in Netezza.
PATIENT_NAME
SMITH,JOHN L
BROWN,JANE R
JONES,MARY LYNN
I need the first name which is always after the comma and before the first space. How would I do this in Netezza?
I think Netezza supports regexp_extract(). That would be:
select replace(regexp_extract(name, ',[^ ]+'), ',', '')
Or regexp_replace():
select regexp_replace(name, '^[^,]+,([^ ]+)( |$).*$', '\1')
Netezza supports regexp_extract and if there is reason to handle all kinds of whitespace between first and middle name(s) then this would work -
select regexp_extract(name,
'^[^[:space:],]+[[:space:],]+([^[:space:]]+)')
This would handle optional whitespaces, tabs etc on either side of , as well.
I got a database with a user table. This table contains a column phonenumber. The problem is that its fields use multiple number patterns.
The current patterns I found:
06403/975-0
+496403975 0
06403 975-0
06403 975 0
+49 6403 975 0
When searching for a user in the database, is there a way to search for all number patterns?
SELECT id FROM user WHERE phone = '0123456789'
I use Oracle and MS SQL
Assuming your question means this:
"Is it possible to remove all the non-digit characters from the stored phone number, before making the comparison in the WHERE clause?"
a possible solution looks like this:
...
where translate(phone, '0123456789' || phone, '0123456789') = <input value here>
TRANSLATE will translate every digit to itself, and all other characters in phone to nothing (they will simply be deleted from the string). This is exactly what you want.
If you find that the query is slow, you may want to create a (function-based) index on translate(phone, '0123456789' || phone, '0123456789').
EDIT: I missed the part where you said you are using both Oracle and SQL Server. I did a quick search and found that SQL Server does not have a function similar to Oracle's TRANSLATE. I will leave it to SQL Server experts to help you with that part; I don't know SQL Server.
In Oracle you could do it like this. Strip out the non-numeric characters with translate() to get the phone number. You need to handle the leading zero or international dialling code:
select username from your_table
where translate(phone, '1234567890+/ -', '1234567890') in ('064039750', '4964039750')
You may need to tweak this if you don't know what the international dialling code is.
Obviously the actual problem is one of data quality: the application should enforce a strict format on phone numbers. One bout of data cleansing on write saves a whole bunch of grief on read.
You have a database containing phone numbers. These are sometimes in international format, but often in some national format, probably German, where two leading zeros introduce a country code, while a single leading zero would introduce an area code instead (assuming the home country Germany then). Moreover, a phone number can contain symbols for readability, namely '-', '/', and ' '.
So
+49 12/3456-7 means +491234567 of course
00441234567 means +441234567
04412345 means +494412345
I suggest you convert all numbers into international format in these steps:
replace a leading + with a leading 00, thus making only digits important
remove every character that is not a digit
replace a leading 00 with a leading +
replace a leading 0 with a leading +49
Use Oracle's REGEXP_REPLACE for this:
select
regexp_replace(
regexp_replace(
regexp_replace(
regexp_replace(trim(phone),
'^\+', '00'), -- leading '+' -> leading '00'
'[^[:digit:]]', ''), -- remove all non-digits
'^00' , '+'), -- leading '00' -> leading '+'
'^0', '+49') -- leading '0' -> leading '+49'
as international_phone
from mytable;
You can do this in the WHERE clause of course:
SELECT id FROM user WHERE regexp_replace(...) = '+49123456789'
or even
SELECT id FROM user WHERE regexp_replace(...phone...) = regexp_replace(...'0123456789'...)
And you may write a PL/SQL function for this for convenience and use it so:
SELECT id FROM user WHERE international_phone(phone) = international_phone('0123456789')
This is for Oracle. There may be something alike for SQL Server.
I'm attempting to isolate eight digits from a cell that contains other numbers as well as text and no rhyme or reason to where it is placed. An example return would look something like this:
will deliver 11/07 in USA at 12:30 with conf# 12345678
I need the conf# only, but it could be at the end, beginning, middle of the string and I don't know how to isolate it. I'm working in DB2 so I can't use functions such as PATINDEX or CHARINDEX, so what are my other option for pulling out only "12345678" regardless of where it is located?
While DB2 doesn't have PATINDEX or CHARINDEX, it does have LOCATE.
If your DB2 version supportx pureXML, you can use the regular expression support in XQuery, something like:
select xmlcast(
xmlquery(
' if (fn:matches( $YOURCOLUMN, "(^|.*[^\d])(\d{8})([^\d].*$|$)")) then fn:replace( $YOURCOLUMN,"(^|.*[^\d])(\d{8})([^\d].*$|$)","$2") else "" '
)
as varchar(20)
)
from YOURTABLE
This assumes that 8-digit sequence appears only once in the column. You may need to tweak the regex to support some border cases.
I have column store_name (varchar). In that column I have entries like prime sport, best buy... with a space. But when user typed concatenated string like primesport without space I need to show result prime sport. how can I achieve this? Please help me
SELECT *
FROM TABLE
WHERE replace(store_name, ' ', '') LIKE '%'+#SEARCH+'%' OR STORE_NAME LIKE '%'+#SEARCH +'%'
Well, I don't have much idea, and even I am searching for it. But may be what I know works for you, You can achieve this by performing different type of string operations:
Mike can be Myke or Myce or Mikke or so on.
Cat an be Kat or katt or catt or so on.
For this you should write a function to generate number of possible strings and then form a SQL Query using all these, and query the database.
A similar kind of search in known as Soundex Search from Oracle and Soundex Search from Microsoft. Have a look of it. this may work.
And overall make use of functions like upper and lower.
Have you tried using replace()
You can replace the white space in the query then use like
SELECT * FROM table WHERE replace(store_name, ' ', '') LIKE '%primesport%'
It will work for entries like 'prime soft' querying with 'primesoft'
Or you can use regex.