SQL Server Replace with multiple SUBSTRING SSMS - sql

I would like to change multiple substrings at once for example
'0100001000100'
The desired output.
' | ABC | | | | DFG | | | HIG | |'
REPLACE(SUBSTRING([column],1,1),' 1 ' ,' XYZ '),SUBSTRING([column],2,1),' 1 ' ,' zzz ')....
but does not work.

It's extremely unclear how you mean your input to turn to your desired output. However, hopefully this will clear up your confusion about the REPLACE function.
REPLACE(targetstring, str1, str2) will find each occurrence of str1 in targetstring and replace it with str2. So REPLACE('I know John and John knows me', 'John', 'Fred') would result in the string 'I know Fred and Fred knows me'.
To chain REPLACEs together, so that the output from the first REPLACE is used as the target for the next replace, you need a structure like this:
REPLACE(
REPLACE(
REPLACE(
'I know Fred and Fred knows me', 'Fred', 'Bill'
), 'know', 'like'
), 'and', 'but'
)
Next point: if you want to string together your replacement results on each character, you need to not chain the REPLACEs but instead join them with +:
REPLACE(SUBSTRING([column],1,1),'1','XYZ')
+
REPLACE(SUBSTRING([column],2,1),'1','zzz')
+ ...

Related

SQL - trimming values before bracket

I have a column of values where some values contain brackets with text which I would like to remove. This is an example of what I have and what I want:
CREATE TABLE test
(column_i_have varchar(50),
column_i_want varchar(50))
INSERT INTO test (column_i_have, column_i_want)
VALUES ('hospital (PWD)', 'hopistal'),
('nursing (LLC)','nursing'),
('longterm (AT)', 'longterm'),
('inpatient', 'inpatient')
I have only come across approaches that use the number of characters or the position to trim the string, but these values have varying lengths. One way I was thinking was something like:
TRIM('(*',col1)
Doesn't work. Is there a way to do this in postgres SQL without using the position? THANK YOU!
If all the values contain "valid" brackets, then you may use split_part function without any regular expressions:
select
test.*,
trim(split_part(column_i_have, '(', 1)) as res
from test
column_i_have | column_i_want | res
:------------- | :------------ | :--------
hospital (PWD) | hopistal | hospital
nursing (LLC) | nursing | nursing
longterm (AT) | longterm | longterm
inpatient | inpatient | inpatient
db<>fiddle here
You can replace partial patterns using regular expressions. For example:
select *, regexp_replace(v, '\([^\)]*\)', '', 'g') as r
from (
select '''hospital (PWD)'', ''nursing (LLC)'', ''longterm (AT)'', ''inpatient''' as v
) x
Result:
r
-------------------------------------------------
'hospital ', 'nursing ', 'longterm ', 'inpatient'
See example at db<>fiddle.
Could it be as easy as:
SELECT SUBSTRING(column_i_have, '\w+') AS column_i_want FROM test
See demo
If not, and you still want to use SUBSTRING() to get upto but exclude paranthesis, then maybe:
SELECT SUBSTRING(column_i_have, '^(.+?)(?:\s*\(.*)?$') AS column_i_want FROM test
See demo
But if you really are looking upto the opening paranthesis, then maybe just use SPLIT_PART():
SELECT SPLIT_PART(column_i_have, ' (', 1) AS column_i_want FROM test
See demo

Oracle REGEXP_SUBSTR to ignore the first ocurrence of a character but include the 2nd occurence

I have a string that has this format "number - name" I'm using REGEXP_SUBSTR to split it in two separate columns one for name and one for number.
SELECT
REGEXP_SUBSTR('123 - ABC','[^-]+',1,1) AS NUM,
REGEXP_SUBSTR('123 - ABC','[^-]+',1,2) AS NAME
from dual;
But it doesn't work if the name includes a hyphen for example: ABC-Corp then the name is shown only like 'ABC' instead of 'ABC-Corp'. How can I get a regex exp to ignore everything before the first hypen and include everything after it?
You want to split the string on the first occurence of ' - '. It is a simple enough task to be efficiently performed by string functions rather than regexes:
select
substr(mycol, 1, instr(mycol, ' - ') - 1) num,
substr(mycol, instr(mycol, ' - ') + 3) name
from mytable
Demo on DB Fiddlde:
with mytable as (
select '123 - ABC' mycol from dual
union all select '123 - ABC - Corp' from dual
)
select
mycol,
substr(mycol, 1, instr(mycol, ' - ') - 1) num,
substr(mycol, instr(mycol, ' - ') + 3) name
from mytable
MYCOL | NUM | NAME
:--------------- | :-- | :---------
123 - ABC | 123 | ABC
123 - ABC - Corp | 123 | ABC - Corp
NB: #GMB solution is much better in your simple case. It's an overkill to use regular expressions for that.
tldr;
Usually it's easierr and more readable to use subexpr parameter instead of occurrence in case of such fixed masks. So you can specify full mask: \d+\s*-\s*\S+
ie numbers, then 0 or more whitespace chars, then -, again 0 or more whitespace chars and 1+ non-whitespace characters.
Then we adding () to specify subexpressions: since we need only numbers and trailing non-whitespace characters we puts them into ():
'(\d+)\s*-\s*(\S+)'
Then we just specify which subexpression we need, 1 or 2:
SELECT
REGEXP_SUBSTR(column_value,'(\d+)\s*-\s*(\S+)',1,1,null,1) AS NUM,
REGEXP_SUBSTR(column_value,'(\d+)\s*-\s*(\S+)',1,1,null,2) AS NAME
from table(sys.odcivarchar2list('123 - ABC', '123 - ABC-Corp'));
Result:
NUM NAME
---------- ----------
123 ABC
123 ABC-Corp
https://docs.oracle.com/database/121/SQLRF/functions164.htm#SQLRF06303
https://docs.oracle.com/database/121/SQLRF/ap_posix003.htm#SQLRF55544

oracle sql regexp_replace

I have a table that has the values like this.
ExpTable
+--------+
|expCol |
+--------+
|abc.abc |
|bcd.123 |
|efg.#/. |
+--------+
And what I wanted is that when the character after the period is a letter or number, the output will add a space after the dot like this:
Expected Output
+--------+
|expCol |
+--------+
|abc. abc|
|bcd. 123|
|efg.#/. | --the value here stays the same because after the period is not a letter/number
+--------+
I tried:
SELECT REGEXP_REPLACE(expCol, '.', '. ') from expTable WHERE /*condition*/
And as expected, everything including the last value 'efg.#/.' has got a space after the period. I dont know what to put in the WHERE clause.
You could try this. It searches for a . followed by a word character, and replaces it with a dot ., then a space and the matched character.
select REGEXP_REPLACE(expCol, '\.(\w)','. \1') FROM ExpTable;
if you only want the first such occurrence to be replaced, you could specify it.
REGEXP_REPLACE(expCol, '\.(\w)','. \1',1,1)
Only thing to note is this would match a number,alphabet and underscore as well, if you don't want to consider "_" , use [[:alnum:]] or [a-zA-Z0-9] in place of \w
Demo
SELECT REGEXP_REPLACE(expCol, '\.([a-zA-Z0-9])', '. \1') AS expCol FROM expTable
OR
SELECT REGEXP_REPLACE(expCol, '[.]([a-zA-Z0-9])', '. \1') AS expCol FROM expTable
Output
EXPCOL
abc. abc
bcd. 123
efg.#/.
LiveDemo
http://sqlfiddle.com/#!4/0a6e0/31
You can try this.
. is a keyword in regex so you need put \ in front of it
SELECT REGEXP_REPLACE(expCol, '\.(\w)', '. \1') from T
sqlfiddle :http://sqlfiddle.com/#!4/94ffec/1

Format phone number in Oracle with country code

I have a requirement to format phone numbers in the following way:
No spaces
No special characters
Remove preceding zero - if area code exists
Remove country code if present e.g. +44
For instance this: (03069) 990927 would become: 3069990927.
So far I have come up this this:
replace(replace(replace(replace(replace(replace(substr(replace(ltrim([VALUE],0), ' ', ''),nvl(length(substr(replace(ltrim([VALUE],0), ' ', ''),11)),0)+1), '-', ''), '(', ''), ')', ''),'/', ''), '.', ''), '+', '')
Is there a shorter version of this, maybe using a regular expression?
The final version of this snippet will become a column in a view that will return the following columns:
Customer Number
Customer Name
Country
Formatted Phone Number
The formatted phone number will be concatenated with the international dial code (e.g. +44) that are saved in the database in a table - DIALCODE_TAB(COUNTRY_CODE, CODE). Below is an example using the replace syntax above:
CREATE OR REPLACE FORCE VIEW "CUST_PHONE" ("CUSTOMER_ID", "NAME", "COUNTRY", "PHONE_NUMBER") AS
select
cicm.customer_id,
cicm.name,
dct.country,
dct.code || replace(replace(replace(replace(replace(replace(substr(replace(ltrim(cicm.value,0), ' ', ''),nvl(length(substr(replace(ltrim(cicm.value,0), ' ', ''),11)),0)+1), '-', ''), '(', ''), ')', ''),'/', ''), '.', ''), '+', '') phone_number
from customer_info_comm_method cicm
join dialcode_tab dct
on dct.country_code = customer_info_api.get_country_code(cicm.customer_id)
where cicm.method_id_db = 'PHONE'
--and dct.code || replace(replace(replace(replace(replace(replace(substr(replace(ltrim(cicm.value,0), ' ', ''),nvl(length(substr(replace(ltrim(cicm.value,0), ' ', ''),11)),0)+1), '-', ''), '(', ''), ')', ''),'/', ''), '.', ''), '+', '') = [phone_number]
--in terms of performance this SQL has to be written so that it returns all the records or a specific record when searching for the phone number - very quickly (<10s).
WITH read only;
N.B. A customer record can have more than 1 phone number and the same phone number can exist on more than 1 customer record.
To begin with a remark: This only works if the country is stored elsewhere for the record and there are no telephone numbers without an area code. Otherwise one would not be able to reconstruct the complete phone number again.
Then: How are country codes represented in your data? Is it always +44 or can it be 0044? Be careful here. Especially don't remove a single zero (assuming it's an area code), when it's actually the first of two zeros representing the country code :-)
Then: You need a list of all country codes. Let's take for example +1441441441. Where does the country code end? (Solution: +1441 is Bermudas.)
As to "no spaces" and "no special characters" you can solve this best with regexp_replace.
So all in all not so simple a task as you obviously expected it to be. (But not too hard to do either.)
I would use PL/SQL for this.
Hope my hints help you. Good luck.
EDIT: Here is what is needed. I still think a PL/SQL function will be best here.
Make sure your DIALCODE_TAB contains all country codes necessary.
1. Trim the phone number.
2. Then check if its starts with a country identifyer (+, 00).
2.1. If so: remove that. Remove all non-digits. Look up the country code in your table and remove it.
2.2. If not so: check if it starts with an area identifyer (0).
2.2.1. If so: remove it.
2.2.2. In any case: remove all non-digits.
That should do it, provided the numbers are valid. In Germany sometimes people write +49(0)40-123456, which is not valid, because one either uses a country code or an area code, not both in the same number. The (0) would have to be removed to make the number valid.
SELECT LTRIM(REGEXP_REPLACE(
REGEXP_REPLACE('+44(03069) 990927',
'(\+).([[:digit:]])+'), -- to strip off country code
'[^[:alnum:]]'),-- Strip off non-aplanumeric [:digit] if only digit
'0') -- Remove preceding Zero
FROM DUAL;
Wont work for +44990927 (If country code ends without any space or something or country didnt start with +)
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE phone_numbers ( phone_number ) AS
SELECT '(03069) 990927' FROM DUAL
UNION ALL SELECT '+44 1234 567890' FROM DUAL
UNION ALL SELECT '+44(0)1234 567890' FROM DUAL
UNION ALL SELECT '+44(012) 34-567-890' FROM DUAL
UNION ALL SELECT '+44-1234-567-890' FROM DUAL
UNION ALL SELECT '+358-1234567890' FROM DUAL;
Query 1:
If you are just dealing with +44 international dialling codes then you could:
use ^\+44|\D to strip the +44 international code and all non-digit characters; then
use ^0 to strip a leading zero if its present.
Like this:
SELECT REGEXP_REPLACE(
REGEXP_REPLACE(
phone_number,
'^\+44|\D',
''
),
'^0', '' ) AS phone_number
FROM phone_numbers
Results:
| PHONE_NUMBER |
|---------------|
| 3069990927 |
| 1234567890 |
| 1234567890 |
| 1234567890 |
| 1234567890 |
| 3581234567890 |
(You can see it doesn't work for the final number with a +358 international code.)
Query 2:
This can be simplified into a single regular expression (that's slightly less readable):
SELECT REGEXP_REPLACE(
phone_number,
'^(\+44)?\D*0?|\D',
''
) AS phone_number
FROM phone_numbers
Results:
| PHONE_NUMBER |
|---------------|
| 3069990927 |
| 1234567890 |
| 1234567890 |
| 1234567890 |
| 1234567890 |
| 3581234567890 |
Query 3:
If you want to deal with multiple international dialling codes then you will need to know which ones are valid (see http://en.wikipedia.org/wiki/List_of_country_calling_codes for a list).
This is an example of a regular expression which will strip out valid international dialling codes beginning with +3, +4 or +5 (I'll leave all the other dialling codes for you to code up):
SELECT REGEXP_REPLACE(
phone_number,
'^(\+(3[0123469]|3[57]\d|38[01256789]|4[013456789]|42[013]|5[09]\d|5[12345678]))?\D*0?|\D',
''
) AS phone_number
FROM phone_numbers
Results:
| PHONE_NUMBER |
|--------------|
| 3069990927 |
| 1234567890 |
| 1234567890 |
| 1234567890 |
| 1234567890 |
| 1234567890 |
If the + at the start of the international dialling code is optional then just replace \+ (near the start of the regular expression) with \+?.

Remove sub string from a column's text

I've the following two columns in Postgres table
name | last_name
----------------
AA | AA aa
BBB | BBB bbbb
.... | .....
.... | .....
How can I update the last_name by removing name text from it?
final out put should be like
name | last_name
----------------
AA | aa
BBB | bbbb
.... | .....
.... | .....
UPDATE table SET last_name = regexp_replace(last_name, '^' || name || ' ', '');
This only removes one copy from the beginning of the column and correctly removes the trailing space.
Edit
I'm using a regular expression here. '^' || name || ' ' builds the regular expression, so with the 'Davis McDavis' example, it builds the regular expression '^Davis '. The ^ causes the regular expression to be anchored to the beginning of the string, so it's going to match the word 'Davis' followed by a space only at the beginning of the string it is replacing in, which is the last_name column.
You could achieve the same effect without regular expressions like this:
UPDATE table SET last_name = substr(last_name, length(name) + 2);
You need to add two to the length to create the offset because substr is one-based (+1) and you want to include the space (+1). However, I prefer the regular expression solution even though it probably performs worse because I find it somewhat more self-documenting. It has the additional advantage that it is idempotent: if you run it again on the database it won't have any effect. The substr/offset method is not idempotent; if you run it again, it will eat more characters off your last name.
Not sure about syntax, but try this:
UPDATE table
SET last_name = TRIM(REPLACE(last_name,name,''))
I suggest first to check it by selecting :
SELECT REPLACE(last_name,name,'') FROM table
you need the replace function see http://www.postgresql.org/docs/8.1/static/functions-string.html
UPDATE table SET last_name = REPLACE(last_name,name,'')