PostgreSQL regexp replace function with condition - sql

There is a PostgreSQL table. This table has a field which contains the queries of the stored procedures as string.
I am looking for a regexp replace solution which I am able to remove the part of the string with but only in that cases where the string contains 'tmp'.
Example string inputs:
...from schema1.table_1...
...from schema1.table_1_tmp...
...from schema1.table_2...
...from schema1.table_2_tmp...
Aim:
...from schema1.table_1...
...from table_1_tmp...
...from schema1.table_2...
...from table_2_tmp...
schema1 is a static value, only the table names are different. Some of them contains tmp substring, some of them not.
If it contains tmp, we should remove the schema1 string.

You could use regexp_replace() as follows:
regexp_replace(mycol, '\sschema1\.(\w+_tmp)\s', ' \1 ')
Regex breakdown:
\s a space
schema1\. litteral string "schema1."
( beginning of a capturing group
\w+ at many alphanumeric characters as possible (including "_")
_tmp litteral string "_tmp"
) end of the capturing group
\s a space
When the string matches the regex, the matching expression is replaced by: a space, then the captured part, then another space.
Demo on DB Fiddle:
with t as (
select '... from schema1.table_1_tmp ...' mycol
union all select '... from schema1.table_2 ...'
)
select mycol, regexp_replace(mycol, '\sschema1\.(\w+_tmp)\s', ' \1 ') newcol from t
mycol | newcol
:------------------------------- | :---------------------------
... from schema1.table_1_tmp ... | ... from table_1_tmp ...
... from schema1.table_2 ... | ... from schema1.table_2 ...

You really need to update your Postgres version; version 8.3.x reached end-of-life in Feb-2013. However, the #GMB answer should work as all the appropriate regexp functions do exist in it. However, you can also try the replace function.
with test_tab (tbl) as
( values ('...from schema1.table_1...')
, ('...from schema1.table_1_tmp...')
, ('...from schema1.table_2...')
, ('...from schema1.table_2_tmp...')
)
select replace(tbl,'schema1.','') "Without Schema"
from test_tab
where tbl ilike '%schema1%_tmp%';

Related

How to give argument for a repeating character in snowflake regex

My string is a comment that looks like:
***z|Samuel|Amount:15|Frequency:1
I want to use regex to filter all such rows out of a data base, my query is below
select
ID,
COMMENT,
max(case when lower(COMMENT) Rlike '\*+z\|Samuel\|Amount:[0-9]+\|Frequency:[0-9]+'
then 1 else 0 end) as indicator
from Table_Name group by 1,2
But this gives me an error:
Invalid regular expression: '*+z|Samuel|Amount:[0-9]+|Frequency:[0-9]+', no argument for repetition operator: *
Does anyone know how to navigate through this?
Using '[*]+z[|]Samuel[|]Amount:[0-9]+[|]Frequency:[0-9]+':
CREATE OR REPLACE TEMPORARY TABLE t AS
SELECT '***z|Samuel|Amount:15|Frequency:1' AS COMMENT;
SELECT *
FROM t
WHERE RLIKE (t.COMMENT, '[*]+z[|]Samuel[|]Amount:[0-9]+[|]Frequency:[0-9]+', 'i');
Output:
Alternatively the original \ should be doubled or the string not wrapped with ':
'\*+z\|Samuel\|Amount:[0-9]+\|Frequency:[0-9]+'
=>
'\\*+z\\|Samuel\\|Amount:[0-9]+\\|Frequency:[0-9]+'
$$\*+z\|Samuel\|Amount:[0-9]+\|Frequency:[0-9]+$$
Matching Characters That Are Metacharacters
If you are using the regular expression in a single-quoted string constant, you must escape the backslash with a second backslash (e.g. \., \*, \?, etc.).
SELECT COMMENT,
RLIKE (t.COMMENT, '[*]+z[|]Samuel[|]Amount:[0-9]+[|]Frequency:[0-9]+', 'i') AS "[]",
RLIKE (t.COMMENT, $$\*+z\|Samuel\|Amount:[0-9]+\|Frequency:[0-9]+$$, 'i') AS "$$",
RLIKE (t.COMMENT, '\\*+z\\|Samuel\\|Amount:[0-9]+\\|Frequency:[0-9]+', 'i') AS "\\"
FROM t;
Output:

Select statement with column contains '%'

I want to select names from a table where the 'name' column contains '%' anywhere in the value. For example, I want to retrieve the name 'Approval for 20 % discount for parts'.
SELECT NAME FROM TABLE WHERE NAME ... ?
You can use like with escape. The default is a backslash in some databases (but not in Oracle), so:
select name
from table
where name like '%\%%' ESCAPE '\'
This is standard, and works in most databases. The Oracle documentation is here.
Of course, you could also use instr():
where instr(name, '%') > 0
One way to do it is using replace with an empty string and checking to see if the difference in length of the original string and modified string is > 0.
select name
from table
where length(name) - length(replace(name,'%','')) > 0
Make life easy on yourselves and just use REGEXP_LIKE( )!
SQL> with tbl(name) as (
select 'ABC' from dual
union
select 'E%FS' from dual
)
select name
from tbl
where regexp_like(name, '%');
NAME
----
E%FS
SQL>
I read the documentation mentioned by Gordon. The relevent sentence is:
An underscore (_) in the pattern matches exactly one character (as opposed to one byte in a multibyte character set) in the value
Here was my test:
select c
from (
select 'a%be' c
from dual) d
where c like '_%'
The value a%be was returned.
While the suggestions of using instr() or length in the other two answers will lead to the correct answer, they will do so slowly. Filtering on function results simply take longer than filtering on fields.

Find value having leading and trailing space in DB table column - Oracle

I have a record which has value with leading and space in a column.
eg: column value is ' D0019 '
I want to pass this particular column in where clause.
select * from my_table where my_column='D0019';
Since the value has space, it doesn't detect from the where clause.
How can I select the record even it has leading and trailing spaces in the value?
My DB is ORACLE
========================================
UPDATE :
I get value only when I try
select * from my_table where my_column like '%D0019%'
not even with ' %D0019% '
=============================================
UPDATE 2 :
SELECT my_column ,DUMP(my_column) FROM my_table WHERE my_column like '%D0019';
output is
" D0019" Typ=1 Len=6: 9,68,48,48,49,57
it's not the normal space you have to remove. It's Horizontal Tab character . (Ascii 9).
The below regexp would strip all the charcters from ASCII range 0-32 , which are associated with the white space symbols.
select * from my_table
WHERE
REGEXP_REPLACE(my_column,'['||chr(1)||'-'||chr(32)||']' ) = 'D0019';
More on ASCII table
select *
from my_table
where trim(my_column) = 'D0019';
Edit:
Based on the output of the dump() function your values not just contain (leading) spaces but also a tab character. the trim() function will only remove space characters, but not other whitespace.
In order to get rid of any whitespace at the beginning you will need to use regexp_replace()
select *
from my_table
where regexp_replace(my_column,'^(\s)+','') = 'D0019'
If you need to get rid of leading and trailing whitespace, the regex needs to be expanded:
select *
from my_table
where regexp_replace(my_column,'^(\s)+|(\s)+$','') = 'D0019'
SQLFiddle example: http://sqlfiddle.com/#!4/3326a/1
Seems as simple as (if I get it):
select * from TAB where REGEXP_LIKE (COL,'\s*D0019')
returns values such ' D0019', 'D0019', ' D0019', ' D0019' ...etc.
Try like this;
select * from my_table where my_column like '%D0019';
you can also use this if you want only spaces at start for the select:
select * from my_table where my_column like '% %D0019';
or
select * from my_table where my_column like ' %D0019';

Parsing string based on last occurance of a delimiter (space in this case)

How to parse string - last delimiter.
In Teradata I have name data stored in a varchar column. I don't know how long the name could be, or how many pieces it could have: given name, potential multiple middle names (or no middle name), surname, etc.
I would like to parse the string, assuming everything after the last space in the name is the last name. Anyone have any better ideas than mine?
Here is my solution:
(It's Hack-y, but it works, and avoids recursion, looping, udfs, etc.)
drop table tmp;
create volatile table tmp (str1 varchar(50)) on commit preserve rows;
insert into tmp values('mortecai ali von allen o''shae');
insert into tmp values('hillary rodham-clinton');
insert into tmp values('cher');
insert into tmp values('a.e. schatzschneider');
select str1
,length(str1)-length(oreplace(str1,' ','')) as occurrence
,(1-ABS(occurrence-0.1)/(occurrence-0.1))/2
as if_occurence_is_0_return_1
-- this just to handle the case that there are no spaces in the string at all
-- in the case of no spaces, assumes whole field is just last name
,occurrence+if_occurence_is_0_return_1
,instr(str1,' ',1,occurrence+if_occurence_is_0_return_1) as lastspace
,substr(str1,1,lastspace) as first_nm
,substr(str1,lastspace,length(str1)-lastspace+1) as last_nm
from tmp;
--pulling it all together
--(just str1, first_nm & last_nm - no intermediate placeholder fields):
select str1
,substr(str1,1,instr(str1,' ',1,length(str1)-length(oreplace(str1,' ',''))
+(1-ABS(length(str1)-length(oreplace(str1,' ',''))-0.1)/(length(str1)
-length(oreplace(str1,' ',''))-0.1))/2)) as first_nm
,substr(str1,instr(str1,' ',1,length(str1)-length(oreplace(str1,' ',''))
+(1-ABS(length(str1)-length(oreplace(str1,' ',''))-0.1)/(length(str1)
-length(oreplace(str1,' ',''))-0.1))/2),length(str1)-instr(str1,' ',1,length(str1)
-length(oreplace(str1,' ',''))+(1-ABS(length(str1)
-length(oreplace(str1,' ',''))-0.1)/(length(str1)
-length(oreplace(str1,' ',''))-0.1))/2)+1) as last_nm
from tmp;
As you're using INSTR you're probably on TD14.
You should check the parameters for INSTR, you can search from backwards, too :-)
trim(substring(str1 from instr(str1,' ',-1,1))) as last_nm
The TRIM gets rid of the leading blank.
And the first name is
trim(substring(str1 from 1 for instr(str1,' ',-1,1))) as first_nm,
And of course you could also use a regular expression:
REGEXP_SUBSTR(str1, '[^ ]+$') as lst_nm,
REGEXP_SUBSTR(str1, '.*[ ]') as first_nm

Delete certain character based on the preceding or succeeding character - ORACLE

I have used REPLACE function in order to delete email addresses from hundreds of records. However, as it is known, the semicolon is the separator, usually between each email address and anther. The problem is, there are a lot of semicolons left randomly.
For example: the field:
123#hotmail.com;456#yahoo.com;789#gmail.com;xyz#msn.com
Let's say that after I deleted two email addresses, the field content became like:
;456#yahoo.com;789#gmail.com;
I need to clean these fields from these extra undesired semicolons to be like
456#yahoo.com;789#gmail.com
For double semicolons I have used REPLACE as well by replacing each ;; with ;
Is there anyway to delete any semicolon that is not preceded or following by any character?
If you only need to replace semicolons at the start or end of the string, using a regular expression with the anchor '^' (beginning of string) / '$' (end of string) should achieve what you want:
with v_data as (
select '123#hotmail.com;456#yahoo.com;789#gmail.com;xyz#msn.com' value
from dual union all
select ';456#yahoo.com;789#gmail.com;' value from dual
)
select
value,
regexp_replace(regexp_replace(value, '^;', ''), ';$', '') as normalized_value
from v_data
If you also need to replace stray semicolons from the middle of the string, you'll probably need regexes with lookahead/lookbehind.
You remove leading and trailing characters with TRIM:
select trim(both ';' from ';456#yahoo.com;;;789#gmail.com;') from dual;
To replace multiple characters with only one occurrence use REGEXP_REPLACE:
select regexp_replace(';456#yahoo.com;;;789#gmail.com;', ';+', ';') from dual;
Both methods combined:
select regexp_replace( trim(both ';' from ';456#yahoo.com;;;789#gmail.com;'), ';+', ';' ) from dual;
regular expression replace can help
select regexp_replace('123#hotmail.com;456#yahoo.com;;456#yahoo.com;;789#gmail.com',
'456#yahoo.com(;)+') as result from dual;
Output:
| RESULT |
|-------------------------------|
| 123#hotmail.com;789#gmail.com |