Parse out word between two characters SQL Netezza - sql

I am having trouble finding the correct syntax to parse out a word between two characters in Netezza.
PATIENT_NAME
SMITH,JOHN L
BROWN,JANE R
JONES,MARY LYNN
I need the first name which is always after the comma and before the first space. How would I do this in Netezza?

I think Netezza supports regexp_extract(). That would be:
select replace(regexp_extract(name, ',[^ ]+'), ',', '')
Or regexp_replace():
select regexp_replace(name, '^[^,]+,([^ ]+)( |$).*$', '\1')

Netezza supports regexp_extract and if there is reason to handle all kinds of whitespace between first and middle name(s) then this would work -
select regexp_extract(name,
'^[^[:space:],]+[[:space:],]+([^[:space:]]+)')
This would handle optional whitespaces, tabs etc on either side of , as well.

Related

Regex to split apart text. Special case for parentheses with spaces in them

I am trying to split a field by delimiter in LookML. This field either follows the format of:
Managers (AE)
Managers (AE - MM)
I was able to split to first case using this
sql: case
when rlike (${user_role_name}, '^.*[\\(\\)].*$') then split_part(${user_role_name}, ' ', -1)
However, I haven't been able to get the 2nd case to do the same. It's in a case statement so I am going to add another when statement, but am not able to figure out the regex for parentheses that contains spaces.
Thanks in advance for the help!
By "split" the string, I think you mean you want to extract the part in parentheses, right?
I would do this using a regex substring method. You didn't mention what warehouse you're using, and the syntax will vary a little, but on snowflake that would look like:
regexp_substr(${user_role_name}, '\\([^)]*\\)')
So, for example, with the inputs you gave:
select regexp_substr('Managers (AE)', '\\([^)]*\\)')
union all
select regexp_substr('Managers (AE - MM)', '\\([^)]*\\)')
result
(AE)
(AE - MM)

Snowflake SQL - Format Phone Number to 9 digits

I have a column with phone numbers in varchar, currently looks something like this. Because there is no consistent format, I don't think substring works.
(956) 444-3399
964-293-4321
(929)293-1234
(919)2991234
How do I remove all brackets, spaces and dashes and have the query return just the digits, in Snowflake? The desired output:
9564443399
9642934321
9292931234
9192991234
You can use regexp_replace() function to achieve this:
REGEXP_REPLACE(yourcolumn, '[^0-9]','')
That will strip out any non-numeric character.
You could use regexp_replace to remove all of the special characters
something like this
select regexp_replace('(956) 444-3399', '[\(\) -]', '')
An alternative using translate . Documentation
select translate('(956) 444-3399', '() -', '')

Extract data in parentheses with Amazon-redshift

I have these characters in a table:
LEASE THIRD-CCP-MANAGER (AAAA)
THE MANAGEMENT OF A THIRD PARTY (BBBB / AAAA)
When I extract the information:
AAAA
BBBB/AAAA
That is, I have to look for the pattern and extract what is inside the parenthesis.
I'm trying to use the REGEXP_SUBSTR function.
In amazon redshift, how do I extract the characters in parentheses?
thanks
Whoa, that was hard!
Here is the syntax to use:
SELECT REGEXP_SUBSTR('One (Two) Three', '[(](.*)[)]', 1, 1, 'e')
This will return: Two
It appears that escaping brackets with \( doesn't work, but putting them in [(] does work. The 'e' at the end will "Extract a substring using a subexpression".
use position for finding the index of parenthesis ( and then substring
select
substring(position('(' in 'LEASE THIRD-CCP-MANAGER (AAAA)'),position(')' in 'LEASE THIRD-CCP-MANAGER (AAAA)'))
or you can use split_part
split_part('LEASE THIRD-CCP-MANAGER (AAAA)','(',2)
You’re probably struggling with () meaning something in regular expressions (lookup “back references”)
To tell regex that you just mean the characters ( and ) without their special meaning, “escape” them using \
regexp_substr(yourtable.yourcolumn,'\(.*\)')

Postgresql: Extracting substring after first instance of delimiter

I'm trying to extract everything after the first instance of a delimiter.
For example:
01443-30413 -> 30413
1221-935-5801 -> 935-5801
I have tried the following queries:
select regexp_replace(car_id, E'-.*', '') from schema.table_name;
select reverse(split_part(reverse(car_id), '-', 1)) from schema.table_name;
However both of them return:
01443-30413 -> 30413
1221-935-5801 -> 5801
So it's not working if delimiter appears multiple times.
I'm using Postgresql 11. I come from a MySQL background where you can do:
select SUBSTRING(car_id FROM (LOCATE('-',car_id)+1)) from table_name
Why not just do the PG equivalent of your MySQL approach and substring it?
SELECT SUBSTRING('abcdef-ghi' FROM POSITION('-' in 'abcdef-ghi') + 1)
If you don't like the "from" and "in" way of writing arguments, PG also has "normal" comma separated functions:
SELECT SUBSTR('abcdef-ghi', STRPOS('abcdef-ghi', '-') + 1)
I think that regexp_replace is appropriate, but using the correct pattern:
select regexp_replace('1221-935-5801', E'^[^-]+-', '');
935-5801
The regex pattern ^[^-]+- matches, from the start of the string, one or more non dash characters, ending with a dash. It then replaces with empty string, effectively removing this content.
Note that this approach also works if the input has no dashes at all, in which case it would just return the original input.
Use this regexp pattern :
select regexp_replace('1221-935-5801', E'^[^-]+-', '') from schema.table_name
Regexp explanation :
^ is the beginning of the string
[^-]+ means at least one character different than -
...until the - character is met
I tried it in a conventional way in general what we do (found
something similar to instr as strpos in postgrsql .) Can try the below
SELECT
SUBSTR(car_id,strpos(car_id,'-')+1,
length(car_id) ) from table ;

Cut string after first occurrence of a character

I have strings like 'keepme:cutme' or 'string-without-separator' which should become respectively 'keepme' and 'string-without-separator'. Can this be done in PostgreSQL? I tried:
select substring('first:last' from '.+:')
But this leaves the : in and won't work if there is no : in the string.
Use split_part():
SELECT split_part('first:last', ':', 1) AS first_part
Returns the whole string if the delimiter is not there. And it's simple to get the 2nd or 3rd part etc.
Substantially faster than functions using regular expression matching. And since we have a fixed delimiter we don't need the magic of regular expressions.
Related:
Split comma separated column data into additional columns
regexp_replace() may be overload for what you need, but it also gives the additional benefit of regex. For instance, if strings use multiple delimiters.
Example use:
select regexp_replace( 'first:last', E':.*', '');
SQL Select to pick everything after the last occurrence of a character
select right('first:last', charindex(':', reverse('first:last')) - 1)