Avoiding Lookbehind in Postgres - sql

In a Postgres query I wrote, I'm trying to find an empty char between a digit and a char.
The Raw-Data looks pretty similar to this:
"X 1111-11-112222-22-22YY 3333-33-334444-44-44ZZZ5555-55-556666-66-66AAA7777-77-778888-88-88B 9999-99-991111-11-11"
I would like to split this into following table:
X 1111-11-112222-22-22
YY 3333-33-334444-44-44
ZZZ5555-55-556666-66-66
AAA7777-77-778888-88-88
B 9999-99-991111-11-11
So, normally i would do this by defining the Regex (?<\d)(?=[A-z]) which is giving me the empty char between the chars and digits, but Postgres doesn't support lookbehinds.
Anyone an idea how to fix this?
Thanks in advance!

(\w+\s*[\d-]+)
This will give all the groups.See demo.
http://regex101.com/r/nW8dX7/1

I would do a replace with a character you can then later split on. I'm not familiar with psql but something like:
split('~', replace(value, '\d(?=[A-z])', '$0~'))

Related

TERADATA REGEXP_SUBSTR Get string between two values

I am fairly new to teradata, but I was trying to understand how to use REGEXP_SUBSTR
For example I have the following cell value = ABCD^1234567890^1
How can I extract 1234567890
What I attempted to do is the following:
REGEXP_SUBSTR(x, '(?<=^).*?(?=^)')
But this didnt seem to work.
Can anyone help?
It might (or might not) be possible to use REGEXP_SUBSTR() to handle this, but you would need to use a capture group. An alternative here would be to do a regex replacement instead:
SELECT x, REGEXP_REPLACE(x, '^.*?\^|\^.*$', '') AS output
FROM yourTable;
The regex pattern used here matches:
^.*?\^ everything from the start to the first ^
| OR
\^.*$ everything from the second ^ to the end
We then replace with empty string to remove the content being matched.

Regexp_Extract BigQuery anything up to "|"

I'm fairly new to coding and I was wondering if you could give me a hand writing some regular expression for BigQuery SQL.
Basically I would like to extract everything before the bar sign "|" for one of my column.
Example:
Source string:
bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed
Desired output:
bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah
I thought about using the REGEXP_EXTRACT(string, delimiter) function but I'm totally unable to write some regex (LOL). Therefore I had a look over Stack, and have found stuff like:
SELECT REGEXP_EXTRACT( String_Name , "\S*\s*\|" ) ,
# or
SELECT REGEXP_EXTRACT( String_Name , '.+?(?=|)')
But every time I get error messages like " invalid perl operator: (?= " or "Illegal escape space"
Would you have any suggestions on why I get these messages and/or how could I proceed to extract these strings?
Many many thanks in advance <3
You can use SPLIT instead:
SELECT SPLIT("bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed", "|")[OFFSET(0)]
Prefix the pattern string with r:
SELECT REGEXP_EXTRACT(String_Name, r'\S*\s*\|')
This is the syntax for a raw string constant. You can review what this means in the documentation.

sql looking for pattern

I have a string similar to this 'MSH|^~\&|STF_ALL_LAB_IN_C...
I'm trying to find some sql that will bring back all messages that contain
MSH|^~\&|(any 3 characters)_(anything after the underscore).
Tried something like this
WHERE TransText LIKE 'MSH|^~\&|%_%_%_'
But that doesn't seem to require the underscore.
Any suggestions?
MSH|^~&|(any 3 characters)_(anything after the underscore).
The pattern would be:
where TransText like 'MSH|^~\&|___\_%'
In some databases, the backslash would need to be escaped, so that would be:
where TransText like 'MSH|^~\\&|___\_%'
_ is a special character in a LIKE clause. It matches any one character, where % matches any series of 0 or more characters.
You need to escape it, using \_.

How can I escape the wildcard for like operator? [duplicate]

This question also has the answer, but it mentions DB2 specifically.
How do I search for a string using LIKE that already has a percent % symbol in it? The LIKE operator uses % symbols to signify wildcards.
Use brackets. So to look for 75%
WHERE MyCol LIKE '%75[%]%'
This is simpler than ESCAPE and common to most RDBMSes.
You can use the ESCAPE keyword with LIKE. Simply prepend the desired character (e.g. '!') to each of the existing % signs in the string and then add ESCAPE '!' (or your character of choice) to the end of the query.
For example:
SELECT *
FROM prices
WHERE discount LIKE '%80!% off%'
ESCAPE '!'
This will make the database treat 80% as an actual part of the string to search for and not 80(wildcard).
MSDN Docs for LIKE
WHERE column_name LIKE '%save 50[%] off!%'
You can use the code below to find a specific value.
WHERE col1 LIKE '%[%]75%'
When you want a single digit number after the% sign, you can write the following code.
WHERE col2 LIKE '%[%]_'
In MySQL,
WHERE column_name LIKE '%|%%' ESCAPE '|'

Find character sequence at specific position in string

I need to use SQL to find a sequence of characters at a specific position in a string.
Example:
atcgggatgccatg
I need to find 'atg' starting at character 7 or at character 7-9, either way would work. I don't want to find the 'atg' at the end of the string. I know about LIKE but couldn't find how to use it for a specific position.
Thank you
In MS Access, you could write this as:
where col like '???????atg*' or
col like '????????atg*' or
col like '?????????atg*'
However, if you interested in this type of comparison, you might consider using a database that supports regular expressions.
If you have a look at this page you'll find that LIKE is entirely capable of doing what you want. To find something at, for example, a 3 char offset you can use something like this
SELECT * FROM SomeTable WHERE [InterestingField] LIKE '___FOO%'
The '_' (underscore) is a place marker for any char. Having 3 "any char" markers in the pattern, with a trailing '%', means that the above SQL will match anything with FOO starting from the fourth char, and then anything else (including nothing).
To look for something 7 chars in, use 7 underscores.
Let me know ifthis isn't quite clear.
EDIT: I quoted SQL Server stuff, not Access. Swap in '?' where I have '_', use '*' instead of '%', and check out this link instead.
Revised query:
SELECT * FROM SomeTable WHERE [InterestingField] LIKE '???FOO*'