I have a string in which I'm trying to extract a URL from. When I run it on this RegEx site, it works fine.
The Regex Pattern is: http:\/\/GNTXN.US\/\S+
The message I'm extracting from is below, and lives in a column called body in my SQL database.
Test Message: We want to hear from you! Take our 2022 survey & tell us what matters most to you this year: http://GNTXN.US/qsx Text STOP 2 stop/HELP 4 help
But when I run the following in SQL:
SELECT
body,
REGEXP_SUBSTR(body, 'http:\/\/GNTXN.US\/\S+') new_body
FROM
table.test
It returns no value. I have to imagine it's something to do with the backslashes in the URL, but I've tried everything.
The new_body output should read as http://GNTXN.US/qsx
In mysql you just need to escape the \
select body, REGEXP_SUBSTR(body, 'http:\\/\\/GNTXN.US\\/\\S+') as new_body
from table.test;
new_body output:
http://GNTXN.US/qsx
Related
I want to extract a set of characters between "u1=" and the first semi-colon using a regex. For instance, given the following string: id=1w54;name=nick;u1=blue;u2=male;u3=ohio;u5=
The desired regex output should be just blue.
I tested (?<=u1=)[^;]* on https://regex101.com and it works. However, when I run this in BigQuery, using regexp_extract(string, '(?<=u1=)[^;]*') , I get an error that reads "Cannot parse regular expression: invalid perl operator: (?<"
I'm confused why this isn't working in BQ. Any help would be appreciated.
You can use regexp_extract() like this:
regexp_extract(string, 'u1=([^;]+)')
I've looked at lots of examples for TRIM and REPLACE on the internet and for some reason I keep getting errors when I try.
I need to strip suffixes from my Netsuite item record names in a saved item search. There are three possible suffixes: -T, -D, -S. So I need to turn 24335-D into 24335, and 24335-S into 24335, and 24335-T into 24335.
Here's what I've tried and the errors I get:
Can you help me please? Note: I can't assume a specific character length of the starting string.
Use case: We already have a field on item records called Nickname with the suffixes stripped. But I've ran into cases where Nickname is incorrect compared to Name. Ex: Name is 24335-D but Nickname is 24331-D. I'm trying to build a saved search alert that tells me any time the Nickname does not equal suffix-stripped Name.
PS: is there anywhere I can pay for quick a la carte Netsuite saved search questions like this? I feel bad relying on free technical internet advice but I greatly appreciate any help you can give me!
You are including too much SQL - a formulae is like a single result field expression not a full statement so no FROM or AS. There is another place to set the result column/field name. One option here is Regex_replace().
REGEXP_REPLACE({name},'\-[TDS]$', '')
Regex meaning:
\- : a literal -
[TDS] : one of T D or S
$ : end of line/string
To compare fields a Formulae (Numeric) using a CASE statement can be useful as it makes it easy to compare the result to a number in a filter. A simple equal to 1 for example.
CASE WHEN {custitem_nickname} <> REGEXP_REPLACE({name},'\-[TDS]$', '') then 1 else 0 end
You are getting an error because TRIM can trim only one character : see oracle doc
https://docs.oracle.com/javadb/10.8.3.0/ref/rreftrimfunc.html (last example).
So try using something like this
TRIM(TRAILING '-' FROM TRIM(TRAILING 'D' FROM {entityid}))
And always keep in mind that saved searches are running as Oracle SQL queries so Oracle SQL documentation can help you understand how to use the available functions.
I want to retrieve file names from urls in sql.
for example:
Input:
url:
https://www.google.co.in/root/subdir/file.extension?p1=v1&p2=v2
https://www.abxdhcak.com/sitemap-companies.xml
then Output should be:
file.extension
sitemap-companies.xml
To match your expected output you can use REGEXP_REPLACE
REGEXP_REPLACE(txt, '^.*/|\?.*$') as rg
This does 2 things:
'^.*/'
This removes all characters up to and including the last forward-slash in the string.
'\?.*$'
This removes all characters after and including a question mark.
This may not work for all cases, but it works for the examples provided.
This is probably a simple problem but unfortunately I wasn't able to get the results I wanted.
I have the following input line
A[C1234/3/4]b[123/0]C[123/0]d[123/0]E[123/0]d[http://google.com]AD[M/1/2]g[ab]
I want to retrieve the numbers using regex_extract in Hive
1/2
which is followed by "AD[M/ " in each case.
I am currently using
'\(AD([^)]+)\)' which gives output AD[M/1/2]g[ab]
Implementing any other like (//d*) is give a code 2 error. Please suggest the possible replacements
Try this regex
.*AD\[M\/(.*)\].*
by the way () should be the capturing bracket pair, not \(\)
I'm working with Google Big Query and try to extract some information from a string column into another column using Regexp_extract. In short:
Data in myVariable:
yippie/eggs-spam/?portlet:hungry=1234
yippie/eggs-spam/?portlet:hungry=456&portlet:hungrier=7890
I want a column with:
1234
456
My command:
SELECT Regexp_extract(myVariable, r'SOME_MAGIC') as result
FROM table
I tried for SOME_MAGIC:
hungry=(.*)[&$] - null, 456 (I learned that $ is interpreted as is)
hungry=(.*)(&|$) - Error: Exactly one capturing group must be specified
hungry=(.*)^& - null, null
hungry=(&.*)?$ - null, null
I read this, but there the number has a fixed length. Also looked at this, but "?=" is no known command for perl.
Does anybody have an idea? Thank you in advance!
I just found an answer to how I can solve my problem differently:
hungry=([0-9]+) - 1234, 456
It isn't an answer to my abstract question (regex for selecting Charater A to [Character B or EOL]), so it's not that satisfying. E.g. it won't work with
yippie/eggs-spam/?portlet:hungry=12AB34
However my original problem is solved. I leave the question open for a while in case somebody has a better answer.
I think I had a similar problem were I was trying to select the last 6 characters in a string (link_id) as a new column.
I kept getting this error:
Exactly one capturing group must be specified
My code originally was:
SELECT
...
REGEXP_EXTRACT(link_id, r'......$') AS updated_link_id
FROM sometable;
To get rid of the error and retrieve the correct substring as a column, I had to add parentheses around my regex string.
SELECT
...
REGEXP_EXTRACT(link_id, r'(......$)') AS updated_link_id
FROM sometable;