I am pretty new to using string operations in SQL(Redshift). I want to extract a part of a string from the strings of the following format:
I can have strings of the format:
http://bunnytalks.com/goingOn?name=Bunny&phone=2340
http://bunnytalks.com/goingOn?name=Bunny
http://bunnytalks.com/goingOn?name=Talks/whatson%goingOn%name%Bunny
http://bunnytalks.com/goingOn?name=Talks/whatson%goingOn%name%Bunny&phone=2340
The final output I need from any of the above strings when applying regex:
Bunny
From the above string examples, I can tell that I need a string between the last occurrence of a name followed by either = or % and the end of the string or before &
I need a regex/ any string operations in SQL that can achieve the above operations as shown in examples. Thanks in advance.
Try:
.*name[%=]([^&\n]+)
Regex demo.
.*name - match the last name
[%=] - followed by % or =
([^&\n]+) - match all non-&, non-\n characters as group 1
Related
how would I be able to grab the number 2627995 from this string
"hellotest/2627995?hl=en"
I want to grab the number 2627995, here is my current regex but it does not work when I use regex extract from big query
(\/)\d{7,7}
SELECT
REGEXP_EXTRACT(DESC, r"(\/)\d{7,7}")
AS number
FROM
`string table`
here is the output
Thank you!!
I think you just want to match all digits coming after the last path separator, before either the start of the query parameter, or the end of the URL.
SELECT REGEXP_EXTRACT(DESC, r"/(\d+)(?:\?|$)") AS number
FROM `string table`
Demo
Try this one: r"\/(\d+)"
Your code returns the slash because you captured it (see the parentheses in (\/)\d{7,7}). REGEXP_EXTRACT only returns the captured substring.
Thus, you could just wrap the other part of your regex with the parentheses:
SELECT
REGEXP_EXTRACT(DESC, r"/(\d{7})")
AS number
FROM
`string table`
NOTE:
In BigQuery, regex is specified with string literals, not regex literals (that are usually delimited with forward slashes), that is why you do not need to escape the / char (it is not a special regex metacharacter)
{7,7} is equal to {7} limiting quantifier, meaning seven occurrences.
Also, if you are sure the number is at the end of string or is followed with a query string, you can enhance it as
REGEXP_EXTRACT(DESC, r"/(\d+)(?:[?#]|$)")
where the regex means
/ - a / char
(\d+) - Group 1 (the actual output): one or more digits
(?:[?#]|$) - either ? or # char, or end of string.
how do I write a SQL where statement that checks if a string contains some substring and a number. For example:
string: macsea01
where string like 'macsea' plus a number
Regex is the most obvious solution to this question. Without more detail about the specific format of the string, I can suggest the following, which will match a sequence of a letter in the alphabet followed immediately by a digit:
where column_name like '%[a-zA-Z][0-9]%'
If you're literally looking for macsea at the beginning of the string followed by a digit, it would be:
where column_name like 'macsea[0-9]%'
Regex seem to bee a little slippery here, depending on your needs you can for instance divide the string into several parts, first the text part, and take the rest of the string, try to convert it into a number.
Somthing like this (but I think this perticular code is broken
where substring(column_name, 1, 6) = 'macsea' and cast(substring(column_name, 7, 1000) as int) > 0
what i'm looking for is a way for a way to check if a string follows a specific format.
Something Like:
if strPhoneNumber = format(XX-XXXX-XXXX) then
MsgBox("Correct")
else
MsgBox("Incorrect")
the specific format is Two Numbers, a Dash, Four Number, A Dash, Four Numbers: eg: 04-9567-3915
Thanks.
You want to parse a Regular Grammar, so use a regular expression:
Dim regex As New Regex( "\d\d\-\d\d\d\d\-\d\d\d\d" );
If regex.IsMatch( strPhoneNumber ) Then
MsgBox("correct")
End If
I need to pull upto a matching string with below combination: string starting with originMode till URBAN98D....F0F" from the string: version":"7.1.1","originMode":"URBAN98DC66F9-E141-408C-B6A5-99C727571F0F","ModeVersion":
I used below regex:
regexp_extract(string_content ,'^.*originMode\"\:\"(URBAN+)\"',0 )
I am able to pull till URBAN in case if I use the expression as:
regexp_extract(string_content ,'^.*originMode\"\:\"URBAN',0 )
Any help is highly appreciated.
You need to use a negated character class.
regexp_extract(string_content ,'^.*\boriginMode\"\:\"(URBAN[^\"]*)\"',0 )
I'm trying to parse this string 'Smith, Joe M_16282' to get everything before the comma, combined with everything after the underscore.
The resulting string would be: Smith16282
string longName = "Smith, Joe M_16282";
string shortName = longName.Substring(0, longName.IndexOf(",")) + longName.Substring(longName.LastIndexOf("_") + 1);
Notes:
The second "substring" doesn't need a length parameter, because we want everything after the underscore
The LastIndexOf is used instead of IndexOf in case there are other underscores appearing in the name such as "Smith_Jones, Joe M_16282"
This code assumes that there is at least one comma and at least one underscore in the string "longName." If not, the code fails. I will leave that checking to you if you need it.
As others have said, the simple approach for parsing a string like that would be to use the String's various parsing methods, such as IndexOf and SubString. If you want something more powerful and flexible, you may also want to consider using a RegEx replacement. For instance, you could do something like this:
Dim input As String = "Smith, Joe M_16282"
Dim pattern As String = "(.*?),.*?_(.*)"
Dim replacement As String = "$1$2"
Dim output As String = Regex.Replace(input, pattern, replacement)
Or, more simply:
Dim output As String = Regex.Replace("Smith, Joe M_16282", "(.*?),.*?_(.*)", "$1$2")
Here's the meaning of the pattern:
(.*?) - The first group capturing all of the characters before the comma
( - Starts the capturing group
. - This is a wildcard which matches any character
* - Specifies that the previous thing (any character) is repeated any number of times
? - Specifies that the * is non-greedy, meaning it won't match everything until the end of the string--it will only match until it finds the following comma
) - Ends the capturing group
, - The comma to look for
.*? - Says that there will be any number of any characters between the comma and the underscore which we don't care about
. - Any character
* - Any number of times
? - Until you find the underscore
_ - The underscore the look for
(.*) - The second group capturing all of the characters after the underscore
( - Starts the capturing group
. - Any character
* - Any number of times
) - Ends the capturing group
Here's the meaning of the replacement:
$1 - The value of all of the characters found in the first capturing group
$2 - The value of all of the characters found in the second capturing group
RegEx may be overkill for your particular situation, but it is a very handy tool to learn. One major advantage is that you could move the pattern and replacement values out into external settings in the app.config, or somewhere. Then, you could modify the replacement rules without recompiling your application.