I need to match two ipaddress/hostname with a regular expression:
Like 20.20.20.20
should match with 20.20.20.20
should match with [http://20.20.20.20/abcd]
should not match with 20.20.20.200
should not match with [http://20.20.20.200/abcd]
should not match with [http://120.20.20.20/abcd]
should match with AB_20.20.20.20
should match with 20.20.20.20_AB
At present i am using something like this regular expression: "(.*[^(\w)]|^)20.20.20.20([^(\w)].*|$)"
But it is not working for the last two cases. As the "\w" is equal to [a-zA-Z0-9_]. Here I also want to eliminate the "_" underscore. I tried different combination but not able to succeed. Please help me with this regular expression.
(.*[_]|[^(\w)]|^)10.10.10.10([_]|[^(\w)].*|$)
I spent some more time on this.This regular expression seems to work.
I don't know which language you're using, but with Perl-like regular expressions you could use the following, shorter expression:
(?:\b|\D)20\.20\.20\.20(?:\b|\D)
This effectively says:
Match word boundary (\b, here: the start of the word) or a non-digit (\D).
Match IP address.
Match word boundary (\b, here: the end of the word) or a non-digit (\D).
Note 1: ?: causes the grouping (\b|\D) not to create a backreference, i.e. to store what it has found. You probably don't need the word boundaries/non-digits to be stored. If you actually need them stored, just remove the two ?:s.
Note 2: This might be nit-picking, but you need to escape the dots in the IP address part of the regular expression, otherwise you'd also match any other character at those positions. Using 20.20.20.20 instead of 20\.20\.20\.20, you might for example match a line carrying a timestamp when you're searching through a log file...
2012-07-18 20:20:20,20 INFO Application startup successful, IP=20.20.20.200
...even though you're looking for IP addresses and that particular one (20.20.20.200) explicitly shouldn't match, according to your question. Admittedly though, this example is quite an edge case.
Related
Am trying to determine how one attempts to identify, in Snowflake SQL, if a product code begins with three letters.
Suggestions?
I did just try: LEFT(P0.PRODUCTCODE,3) NOT LIKE '[a-zA-Z]%' but it didn't work.
Thanks folks
You can use REGEXP_LIKE to return a boolean value indicating whether or not your string matched the pattern you're interested in.
In your case, something like REGEXP_LIKE(string_field_here, '[a-zA-Z]{3}.*')
Breaking down the regular expression pattern:
[a-zA-Z]: Only match letter characters, both upper and lowercase
{3}: Require three of those letters
.*: Allow any number of any characters after those three letters
Note: in many cases, you would need to specifically indicate the beginning/ending of the string in the pattern, but Snowflake's implementation handles that for you. From the docs:
The function implicitly anchors a pattern at both ends (i.e. ''
automatically becomes '^$', and 'ABC' automatically becomes '^ABC$').
To match any string starting with ABC, the pattern would be 'ABC.*'.
You can try running these examples:
SELECT REGEXP_LIKE('abc', '[a-zA-Z]{3}.*') AS _abc,
REGEXP_LIKE('123', '[a-zA-Z]{3}.*') AS _123,
REGEXP_LIKE('abc123', '[a-zA-Z]{3}.*') AS _abc123,
REGEXP_LIKE('123abc', '[a-zA-Z]{3}.*') AS _123abc
I don't fully understand, why the results are different here. Does :ov apply only to <left>, so having found the longest match it wouldn't do anything else?
my regex left {
a | ab
}
my regex right {
bc | c
}
"abc" ~~ m:ex/<left><right>
{put $<left>, '|', $<right>}/; # 'ab|c' and 'a|bc'
say '---';
"abc" ~~ m:ov/<left><right>
{put $<left>, '|', $<right>}/; # only 'ab|c'
Types of adverbs
It's important to understand that there are two different types of regex adverbs:
Those that fine-tune how your regex code is compiled (e.g. :sigspace/:s, :ignorecase/:i, ...). These can also be written inside the regex, and only apply to the rest of their lexical scope within the regex.
Those that control how regex matches are found and returned (e.g. :exhaustive/:ex, :overlap/:ov, :global/:g). These apply to a given regex matching operation as a whole, and have to be written outside the regex, as an adverb of the m// operator or .match method.
Match adverbs
Here is what the relevant adverbs of the second type do:
m:ex/.../ finds every possible match at every possible starting position.
m:ov/.../ finds the first possible match at every possible starting position.
m:g/.../ finds the first possible match at every possible starting position that comes after the end of the previous match (i.e., non-overlapping).
m/.../ finds the first possible match at the first possible starting position.
(In each case, the regex engine moves on as soon as it has found what it was meant to find at any given position, that's why you don't see additional output even by putting print statements inside the regexes.)
Your example
In your case, there are only two possible matches: ab|c and a|bc.
Both start at the same position in the input string, namely at position 0.
So only m:ex/.../ will find both of them – all the other variants will only find one of them and then move on.
:ex will find all possible combinations of overlapping matches.
:ov acts like :ex except that it limits the search algorithm by constraining it to find only a single match for a given starting position, causing it to produce a single match for a given length. :ex is allowed to start from the very beginning of the string to find a new unique match, and so it may find several matches of length 3; :ov will only ever find exactly one match of length 3.
Documentation:
https://docs.perl6.org/language/regexes
Exhaustive:
To find all possible matches of a regex – including overlapping ones – and several ones that start at the same position, use the :exhaustive (short :ex) adverb
Overlapping:
To get several matches, including overlapping matches, but only one (the longest) from each starting position, specify the :overlap (short :ov) adverb:
I have a test string such as: The Sun and the Moon together, forever
I want to be able to type a few characters or words and be able to match this string if the characters appear in the correct sequence together, even if there are missing words. For example, the following search word(s) should all match against this string:
The Moon
Sun tog
Tsmoon
The get ever
What regex pattern should I be using for this? I should add that the supplied test strings are going to be dynamic within an app, and so I'd like to be able to use a pattern based on the search string.
From your example Tsmoon you show partial words (T), ignoring case (s, m) and allow anything between each entered character. So as a first attempt you can:
Set the ignore case option
Between each chapter input insert the regular expression to match zero or more of anything. You can choose whether to match the shortest or longest run.
Try that, reading the documentation for NSRegularExpression if you're stuck, and see how it goes. If you get stuck ask a new question showing your code and the RE constructed and explain what happens/doesn't work as expected.
HTH
Using Regular Extractor in JMeter, I need to get the value of "fullBkupUNIXTime" from the below response,
{"fullBackupTimeString":["Mon 10 Apr 2017 14:14:36"],"fullBkupUNIXTime":["1491833676"],"fullBackupDirName":["10_04_2017_0636"]}
I tried with Ref Name as time and
Regular Expression: "fullBkupUNIXTime": "([0-9])" and "(.+?)"
and pass them as input for 2nd request ${time}
The above 2 two doesn't work out for me.
Please Help me out of this.
First of all: why not just use this thing?
Then, if you firm with your RegExp adventure to get happen.
First expression is not going to work because you've defined it to match exactly one [0-9] charcter.
Add the appropriate repetition character, like "fullBkupUNIXTime": "([0-9]+)".
And basically it make sense to tell the engine to stop at first narrowest match too: "fullBkupUNIXTime": "([0-9]+?)"
Next, make sure you're handling space chars between key and value and colon mark properly. Better mark them explicitly, if any, with \s
And last but not least: make sure you're properly handle multiple lines (if appropriate, of course). Add the (?m) modifier to your expression.
And/or (?im) to be not case-sensitive, in addition.
[ is a reserve character in regex, you need to escape it, in your case use:
Regular Expression fullBkupUNIXTime":\["(\d+)
Template: $1$
Match No.: 1
I'd like to create a regular expression such that when I compare the a string against an array of strings, matches are returned with the regex ignoring certain characters.
Here's one example. Consider the following array of names:
{
"Andy O'Brien",
"Bob O'Brian",
"Jim OBrien",
"Larry Oberlin"
}
If a user enters "ob", I'd like the app to apply a regex predicate to the array and all of the names in the above array would match (e.g. the ' is ignored).
I know I can run the match twice, first against each name and second against each name with the ignored chars stripped from the string. I'd rather this by done by a single regex so I don't need two passes.
Is this possible? This is for an iOS app and I'm using NSPredicate.
EDIT: clarification on use
From the initial answers I realized I wasn't clear. The example above is a specific one. I need a general solution where the array of names is a large array with diverse names and the string I am matching against is entered by the user. So I can't hard code the regex like [o]'?[b].
Also, I know how to do case-insensitive searches so don't need the answer to focus on that. Just need a solution to ignore the chars I don't want to match against.
Since you have discarded all the answers showing the ways it can be done, you are left with the answer:
NO, this cannot be done. Regex does not have an option to 'ignore' characters. Your only options are to modify the regex to match them, or to do a pass on your source text to get rid of the characters you want to ignore and then match against that. (Of course, then you may have the problem of correlating your 'cleaned' text with the actual source text.)
If I understand correctly, you want a way to match the characters "ob" 1) regardless of capitalization, and 2) regardless of whether there is an apostrophe in between them. That should be easy enough.
1) Use a case-insensitivity modifier, or use a regexp that specifies that the capital and lowercase version of the letter are both acceptable: [Oo][Bb]
2) Use the ? modifier to indicate that a character may be present either one or zero times. o'?b will match both "o'b" and "ob". If you want to include other characters that may or may not be present, you can group them with the apostrophe. For example, o['-~]?b will match "ob", "o'b", "o-b", and "o~b".
So the complete answer would be [Oo]'?[Bb].
Update: The OP asked for a solution that would cause the given character to be ignored in an arbitrary search string. You can do this by inserting '? after every character of the search string. For example, if you were given the search string oleary, you'd transform it into o'?l'?e'?a'?r'?y'?. Foolproof, though probably not optimal for performance. Note that this would match "o'leary" but also "o'lea'r'y'" if that's a concern.
In this particular case, just throw the set of characters into the middle of the regex as optional. This works specifically because you have only two characters in your match string, otherwise the regex might get a bit verbose. For example, match case-insensitive against:
o[']*b
You can add more characters to that character class in the middle to ignore them. Note that the * matches any number of characters (so O'''Brien will match) - for a single instance, change to ?:
o[']?b
You can make particular characters optional with a question mark, which means that it will match whether they're there or not, e.g:
/o\'?b/
Would match all of the above, add .+ to either side to match all other characters, and a space to denote the start of the surname:
/.+? o\'?b.+/
And use the case-insensitivity modifier to make it match regardless of capitalisation.