T-SQL Contains not working for values with point as 2nd or 3rd symbol - sql

There is a SQL statement, generated by stored procedure, looking like this:
SELECT Id, Name
FROM UInstitutions as UI
WHERE Contains(UI.Name, #ParamName)
It seems that if value has dot (.) as second or third symbol, it's unsearchable, when searching by exact match or substring. E.g.:
[dbo].[FindRecords] N'b.la bla'
or
[dbo].[FindRecords] N'bl.a bla'
return nothing, while
[dbo].[FindRecords] N'bla. bla'
returns
Id Name
--------------
1388 b.la bla
1389 bl.a bla
1386 bla bla
1390 bla. bla
What could be the reason for this, and how to fix it?

Per MSDN for the Contains statement:
Punctuation is ignored. Therefore, CONTAINS(testing, "computer failure") matches a row with the value, "Where is my computer? Failure to find it would be expensive." For more information on word-breaker behavior, see Configure and Manage Word Breakers and Stemmers for Search.
CONTAINS (Transact-SQL)
Also see: Configure and Manage Word Breakers and Stemmers for Search

Related

regex findText with exception

I have a jenkins pipeline in which I've always been searching for a string in case I need to update my build to failure.
findText regexp: 'ERROR', alsoCheckConsoleOutput: true, unstableIfFound: true
https://www.jenkins.io/doc/pipeline/steps/text-finder/
I now need for example to make an exception to allow certain cases of ERROR to be handled differently.
I was looking at this question Regular expression to match a line that doesn't contain a word and this question Regex with exception of particular words but I still could not make it work.
I really suck at regex and trying to play with the different websites I'm struggling a bit with the negatives lookup parts.
Basically I want to ensure that
This is an ERROR bla bla bla
Is an error caught by my regex
But I need to have this as not an error because XYZ shows up before
This XYZ is not an ERROR bla bla bla because of the prefix
I was trying something like this but not really close.
(ERROR)+?:(XYZ)
Basically ERROR can appear more than once in my output lines but XYZ should never be found. If it helps XYZ will always show up before the ERROR
Any idea?

regex not working correctly when the test is fine

For my database, I have a list of company numbers where some of them start with two letters. I have created a regex which should eliminate these from a query and according to my tests, it should. But when executed, the result still contains the numbers with letters.
Here is my regex, which I've tested on https://www.regexpal.com
([^A-Z+|a-z+].*)
I've tested it against numerous variations such as SC08093, ZC000191 and NI232312 which shouldn't match and don't in the tests, which is fine.
My sql query looks like;
SELECT companyNumber FROM company_data
WHERE companyNumber ~ '([^A-Z+|a-z+].*)' order by companyNumber desc
To summerise, strings like SC08093 should not match as they start with letters.
I've read through the documentation for postgres but I couldn't seem to find anything regarding this. I'm not sure what I'm missing here. Thanks.
The ~ '([^A-Z+|a-z+].*)' does not work because this is a [^A-Z+|a-z+].* regex matching operation that returns true even upon a partial match (regex matching operation does not require full string match, and thus the pattern can match anywhere in the string). [^A-Z+|a-z+].* matches a letter from A to Z, +,|or a letter fromatoz`, and then any amount of any zero or more chars, anywhere inside a string.
You may use
WHERE companyNumber NOT SIMILAR TO '[A-Za-z]{2}%'
See the online demo
Here, NOT SIMILAR TO returns the inverse result of the SIMILAR TO operation. This SIMILAR TO operator accepts patterns that are almost regex patterns, but are also like regular wildcard patterns. NOT SIMILAR TO '[A-Za-z]{2}%' means all records that start with two ASCII letters ([A-Za-z]{2}) and having anything after (%) are NOT returned and all others will be returned. Note that SIMILAR TO requires a full string match, same as LIKE.
Your pattern: [^A-Z+|a-z+].* means "a string where at least some characters are not A-Z" - to extend that to the whole string you would need to use an anchored regex as shown by S-Man (the group defined with (..) isn't really necessary btw)
I would probably use a regex that specifies want the valid pattern is and then use !~ instead.
where company !~ '^[0-9].*$'
^[0-9].*$ means "only consists of numbers" and the !~ means "does not match"
or
where not (company ~ '^[0-9].*$')
Not start with a letter could be done with
WHERE company ~ '^[^A-Za-z].*'
demo: db<>fiddle
The first ^ marks the beginning. The [^A-Za-z] says "no letter" (including small and capital letters).
Edit: Changed [A-z] into the more precise [A-Za-z] (Why is this regex allowing a caret?)

Match beginning of string with lookbehind and named group

help needed to match full message in a Lookbehind.
Lets say i have the following simplified string:
1 hostname Here is some Text
at the beginning i could have 1 or 2 digits followed by space, which i would ignore.
then i need the first word captured as "host"
and then i would like to look behind to the first space, so that capture group "message" has everything starting after the first 2 digits and space. i.e. "hostname Here is some Text"
my regex is:
^[1-9]\d{0,2}\s(?<host>[\w][\w\d\.#-]*)\s(?<message>(?<=\s).*$)
this gives me:
host = "hostname"
message = "Here is some Text"
I can't figure out how my lookbehind needs to look like.
Thanks for your help.
ok, i found it. What needs to be done is to put message as the first group, and everything else, including the other groups inside the message group:
^[1-9]\d{0,2}\s(?<message>(?<host>[\w][\w\d\.#-]*)\s.*$)

Regex to match BIN ranges

I'm trying to write a regex that matches the numbers 456725 to 456744 (Last 2 digits, 25-44), but can't seem to figure out a correct regex format. I've tried ^(4567[2-4][0-9]) but using this also matches 456745 which it shouldn't.
If you do it like ^(4567[2-4][0-9]), you are allowing any number in the range between [2-4] together with any number in the range between [0-9], which is obviously not what you wanted.
So you need to change for something like:
^4567(?:2[5-9]|3[0-9]|4[0-4])
Explanation
^ asserts position at start of the string
4567 matches the characters 4567 literally
Non-capturing group (?:2[5-9]|3[0-9]|4[0-4])
1st Alternative 2[5-9]
2 matches the character 2 literally
Match a single character present in the list [5-9]
2nd Alternative 3[0-9]
3 matches the character 3 literally
Match a single character present in the list [0-9]
3rd Alternative 4[0-4]
4 matches the character 4 literally
Match a single character present in the list [0-4]
You could use the page regex101 to learn more and read good explanations on the subject. Hope it helps.
If your variable is just an integer it is best to just compare it as such...
For the regex though..the ^(4567 is correct your issue is the [2-4] and [0-9] those are independent of each other. You need to put the pieces together so only 25-29 and 40-44 are allowed.
This should get you on the right track:
^(4567(?:2[5-9]|3[0-9]|4[0-4]))$

How should a string be matched with a regular expression in Objective C

I'm finding it hard to match strings using NSRegularExpression. Generic alpha characters are not a problem with [a-z] but if I need to match a word like 'import' I'm struggling to make it work. I'm sure I have to escape the word in some manner but I can't find any docs around this. A really basic example would be
{{import "hello"}}
where I want to get hold of the string: hello
edit: to clarify - 'hello' could be any string - it's the bit I want returned
This regular expression matches the text between the "-s in your example:
\{\{import "([^"]+)"\}\}
The match will be stored in the first match group.