How to use Regex to lowercase catalogue values without any logic codes - sql

For a loan domain we pass some catalogue values eg. if a customer is primary or secondary customer like that. So i need to check the values irrespective of uppercase, lowercase, camelcase. Software which i am using will accept only regex codes not any Java, js codes (it is different scripting). I am trying to convert only with regexp but still getting error.
If catalogue_value ~"(/A-Z/)" then
Catalogue_value ~"/l"
Endif
As i am learning regex as of now still figuring for correct expressions to use.
Kindly please tell me correct format to use regex to change into lowercase / uppercase

If i understood your problem you want to search without worrying about the case, for example the data is Paul, and you want to find this record searching by PAUL, paul, PaUl, etc?
One common to technique to do that is to put both sides all in upper or lower case, without regex, for example, in javascript:
"Paul".toLowerCase() === "paUL".toLowerCase()
In SQL:
select case when LOWER('Paul') = LOWER('paUL') then 1 else 0 end

Related

Regex matching everything except specific words

I have looked through the other questions asked on excluding regex, but I was unable to find the answer to my question.
I have the SQL statement
select --(* vendor(microsoft), product(odbc) guid'12345678-1234-1234-1234-123456789012' *)-- from TAB
With regex, I want to find every single character in that string, except
--(* vendor(microsoft), product(odbc)
and
*)--
The vendor and product names (microsoft and odbc) could be anything as well, I still want to exclude it.
I don't care what kind of characters there are, or if the SQL statement is even syntactically correct. The string could look like this, and I still want to find everything, including whitespaces, excluding what I mentioned above:
{Jane Doe?= --(* vendor(micro1macro2?), product(cdb!o) 123$% --(**) *)-- = ?
So far, I have this expression:
(--\(\* vendor\(.*\), product\(.*?\))|(\*\)--)
Which seems to work in finding what I want to exclude https://regex101.com/r/rMbYHz/204. However, I'm unable to negate it.
Does replace() do what you want?
select replace(replace(t.col, '--(* vendor(microsoft), product(odbc)', ''
), '*)--', ''
)

Usage of Regular Expression Extractor JMeter?

Using Regular Extractor in JMeter, I need to get the value of "fullBkupUNIXTime" from the below response,
{"fullBackupTimeString":["Mon 10 Apr 2017 14:14:36"],"fullBkupUNIXTime":["1491833676"],"fullBackupDirName":["10_04_2017_0636"]}
I tried with Ref Name as time and
Regular Expression: "fullBkupUNIXTime": "([0-9])" and "(.+?)"
and pass them as input for 2nd request ${time}
The above 2 two doesn't work out for me.
Please Help me out of this.
First of all: why not just use this thing?
Then, if you firm with your RegExp adventure to get happen.
First expression is not going to work because you've defined it to match exactly one [0-9] charcter.
Add the appropriate repetition character, like "fullBkupUNIXTime": "([0-9]+)".
And basically it make sense to tell the engine to stop at first narrowest match too: "fullBkupUNIXTime": "([0-9]+?)"
Next, make sure you're handling space chars between key and value and colon mark properly. Better mark them explicitly, if any, with \s
And last but not least: make sure you're properly handle multiple lines (if appropriate, of course). Add the (?m) modifier to your expression.
And/or (?im) to be not case-sensitive, in addition.
[ is a reserve character in regex, you need to escape it, in your case use:
Regular Expression fullBkupUNIXTime":\["(\d+)
Template: $1$
Match No.: 1

Is it possible to ignore characters in a string when matching with a regular expression

I'd like to create a regular expression such that when I compare the a string against an array of strings, matches are returned with the regex ignoring certain characters.
Here's one example. Consider the following array of names:
{
"Andy O'Brien",
"Bob O'Brian",
"Jim OBrien",
"Larry Oberlin"
}
If a user enters "ob", I'd like the app to apply a regex predicate to the array and all of the names in the above array would match (e.g. the ' is ignored).
I know I can run the match twice, first against each name and second against each name with the ignored chars stripped from the string. I'd rather this by done by a single regex so I don't need two passes.
Is this possible? This is for an iOS app and I'm using NSPredicate.
EDIT: clarification on use
From the initial answers I realized I wasn't clear. The example above is a specific one. I need a general solution where the array of names is a large array with diverse names and the string I am matching against is entered by the user. So I can't hard code the regex like [o]'?[b].
Also, I know how to do case-insensitive searches so don't need the answer to focus on that. Just need a solution to ignore the chars I don't want to match against.
Since you have discarded all the answers showing the ways it can be done, you are left with the answer:
NO, this cannot be done. Regex does not have an option to 'ignore' characters. Your only options are to modify the regex to match them, or to do a pass on your source text to get rid of the characters you want to ignore and then match against that. (Of course, then you may have the problem of correlating your 'cleaned' text with the actual source text.)
If I understand correctly, you want a way to match the characters "ob" 1) regardless of capitalization, and 2) regardless of whether there is an apostrophe in between them. That should be easy enough.
1) Use a case-insensitivity modifier, or use a regexp that specifies that the capital and lowercase version of the letter are both acceptable: [Oo][Bb]
2) Use the ? modifier to indicate that a character may be present either one or zero times. o'?b will match both "o'b" and "ob". If you want to include other characters that may or may not be present, you can group them with the apostrophe. For example, o['-~]?b will match "ob", "o'b", "o-b", and "o~b".
So the complete answer would be [Oo]'?[Bb].
Update: The OP asked for a solution that would cause the given character to be ignored in an arbitrary search string. You can do this by inserting '? after every character of the search string. For example, if you were given the search string oleary, you'd transform it into o'?l'?e'?a'?r'?y'?. Foolproof, though probably not optimal for performance. Note that this would match "o'leary" but also "o'lea'r'y'" if that's a concern.
In this particular case, just throw the set of characters into the middle of the regex as optional. This works specifically because you have only two characters in your match string, otherwise the regex might get a bit verbose. For example, match case-insensitive against:
o[']*b
You can add more characters to that character class in the middle to ignore them. Note that the * matches any number of characters (so O'''Brien will match) - for a single instance, change to ?:
o[']?b
You can make particular characters optional with a question mark, which means that it will match whether they're there or not, e.g:
/o\'?b/
Would match all of the above, add .+ to either side to match all other characters, and a space to denote the start of the surname:
/.+? o\'?b.+/
And use the case-insensitivity modifier to make it match regardless of capitalisation.

Change Url using Regex

I have url, for example:
http://i.myhost.com/myimage.jpg
I want to change this url to
http://i.myhost.com/myimageD.jpg.
(Add D after image name and before point)
i.e I want add some words after image name and before point using regex.
What is the best way do it using regex?
Try using ^(.*)\.([a-zA-Z]{3,5}) and replacing with \1D\2. I'm assuming the extension is 3-5 alphanumeric numbers but you can modify it to suit. E.g. if it's just jpg images then you can put that instead of the [a-zA-Z]{3,5}.
Sounds like a homework question given the solution must use a regex, on that assumption here is an outline to get you going.
If all you have is a URL then #mathematical.coffee's solution will suit. However if you have a chunk of text within which is one or more URLs and you have to locate and change just those then you'll need something a little more involved.
Look at the structure of a URL: {protocol}{address}{item}; where
{protocol} is "http://", "ftp://" etc.;
{address} is a name, e.g. "www.google.com", or a number, e.g. "74.125.237.116" - there will always be at least one dot in the address; and
{item} is "/name" where name is quite flexible - there will be zero or more items, you can think of them as directories and a file but this isn't strictly true. Also the sequence of items can end in a "/" (including when there are zero of them).
To make a regex which matches a URL start by matching each part. In the case of the items you'll want to match the last in the sequence separately - you'll have zero or more "directories" and one "file", the latter must be of the form "name.extension".
Once you have regexes for each part you just concatenate them to produce a regex for the whole. To form the replacement pattern you can surround parts of your regex with parentheses and refer to those parts using \number in the replacement string - see #mathematical.coffee's solution for an example.
The best way to learn regexs is to use an editor which supports them and just experiment. The exact syntax may not be the same as NSRegularExpression but they are mostly pretty similar for the basic stuff and you can translate from one to another easily.

What's the best way to parse an Address field using t-sql or SSIS?

I have a data set that I import into a SQL table every night. One field is 'Address_3' and contains the City, State, Zip and Country fields. However, this data isn't standardized. How can I best parse the data that is currently going into 1 field into individual fields. Here are some examples of the data I might receive:
'INDIANAPOLIS, IN 46268 US'
'INDIANAPOLIS, IN 46268-1234 US'
'INDIANAPOLIS, IN 46268-1234'
'INDIANAPOLIS, IN 46268'
Thanks in advance!
David
I've done something similar (not in T-SQL) and I find it works best to start at the end of the string and work backwards.
Grab the rightmost element up to the first space or comma.
Is it a known country code? It's a country
If not, is it all numeric (including a hyphen)? It's a zip code.
Else discard it
Grab the second rightmost element up to the next space or comma
Is it a two alpha-character field? It's the state
Grab everything else preceding the last comma and call it the city.
You'll need to make some adjustments based on what your input data looks like but the basic idea is to start from the right, grab the elements you can easily classify and call everything else the city.
You can implement something like this by using the REVERSE function to make searching easier (in which case you'll be parsing the string from left to right instead of right to left like I said above), the PATINDEX or CHARINDEX functions to find spaces and commas, and the SUBSTRING function to pull the address apart based on the positions found by PATINDEX and CHARINDEX. You could use the ASCII function to determine if a character is numeric or not.
You tagged your question with the SSIS tag as well - it might be easier to implement the parsing in some VB script in SSIS rather than try to do it with T-SQL.
By far the best way is to not reinvent the wheel and get an address parsing and standardization engine. Ideally, you would use a CASS certified engine which is what is approved by the Postal Service. However, there are free address parsers on the net these days and any of those would be more accurate and less frustrating than trying to parse the address yourself.
That said, I will say that address parsers and the Post Office work from bottom up (So, country, then zip code, then city, then state then address line 2 etc.).
In SSIS you can have 4 derived columns (city,state,zip,country).
substring(column,1,FINDSTRING(",",column,1)-1) --city
substring(column,FINDSTRING(" ",column,1)+1,FINDSTRING("",column,2)-1) --state
substring(column,FINDSTRING(" ",column,2)+1,FINDSTRING(" ",column,3)-1) -- zip
You can see the pattern above and continue accordingly. This might get a bit complicated. You can use a Script Component to better pull out the lines of text.
something like this should help:
select substring(CityStateZip, 1,
case when charindex(',',reverse(CityStateZip)) = 0 then len(CityStateZip)
else len(CityStateZip) - charindex(',',reverse(CityStateZip)) end) as City,
LEFT(LTRIM(
SUBSTRING(CityStateZip, case when charindex(',',reverse(CityStateZip)) = 0 then len(CityStateZip) else
len(CityStateZip) - charindex(',',reverse(CityStateZip))+2 end, LEN(CityStateZip)))
,2) as State,
SUBSTRING(CityStateZip, case when charindex(' ',reverse(CityStateZip)) = 0 then len(CityStateZip) else
len(CityStateZip) - charindex(' ',reverse(CityStateZip))+2 end, LEN(CityStateZip)) as Zip
from YourAddressTable