SQL Server : PATINDEX using pattern - sql

I am using PATINDEX to replace stray characters in string (column in my table).
select PATINDEX('%[^A-Z,a-z0-9 -()&_/\.]%', 'This has a stray character$')
I am puzzled as to why the result is 0 (I am expecting 27): $ is not in the pattern that I want to keep. Any insights?

The thing is that dash - is used in pattern matching. So you have to escape it:
select PATINDEX('%[^A-Z,a-z0-9 \-()&_/\.]%', 'This has a stray character$')

I would escape all special character:
select PATINDEX('%[^A-Z,a-z0-9 \-\(\)\&\_\/\\\.]%', 'This has a stray character$')
LiveDemo

Related

Semicolon in LIKE is not working for SQL server 2017

I have a query like this which is not retrieving the values from DB table even if the required value exist there.
Here's the query, which return zero rows:
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\;'',./%'
Following is the value in the table:
'!##$%&*()-_=+{}|:"<>?[]\;'',./'
When I run the query without ";" it is returning the value.
Can any one help me in figuring this out?
Thanks
Ritu
You are using multiple characters which are reserved when using LIKE statement.
i.e. %, _, []
Use the escape character clause (where I have used backtick to treat special characters as regular) such as
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' escape '`'
The value in your table is:
!##$%&*()-_=+{};; :"<>?[]\;'',./
And the one in the like is:
(!##$%&*()-_=+{};;
Starting with ( it will never match, also you should scape the percent (%) in the middle of the string like this:
Select *
FROM SitePanel_FieldValue
WHERE SiteFieldIdfk =111
AND SiteFieldvalue like '%!##$\%&*()-_=+{};;%' ESCAPE '\'
The problem is your brackets ([]), it has nothing to do with semicolons. If we remove the brackets, the above works:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?\;'',./%' THEN 1 END AS WithoutBrackets,
CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-_=+{}|:"<>?[]\;'',./%' THEN 1 END AS WithBrackets
Notice that WithoutBrackets returns 1, where as WithBrackets returns NULL.
Brackets in a LIKE are to denote a pattern. For example SomeExpress LIKE '[ABC]' would match the characters, A, B, and C. If you are going to include special characters, you need to ESCAPE them. You have both brackets, a percent sign (%) and an underscore (_) you need to escape. You don't need to escape the hyphen (-), as it doesn't appear in a pattern (for example [A-Z]). I choose to use a backtick as the ESCAPE character, as it doesn't appear in your string, and demonstrate with a CASE expression again:
SELECT CASE WHEN '!##$%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$`%&*()-`_=+{}|:"<>?`[`]\;'',./%' ESCAPE '`' THEN 1 END;
If you wanted to use a backslash (\ ), which many do, you would need to also escape the backslash in your string:
SELECT CASE WHEN '!##$\%&*()-_=+{}|:"<>?[]\;'',./' LIKE '%!##$%&*()-\_=+{}|:"<>?\[\]\\;'',./%' ESCAPE '\' THEN 1 END;
db<>fiddle
I think the issue is actually with the backslash. This is an escape character and so if you want it to be included, you have to put it in twice.
Select * from SitePanel_FieldValue WHere SiteFieldIdfk =111
And SiteFieldvalue like '%!##$%&*()-_=+{}|:"<>?[]\\;'',./%'

How can I replace a string pattern with blank in hive?

I have a string as:
https://maps.googleapis.com/maps/api/staticmap?center=41.892532+-87.63811&zoom=11&scale=2&size=280x320&maptype=roadmap&format=png&visual_refresh=true%7C&markers=size:mid%7Ccolor:0x8000ff%7Clabel:1%7C2413+S+State+St++Chicago+IL+60616%7C&markers=size:mid%7Ccolor:0x8000ff%7Clabel:2%7C3000+N+Halsted+St++Chicago+IL+60657%7C&markers=size:mid%7Ccolor:0x8000ff%7Clabel:3%7C++++&key=AIzaSyBNEAQcC5niAEeiP3zkA_nuWGvtl0IOEs4
I want to replace the '++++' pattern at the end with blank and not the single occurrence of '+'. Tried using regexp_replace and translate functions in hive but that replaces all the single occurrences of '+' as well.
Use
regexp_replace(string,'[+]{4}','')
Pattern '[+]{4}' means + caracter four times.
Test:
select regexp_replace('++markers=size:mid%7Ccolor:0x8000ff%7Clabel:3%7C++++&','[+]{4}','');
Result:
OK
++markers=size:mid%7Ccolor:0x8000ff%7Clabel:3%7C&
Dod you try this?
replace(string, '++++', '')
Admittedly, this will replace all occurrences of '++++', but your string only has one of them.

Postgresql: Extracting substring after first instance of delimiter

I'm trying to extract everything after the first instance of a delimiter.
For example:
01443-30413 -> 30413
1221-935-5801 -> 935-5801
I have tried the following queries:
select regexp_replace(car_id, E'-.*', '') from schema.table_name;
select reverse(split_part(reverse(car_id), '-', 1)) from schema.table_name;
However both of them return:
01443-30413 -> 30413
1221-935-5801 -> 5801
So it's not working if delimiter appears multiple times.
I'm using Postgresql 11. I come from a MySQL background where you can do:
select SUBSTRING(car_id FROM (LOCATE('-',car_id)+1)) from table_name
Why not just do the PG equivalent of your MySQL approach and substring it?
SELECT SUBSTRING('abcdef-ghi' FROM POSITION('-' in 'abcdef-ghi') + 1)
If you don't like the "from" and "in" way of writing arguments, PG also has "normal" comma separated functions:
SELECT SUBSTR('abcdef-ghi', STRPOS('abcdef-ghi', '-') + 1)
I think that regexp_replace is appropriate, but using the correct pattern:
select regexp_replace('1221-935-5801', E'^[^-]+-', '');
935-5801
The regex pattern ^[^-]+- matches, from the start of the string, one or more non dash characters, ending with a dash. It then replaces with empty string, effectively removing this content.
Note that this approach also works if the input has no dashes at all, in which case it would just return the original input.
Use this regexp pattern :
select regexp_replace('1221-935-5801', E'^[^-]+-', '') from schema.table_name
Regexp explanation :
^ is the beginning of the string
[^-]+ means at least one character different than -
...until the - character is met
I tried it in a conventional way in general what we do (found
something similar to instr as strpos in postgrsql .) Can try the below
SELECT
SUBSTR(car_id,strpos(car_id,'-')+1,
length(car_id) ) from table ;

Regex not matching correct string

I am busy building a lookup table for specific names of merchants. I tried to make use of the following regex but it's returning less results than the standard "like" function in Netezza SQL. Please refer to below:
SQL Like function: where trim(upper(a.MRCH_NME)) like '%CNA %' -- returns 4622 matches
Regex function in Netezza SQL: where array_combine(regexp_extract_all(trim(upper(a.MRCH_NME)),'.*CNA\s','i'),'|') = 'CNA' -- returns 2226 matches
I looked at the two result sets and found that strings such as the following aren't matched:
!C CNA INT ARR
*CNA PLATZ 0400
015764 CNA CRAD
C#CNA PARK 0
I made use of the following regex expression: /.*CNA\s'/
Any idea why the above strings aren't being returned as matches?
Thank you.
You probably should be using regexp_like:
SELECT *
FROM yourTable
WHERE REGEXP_LIKE(MRCH_NME, 'CNA[ ]', 'i');
This would be logically identical to the following query using LIKE:
SELECT *
FROM yourTable
WHERE MRCH_NME LIKE '%CNA ';
It seems to me the problem is more with your code rather than the regex. Look: like '%CNA %' returns all entries that contain a CNA substring followed with a literal space anywhere inside the entry. The '.*CNA\s' regex matches any 0+ chars other than newline followed with CNA and **any whitespace char*.
Acc. to this reference, \s matches "a white space character. White space is defined as [\t\n\f\r\p{Z}].
Thus, you should in fact just use
WHERE REGEXP_LIKE(MRCH_NME, 'CNA ', 'i')
or, better with a word boundary check:
WHERE REGEXP_LIKE(MRCH_NME, '\bCNA\b', 'i')
where \b marks a transition from a word to non-word and non-word to word character, thus ensuring a whole word search and justifying the regex usage.
If you do not need to match the merchant name as a whole word, use the regular LIKE with '%CNA %', it should be more efficient.

How to escape '|' in SQL Server or T-SQL Query?

SELECT *
FROM performance_table
WHERE ad_group like '%|%'
I have no idea on how to escape the Pipe operator here.
You don't need to escape | in T-SQL as it has no special meaning inside like. However, if for example you would like to find texts containing % character, what you're looking for is:
SELECT *
FROM performance_table
WHERE ad_group like '%#%%' escape '#'
where escape defines escape character.
The pipe character does not need to be escaped.
Your query will find all records that contain a pipe character in the ad_group column.
When used inside a string literal ('|'), the character is not treated as an operator. Its function as an operator is bitwise OR, as for example in
select 8|3
will be 11.