SQL Server Like Pattern - sql

I want to match custom pattern in one of the column in a SQL Server database. The problem is I don't know the exact pattern length.
I want only those rows which has 'function' and 'alphanumeric pattern' which has min 5 and max 8 characters. Starting and ending characters are not fixed, not case sensitive.
Column value looks like this:
Row Value
--------------------------------------------------------------------
1 I have a single own function and its namely 123BA689,BAS54256
2 Everyone has base function AFD12,CHD12234
3 Nicole has its own ASS1256902,25ADFG2
Desired output:
Row Value
--------------------------------------------------------------------
1 I have a single own function and its namely 123BA689,BAS54256a
2 Everyone has base function AFD12,CHD1223465AS
I have tried Like and regex to match pattern but failed.
Does anybody know how to fix it?
select *
from ab
where lower(ab.a) like '%function' and '%[a-z0-9]{6}%'
Thanks.

SQL Server doesn't support regular expressions. You could conceptually do what you want with:
where lower(ab.a) like '%function%' and
lower(ab.a) like '%[a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]%' and
lower(ab.a) not like '%[a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]%'
However, this will return any string that has "function", because that is an alphanumeric patter with 5-8 characters.

Related

How to filter String in where clause

I would like to extract the string using where clause in SAP HANA.For an example,these are 3 strings for name column.
123._SYS_BIC.meag.app.qthor.cidwh_eingangsschicht.backend.dblayer.l2.checks/MasterData_Holdings.
153._SYS_BIC.meag.app.qthor.centralAdministration.backend.dblayer.l2.checks/AuditAndSecurities.
meag.app.qthor.centralAdministration.backend.dblayer.l2.checks/GeneralLedger
After filter the name column using where clause, output in the name column would be shown only the last portion of the string. So, output will be like this. That means whatever we have, just remove from the beginning till '/'.
"MasterData_Holdings"
"AuditAndSecurities"
"GeneralLedger"
You can try using the REPLACE_REGEXPR
I'm not familiar myself with Hana but the function is pretty straight forward and it should be:
select REPLACE_REGEXPR('.+/(.+)' IN fieldName WITH '\1' OCCURRENCE ALL) as field
...
where
... -- your filter
Be aware that this regex '.+/(.+)' will eat everything until the last / so for instance if you have ....checks/MasterData_Holdings/Something it will return only Something

Regular expression to remove element not match specific prefix

I am doing this in Impala or Hive. Basically let say I have a string like this
f-150:aa|f-150:cc|g-210:dd
Each element is separated by the pipe |. Each has prefix f-150 or whatever. I want to be able to remove the prefix and keep only element that matches specific prefix. For example, if the prefix is f-150, I want the final string after regex_replace is
aa|cc
dd is removed because g-210 is different prefix and not match, therefore the whole element is removed.
Any idea how to do this using string expression in one SQL?
Thanks
UPDATE 1
I tried this in Impala:
select regexp_extract('f-150:aa|f-150:cc|g-210:dd','(?:(?:|(\\|))f-150|keep|those):|(?:^|\\|)\\w-\\d{3}:\\w{2}',0);
But got this output:
f-150:aa
In Hive, I got NULL.
The regexyou in question could look like this:
(?:(?:|(\\|))f-150|keep|those):|(?:^|\\|)\\w-\\d{3}:\\w{2}
I have added some pseudo keywords to retain, but I am sure you get the idea:
Wholy match elements that should be dropped but only match the prefix for those that should be retained.
To keep the separator intact, match | at the beginning of an element in group 1 and put it back in the replacement with $1.
Demo
According to the documentation, your query should be written like a Java regex; likewise, this should perform like this code sample in Java.
You could match the values that you want to remove and then replace with an empty string:
f-150:|\|[^:]+:[^|]+$|[^|]+:[^|]+\|
f-150:|\\|[^:]+:[^|]+$|[^|]+:[^|]+\\|
Explanation
f-150: Match literally
| Or
\|[^:]+:[^|]+$ Match a pipe, not a colon one or more times followed by not a pipe one or more times and assert the end of the line
| Or
[^|]+:[^|]+\| Match not a pipe one or more times, a colon followed by matching not a pipe one or more times and then match a pipe
Test with multiple lines and combinations
You may have to loop through the string until the end to get the all the matching sub string. Look ahead syntax is not supported in most sql so above regexp might not be suitable for SQL syntax. For you purpose you can do something like creating a table to loop through just to mimic Oracle's level syntax and join with your table containing the string.
With loop_tab as (
Select 1 loop union all
Select 2 union all
select 3 union all
select 4 union all
select 5),
string_tab as(Select 'f-150:aa|ade|f-150:ce|akg|f-150:bb|'::varchar(40) as str)
Select regexp_substr(str,'(f\\-150\\:\\w+\\|)',1,loop)
from string_tab
join loop_tab on 1=1
Output:
regexp_substr
f-150:aa|
f-150:ce|
f-150:bb|

Consider a query to find details of research fields where the first two parts of the ID are D and 2 and the last part is one character (digit)

The ID of research fields have three parts, each part separated by a period.
Consider a query to find the details of research fields where the first two parts of the ID are D and 2, and the last part is a single character (digit).
IDs like D.2.1 and D.2.3 are in the query result whereas IDs like D.2.12 or D.2.15 are not.
The SQL query given below does not return the correct result. Explain the reason why it does not return the correct result and give the correct SQL query.
select *
from field
where ID like 'B.1._';
I have no idea why it doesnt work.
Anyone can help on this? Many thanks
D.2.1 and D.2.3 are in the query result whereas IDs like D.2.12 or D.2.15 are not.
An underscore matches any single character in a LIKE filter so B.1._ is looking for the start of the string followed by a B character followed by a . character then a 1 character then a . character then any single character then the end of the string.
You could use:
SELECT *
FROM field
WHERE ID like 'B.1._%';
The % will match any number of characters (including zero) until the end of the string and the preceding underscore will enforce that there is at least one character after the period.

SQL LIKE - Using square bracket (character range) matching to match an entire word

In REGEX you can do something like [a-c]+, which will match on
aaabbbccc
abcccaabc
cbccaa
b
aaaaaaaaa
In SQL LIKE it seems that one can either do the equivalent of ".*" which is "%", or [a-c]. Is it possible to use the +(at least one) quantifier in SQL to do [a-c]+?
EDIT: Just to clarify, the desired end-query would look something like
SELECT * FROM table WHERE column LIKE '[a-c]+'
which would then match on the list above, but would NOT match on e.g "xxxxxaxxxx"
As a general rule, SQL Server's LIKE patterns are much weaker than regular expressions. For your particular example, you can do:
where col not like '%[^a-c]%'
That is, the column contains no characters that are not a, b, or c.
You can use regex in SQL with combination of LIKE e.g :
SELECT * FROM Table WHERE Field LIKE '%[^a-z0-9 .]%'
This works in SQL
Or in your case
SELECT * FROM Table WHERE Field LIKE '%[^a-c]%'
I seems you want some data from database, That is you don't know exactly, You must show your column and the all character that you want in that filed.

MSAccess Query a string matching a pattern

I have a table with a string field containing location information. I want to be able to query this table and retrieve all of the tags matching the format xxxxxxAA where xxxxxx is a 6-digit number and AA is two alphabetic characters.
Is there a method of querying this using SQL or is this something that I need to do in VBA?
Sample data:
BGS5 PM RGP5
022051PM
022201PM
030539PM
WAS3N
179546MM
And I want to return the following without knowing the values:
022051PM
022201PM
030539PM
179546MM
thanks in advance
Jason
You can use a query with a Like comparison in the WHERE clause.
SELECT y.text_field
FROM YourTable AS y
WHERE y.text_field Like '######[A-Z][A-Z]'
The # matches a digit.
[A-Z] matches one character from a character class consisting of only letters. That character class is actually upper case letters. However, the comparison is case-insensitive, so will match lower case letters, too.