MySQL match existence of a term amongst comma seperated values using REGEXP - sql

I have a MySQL select statement dilemma that has thrown me off course a little. In a certain column I have a category term. This term could look something like 'category' or it could contain multiple CSV's like this: 'category, bank special'.
I need to retrieve all rows containing the term 'bank special' in the comma separated value. Currently I am using the following SQL statement:
SELECT * FROM tblnecklaces WHERE nsubcat REGEXP 'bank special';
Now this works OK, but if I had the category as follows: 'diamond special' for example then the row is still retrieved because the term 'special' in my column seems to be matching up to the term 'bank special' in my REGEXP statement.
How would I go abut checking for the existence of the whole phrase 'bank special' only and not partially matching the words?
Many thanks for your time and help

The simplest solution is to use the LIKE clause (% is wildcard):
SELECT * FROM tblnecklaces WHERE nsubcat LIKE '%bank special%';
Note that LIKE is also a lot faster than REGEXP.

You can prefix the column with comma's, and compare it to the bank special:
SELECT *
FROM tblnecklaces
WHERE ',' + replace(nsubcat,', ','') + ',' LIKE ',bank special,'
I put in a replace to remove optional space after a comma, because your example has one.

Not tested but this RegExp should work (LIKE works too but will match items that starts or ends with the same phrase):
SELECT * FROM tblnecklaces WHERE nsubcat REGEXP '(^|, )bank special($|,)';

Related

How run Select Query with LIKE on thousands of rows

Newbie here. Been searching for hours now but I can seem to find the correct answer or properly phrase my search.
I have thousands of rows (orderids) that I want to put on an IN function, I have to run a LIKE at the same time on these values since the columns contains json and there's no dedicated table that only has the order_id value. I am running the query in BigQuery.
Sample Input:
ORD12345
ORD54376
Table I'm trying to Query: transactions_table
Query:
SELECT order_id, transaction_uuid,client_name
FROM transactions_table
WHERE JSON_VALUE(transactions_table,'$.ordernum') LIKE IN ('%ORD12345%','%ORD54376%')
Just doesn't work especially if I have thousands of rows.
Also, how do I add the order id that I am querying so that it appears under an order_id column in the query result?
Desired Output:
Option one
WITH transf as (Select order_id, transaction_uuid,client_name , JSON_VALUE(transactions_table,'$.ordernum') as o_num from transactions_table)
Select * from transf where o_num like '%ORD12345%' or o_num like '%ORD54376%'
Option two
split o_num by "-" as separator , create table of orders like (select 'ORD12345' as num
Union
Select 'ORD54376' aa num) and inner join it with transf.o_num
One method uses OR:
WHERE JSON_VALUE(transactions_table, '$.ordernum') LIKE IN '%ORD12345%' OR
JSON_VALUE(transactions_table, '$.ordernum') LIKE '%ORD54376%'
An alternative method uses regular expressions:
WHERE REGEXP_CONTAINS(JSON_VALUE(transactions_table, '$.ordernum'), 'ORD12345|ORD54376')
According to the documentation, here, the LIKE operator works as described:
Checks if the STRING in the first operand X matches a pattern
specified by the second operand Y. Expressions can contain these
characters:
A percent sign "%" matches any number of characters or
bytes.
An underscore "_" matches a single character or byte.
You can escape "\", "_", or "%" using two backslashes. For example, "\%". If
you are using raw strings, only a single backslash is required. For
example, r"\%".
Thus , the syntax would be like the following:
SELECT
order_id,
transaction_uuid,
client_name
FROM
transactions_table
WHERE
JSON_VALUE(transactions_table,
'$.ordernum') LIKE '%ORD12345%'
OR JSON_VALUE(transactions_table,
'$.ordernum') LIKE '%ORD54376%
Notice that we specify two conditions connected with the OR logical operator.
As a bonus information, when querying large datasets it is a good pratice to select only the columns you desire in your out output ( either in a Temp Table or final view) instead of using *, because BigQuery is columnar, one of the reasons it is faster.
As an alternative for using LIKE, you can use REGEXP_CONTAINS, according to the documentation:
Returns TRUE if value is a partial match for the regular expression, regex.
Using the following syntax:
REGEXP_CONTAINS(value, regex)
However, it will also work if instead of a regex expression you use a STRING between single/double quotes. In addition, you can use the pipe operator (|) to allow the searched components to be logically ordered, when you have more than expression to search, as follows:
where regexp_contains(email,"gary|test")
I hope if helps.

Similar to with regex in Postgresql

In Postgresql database I have a column called names where I have some names which need to be parsed using regex to clean up punctuation parts. I am able to get a clean name using regexp_replace as follows:
select regexp_replace(name,'\.COM|''[A-Z]|[^a-zA-Z0-9 -]+|\s(?=&)|(?<!\w\w)(?:\s+|-)(?!\w\w)','','g')
from tableA
However, I would like to compare with some strings that are also cleaned of punctuation. How can I use similar to with the formed regular expression?
select name
from tableA
where (lower(name) ~ '\.COM|''[A-Za-z]|[^a-zA-Z0-9 -]+|\s(?=&)|(?<!\w\w)(?:\s+|-)(?!\w\w)') as nameParsed similar to '(fg )%' and
(lower(name) ~ '\.COM|''[A-Za-z]|[^a-zA-Z0-9 -]+|\s(?=&)|(?<!\w\w)(?:\s+|-)(?!\w\w)') as nameParsed similar to '%( cargo| carrier| cartage )%'
With the previous query I am getting this error:
LINE 3: ...-zA-Z0-9 -]+|\s(?=&)|(?<!\w\w)(?:\s+|-)(?!\w\w)') as namePar...
I have tried in where clause like this and it seems to be working:
select name
from tableA
where (select lower(regexp_replace(name,'\.COM|''[A-Z]|[^a-zA-Z0-9 -]+|\s(?=&)|(?<!\w\w)(?:\s+|-)(?!\w\w)','','g'))) similar to '(fg )%'
Is this the best approach? The execution time went to 46 seconds :(
Thanks in advance
You're trying to get a column name in a WHERE clause (is a comparison, not a column). So, you can use as follows:
SELECT name
FROM "tableA"
WHERE (regexp_replace(name,'\.COM|''[A-Z]|[^a-zA-Z0-9 -]+|\s(?=&)|(?<!\w\w)(?:\s+|-)(?!\w\w)','','g') similar to '(fg )%'
OR regexp_replace(name,'\.COM|''[A-Z]|[^a-zA-Z0-9 -]+|\s(?=&)|(?<!\w\w)(?:\s+|-)(?!\w\w)','','g') similar to '%( cargo| carrier| cartage )%');
Alternatively, you can use ilike instead of similar to if you want to find a specific word.

SQL LIKE - Using square bracket (character range) matching to match an entire word

In REGEX you can do something like [a-c]+, which will match on
aaabbbccc
abcccaabc
cbccaa
b
aaaaaaaaa
In SQL LIKE it seems that one can either do the equivalent of ".*" which is "%", or [a-c]. Is it possible to use the +(at least one) quantifier in SQL to do [a-c]+?
EDIT: Just to clarify, the desired end-query would look something like
SELECT * FROM table WHERE column LIKE '[a-c]+'
which would then match on the list above, but would NOT match on e.g "xxxxxaxxxx"
As a general rule, SQL Server's LIKE patterns are much weaker than regular expressions. For your particular example, you can do:
where col not like '%[^a-c]%'
That is, the column contains no characters that are not a, b, or c.
You can use regex in SQL with combination of LIKE e.g :
SELECT * FROM Table WHERE Field LIKE '%[^a-z0-9 .]%'
This works in SQL
Or in your case
SELECT * FROM Table WHERE Field LIKE '%[^a-c]%'
I seems you want some data from database, That is you don't know exactly, You must show your column and the all character that you want in that filed.

Matching exactly 2 characters in string - SQL

How can i query a column with Names of people to get only the names those contain exactly 2 “a” ?
I am familiar with % symbol that's used with LIKE but that finds all names even with 1 a , when i write %a , but i need to find only those have exactly 2 characters.
Please explain - Thanks in advance
Table Name: "People"
Column Names: "Names, Age, Gender"
Assuming you're asking for two a characters search for a string with two a's but not with three.
select *
from people
where names like '%a%a%'
and name not like '%a%a%a%'
Use '_a'. '_' is a single character wildcard where '%' matches 0 or more characters.
If you need more advanced matches, use regular expressions, using REGEXP_LIKE. See Using Regular Expressions With Oracle Database.
And of course you can use other tricks as well. For instance, you can compare the length of the string with the length of the same string but with 'a's removed from it. If the difference is 2 then the string contained two 'a's. But as you can see things get ugly real soon, since length returns 'null' when a string is empty, so you have to make an exception for that, if you want to check for names that are exactly 'aa'.
select * from People
where
length(Names) - 2 = nvl(length(replace(Names, 'a', '')), 0)
Another solution is to replace everything that is not an a with nothing and check if the resulting String is exactly two characters long:
select names
from people
where length(regexp_replace(names, '[^a]', '')) = 2;
This can also be extended to deal with uppercase As:
select names
from people
where length(regexp_replace(names, '[^aA]', '')) = 2;
SQLFiddle example: http://sqlfiddle.com/#!4/09bc6
select * from People where names like '__'; also ll work

Search for “whole word match” with SQL Server LIKE pattern

Does anyone have a LIKE pattern that matches whole words only?
It needs to account for spaces, punctuation, and start/end of string as word boundaries.
I am not using SQL Full Text Search as that is not available. I don't think it would be necessary for a simple keyword search when LIKE should be able to do the trick. However if anyone has tested performance of Full Text Search against LIKE patterns, I would be interested to hear.
Edit:
I got it to this stage, but it does not match start/end of string as a word boundary.
where DealTitle like '%[^a-zA-Z]pit[^a-zA-Z]%'
I want this to match "pit" but not "spit" in a sentence or as a single word.
E.g. DealTitle might contain "a pit of despair" or "pit your wits" or "a pit" or "a pit." or "pit!" or just "pit".
Full text indexes is the answer.
The poor cousin alternative is
'.' + column + '.' LIKE '%[^a-z]pit[^a-z]%'
FYI unless you are using _CS collation, there is no need for a-zA-Z
you can just use below condition for whitespace delimiters:
(' '+YOUR_FIELD_NAME+' ') like '% doc %'
it works faster and better than other solutions. so in your case it works fine with "a pit of despair" or "pit your wits" or "a pit" or "a pit." or just "pit", but not works for "pit!".
I think the recommended patterns exclude words with do not have any character at the beginning or at the end. I would use the following additional criteria.
where DealTitle like '%[^a-z]pit[^a-z]%' OR
DealTitle like 'pit[^a-z]%' OR
DealTitle like '%[^a-z]pit'
I hope it helps you guys!
Surround your string with spaces and create a test column like this:
SELECT t.DealTitle
FROM yourtable t
CROSS APPLY (SELECT testDeal = ' ' + ISNULL(t.DealTitle,'') + ' ') fx1
WHERE fx1.testDeal LIKE '%[^a-z]pit[^a-z]%'
If you can use regexp operator in your SQL query..
For finding any combination of spaces, punctuation and start/end of string as word boundaries:
where DealTitle regexp '(^|[[:punct:]]|[[:space:]])pit([[:space:]]|[[:punct:]]|$)'
Another simple alternative:
WHERE DealTitle like '%[^a-z]pit[^a-z]%' OR
DealTitle like '[^a-z]pit[^a-z]%' OR
DealTitle like '%[^a-z]pit[^a-z]'
This is a good topic and I want to complement this to someone how needs to find some word in some string passing this as element of a query.
SELECT
ST.WORD, ND.TEXT_STRING
FROM
[ST_TABLE] ST
LEFT JOIN
[ND_TABLE] ND ON ND.TEXT_STRING LIKE '%[^a-z]' + ST.WORD + '[^a-z]%'
WHERE
ST.WORD = 'STACK_OVERFLOW' -- OPTIONAL
With this you can list all the incidences of the ST.WORD in the ND.TEXT_STRING and you can use the WHERE clausule to filter this using some word.
You could search for the entire string in SQL:
select * from YourTable where col1 like '%TheWord%'
Then you could filter the returned rows client site, adding the extra condition that it must be a whole word. For example, if it matches the regex:
\bTheWord\b
Another option is to use a CLR function, available in SQL Server 2005 and higher. That would allow you to search for the regex server-side. This MSDN artcile has the details of how to set up a dbo.RegexMatch function.
Try using charindex to find the match:
Select *
from table
where charindex( 'Whole word to be searched', columnname) > 0