Search for “whole word match” with SQL Server LIKE pattern - sql

Does anyone have a LIKE pattern that matches whole words only?
It needs to account for spaces, punctuation, and start/end of string as word boundaries.
I am not using SQL Full Text Search as that is not available. I don't think it would be necessary for a simple keyword search when LIKE should be able to do the trick. However if anyone has tested performance of Full Text Search against LIKE patterns, I would be interested to hear.
Edit:
I got it to this stage, but it does not match start/end of string as a word boundary.
where DealTitle like '%[^a-zA-Z]pit[^a-zA-Z]%'
I want this to match "pit" but not "spit" in a sentence or as a single word.
E.g. DealTitle might contain "a pit of despair" or "pit your wits" or "a pit" or "a pit." or "pit!" or just "pit".

Full text indexes is the answer.
The poor cousin alternative is
'.' + column + '.' LIKE '%[^a-z]pit[^a-z]%'
FYI unless you are using _CS collation, there is no need for a-zA-Z

you can just use below condition for whitespace delimiters:
(' '+YOUR_FIELD_NAME+' ') like '% doc %'
it works faster and better than other solutions. so in your case it works fine with "a pit of despair" or "pit your wits" or "a pit" or "a pit." or just "pit", but not works for "pit!".

I think the recommended patterns exclude words with do not have any character at the beginning or at the end. I would use the following additional criteria.
where DealTitle like '%[^a-z]pit[^a-z]%' OR
DealTitle like 'pit[^a-z]%' OR
DealTitle like '%[^a-z]pit'
I hope it helps you guys!

Surround your string with spaces and create a test column like this:
SELECT t.DealTitle
FROM yourtable t
CROSS APPLY (SELECT testDeal = ' ' + ISNULL(t.DealTitle,'') + ' ') fx1
WHERE fx1.testDeal LIKE '%[^a-z]pit[^a-z]%'

If you can use regexp operator in your SQL query..
For finding any combination of spaces, punctuation and start/end of string as word boundaries:
where DealTitle regexp '(^|[[:punct:]]|[[:space:]])pit([[:space:]]|[[:punct:]]|$)'

Another simple alternative:
WHERE DealTitle like '%[^a-z]pit[^a-z]%' OR
DealTitle like '[^a-z]pit[^a-z]%' OR
DealTitle like '%[^a-z]pit[^a-z]'

This is a good topic and I want to complement this to someone how needs to find some word in some string passing this as element of a query.
SELECT
ST.WORD, ND.TEXT_STRING
FROM
[ST_TABLE] ST
LEFT JOIN
[ND_TABLE] ND ON ND.TEXT_STRING LIKE '%[^a-z]' + ST.WORD + '[^a-z]%'
WHERE
ST.WORD = 'STACK_OVERFLOW' -- OPTIONAL
With this you can list all the incidences of the ST.WORD in the ND.TEXT_STRING and you can use the WHERE clausule to filter this using some word.

You could search for the entire string in SQL:
select * from YourTable where col1 like '%TheWord%'
Then you could filter the returned rows client site, adding the extra condition that it must be a whole word. For example, if it matches the regex:
\bTheWord\b
Another option is to use a CLR function, available in SQL Server 2005 and higher. That would allow you to search for the regex server-side. This MSDN artcile has the details of how to set up a dbo.RegexMatch function.

Try using charindex to find the match:
Select *
from table
where charindex( 'Whole word to be searched', columnname) > 0

Related

Regex to split apart text. Special case for parentheses with spaces in them

I am trying to split a field by delimiter in LookML. This field either follows the format of:
Managers (AE)
Managers (AE - MM)
I was able to split to first case using this
sql: case
when rlike (${user_role_name}, '^.*[\\(\\)].*$') then split_part(${user_role_name}, ' ', -1)
However, I haven't been able to get the 2nd case to do the same. It's in a case statement so I am going to add another when statement, but am not able to figure out the regex for parentheses that contains spaces.
Thanks in advance for the help!
By "split" the string, I think you mean you want to extract the part in parentheses, right?
I would do this using a regex substring method. You didn't mention what warehouse you're using, and the syntax will vary a little, but on snowflake that would look like:
regexp_substr(${user_role_name}, '\\([^)]*\\)')
So, for example, with the inputs you gave:
select regexp_substr('Managers (AE)', '\\([^)]*\\)')
union all
select regexp_substr('Managers (AE - MM)', '\\([^)]*\\)')
result
(AE)
(AE - MM)

Regex not matching correct string

I am busy building a lookup table for specific names of merchants. I tried to make use of the following regex but it's returning less results than the standard "like" function in Netezza SQL. Please refer to below:
SQL Like function: where trim(upper(a.MRCH_NME)) like '%CNA %' -- returns 4622 matches
Regex function in Netezza SQL: where array_combine(regexp_extract_all(trim(upper(a.MRCH_NME)),'.*CNA\s','i'),'|') = 'CNA' -- returns 2226 matches
I looked at the two result sets and found that strings such as the following aren't matched:
!C CNA INT ARR
*CNA PLATZ 0400
015764 CNA CRAD
C#CNA PARK 0
I made use of the following regex expression: /.*CNA\s'/
Any idea why the above strings aren't being returned as matches?
Thank you.
You probably should be using regexp_like:
SELECT *
FROM yourTable
WHERE REGEXP_LIKE(MRCH_NME, 'CNA[ ]', 'i');
This would be logically identical to the following query using LIKE:
SELECT *
FROM yourTable
WHERE MRCH_NME LIKE '%CNA ';
It seems to me the problem is more with your code rather than the regex. Look: like '%CNA %' returns all entries that contain a CNA substring followed with a literal space anywhere inside the entry. The '.*CNA\s' regex matches any 0+ chars other than newline followed with CNA and **any whitespace char*.
Acc. to this reference, \s matches "a white space character. White space is defined as [\t\n\f\r\p{Z}].
Thus, you should in fact just use
WHERE REGEXP_LIKE(MRCH_NME, 'CNA ', 'i')
or, better with a word boundary check:
WHERE REGEXP_LIKE(MRCH_NME, '\bCNA\b', 'i')
where \b marks a transition from a word to non-word and non-word to word character, thus ensuring a whole word search and justifying the regex usage.
If you do not need to match the merchant name as a whole word, use the regular LIKE with '%CNA %', it should be more efficient.

Match Character Whether or Not It Exists in Like Statement

I need a like expression that will match a character whether or not it exists. It needs to match the following values:
..."value": "123456"...
..."value": "123456"...
"...value":"123456"...
This like statement will almost work: LIKE '%value":%"123456"%'
But there are values like this one that would also match, but I don't want returned:
..."value":"99999", "other":"123456"...
A regex expression to do what I'm looking to do is 'value": *?"123456"'. I need to do this in SQL Server 2008 and I don't believe there is good regex support in that version. How can I match using a like statement?
Remove the whitespace in your compare with REPLACE():
WHERE REPLACE(column,' ','') LIKE '%"value":"123456"%'
May need a double replace for tabs:
REPLACE(REPLACE(column,' ',''),' ','')
I don't think you can with the like operator. You could exclude ones you could match, like if you want to make sure it just doesn't contain other:
[field] LIKE '%value":%"123456"%` AND [field] NOT LIKE '%"other"%'
Otherwise I think you'd have to do some processing on the string. You could write a UDF to take the string and parse it to find the value for 'value' and compare based on that:
dbo.fn_GetValue([field], 'value') = '123456'
The function could find the index of '"' + #name + '"', find the next index of a quote, and the one after that, then get the string between those two quotes and return it.

SQL fetch results by concatenating words in column

I have column store_name (varchar). In that column I have entries like prime sport, best buy... with a space. But when user typed concatenated string like primesport without space I need to show result prime sport. how can I achieve this? Please help me
SELECT *
FROM TABLE
WHERE replace(store_name, ' ', '') LIKE '%'+#SEARCH+'%' OR STORE_NAME LIKE '%'+#SEARCH +'%'
Well, I don't have much idea, and even I am searching for it. But may be what I know works for you, You can achieve this by performing different type of string operations:
Mike can be Myke or Myce or Mikke or so on.
Cat an be Kat or katt or catt or so on.
For this you should write a function to generate number of possible strings and then form a SQL Query using all these, and query the database.
A similar kind of search in known as Soundex Search from Oracle and Soundex Search from Microsoft. Have a look of it. this may work.
And overall make use of functions like upper and lower.
Have you tried using replace()
You can replace the white space in the query then use like
SELECT * FROM table WHERE replace(store_name, ' ', '') LIKE '%primesport%'
It will work for entries like 'prime soft' querying with 'primesoft'
Or you can use regex.

SQL (MySQL): Match first letter of any word in a string?

(Note: This is for MySQL's SQL, not SQL Server.)
I have a database column with values like "abc def GHI JKL". I want to write a WHERE clause that includes a case-insensitive test for any word that begins with a specific letter. For example, that example would test true for the letters a,c,g,j because there's a 'word' beginning with each of those letters. The application is for a search that offers to find records that have only words beginning with the specified letter. Also note that there is not a fulltext index for this table.
You can use a LIKE operation. If your words are space-separated, append a space to the start of the string to give the first word a chance to match:
SELECT
StringCol
FROM
MyTable
WHERE
' ' + StringCol LIKE '% ' + MyLetterParam + '%'
Where MyLetterParam could be something like this:
'[acgj]'
To look for more than a space as a word separator, you can expand that technique. The following would treat TAB, CR, LF, space and NBSP as word separators.
WHERE
' ' + StringCol LIKE '%['+' '+CHAR(9)+CHAR(10)+CHAR(13)+CHAR(160)+'][acgj]%'
This approach has the nice touch of being standard SQL. It would work unchanged across the major SQL dialects.
Using REGEXP opearator:
SELECT * FROM `articles` WHERE `body` REGEXP '[[:<:]][acgj]'
It returns records where column body contains words starting with a,c,g or i (case insensitive)
Be aware though: this is not a very good idea if you expect any heavy load (not using index - it scans every row!)
Check the Pattern Matching and Regular Expressions sections of the MySQL Reference Manual.