How do I exclude particular words from a match? - sql

Suppose I have the following SQL query:
SELECT
id, text
FROM
comments
WHERE
lower(text) LIKE '%hell%'
ORDER BY len(text) ASC
This matches any text that contains hell. However, I want to exclude a particular word from a match: "seashells". How can I write an SQL query that matches everything that contains hell, but ignores seashells?
Examples:
He said hello to the boy — matches.
Are there seashells on the beach? — no match.
Unshelled seashells — matches.

You could try keeping your current logic but checking for%hell% after first removing seashells from the text:
SELECT id, text
FROM comments
WHERE LOWER(REPLACE(text, 'seashell', '')) LIKE '%hell%'
ORDER BY LEN(text);
Note that regex search would be a much better way to handle your requirement, but SQL Server does not have much built in regex support.

You can try
Not Like in Sql Query. Here is something you can try.
SELECT
id, text
FROM
comments
WHERE
lower(text) LIKE '%hell%'
ORDER BY len(text) ASC
EXCEPT
SELECT
id, text
FROM
comments
WHERE
lower(text) LIKE '%seashells%'
OR
lower(text) LIKE '%other_word%'
ORDER BY len(text) ASC
To exclude some specific key word containing hell .
Note: NOT / LIKE IS CASE SENSITIVE.

Related

BigQuery - Using regexp with LIKE operator (?)

I'd like to get productids from url and I've almost finetuned a query to do it but still there is an issue I cannot solve.
The url usually looks like this:
/xp-pen/toll-spe43-deco-pro-small-medium-spe43-tobuy-p665088831/
or
/harry-potter-es-a-tuz-serlege-2019-m19247107/
As you can see there are two types of ids:
in general, ids start with '-p'
ids of some special products start with '-m'
I created this case when statement:
CASE
WHEN MAX(hits.page.pagePath) LIKE '%-p%'
THEN MAX(REGEXP_REPLACE(REGEXP_EXTRACT(
hits.page.pagePath, '-p[0-9]+/'), '\\-|p|/', ''))
WHEN MAX(hits.page.pagePath) LIKE '%-m%'
THEN MAX(REGEXP_REPLACE(REGEXP_EXTRACT(
hits.page.pagePath, '-m[0-9]+/'), '\\-|m|/', ''))
ELSE NULL
END AS productId
It's a little complicated at the first look but I really needed a regexp_replace and a regexp_extract because '-p' or '-m' characters doesn't appear only before the id but it can be multiplied times in a url.
The problem with my code is that there are some special cases when the url looks like this:
/elveszett-profeciak-2019-m17855487/
As you can see the id starts with '-m' but the url also contains '-p'. In this case the result is empty value in the query.
I think it could be solved by modifying the like operator in the when part of the case when statement: LIKE '%-p%' or LIKE '%-m%'
It would be great to have a regexp expression after or instead of the LIKE operator. Something similar to the parameter of '-p[0-9]+/' what I used in regexp_extract function.
So what I would need is to define in the when part of the statement that if the '-p' or '-m' text is followed by numbers in the urls
I'm not sure it's possible to do or not in BQ.
So what I would need is to define in the when part of the statement that if the '-p' or '-m' text is followed by numbers in the urls
I think you want '-p' and '-m' followed by digits. If so, I think this does what you want:
select regexp_extract(url, '-[pm][0-9]+')
from (select '/xp-pen/toll-spe43-deco-pro-small-medium-spe43-tobuy-p665088831/' as url union all
select '/elveszett-profeciak-2019-m17855487/' union all
select '/harry-potter-es-a-tuz-serlege-2019-m19247107/'
) x

Making a query which selects the news that contain a specific word

CREATE TABLE IF NOT EXISTS news (
title TEXT PRIMARY KEY,
newsletter_description TEXT NOT NULL
);
I need to write a query which selects all the news that contain the word "apple" or "watermelon"(or both) in their title or in their newsletter_description and I am not very sure about how I can do that. (case insensitive, it can also be "AppLe" or "WaterMelon")
SELECT * FROM NEWS
WHERE title LIKE "%apple%" OR
title LIKE "%watermelon%" OR
newsletter_description LIKE "%apple%" OR
newsletter_description LIKE "%watermelon%
SQlite implemented LIKE operator case insensitive for ASCII characters by default. Unless you use unicode characters in your text you can use above query.
However if you use unicode chars, using lower or upper functions doesn't work either. So there is no point in using lower or upper functions at all.
https://www.sqlite.org/c3ref/strlike.html
You can use « lower(title) like '%apple%' »
In fact the lower put all the field in minuscule, that help you to find the word needed without knowing how he is written
You can use like operator and to have case insensitive search you can either use lower or upper on the actual column and also have to convert the input to lower/upper before passing to the query accordingly,
select *
from news
where lower(newsletter_description) like '%watermelon%'
or lower(newsletter_description) like '%apple%'
or lower(title) like '%watermelon%'
or lower(title) like '%apple%';
Use a CTE that returns all the words that you search for and join it to the table:
with cte(word) as (values ('apple'), ('watermelon'))
select n.*
from news n inner join (select '%' || word || '%' word from cte) c
on n.title like c.word or n.newsletter_description like c.word
Naive way will be select * from new where lower(title) like ‘%apple%’ or lower(title) like ‘%watermelon%’ or lower(newsletter_description) like ‘%apple%’ or lower(newsletter_description) like ‘%watermelon%’;

Finding first and second word in a string in SQL Developer

How can I find the first word and second word in a string separated by unknown number of spaces in SQL Developer? I need to run a query to get the expected result.
String:
Hello Monkey this is me
Different sentences have different number of spaces between the first and second word and I need a generic query to get the result.
Expected Result:
Hello
Monkey
I have managed to find the first word using substr and instr. However, I do not know how to find the second word due to the unknown number of spaces between the first and second word.
select substr((select ltrim(sentence) from table1),1,
(select (instr((select ltrim(sentence) from table1),' ',1,1)-1)
from table1))
from table1
Since you seem to want them as separate result rows, you could use a simple common table expression to duplicate the rows, once with the full row, then with the first word removed. Then all you have to do is get the first word from each;
WITH cte AS (
SELECT value FROM table1
UNION ALL
SELECT SUBSTR(TRIM(value), INSTR(TRIM(value), ' ')) FROM table1
)
SELECT SUBSTR(TRIM(value), 1, INSTR(TRIM(value), ' ') -1) word
FROM cte
Note that this very simple example assumes that there is a second word, if there isn't, NULL will be returned for both words.
An SQLfiddle to test with.
While Joachim Isaksson's answer is a robust and fast approach, you can also consider splitting the string and selecting from the resulting pieces set. This is just meant as hint for another approach, if your requirements alter (e.g. more than two string pieces).
You could split finally by the regex /[ ]+/, and so getting the words between the blanks.
Find more about splitting here: How do I split a string so I can access item x?
This will strongly depend on the SQL dialect you are using.
Try this with REGEXP_SUBSTR:
SELECT
REGEXP_SUBSTR(sentence,'\w+\s+'),
REGEXP_SUBSTR(sentence,'\s+(\w+)\s'),
REGEXP_SUBSTR(sentence,'\s+(\w+)\s+(\w+)'),
REGEXP_SUBSTR(REGEXP_SUBSTR(sentence,'\s+(\w+)\s+(\w+)'),'\w+$'),
REGEXP_SUBSTR(sentence,'\s+(\w+)\s+$')
FROM table1;
result:
1 2 3 4 5
Hello Monkey Monkey this this is_me
Learn more about REGEXP_SUBSTR reference to Using Regular Expressions With Oracle Database
Test use SqlFiddle: http://sqlfiddle.com/#!4/8e9ef/9
If you only want to get the first and the second word, use REGEXP_INSTR to get second word start position :
SELECT
REGEXP_SUBSTR(sentence,'\w+\s+') AS FIRST,
REGEXP_SUBSTR(sentence,'\w+\s',REGEXP_INSTR(sentence,'\w+\s+')+length(REGEXP_SUBSTR(sentence,'\w+\s+'))) AS SECOND
FROM table1;

Oracle sql queries

I am working on a small relational database for school and having trouble with a simple query. I am supposed to find all records in a table that have a field with the word 'OMG' somewhere in the text.
I have tried a few other and I can't figure it out. I am getting an invalid relational operator error.
my attempts:
select * from comments where contains('omg');
select * from comments where text contains('omg');
select * from comments where about('omg');
and several more variations of the above. As you can see I am brand spanking new to SQL.
text is the name of the field in the comments table.
Thanks for any help.
Assuming that the column name is text:
select * from comments where text like '%omg%';
The percentage % is a wild-card which means that any text can come before/after omg
Pay attention, if you don't want the results to contain words in which omg` is a substring - you might want to consider adding whitespaces:
select * from comments where text like '% omg %';
Of course that it doesn't cover cases like:
omg is at the end of the sentence (followed by a ".")
omg has "!" right afterwards
etc
You may want to use the LIKE operator with wildcards (%):
SELECT * FROM comments WHERE text LIKE '%omg%';

Search for “whole word match” with SQL Server LIKE pattern

Does anyone have a LIKE pattern that matches whole words only?
It needs to account for spaces, punctuation, and start/end of string as word boundaries.
I am not using SQL Full Text Search as that is not available. I don't think it would be necessary for a simple keyword search when LIKE should be able to do the trick. However if anyone has tested performance of Full Text Search against LIKE patterns, I would be interested to hear.
Edit:
I got it to this stage, but it does not match start/end of string as a word boundary.
where DealTitle like '%[^a-zA-Z]pit[^a-zA-Z]%'
I want this to match "pit" but not "spit" in a sentence or as a single word.
E.g. DealTitle might contain "a pit of despair" or "pit your wits" or "a pit" or "a pit." or "pit!" or just "pit".
Full text indexes is the answer.
The poor cousin alternative is
'.' + column + '.' LIKE '%[^a-z]pit[^a-z]%'
FYI unless you are using _CS collation, there is no need for a-zA-Z
you can just use below condition for whitespace delimiters:
(' '+YOUR_FIELD_NAME+' ') like '% doc %'
it works faster and better than other solutions. so in your case it works fine with "a pit of despair" or "pit your wits" or "a pit" or "a pit." or just "pit", but not works for "pit!".
I think the recommended patterns exclude words with do not have any character at the beginning or at the end. I would use the following additional criteria.
where DealTitle like '%[^a-z]pit[^a-z]%' OR
DealTitle like 'pit[^a-z]%' OR
DealTitle like '%[^a-z]pit'
I hope it helps you guys!
Surround your string with spaces and create a test column like this:
SELECT t.DealTitle
FROM yourtable t
CROSS APPLY (SELECT testDeal = ' ' + ISNULL(t.DealTitle,'') + ' ') fx1
WHERE fx1.testDeal LIKE '%[^a-z]pit[^a-z]%'
If you can use regexp operator in your SQL query..
For finding any combination of spaces, punctuation and start/end of string as word boundaries:
where DealTitle regexp '(^|[[:punct:]]|[[:space:]])pit([[:space:]]|[[:punct:]]|$)'
Another simple alternative:
WHERE DealTitle like '%[^a-z]pit[^a-z]%' OR
DealTitle like '[^a-z]pit[^a-z]%' OR
DealTitle like '%[^a-z]pit[^a-z]'
This is a good topic and I want to complement this to someone how needs to find some word in some string passing this as element of a query.
SELECT
ST.WORD, ND.TEXT_STRING
FROM
[ST_TABLE] ST
LEFT JOIN
[ND_TABLE] ND ON ND.TEXT_STRING LIKE '%[^a-z]' + ST.WORD + '[^a-z]%'
WHERE
ST.WORD = 'STACK_OVERFLOW' -- OPTIONAL
With this you can list all the incidences of the ST.WORD in the ND.TEXT_STRING and you can use the WHERE clausule to filter this using some word.
You could search for the entire string in SQL:
select * from YourTable where col1 like '%TheWord%'
Then you could filter the returned rows client site, adding the extra condition that it must be a whole word. For example, if it matches the regex:
\bTheWord\b
Another option is to use a CLR function, available in SQL Server 2005 and higher. That would allow you to search for the regex server-side. This MSDN artcile has the details of how to set up a dbo.RegexMatch function.
Try using charindex to find the match:
Select *
from table
where charindex( 'Whole word to be searched', columnname) > 0