Count and Order By Where clause Matches - sql

I'm writing some very simple search functionality for a list of FAQ's. I'm splitting the search string on a variety of characters, including spaces. Then performing a select along the lines of
SELECT *
FROM "faq"
WHERE
((LOWER("Question") LIKE '%what%'
OR LOWER("Question") LIKE '%is%'
OR LOWER("Question") LIKE '%a%'
OR LOWER("Question") LIKE '%duck%'))
I've had to edit this slightly as its generated by our data access layer but it should give you an idea of whats going on.
The problem is demonstrated well with the above query in that most questions are likely to have the words a or is in them, however I can not filter these out as acronyms may well be important for the searcher. What has been suggested is we order by the number of matching keywords. However I have been unable to find a way of doing this in SQL (we do not have time to create a simple search engine with an index of keywords etc). Does anyone know if there's a way of counting the number of LIKE matches in an SQL statement and ordering by that so that the questions with the most keywords appear at the top of the results?

I assume the list of matching keywords is being entered by the user and inserted into the query dynamically by the application, immediately prior to executing the query. If so, I suggest amending the query like so:
SELECT *
FROM "faq"
WHERE
((LOWER("Question") LIKE '%what%'
OR LOWER("Question") LIKE '%is%'
OR LOWER("Question") LIKE '%a%'
OR LOWER("Question") LIKE '%duck%'))
order by
case when LOWER("Question") LIKE '%what%' then 1 else 0 end +
case when LOWER("Question") LIKE '%is%' then 1 else 0 end +
case when LOWER("Question") LIKE '%a%' then 1 else 0 end +
case when LOWER("Question") LIKE '%duck%' then 1 else 0 end
descending;
This would even enable you to "weight" the importance of each selection term, assuming the user (or an algorithm) could assign a weighting to each term.
One caveat: if your query is being constructed dynamically, are you aware of the risk of SQL Insertion attacks?

You can write a function which counts the occurrences of one string in another like this:
CREATE OR REPLACE FUNCTION CountInString(text,text)
RETURNS integer AS $$
SELECT(Length($1) - Length(REPLACE($1, $2, ''))) / Length($2) ;
$$ LANGUAGE SQL IMMUTABLE;
And use it in the select: select CountInString("Question",' what ') from "faq".

Related

How to combine LIKE and CASE WHEN in SQLite?

...Hi, a database ingenue here. I'm trying to figure out how to use LIKE with CASE in SQLite, or some equivalent approach. I've got a prod_names table that contains concatenated data--occasionally just 1 item, but usually containing several comma-separated items. For my new 'Toy' column, I need to find every record that contains 'CapGun'. The code below works only when 'CapGun' is the only item, and not when there are multiple items (eg, 'BarbieDoll, CapGun, EasyBakeOven').
SELECT
customer_id,
prod_names,
CASE prod_names WHEN 'CapGun' THEN 'CG' ELSE 'not_CG' END Toy
FROM
Toys_table
ORDER BY
Toy
I've tried various approaches like WHEN LIKE '%CapGun%', WHEN INSTR(prod_names,'CapGun') > 0, and WHEN GLOB '*CapGun*' but they all return no results or throw a syntax error.
Any suggestions? I'm sure there must be a simple solution.
Use the expanded case syntax:
CASE
WHEN prod_names LIKE '%CapGun%' THEN ...
ELSE ...
END
This lets use any expression as the condition in your CASE, including other columns.

SQL full text search behavior on numeric values

I have a table with about 200 million records. One of the columns is defined as varchar(100) and it's included in a full text index. Most of the values are numeric. Only few are not numeric.
The problem is that it's not working well. For example if a row contains the value '123456789' and i look for '567', it's not returning this row. It will only return rows where the value is exactly '567'.
What am I doing wrong?
sql server 2012.
Thanks.
Full text search doesn't support leading wildcards
In my setup, these return the same
SELECT *
FROM [dbo].[somelogtable]
where CONTAINS (logmessage, N'28400')
SELECT *
FROM [dbo].[somelogtable]
where CONTAINS (logmessage, N'"2840*"')
This gives zero rows
SELECT *
FROM [dbo].[somelogtable]
where CONTAINS (logmessage, N'"*840*"')
You'll have to use LIKE or some fancy trigram approach
The problem is probably that you are using a wrong tool since Full-text queries perform linguistic searches and it seems like you want to use simple "like" condition.
If you want to get a solution to your needs then you can post DDL+DML+'desired result'
You can do this:
....your_query.... LIKE '567%' ;
This will return all the rows that have a number 567 in the beginning, end or in between somewhere.
99% You're missing % after and before the string you search in the LIKE clause.
es:
SELECT * FROM t WHERE att LIKE '66'
is the same as as using WHERE att = '66'
if you write:
SELECT * FROM t WHERE att LIKE '%66%'
will return you all the lines containing 2 'sixes' one after other

Optimize SQL Query containing Like Operator

Trying to make the following query neat and run faster. Any insights are helpful. I have to use like Operator since need to search for the pattern anywhere in the field.
Select Col1,Col2,Col3 from TableName where
(Subject like '%Maths%' OR
Subject like '%Physics%' OR
Subject like '%Chemistry%' OR
Subject like '%English%')
AND
(Description like '%Maths%' OR
Description like '%Physics%' OR
Description like '%Chemistry%' OR
DESCRIPTION like '%English%')
AND
(Extra like '%Maths%' OR
Extra like '%Physics%' OR
Extra like '%Chemistry%' OR
Extra like '%English%') AND Created Date > 2017-01-01
Basically, you can't optimize this query using basic SQL. If the strings you are searching for are words in the texts, then you can use a full text string. The place to start in learning about this is the documentation.
If you happen to know that you will be searching for these four strings, you can set up computed columns and then build indexes on the computed columns. That would be fast. But you would be limited to exactly those strings.
All is not lost. Technically, there are other solutions, such as those based on n-grams or by converting to XML/JSON and indexing that. However, these are either not supported in SQL Server or non-trivial to implement.
You can try this using CTE and charindex
compare execution plan from your OP.
;with mycte as ( --- building words to look for
select * from
(values ('English'),
('Math'),
('Physics'),
('English'),
('Chemistry')
) t (words)
)
select t.* from
tablename t
inner join mycte c
on charindex(c.words,t.subject) > 0 --- check if there is a match
and charindex(c.words,t.description) > 0 --- check if there is a match
and charindex(c.words,t.extra) > 0 --- check if there is a match
where
t.createddate > '2017-01-01'

An esoteric pondering regarding the lack of compatibility between % and = and <>

I am new to the world of programming but please humor me nonetheless.
I know that % works with LIKE and NOT LIKE. For example the following two queries work:
--QUERY 1
SELECT *
FROM TrumpFeccandid_Pacs
WHERE PACID NOT LIKE 'C%'
--QUERY 2
SELECT *
FROM TrumpFeccandid_Pacs
WHERE PACID LIKE 'C%'
However % does not work with = or <>. For example, the following two queries do not work:
--QUERY A
SELECT *
FROM TrumpFeccandid_Pacs
WHERE PACID <> 'C%'
--QUERY B
SELECT *
FROM TrumpFeccandid_Pacs
WHERE PACD = 'C%'
Why is this the case? Intuitively speaking I feel like not only should queries A and B work but Query A should be equivalent to Query 1 and Query B should be equivalent to Query 2.
These examples were using T-SQL from Sql Server 2016.
Image a relatively simple query like this one:
SELECT *
FROM A
JOIN B ON A.Name = B.Name
If = worked like LIKE, god help you if Name contains a percent or underscore!
Intuitively speaking I feel like
That is where you go awry!
LIKE is defined a certain way, as are = and <>. The people who designed the language presumably tried to make it accessible, to make it easy to understand and remember and use. What they did not do, because they could not do, is define it such that it meets everyone's expectations and hunches.
Why is LIKE different from =?
a like 'C%' is true if a starts with 'C'
a = 'C%' is true if a is exactly the 2 letter string 'C%'
But the real moral to this story IMO is that if you want to know how the language works, the best advice is RTFM. Especially when it doesn't work as expected.
SQL provides standard pattern matching like those used in Unix, grep, sed. These patters can be used only with operators "LIKE" and "NOT LIKE".....
LIKE/NOT LIKE are Boolean types i.e they returns TRUE/FALSE if the match_expression matches the specified pattern.
Following are various wild card used to match the patterns:
% = Any number of characters
_ = Any Single character
[] = Any single character within the specified range
[^] = Any single character not within the specified range
Documentation on patterns and like operators:
SQL server LIKE operator

Need to check parameter against table rows in SQL Query

I would like to build one sql query in that one of my filed of form should not contain common names (maintained list of words in separate table) and i am passing value of that filed as parameter and want to check that it shouldn't contain any common name from that table.
How can i achieve that using sql query?
Note : if common name is 'abc' and i am passing parameter as '!abc123' since it contains that word query should return false.
Thanks in advance.
Try something like (Untested Query):
SELECT CommonName
FROM CommonNamesTable
WHERE CommonName like '%NameToTest%'
OR CONTAINS(NameToTest, CommonName);
Basically you need the string match options:
Take a look at options of CONTAINS and read about Queries with full text search
Is this what you're looking for?
SELECT (COUNT(*) == 0) FROM tablewithcommonwords
WHERE wordfromform LIKE CONCAT('%', wordcolumnnfromcommonwordstable, '%');
Try this:
IF NOT EXISTS(SELECT word FROM CommonWord WHERE #yourparam
LIKE '%' + word + '%')
BEGIN
RETURN 1
END
ELSE
BEGIN
Return 0
END
This works if the #yourParam is contained in any word or name, what you do not want to use. It only returns 1 if it is not contained by any row in the table.
I worte this sentence only on this way (you can use a simple Exists instead of NOT Exists), because may you want to extend the functionality in the true part.
if exists (select * from reservedwords where #parameter like '%'+word + '%')
select 0
else
select 1
I would like to suggest that You have to use keypress Event in Your TextBox and then Handle your Code after Each character enter in your TextBox.