SQL query to bring all results regardless of punctuation with JSF - sql

So I have a database with articles in them and the user should be able to search for a keyword they input and the search should find any articles with that word in it.
So for example if someone were to search for the word Alzheimer's I would want it to return articles with the word spell in any way regardless of the apostrophe so;
Alzheimer's
Alzheimers
results should all be returned. At the minute it is search for the exact way the word is spell and wont bring results back if it has punctuation.
So what I have at the minute for the query is:
private static final String QUERY_FIND_BY_SEARCH_TEXT = "SELECT o FROM EmailArticle o where UPPER(o.headline) LIKE :headline OR UPPER(o.implication) LIKE :implication OR UPPER(o.summary) LIKE :summary";
And the user's input is called 'searchText' which comes from the input box.
public static List<EmailArticle> findAllEmailArticlesByHeadlineOrSummaryOrImplication(String searchText) {
Query query = entityManager().createQuery(QUERY_FIND_BY_SEARCH_TEXT, EmailArticle.class);
String searchTextUpperCase = "%" + searchText.toUpperCase() + "%";
query.setParameter("headline", searchTextUpperCase);
query.setParameter("implication", searchTextUpperCase);
query.setParameter("summary", searchTextUpperCase);
List<EmailArticle> emailArticles = query.getResultList();
return emailArticles;
}
So I would like to bring back all results for alzheimer's regardless of weather their is an apostrophe or not. I think I have given enough information but if you need more just say. Not really sure where to go with it or how to do it, is it possible to just replace/remove all punctuation or just apostrophes from a user search?

In my point of view, you should change your query,
you should add alter your table and add a FULLTEXT index to your columns (headline, implication, summary).
You should also use MATCH-AGAINST rather than using LIKE query and most important, read about SOUNDEX() syntax, very beautiful syntax.
All I can give you is a native query example:
SELECT o.* FROM email_article o WHERE MATCH(o.headline, o.implication, o.summary) AGAINST('your-text') OR SOUNDEX(o.headline) LIKE SOUNDEX('your-text') OR SOUNDEX(o.implication) LIKE SOUNDEX('your-text') OR SOUNDEX(o.summary) LIKE SOUNDEX('your-text') ;
Though it won't give you results like Google search but it works to some extent. Let me know what you think.

Related

lucene wildcard query with space

I have Lucene index which has city names.
Consider I want to search for 'New Delhi'. I have string 'New Del' which I want to pass to Lucene searcher and I am expecting output as 'New Delhi'.
If I generate query like Name:New Del* It will give me all cities with 'New and Del'in it.
Is there any way by which I can create Lucene query wildcard query with spaces in it?
I referred and tried few solutions given # http://www.gossamer-threads.com/lists/lucene/java-user/5487
It sounds like you have indexed your city names with analysis. That will tend to make this more difficult. With analysis, "new" and "delhi" are separate terms, and must be treated as such. Searching over multiple terms with wildcards like this tends to be a bit more difficult.
The easiest solution would be to index your city names without tokenization (lowercasing might not be a bad idea though). Then you would be able to search with the query parser simply by escaping the space:
QueryParser parser = new QueryParser("defaultField", analyzer);
Query query = parser.parse("cityname:new\\ del*");
Or you could use a simple WildcardQuery:
Query query = new WildcardQuery(new Term("cityname", "new del*"));
With the field analyzed by standard analyzer:
You will need to rely on SpanQueries, something like this:
SpanQuery queryPart1 = new SpanTermQuery(new Term("cityname", "new"));
SpanQuery queryPart2 = new SpanMultiTermQueryWrapper(new WildcardQuery(new Term("cityname", "del*")));
Query query = new SpanNearQuery(new SpanQuery[] {query1, query2}, 0, true);
Or, you can use the surround query parser (which provides query syntax intended to provide more robust support of span queries), using a query like W(new, del*):
org.apache.lucene.queryparser.surround.parser.QueryParser surroundparser = new org.apache.lucene.queryparser.surround.parser.QueryParser();
SrndQuery srndquery = surroundparser.parse("W(new, del*)");
query = srndquery.makeLuceneQueryField("cityname", new BasicQueryFactory());
As I learnt from the thread mentioned by you (http://www.gossamer-threads.com/lists/lucene/java-user/5487), you can either do an exact match with space or treat either parts w/ wild card.
So something like this should work - [New* Del*]

Using Wildcard Sql for searching a word in a TextField

To make it clearer I have this fields
Columntobesearch
aword1 bword1
aword2 bword2
aword3 bword4
Now what I want to do is search using the sql wild card so what I did is like this
%searchbox%
I placed to wildcards on both ends of my search but what it searches is just the first word on the field
when I search 'aword' all of the fields is showing but when I search 'bword' nothing is showing, Please help.
Here is my Full Code
$Input=Input::all();
$makethis=Input::flash();
$soptions=Input::get('soptions');
$searchbox=Input::get('searchbox');
$items = Gamefarm::where('roost_hen', '=',Input::get('sex'))
->where($soptions, 'LIKE','%' . $searchbox . '%')
->paginate(12);
If you use mysql you can try this:
<?php
$q = Input::get('searchbox');
$results = DB::table('table')
->whereRaw("MATCH(columntobesearch) AGAINST(? IN BOOLEAN MODE)",
array($q)
)->get();
Ofcourse you need to prepare your table for full text search in your migration file with
DB::statement('ALTER TABLE table ADD FULLTEXT search(columntobesearch)');
Any way, this is not the more scalable nor efficient way to do FTS.
For a scalable and reliable full text search I strongly recommend you see elasticsearch and implement any Laravel package to this task

How to make LIKE in SQL look for specific string instead of just a wildcard

My SQL Query:
SELECT
[content_id] AS [LinkID]
, dbo.usp_ClearHTMLTags(CONVERT(nvarchar(600), CAST([content_html] AS XML).query('root/Physicians/name'))) AS [Physician Name]
FROM
[DB].[dbo].[table1]
WHERE
[id] = '188'
AND
(content LIKE '%Urology%')
AND
(contentS = 'A')
ORDER BY
--[content_title]
dbo.usp_ClearHTMLTags(CONVERT(nvarchar(600), CAST([content_html] AS XML).query('root/Physicians/name')))
The issue I am having is, if the content is Neurology or Urology it appears in the result.
Is there any way to make it so that if it's Urology, it will only give Urology result and if it's Neurology, it will only give Neurology result.
It can be Urology, Neurology, Internal Medicine, etc. etc... So the two above used are what is causing the issue.
The content is a ntext column with XML tag inside, for example:
<root><Location><location>Office</location>
<office>Office</office>
<Address><image><img src="Rd.jpg?n=7513" /></image>
<Address1>1 Road</Address1>
<Address2></Address2>
<City>Qns</City>
<State>NY</State>
<zip>14404</zip>
<phone>324-324-2342</phone>
<fax></fax>
<general></general>
<from_north></from_north>
<from_south></from_south>
<from_west></from_west>
<from_east></from_east>
<from_connecticut></from_connecticut>
<public_trans></public_trans>
</Address>
</Location>
</root>
With the update this content column has the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<Physicians>
<name>Doctor #1</name>
<picture>
<img src="phys_lab coat_gradation2.jpg?n=7529" />
</picture>
<gender>M</gender>
<langF1>
English
</langF1>
<specialty>
<a title="Neurology" href="neu.aspx">Neurology</a>
</specialty>
</Physicians>
</root>
If I search for Lab the result appears because there is the text lab in the column.
This is what I would do if you're not into making a CLR proc to use Regexes (SQL Server doesn't have regex capabilities natively)
SELECT
[...]
WHERE
(content LIKE #strService OR
content LIKE '%[^a-z]' + #strService + '[^a-z]%' OR
content LIKE #strService + '[^a-z]%' OR
content LIKE '%[^a-z]' + #strService)
This way you check to see if content is equal to #strService OR if the word exists somewhere within content with non-letters around it OR if it's at the very beginning or very end of content with a non-letter either following or preceding respectively.
[^...] means "a character that is none of these". If there are other characters you don't want to accept before or after the search query, put them in every 4 of the square brackets (after the ^!). For instance [^a-zA-Z_].
As I see it, your options are to either:
Create a function that processes a string and finds a whole match inside it
Create a CLR extension that allows you to call .NET code and leverage the REGEX capabilities of .NET
Aaron's suggestion is a good one IF you can know up front all the terms that could be used for searching. The problem I could see is if someone searches for a specific word combination.
Databases are notoriously bad at semantics (i.e. they don't understand the concept of neurology or urology - everything is just a string of characters).
The best solution would be to create a table which defines the terms (two columns, PK and the name of the term).
The query is then a join:
join table1.term_id = terms.term_id and terms.term = 'Urology'
That way, you can avoid the LIKE and search for specific results.
If you can't do this, then SQL is probably the wrong tool. Use LIKE to get a set of results which match and then, in an imperative programming language, clean those results from unwanted ones.
Judging from your content, can you not leverage the fact that there are quotes in the string you're searching for?
SELECT
[...]
WHERE
(content LIKE '%""Urology""%')

The right way to prepare SQL statements with parametrized text search

Suddenly I've realized that while this works in groovy just like it is expeceted:
Sql.newInstance(connectionParams).rows("SELECT FROM ITEMS WHERE id = ?", [200])
this won't work
Sql.newInstance(connectionParams).rows("SELECT FROM ITEMS WHERE name LIKE '%?%'", ["some"])
All you can get is
Failed to execute: SELECT FROM ITEMS WHERE name LIKE '%?%' because:
The column index is out of range: 1, number of columns: 0.
My questions are:
Is it intentionally implemented this way? I've never needed to have a parametrized text search, so I'm not sure where this behaviour is typical or not.
How can I nevertheless safely parametrize statement with text search in it?
I believe you want to include the %'s in the parameter, like:
Sql.newInstance(connectionParams).rows("SELECT FROM ITEMS WHERE name LIKE ?", ["%some%"])

SQL Select Like Keywords in Any Order

I am building a Search function for a shopping cart site, which queries a SQL Server database. When the user enters "Hula Hoops" in the search box, I want results for all records containing both "Hula" and "Hoop", in any order. Furthermore, I need to search multiple columns (i.e. ProductName, Description, ShortName, MaufacturerName, etc.)
All of these product names should be returned, when searching for "Hula hoop":
Hula hoop
Hoop Hula
The Hoopity of xxhula sticks
(Bonus points if these can be ordered by relevance!)
It sounds like you're really looking for full-text search, especially since you want to weight the words.
In order to use LIKE, you'll have to use multiple expressions (one per word, per column), which means dynamic SQL. I don't know which language you're using, so I can't provide an example, but you'll have to produce a statement that's like this:
For "Hula Hoops":
where (ProductName like '%hula%' or ProductName like '%hoops%')
and (Description like '%hula%' or Description like '%hoops%')
and (ShortName like '%hula%' or ShortName like '%hoops%')
etc.
Unfortunately, that's really the only way to do it. Using Full Text Search would allow you to reduce your criteria to one per column, but you'll still have to specify the columns explicitly.
Since you're using SQL Server, I'm going to hazard a guess that this is a C# question. You'd have to do something like this (assuming you're constructing the SqlCommand or DbCommand object yourself; if you're using an ORM, all bets are off and you probably wouldn't be asking this anyway):
SqlCommand command = new SqlCommand();
int paramCount = 0;
string searchTerms = "Hula Hoops";
string commandPrefix = #"select *
from Products";
StringBuilder whereBuilder = new StringBuilder();
foreach(string term in searchTerms.Split(' '))
{
if(whereBuilder.Length == 0)
{
whereBuilder.Append(" where ");
}
else
{
whereBuilder.Append(" and ");
}
paramCount++;
SqlParameter param = new SqlParameter(string.Format("param{0}",paramCount), "%" + term + "%");
command.Parameters.Add(param);
whereBuilder.AppendFormat("(ProductName like #param{0} or Description like #param{0} or ShortName like #param{0})",paramCount);
}
command.CommandText = commandPrefix + whereBuilder.ToString();
SQL Server Full Text Search should help you out. You will basically create indexes on the columns you want to search. in the where clause of your query you will use the CONTAINS operator and pass it your search input.
you can start HERE or HERE to learn more
You might want to check out SOLR too - if you're going to be doing this type of searching. Super cool.
http://lucene.apache.org/solr/