best search in mdb for big data - sql

I have created a dictionary for English to Kurdish and I saved my data in .mdb access file, my data are more than 78,000 words.
Please can anyone help me to make a quick search?
I'm using this query for search
"SELECT english FROM table WHERE English LIKE '" +text Searchlight. Text+"%'";

If your query is:
SELECT english
FROM table
WHERE English LIKE '" +text Searchlight. Text+"%'"
Then I'm a little confused. Access generally uses * as the wildcard for searching rather than % (which is the SQL standard). Because the LIKE pattern does not start with a wildcard, many databases will use an index (if available) for this query. I don't know if MS Access has this optimization.
In any case, you seem to be going down a path where full text search is beneficial. If so, I think you have the wrong tool for the job. MS Access doesn't support full text search. I would suggest that you use a database that does (obvious choices are SQL Server Express, Postgres, and MySQL, all of which are free). By the way, all three of these do use an index for LIKE, when the pattern does not start with a wildcard character.
If you decide to use SQL Server Express, this answer should be helpful for the installation.

Related

SQL Full-Text Search (CONTAINSTABLE) - Better String Parser?

I'm currently using a RadSearchBox (Telerik) for AutoComplete purposes to collect the search string from a user and pass it to a stored procedure that queries against a MS SQL full-text index (400k+ rows) fairly fast and returns the results in a RadGrid. The customer wants (as they always do) for the search string parser to be more advanced. I currently have it so if it detects any non-alphanumeric characters ("/","-",etc) it will search for the "exact/phrase", as full-text doesn't do well with " * exact/phrase * ".
But, before I completely re-create the wheel, I was wondering if anyone out there has something more intelligent that will insert 'AND','OR','NEAR', search for "exact/phrase" and " * exact * " OR " * phrase * " and return the results back racked and stacked based on the CONTAINSTABLE ranking. Having FORMSOF(INFLECTION,exact) would be icing on the cake.
I know this is a lot to ask, just putting a "line out" to see if anyone has anything decent they would be willing to share. I would look into using Lucene, but the web app has a Fluent Data Model that isn't easily compatible (that I know of) with Lucene.
I've done my fair share of googling, but can't seem to find anything that looks clean (would prefer not to dive into and out of numerous functions/stored procedures to accomplish) and also something that was written within the last several years.
Thanks in advance...

SQL Server Efficient Search for LIKE '%str%'

In Sql Server, I have a table containing 46 million rows.
In "Title" column of table, I want make search. The word may be at any index of field value.
For example:
Value in table: BROTHERS COMPANY
Search string: ROTHER
I want this search to match the given record. This is exactly what LIKE '%ROTHER%' do. However, LIKE '%%' usage should not be used on large tables because of performance issues. How can I achieve it?
Though I don't know your requirements, your best approach may be to challenge them. Middle-of-the-string searches are usually not very practical. If you can get your users to perform prefix searches (broth%) then you can easily use Full Text's wildcard search (CONTAINS(*, '"broth*"')). Full Text can also handle suffix searches (%rothers) with a little extra work.
But when it comes to middle-of-the-string searches with SQL Server, you're stuck using LIKE. However you may be able to improve performance of LIKE by using a binary collation as explained in this article. (I hate to post a link without including its content but it is way too long of an article to post here and I don't understand the approach enough to sum it up.)
If that doesn't help and if middle-of-the-string searches are that important of a requirement then you should consider using a different search solution like Lucene.
Add Full-Text index if you want.
You can search the table using CONTAINS:
SELECT *
FROM YourTable
WHERE CONTAINS(TableColumnName, 'SearchItem')

SQL - searching database with the LIKE operator

Given your data stored somewhere in a database:
Hello my name is Tom I like dinosaurs to talk about SQL.
SQL is amazing. I really like SQL.
We want to implement a site search, allowing visitors to enter terms and return relating records. A user might search for:
Dinosaurs
And the SQL:
WHERE articleBody LIKE '%Dinosaurs%'
Copes fine with returning the correct set of records.
How would we cope however, if a user mispells dinosaurs? IE:
Dinosores
(Poor sore dino). How can we search allowing for error in spelling? We can associate common misspellings we see in search with the correct spelling, and then search on the original terms + corrected term, but this is time consuming to maintain.
Any way programatically?
Edit
Appears SOUNDEX could help, but can anyone give me an example using soundex where entering the search term:
Dinosores wrocks
returns records instead of doing:
WHERE articleBody LIKE '%Dinosaurs%' OR articleBody LIKE '%Wrocks%'
which would return squadoosh?
If you're using SQL Server, have a look at SOUNDEX.
For your example:
select SOUNDEX('Dinosaurs'), SOUNDEX('Dinosores')
Returns identical values (D526) .
You can also use DIFFERENCE function (on same link as soundex) that will compare levels of similarity (4 being the most similar, 0 being the least).
SELECT DIFFERENCE('Dinosaurs', 'Dinosores'); --returns 4
Edit:
After hunting around a bit for a multi-text option, it seems that this isn't all that easy. I would refer you to the link on the Fuzzt Logic answer provided by #Neil Knight (+1 to that, for me!).
This stackoverflow article also details possible sources for implentations for Fuzzy Logic in TSQL. Once respondant also outlined Full text Indexing as a potential that you might want to investigate.
Perhaps your RDBMS has a SOUNDEX function? You didn't mention which one was involved here.
SQL Server's SOUNDEX
Just to throw an alternative out there. If SSIS is an option, then you can use Fuzzy Lookup.
SSIS Fuzzy Lookup
I'm not sure if introducing a separate "search engine" is possible, but if you look at products like the Google search appliance or Autonomy, these products can index a SQL database and provide more searching options - for example, handling misspellings as well as synonyms, search results weighting, alternative search recommendations, etc.
Also, SQL Server's full-text search feature can be configured to use a thesaurus, which might help:
http://msdn.microsoft.com/en-us/library/ms142491.aspx
Here is another SO question from someone setting up a thesaurus to handle common misspellings:
FORMSOF Thesaurus in SQL Server
Short answer, there is nothing built in to most SQL engines that can do dictionary-based correction of "fat fingers". SoundEx does work as a tool to find words that would sound alike and thus correct for phonetic misspellings, but if the user typed in "Dinosars" missing the final U, or truly "fat-fingered" it and entered "Dinosayrs", SoundEx would not return an exact match.
Sounds like you want something on the level of Google Search's "Did you mean __?" feature. I can tell you that is not as simple as it looks. At a 10,000-foot level, the search engine would look at each of those keywords and see if it's in a "dictionary" of known "good" search terms. If it isn't, it uses an algorithm much like a spell-checker suggestion to find the dictionary word that is the closest match (requires the fewest letter substitutions, additions, deletions and transpositions to turn the given word into the dictionary word). This will require some heavy procedural code, either in a stored proc or CLR Db function in your database, or in your business logic layer.
You can also try the SubString(), to eliminate the first 3 or so characters . Below is an example of how that can be achieved
SELECT Fname, Lname
FROM Table1 ,Table2
WHERE substr(Table1.Fname, 1,3) || substr(Table1.Lname,1 ,3) = substr(Table2.Fname, 1,3) || substr(Table2.Lname, 1 , 3))
ORDER BY Table1.Fname;

Good SQL search tool?

FreeTextTable is really great for searching, as it actually returns a relevancy score for each item it finds.
The problem is, it doesn't support the logical operator AND, so if I have 10 items with the word 'ice' in it, but not 'cream', and vice versa, then 20 results will be returned, when in this scenario 0 should've been returned.
Are there any alternative tools to search a SQL Server database? Or should I just write my own code to provide 'AND' functionality (I.E. doing two seperate searches in the scenario 'Ice'Cream' (splitting each search by spaces))
You can try SQL Search from RedGate.
It is a free tool (though not open source) - I have used it before and it is very powerful.
There is also a free SQL Search tool from ApexSQL you can try. It integrates into SSMS and can also show relationship diagrams and help with safely removing/renaming objects in your database. They do require you to leave email but the product itself is completely free. ApexSQL Search
Since you have full text search enabled to use FREETEXTTABLE perhaps you could make use of CONTAINS instead? (I have to be honest, I've not used full text search myself).
It would appear you can query like this:
SELECT Name, Price FROM Product
WHERE CONTAINS(Name, 'ice')
AND CONTAINS(Name, 'cream')

Best way to implement a stored procedure with full text search

I would like to run a search with MSSQL Full text engine where given the following user input:
"Hollywood square"
I want the results to have both Hollywood and square[s] in them.
I can create a method on the web server (C#, ASP.NET) to dynamically produce a sql statement like this:
SELECT TITLE
FROM MOVIES
WHERE CONTAINS(TITLE,'"hollywood*"')
AND CONTAINS(TITLE, '"square*"')
Easy enough. HOWEVER, I would like this in a stored procedure for added speed benefit and security for adding parameters.
Can I have my cake and eat it too?
I agreed with above, look into AND clauses
SELECT TITLE
FROM MOVIES
WHERE CONTAINS(TITLE,'"hollywood*" AND "square*"')
However you shouldn't have to split the input sentences, you can use variable
SELECT TITLE
FROM MOVIES
WHERE CONTAINS(TITLE,#parameter)
by the way
search for the exact term (contains)
search for any term in the phrase (freetext)
The last time I had to do this (with MSSQL Server 2005) I ended up moving the whole search functionality over to Lucene (the Java version, though Lucene.Net now exists I believe). I had high hopes of the full text search but this specific problem annoyed me so much I gave up.
Have you tried using the AND logical operator in your string? I pass in a raw string to my sproc and stuff 'AND' between the words.
http://msdn.microsoft.com/en-us/library/ms187787.aspx