SQL Server - Need to check spelling errors in a column - sql

Looking for a quick help.
In Sql, one of the table has a column(ntext). I need to find the count of spelling mistakes in the field. Association of MS Word dictionary would be great, however, can work with any other dictionary as well.
Any help on this would be great.

A possible way would be to read the text fields and pass them through a spellcheck library such as PyEnchant to count how many misspelt words there are (see Python: check whether a word is spelled correctly, https://pythonhosted.org/pyenchant/tutorial.html)
Otherwise, if you want to use MS Word spellcheck specifically, you can find some ideas here: Using MS Office's Spellchecking feature with C# and https://msdn.microsoft.com/en-us/library/windows/desktop/hh869748(v=vs.85).aspx.

Thanks for the reply Micah, however, I am looking for coding in sql.
For example one of my column value is - 'Liives' instead of 'Lives' then I should get 1 in column name 'Error_Count'
And if anyone can share the code for this then it would be a great help.
Thanks,
Victor

Related

matching two columns in excel with slight difference in the spelling

I am working on huge excel sheets from different sources about the same thing. The way the sources report it and write down information is different. So, for example, one would write the location as "Khurais" whereas the other would write it as "Khorais".
Since both of these files are contain important information, I would like to combine them in one excel sheet so that I can deal with them more easily. So if you have any suggestion or tool that you think would be beneficial, please share it here.
P.s. The words in the excel sheet are translations of Arabic words.
You could use Levenshtein distance to determine if two words are "close" to each other. Based on that you could match.
You could use FuzzyLookup, a macro that allows you to do appropriate matching. It worked really well for me in the past and is actually really well documented.
You can find it here: https://www.mrexcel.com/forum/excel-questions/195635-fuzzy-matching-new-version-plus-explanation.html including examples on how to use it.
Hope that helps!
PS obviously you can also use it stricly within VBA (not using worksheet functions)
The Double Metaphone algorithm springs to mind. It attempts to convert strings into phonetic representations. For example, "Folly" and "Pholee" should have the same phonetic code.
If you could generate these codes, you could then match your records based on them, instead of the strings.
Here's an article that explains, along with sample VBA code:
https://bytes.com/topic/access/insights/965241-fuzzy-string-matching-double-metaphone-algorithm
Hope that inspires you :)

SELECT query using LIKE property in Microsoft Access returns no results when it should

I'm sure I'm making some kind of rookie error here, but I have no idea what the problem is. I am trying to run a simple query on one table in a microsoft access database using the LIKE property to find records that have a certain text string in a particular field. More specifically, the table, called Catreqs, has a few fields, bib_num, MARC_336, MARC_337, and MARC_338. The MARC_336 field has a text string in it and I want a query that selects all the records for which that text string includes the characters "txt".
Here's my query:
SELECT [Catreqs].record_num, [Catreqs].MARC_336
FROM [Catreqs]
WHERE [Catreqs].MARC_336 Like '%txt%';
I should note that I created this query in MS Access design view and this is the query that was generated when I switched to SQL view. I am a little familiar with SQL and even less familiar with Access so this is actually my preferred way of dealing with it.
I've also tried using Like '*txt*' but that didn't return any results either. For reference, here is the entire text string these characters are in:
text txt rdacontent
Any suggestions thoughts on why this fails and how I can fix it?
Thanks!
In Access, for a string you must use the * character.
Check if [Catreqs] has rows where MARC_336 contains "txt".
This is the official documentation of Access:
https://support.office.com/en-us/article/Like-Operator-b2f7ef03-9085-4ffb-9829-eef18358e931?ui=it-IT&rs=en-001&ad=IT&omkt=en-001

Access: Comparing Memo fields - Not In

Good morning! I am seeking guidance on an issue I have been stuck on since last week, but hopefully there is an easy solution.
As you know, you cannot directly link/join memo fields in MS Access. I created a query last week to return rows where a memo field in one table contained the text field from another table via the Where clause "[memo] LIKE '\*[text]\*'" and this worked out perfectly.
However, now I would like to find out the memo values from the table NOT present in the query. I was hoping it would be simple to do with a "Not in" clause, but this does not seem to be the case.
Is there another method to do this? Is there a way to perhaps convert the data type in a SQL query? Or is the only way to do this type of query in VBA?
Thank you in advance! I can provide more info upon request, but I did not feel the field/table names would be of any use.
Cheers to #HansUp! I added the original primary key to the initial query and just compared those as opposed to trying to compare the memo fields; a much simpler solution! I might make adding the primary key a subquery as to keep the original query only contain fields of interest, but at least it works accurately! Cheers all! I love this community.

Contains() function falters with strings of numbers?

For some background information, I'm creating an application that searches against a couple of indexed tables to retrieve some records. It isn't overtly complex to the point of say Google, but it's good enough for the purpose it serves, barring this strange issue.
I'm using the Contains() function, and it's going very well, except when the search contains strings of numbers. Now, I'm only passing in a string -- nowhere numerical datatypes being passed in -- only characters. We're searching against a collection of emails, each appended with a custom ID when shot off from a workflow. So while testing, we decided to search via number strings.
In our test, we isolated a number 0042600006, which belongs to one and only one email subject. However, when using our query we are getting results for 0042600001, 0042600002, etc. The query is this as follows (with some generic columns standing in):
SELECT description, subject FROM tableA WHERE CONTAINS((subject), '0042600006')
We've tried every possible combination: '0042600006*', '"0042600006"' and '"0042600006*"'.
I think it's just a limitation of the function, but I thought this would probably be the best place for answers. Thanks in advance.
Asked this same question recently. Please see the insightful answer someone left me here
Essentially what this user says to do is to turn off the noise words (Microsoft has included integers 0-9 as noise in the Full Text Search). Hope you can use this awesome tool with integers as I now am!
try to add language 1033 as an additional parameter. that worked with my solution.
SELECT description, subject FROM tableA WHERE CONTAINS((subject), '0042600006', language 1033)
try using
SELECT description, subject FROM tableA WHERE CONTAINS((subject), '%0042600006%')

How can I convert a user's search query into a MS SQL Full-Text Query Statement

I've search for answers for this and I can't seem to find an answer to what should be somewhat simple.
This is related to another question I asked, but it's different. What's the best way to take a user's search phrase and throw it into a CONTAINSTABLE(table, column, #phrase, topN ) phrase?
Say, for example the user inputs: Books by "Dr. Seuss"
What's the best way to turn that into something that will return results in my ContainsTAble() phrase?
I was previously parsing the search phrase and writing something like ISABOUT("Books" WEIGHT(1.0), "by" WEIGHT(0.9), "Dr. Seuss" WEIGHT(0.8)) as my #phrase but ISABOUT seems to be returning odd results... especially when one word searches are entered.
Any Ideas?
We've implemented a slightly modified version of the code found in this article on SQL Server Central. It uses the Irony Compiler Construction Kit from Codeplex.
There was a bug in the original version when starting any search query with a reserved word. For example, by searching for 'Orange', it would recognise the OR term and expect binary operands which weren't supplied. This was fixed in some code provided in the discussion forum on the article which is now up to 13 pages!