SQL Server 2005 FTS unexpected results

SQL Server 2005 FTS unexpected results - sql

I have an Indexed View with two columns, a primary key char and a field for full-text indexing varchar(300). I need to search from a MS Great Plains database, so I created the view to populate a field with concatenated values from my primary table IV00101.
CREATE VIEW SearchablePartContent WITH SCHEMABINDING AS
SELECT ITEMNMBR, rtrim(ITEMNMBR)+' '+rtrim(ITMSHNAM)+' '+rtrim(ITMGEDSC)
as SearchableContent
FROM dbo.IV00101
GO
-- create the index on the view to be used as full text key index
CREATE UNIQUE CLUSTERED INDEX IDX_ITEMNMBR ON SearchablePartContent(ITEMNMBR)
CREATE FULLTEXT INDEX ON SearchablePartContent(SearchableContent) KEY INDEX IDX_ITEMNMBR ON Cat1_PartContent
WHILE fulltextcatalogproperty('Cat1_PartContent','populatestatus') <> 0
BEGIN
WAITFOR DELAY '00:00:01'
END
The problem is that when I do a search with particular keyword(s) it will yield unexpected results. For instance a simple query such as:
SELECT * FROM SearchablePartContent WHERE CONTAINS(SearchableContent, 'rotor')
should yield 5 results, instead I get 1. There's about 72,000 records indexed. However, if I do a LIKE comparison, I will get the expected rows. My data is not complex, here are a couple results that should be returned from my query, but are not:
MN-H151536 John Chopper, Rotor Assembly Monkey 8820,9600,8820FRT
MN-H152756 John Rotor, Bearing 9650STS,9750STS1
MN-H160613 John Rotor, Bearing 9650STS,9750STS2
Any help would be greatly appreciated. Thanks

Just a thought: Try enclosing your search term with double quotes to see if it makes a difference.
SELECT * FROM SearchablePartContent WHERE CONTAINS(SearchableContent, ' "rotor" ')

Related

Indexing a varchar2 column in oracle with a pattern?

I have a problem on indexing oracle
Following SQL query is given (means, it is automatically generated, I can't adapt it):
select * from t where f like 'AA.%.123456.%'
On column f, a unique index exists, but it is not beeing used by the oracle optimizer (it's doing a full table scan)
I think this because of the data
Data in column f allways begin with AA
Please note following data example:
AA.248117.123456.UAAA
AA.741447.123456.UAAA
AA.741449.123456.UAAA
AA.741450.123456.UAAA
AA.322812.123456.UAAA
AA.741363.123456.UZZZ
AA.741365.123456.UZZZ
AA.356110.123456.UZZZ
AA.356333.456789.UBBB
AA.256321.456789.UBBB
AA.333219.456789.UBBX
My question:
Is it possible to create an oracle index with a string-pattern?
Meaning a index as some kind of substring (but not using a function based index Substr(), as mentioned above, I can't change the query)
Index should look like this
'AA.<someNumberNotRelevant>.<indexedValue>.<someStringNotRelevant>'
Thanks

SQL Server stored procedure to search list of values without special characters

What is the most efficient way to search a column and return all matching values while ignoring special characters?
For example if a table has a part_number column with the following values '10-01' '14-02-65' '345-23423' and the user searches for '10_01' and 140265 it should return '10-01' and '14-02-65'.
Processing the input to with a regex to remove those characters is possible, so the stored procedure could could be passed a parameter '1001 140265' then it could split that input to form a SQL statement like
SELECT *
FROM MyTable
WHERE part_number IN ('1001', '140265')
The problem here is that this will not match anything. In this case the following would work
SELECT *
FROM MyTable
WHERE REPLACE(part_number,'-','') IN ('1001', '140265')
But I need to remove all special characters. Or at the very least all of these characters ~!##$%^&*()_+?/\{}[]; with a replace for each of those characters the query takes several minutes when the number of parts in the IN clause is less than 200.
Performance is improved by creating a function that does the replaces, so the query takes less than a minute. But without removals the query takes around 1 second, is there any way to create some kind of functional index that will work on multiple SQL Server engines?

You could use a computed column and index it:
CREATE TABLE MyTable (
part_number VARCHAR(10) NOT NULL,
part_number_int AS CAST(replace(part_number, '-', '') AS int)
);
ALTER TABLE dbo.MyTable ADD PRIMARY KEY (part_number);
ALTER TABLE dbo.MyTable ADD UNIQUE (part_number_int);
INSERT INTO dbo.MyTable (part_number)
VALUES ('100-1'), ('140265');
SELECT *
FROM dbo.MyTable AS MT
WHERE MT.part_number_int IN ('1001', '140265');
Of course your replace statement will be more complex and you'll have to sanitize user input the same way you sanitize column values. But this is going to be the most efficient way to do it.
This query can now seek your column efficiently:
But to be honest, I'd just create a separate column to store cleansed values for querying purpose and keep the actual values for display. You'll have to take care of extra update/insert clauses, but that's a minimum damage.

Best way to index a SQL table to find best matching string

Let's say I have a SQL table with an int PK column and an nvarchar(max). In the the nvarchar(max) column, I have a bunch of table entries that are all like this:
SOME_PEOPLE_LIKE_APPLES
SOME_PEOPLE_LIKE_APPLES_ON_TUESDAY
SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON
SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON_CAFE
SOME_PEOPLE_LIKE_APPLES_ON_THE_RIVER
.
.
.
SOME_ANTS_HATE_SYRUP
SOME_ANTS_HATE_SYRUP_WITH_STRAWBERRIES
There's millions of these rows - Then let's say my goal is to find the row with the most overlap for an input searchTerm - So in this case, if I input SOME PEOPLE_LIKE_APPLES_ON_THE_MOON_MOUNTAIN, the returned entry would be the third entry from the table above, SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON
I have a SPROC that does this very naively, it goes through the entire table as follows:
SELECT DISTINCT phrase, len(phrase) l, [id] FROM X WHERE searchTerm LIKE phrase + '%'
-- phrase is the row entry being searched against
-- searchTerm is the phrase we're searching for
I then ORDER BY length and pick the TOP only
Would there be a way to speed this up, perhaps by doing some indexing?
If this is confusing, think of it as tableRowEntry + wildcard = searchTerm
I'm on MSSQL 2008 if that makes any difference

If there is an index on your NVARCHAR-column a LIKE 'Something%' -search will be able to use it and should be pretty fast.
If there is a wildcard in the beginning you are out of luck. But - in your case - this should work.
You might use an indexed persistant computed column storing the length of the string. In this case you might reduce the workload enormously by filtering out all string which are to short or to long.
If there are certain words in your search terms which appear often but not everywhere, you might use side columns again and filter like AND InlcudePEOPLE=1 AND IncludeMOON=1
UPDATE
Here is an example
CREATE TABLE Phrase(ID INT IDENTITY
,Phrase NVARCHAR(100)
,PhraseLength AS LEN(Phrase) PERSISTED);
CREATE INDEX IX_Phrase_Phrase ON Phrase(Phrase);
CREATE INDEX IX_Phrase_PhraseLength ON Phrase(PhraseLength);
INSERT INTO Phrase
VALUES
('SOME_PEOPLE_LIKE_APPLES')
,('SOME_PEOPLE_LIKE_APPLES_ON_TUESDAY')
,('SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON')
,('SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON_CAFE')
,('SOME_PEOPLE_LIKE_APPLES_ON_THE_RIVER')
,('SOME_ANTS_HATE_SYRUP')
,('SOME_ANTS_HATE_SYRUP_WITH_STRAWBERRIES');
DECLARE #SearchTerm NVARCHAR(100)=N'SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON_MOUNTAIN';
--This uses the index (checked against execution plan)
SELECT TOP 1 *
FROM Phrase
WHERE #SearchTerm LIKE Phrase + '%'
ORDER BY PhraseLength DESC;
--This might be even better, check with your high row count.
SELECT TOP 1 *
FROM Phrase
WHERE Phrase=LEFT(#SearchTerm,PhraseLength)
ORDER BY PhraseLength DESC;
GO
--Clean-Up
DROP TABLE Phrase;

The best solution here is to create a full-text search index:
https://msdn.microsoft.com/en-us/library/ms142571.aspx
Full text search is optimized for this task, once the index is created you can use full-text queries with the CONTAINS full-text function to find the matches efficiently:
SELECT DISTINCT phrase, len(phrase) l, [id] FROM X WHERE CONTAINS(phrase, searchPhrase)
Full text search not only allows custom optimization through query hints like OPTIMIZE FOR, it also allows for stopwords like AND and OR within the search terms, and a variety of other text-searching goodies, like being able to find spelling variations of the same word automatically and filter by relevance, etc..

Improving performance on an alphanumeric text search query

I have table where millions of records are there I'm just posting sample data. Actually I'm looking to get only Endorsement data by using LIKE or LEFT but there is no difference between them in Execution time. IS there any fine way to get data in less time while dealing with Alphanumeric Data. I have 4.4M records in table. Suggest me
declare #t table (val varchar(50))
insert into #t(val)values
('0-1AB11BC11yerw123Endorsement'),
('0-1AB114578Endorsement'),
('0-1BC11BC11yerw122553Endorsement'),
('0-1AB11BC11yerw123newBusiness'),
('0-1AB114578newBusiness'),
('0-1BC11BC11yerw122553newBusiness'),
('0-1AB11BC11yerw123Renewal'),
('0-1AB114578Renewal'),
('0-1BC11BC11yerw122553Renewal')
SELECT * FROM #t where RIGHT(val,11) = 'Endorsement'
SELECT * FROM #t where val like '%Endorsement%'

Imagine you'd have to find names in a telephone book that end with a certain string. All you could do is read every single name and compare. It doesn't help you at all to see where the names with A, B, C, etc. start, because you are not interested in the initial characters of the names but only in the last characters instead. Well, the only thing you could do to speed this up is ask some friends to help you and each person scans a range of pages only. In a DBMS it is the same. The DBMS performs a full table scan and does this parallelized if possible.
If however you had a telephone book listing the words backwards, so you'd see which words end with A, B, C, etc., that sure would help. In SQL Server: Create a computed column on the reverse string:
alter table t add reverse_val as reverse(val);
And add an index:
create index idx_reverse_val on t(reverse_val);
Then query the string with LIKE. The DBMS should notice that it can use the index for speeding up the search process.
select * from t where reverse_val like reverse('Endorsement') + '%';
Having said this, it seems strange that you are interested in the end of your strings at all. In a good database you store atomic information, e.g. you would not store a person's name and birthdate in the same column ('John Miller 12.12.2000'), but in separate columns instead. Sure, it does happen that you store names and want to look for names starting with, ending with, containing substrings, but this is a rare thing after all. Check your column and think about whether its content should be separate columns instead. If you had the string ('Endorsement', 'Renewal', etc.) in a separate column, this would really speed up the lookup, because all you'd have to do is ask where val = 'Endorsement' and with an index on that column this is a super-simple task for the DBMS.

try charindex or patindex:
SELECT *
FROM #t t
WHERE CHARINDEX('endorsement', t.val) > 0
SELECT *
FROM #t t
WHERE PATINDEX('%endorsement%', t.val) > 0

CREATE TABLE tbl
(val varchar(50));
insert into tbl(val)values
('0-1AB11BC11yerw123Endorsement'),
('0-1AB114578Endorsement'),
('0-1BC11BC11yerw122553Endorsement'),
('0-1AB11BC11yerw123newBusiness'),
('0-1AB114578newBusiness'),
('0-1BC11BC11yerw122553newBusiness'),
('0-1AB11BC11yerw123Renewal'),
('0-1AB114578Renewal'),
('0-1BC11BC11yerw122553Renewal');
CREATE CLUSTERED INDEX inx
ON dbo.tbl(val)
SELECT * FROM tbl where val like '%Endorsement';
--LIKE '%Endorsement' will give better performance it will utilize the index well efficiently than RIGHT(val,ll)

Full-Text Catalog unable to search Primary Key

I have create this little sample Table Person.
The Id is a primary key and is identity = True
Now when I try to create a FullText Catalog, I'm unable to search for the Id.
If I follow this wizard, I'll end up with a Catalog, where its only possible to search for a persons name. I would really like to know, how I can make it possible to do a fulltext search for both the Id and Name.
edit
SELECT *
FROM [TestDB].[dbo].[Person]
WHERE FREETEXT (*, 'anders' );
SELECT *
FROM [TestDB].[dbo].[Person]
WHERE FREETEXT (*, '1' );
I would like them to return the same result, the first returns id = 1 name = Anders, while the second query don't return anything.
edit 2
Looks like the problem is in using int, but is it not possible to trick FullText to support it?
edit 3
Created a view where I convert the int to a nvarchar. CONVERT(nvarchar(50), Id) AS PersonId this did make it possible for me to select that column, when creating the Full Text Catalog, but It still won't let me find it searching for the Id.

From reading your question, I am not sure that you understand the purpose of a full-text index. A full-text index is intended to search TEXT (one or more columns) on a table. And not just as a replacement for:
SELECT *
FROM table
WHERE col1 LIKE 'Bob''s Pizzaria%'
OR col2 LIKE 'Bob''s Pizzaria%'
OR col3 LIKE 'Bob''s Pizzaria%'
It also allows you to search for variations of "Bob's Pizzaria" like "Bobs Pizzeria" (in case someone misspells Pizzeria or forgot to put in the ' or over-zealous anti-SQL-injection code stripped the ') or "Robert's Pizza" or "Bob's Pizzeria" or "Bob's Pizza", etc. It also allows you to search "in the middle" of a text column (char, varchar, nchar, nvarchar, etc.) without the dreaded "%Bob%Pizza%" that eliminates any chance of using a traditional index.
Enough with the lecture, however. To answer your specific question, I would create a separate column (not a "computed column") "IdText varchar(10)" and then an AFTER INSERT trigger something like this:
UPDATE t
SET IdText = CAST(Id AS varchar(10))
FROM table AS t
INNER JOIN inserted i ON i.Id = t.Id
If you don't know what "inserted" is, see this MSDN article or search Stack Overflow for "trigger inserted". Then you can include the IdText column in your full-text index.
Again, from your example I am not sure that a full-text index is what you should use here but then again your actual situation might be something different and you just created this example for the question. Also, full-text indexes are relatively "expensive" so make sure you do a cost-benefit analysis. Here is a Stack Overflow question about full-text index usage.

Why not just do a query like this.
SELECT *
FROM [TestDB].[dbo].[Person]
WHERE FREETEXT (*, '1' )
OR ID = 1
You can leave off the "OR ID = 1" part if by first checking if the search term is a number.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server 2005 FTS unexpected results - sql

Just a thought: Try enclosing your search term with double quotes to see if it makes a difference. SELECT * FROM SearchablePartContent WHERE CONTAINS(SearchableContent, ' "rotor" ')

Related

Indexing a varchar2 column in oracle with a pattern?

SQL Server stored procedure to search list of values without special characters

Best way to index a SQL table to find best matching string

Improving performance on an alphanumeric text search query

Full-Text Catalog unable to search Primary Key

Categories

Resources