I came up with a query, to fetch data from a table, which contains 93781665 entries, to display the results as suggestions in an autocomplete text box.
But it takes more than 300 Sec to fetch results.
The query is given below.
select * from table
where upper(column1||' '||column2||' '||column3) like upper('searchstring%')
and rownum <= 10;
Kindly help me to optimize it.
The WHERE clause in your query is not sargable, meaning that no index can be used there. This rules out most of the methods you might use here to optimize the query. Here is one suggestion:
SELECT *
FROM yourTable
WHERE column4 LIKE 'SEARCHSTRING%';
Here, column4 is a new column in your table, which contains the concatenation of the first three columns. Furthermore, all text in column4 will always be uppercase, and the search string you pass into the query will also always be uppercase. Given these assumptions, the following index might help the query:
CREATE INDEX idx ON yourTable (column4);
In Oracle, you can index an expression:
create index idx_t_columns on t(upper(column1||' '||column2||' '||column3))
Then, this condition can use the index:
where upper(column1||' '||column2||' '||column3) like 'searchstring%'
If the search string is constant, then this should also work:
where upper(column1||' '||column2||' '||column3) like upper('searchstring%')
Note that a wildcard at the beginning of the like pattern would preclude the use of an index.
Related
I am using regexp_like function to search specific patterns on a column. But, I see this query is not taking the index created on this column instead going for full table scan. Is there any option to create function based index for regexp_like so that my query will use that index? Here, the pattern SV4889 is not constant expression but it will vary every time.
select * from test where regexp_like(id,'SV4889')
Yup. Regular expressions do not use indexes. What can you do?
Well, if you are just looking for equality, then use equality:
where id = 'SV4889'
This will use an index on (id).
If you are looking for a leading value, then use like:
where id like 'SV4889%'
This will use an index because the wildcard is at the end of the pattern.
If you are storing multiple values in the column, say 'SV4889,SV4890' then fix your data model. It is broken! You should have another table with one row per id.
Finally, if you really need more sophisticated full text capabilities, then look into Oracle's support for full text indexes. However, such capabilities are usually not needed on a column called id.
You can add a virtual column to your table to determine if the substring you're interested in exists in the field, then index the virtual column. For example:
ALTER TABLE TEST
ADD SV4889_FLAG CHAR(1)
GENERATED ALWAYS AS (CASE
WHEN REGEXP_LIKE(ID,'SV4889') THEN 'Y'
ELSE 'N'
END) VIRTUAL;
This adds a field named SV4889_FLAG to your table which will contain Y if the text SV4889 exists in the ID field, and N if it doesn't. Then you can create an index on the new field:
CREATE INDEX IDX_TEST_SV4889_FLAG
ON TEST (SV4889_FLAG);
So to determine if a row has 'SV4889' in it you can use a query such as:
SELECT *
FROM TEST
WHERE SV4889_FLAG = 'Y'
db<>fiddle here
Is doing something like the following possible in SQLite:
create INDEX idx on mytable (synopsis(20));
In other words, indexing by something less than the full text field? This is useful on long-text fields where I don't want to index everything into memory (the index itself could take up more space than the entire table).
You seem to be looking for an index on expression:
Use a CREATE INDEX statement to create a new index on one or more expressions just like you would to create an index on columns. The only difference is that expressions are listed as the elements to be indexed rather than column names.
Consider:
CREATE INDEX idx ON mytable(SUBSTR(synopsis, 1, 20));
Please note that, as explained in the documentation, for this index to be considered by the sqlite query planner, you need to use the exact same expression that was given when creating the index.
So this query would use the index:
SELECT * FROM mytable WHERE SUBSTR(synopsis, 1, 20) = 'a text with 20 chars';
While, typically, this would not:
SELECT * FROM mytable WHERE synopsis LIKE 'a text with 20 chars%';
Note: yes, 'a text with 20 chars' is 20 chars long...
Let's say I have a SQL table with an int PK column and an nvarchar(max). In the the nvarchar(max) column, I have a bunch of table entries that are all like this:
SOME_PEOPLE_LIKE_APPLES
SOME_PEOPLE_LIKE_APPLES_ON_TUESDAY
SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON
SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON_CAFE
SOME_PEOPLE_LIKE_APPLES_ON_THE_RIVER
.
.
.
SOME_ANTS_HATE_SYRUP
SOME_ANTS_HATE_SYRUP_WITH_STRAWBERRIES
There's millions of these rows - Then let's say my goal is to find the row with the most overlap for an input searchTerm - So in this case, if I input SOME PEOPLE_LIKE_APPLES_ON_THE_MOON_MOUNTAIN, the returned entry would be the third entry from the table above, SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON
I have a SPROC that does this very naively, it goes through the entire table as follows:
SELECT DISTINCT phrase, len(phrase) l, [id] FROM X WHERE searchTerm LIKE phrase + '%'
-- phrase is the row entry being searched against
-- searchTerm is the phrase we're searching for
I then ORDER BY length and pick the TOP only
Would there be a way to speed this up, perhaps by doing some indexing?
If this is confusing, think of it as tableRowEntry + wildcard = searchTerm
I'm on MSSQL 2008 if that makes any difference
If there is an index on your NVARCHAR-column a LIKE 'Something%' -search will be able to use it and should be pretty fast.
If there is a wildcard in the beginning you are out of luck. But - in your case - this should work.
You might use an indexed persistant computed column storing the length of the string. In this case you might reduce the workload enormously by filtering out all string which are to short or to long.
If there are certain words in your search terms which appear often but not everywhere, you might use side columns again and filter like AND InlcudePEOPLE=1 AND IncludeMOON=1
UPDATE
Here is an example
CREATE TABLE Phrase(ID INT IDENTITY
,Phrase NVARCHAR(100)
,PhraseLength AS LEN(Phrase) PERSISTED);
CREATE INDEX IX_Phrase_Phrase ON Phrase(Phrase);
CREATE INDEX IX_Phrase_PhraseLength ON Phrase(PhraseLength);
INSERT INTO Phrase
VALUES
('SOME_PEOPLE_LIKE_APPLES')
,('SOME_PEOPLE_LIKE_APPLES_ON_TUESDAY')
,('SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON')
,('SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON_CAFE')
,('SOME_PEOPLE_LIKE_APPLES_ON_THE_RIVER')
,('SOME_ANTS_HATE_SYRUP')
,('SOME_ANTS_HATE_SYRUP_WITH_STRAWBERRIES');
DECLARE #SearchTerm NVARCHAR(100)=N'SOME_PEOPLE_LIKE_APPLES_ON_THE_MOON_MOUNTAIN';
--This uses the index (checked against execution plan)
SELECT TOP 1 *
FROM Phrase
WHERE #SearchTerm LIKE Phrase + '%'
ORDER BY PhraseLength DESC;
--This might be even better, check with your high row count.
SELECT TOP 1 *
FROM Phrase
WHERE Phrase=LEFT(#SearchTerm,PhraseLength)
ORDER BY PhraseLength DESC;
GO
--Clean-Up
DROP TABLE Phrase;
The best solution here is to create a full-text search index:
https://msdn.microsoft.com/en-us/library/ms142571.aspx
Full text search is optimized for this task, once the index is created you can use full-text queries with the CONTAINS full-text function to find the matches efficiently:
SELECT DISTINCT phrase, len(phrase) l, [id] FROM X WHERE CONTAINS(phrase, searchPhrase)
Full text search not only allows custom optimization through query hints like OPTIMIZE FOR, it also allows for stopwords like AND and OR within the search terms, and a variety of other text-searching goodies, like being able to find spelling variations of the same word automatically and filter by relevance, etc..
I would like to ask if it is possible to do this:
For example the search string is '009' -> (consider the digits as string)
is it possible to have a query that will return any occurrences of this on the database not considering the order.
for this example it will return
'009'
'090'
'900'
given these exists on the database. thanks!!!!
Use the Like operator.
For Example :-
SELECT Marks FROM Report WHERE Marks LIKE '%009%' OR '%090%' OR '%900%'
Split the string into individual characters, select all rows containing the first character and put them in a temporary table, then select all rows from the temporary table that contain the second character and put these in a temporary table, then select all rows from that temporary table that contain the third character.
Of course, there are probably many ways to optimize this, but I see no reason why it would not be possible to make a query like that work.
It can not be achieved in a straight forward way as there is no sort() function for a particular value like there is lower(), upper() functions.
But there is some workarounds like -
Suppose you are running query for COL A, maintain another column SORTED_A where from application level you keep the sorted value of COL A
Then when you execute query - sort the searchToken and run select query with matching sorted searchToken with the SORTED_A column
I need to retrieve rows from my table with out a field starting with a certain value:
I'm currently doing so with a simple query like this:
SELECT A.ID FROM SCHEMA.TABLE A WHERE A.FIELD NOT LIKE 'WORD%'
However, I have learned that A.FIELD sometimes contains a varying number of blank spaces before "WORD".
Obviously, I could re-write the query with another wildcard, but that would make it non-sargable and slow it down a fair bit (this query runs on a reasonably large table and needs to be as efficient as possible).
Is there any way I can write a sargable query to fix this problem?
If you can't clean the data for any reason, one option is to add a computed column to your table that trims all leading and trailing spaces:
ALTER TABLE YourTable
ADD TrimmedYourColumn as (RTRIM(LTRIM(YourColumn)))
And index the computed column:
CREATE INDEX IX_YourTable_TrimmedYourColumn
ON YourTable (TrimmedYourColumn)
And now search that column instead:
SELECT A.ID FROM YourTable A WHERE TrimmedYourColumn NOT LIKE 'WORD%'
How about ltrim?
SELECT A.ID FROM SCHEMA.TABLE A WHERE ltrim(A.FIELD) NOT LIKE 'WORD%'