SQL Server Function Based Index Without Computed Column - sql

I have a SQL Server query which is running slow due to the WHERE clause
WHERE LEFT(CustomerId,3) = 'ABC'
I need the LEFT function there as there are other customer IDs which begin with 'BCD', 'CDE' or 'DEF' and I only want those with ABC in my results. However, I need the query to be quick but the function is killing it, causing an index scan.
I know in Oracle there is a way to create a function based index so I would be able to do
CREATE INDEX IX_Customer_ID ON Customer (LEFT(CustomerId,3))
I would like to create similar functionality in SQL Server. I know I can do this using a computed column and creating an index on that column. However, I was wondering if there was a way to do this without a computed column?

You cannot create an index based on a function in SQL Server. You can however create an index on a computed column (with restrictions, such as the computation has to use only deterministic functions).
In your specific case, you don't have to use any functions, just to modify your query slightly.
Since left(column, 3) = 'abc' equals to column like 'abc%' you can safely use it. SQL Server is able to use indexes with the LIKE operator when the pattern contains a constant at the beginning (abc in this case).

Related

Can I ask Firebird to use index (with like on strings) in left join condition on other column?

I have table structures:
journal_entries (id integer, account varchar(20), doc_date, date, amount numeric(15,4))
selected_accounts (account varchar(20), selection_id integer)
I can query this and SQL used indexes both on doc_date and on account:
select je.*
from journal_entries je
where je.doc_date>='01.01.2022' and
je.doc_date<='31.03.2022' and
(je.account like '23%' or
je.account like '24%')
But when I am filling the selected_accounts table with data:
23%, 1
24%, 1
And I am trying to use condition in left join:
select
from selected_accounts sa
left join journal_entries je on (
je.doc_date>='01.01.2022' and
je.doc_date<='31.03.2022' and
je.account like sa.account)
Then SQL is not using index on journal_entries.account data, it is using index on je.doc_date only.
Can I give some hints to optimizer or SQL engine that condition je.account like sa.account should use index on je.account?
I am using Firebird 3.1 and Firebird 2.1 but I guess, this issue is on other SQL databases as well.
I am gravitation towards the need to accept that I can not make optimal query with the condition in left join...
Question supplemented: I copied the plan (e.g. given by IBExpert) from the first query as the plan clause to the second query but the SQL engine reports:
index <index on journal_entries.account> cannot be used in the specified plan
So, there is something in my query that prevents the reference and use of journal_entries.account index.
Additional observation: In fact - my database have 1M journal entries in 2022Q1 (period specified in my examples), then the first (good) query reports less than 1M indexed record reads, but the second (bad) query reports 2*1M indexed record reads (indexed beacuse of the index on journal_entries.doc_date), so, this is even worse than the full read by doc_date and then just filter by selected_records entries.
One step forward: Thanks to #Damien_The_Unbeliever comment I made this test (sic! first string prefixed with %):
select je.*
from journal_entries je
where je.doc_date>='01.01.2022' and
je.doc_date<='31.03.2022' and
(je.account like '%23%' or
je.account like '24%')
And no more the je.account index is used and the number of reads increased. So - it seems to me that Firebird query engine/optimizer scans that string literals used in like conditions and decides on the possibility to use index on je.account.
So, maybe I can give some notice to Firebird (for my second/slow query) that I expect only post-%-fixed strings as select_accounts.account output? That would solve my issue in the case of Firebird engine.
It is not possible to optimize like in this case, because Firebird cannot know what values your column has. The fact somecolumn like '24%' can use an index is because Firebird will rewrite that expression to somecolumn starting with '24' (see also LIKE, specifically the note titled "About LIKE and the Optimizer"). It is not possible to do so with parameters or values obtained from columns.
In other words, the obvious solution to your problem is to have selected_accounts.account not be populated with '24%', but with '24', and using STARTING WITH in your join condition.
In the case the wildcard doesn't always occur, and some times you need an exact match, you could use something like je.account starting with replace(sa.account, '%', '') and je.account like sa.account. This solution assumes that % only ever occurs as the last character, and there is no _ wildcard used.

Oracle CONTAINS() not returning results for numbers

So I have this table with a full text indexed column "value". The field contains some number strings in it, which are correctly returned when i use a query like so:
select value
from mytable
where value like '17946234';
Trouble is, it's incredibly slow because there are a lot of rows in this table, but when i use the CONTAINS operator I get no results:
select value
from mytable
where CONTAINS ( value, '17946234',1)>0
Anyone got any thoughts?
Unfortunately, I'm not an Oracle dude, and the same query works fine in SQL Server. I feel like it must be a stoplist or something with the Oracle Lexer that I can change, but not really sure how.
This could be due to INDEXTYPE IS CTXSYS.CONTEXT in general, or the index not having been updated after the looked for records where added (CONTEXT type indexes are not transactional, whilst CTXCAT type ones are).
However, if you did not accidentally lose the wildcard in your statement (In other words: If no wildcard is required indeed.), you could just query
select value from mytable where value = '17946234';
which could possibly be backed by an ordinary index. (Depending on your specific data distribution and the queries run, an index might not help query performance.)
Otherwise
select value from mytable where instr(value, '17946234') > 0;
might be sufficient.

Use of function calls in stored procedure sql server 2005?

Use of function calls in where clause of stored procedure slows down performance in sql server 2005?
SELECT * FROM Member M
WHERE LOWER(dbo.GetLookupDetailTitle(M.RoleId,'MemberRole')) != 'administrator'
AND LOWER(dbo.GetLookupDetailTitle(M.RoleId,'MemberRole')) != 'moderator'
In this query GetLookupDetailTitle is a user defined function and LOWER() is built in function i am asking about both.
Yes.
Both of these are practices to be avoided where possible.
Applying almost any function to a column makes the expression unsargable which means an index cannot be used and even if the column is not indexed it makes cardinality estimates incorrect for the rest of the plan.
Additionally your dbo.GetLookupDetailTitle scalar function looks like it does data access and this should be inlined into the query.
The query optimiser does not inline logic from scalar UDFs and your query will be performing this lookup for each row in your source data, which will effectively enforce a nested loops join irrespective of its suitability.
Additionally this will actually happen twice per row because of the 2 function invocations. You should probably rewrite as something like
SELECT M.* /*But don't use * either, list columns explicitly... */
FROM Member M
WHERE NOT EXISTS(SELECT *
FROM MemberRoles R
WHERE R.MemberId = M.MemberId
AND R.RoleId IN (1,2)
)
Don't be tempted to replace the literal values 1,2 with variables with more descriptive names as this too can mess up cardinality estimates.
Using a function in a WHERE clause forces a table scan.
There's no way to use an index since the engine can't know what the result will be until it runs the function on every row in the table.
You can avoid both the user-defined function and the built-in by
defining "magic" values for administrator and moderator roles and compare Member.RoleId against these scalars
defining IsAdministrator and IsModerator flags on a MemberRole table and join with Member to filter on those flags

IN vs OR of Oracle, which faster?

I'm developing an application which processes many data in Oracle database.
In some case, I have to get many object based on a given list of conditions, and I use SELECT ...FROM.. WHERE... IN..., but the IN expression just accepts a list whose size is maximum 1,000 items.
So I use OR expression instead, but as I observe -- perhaps this query (using OR) is slower than IN (with the same list of condition). Is it right? And if so, how to improve the speed of query?
IN is preferable to OR -- OR is a notoriously bad performer, and can cause other issues that would require using parenthesis in complex queries.
Better option than either IN or OR, is to join to a table containing the values you want (or don't want). This table for comparison can be derived, temporary, or already existing in your schema.
In this scenario I would do this:
Create a one column global temporary table
Populate this table with your list from the external source (and quickly - another whole discussion)
Do your query by joining the temporary table to the other table (consider dynamic sampling as the temporary table will not have good statistics)
This means you can leave the sort to the database and write a simple query.
Oracle internally converts IN lists to lists of ORs anyway so there should really be no performance differences. The only difference is that Oracle has to transform INs but has longer strings to parse if you supply ORs yourself.
Here is how you test that.
CREATE TABLE my_test (id NUMBER);
SELECT 1
FROM my_test
WHERE id IN (1,2,3,4,5,6,7,8,9,10,
21,22,23,24,25,26,27,28,29,30,
31,32,33,34,35,36,37,38,39,40,
41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59,60,
61,62,63,64,65,66,67,68,69,70,
71,72,73,74,75,76,77,78,79,80,
81,82,83,84,85,86,87,88,89,90,
91,92,93,94,95,96,97,98,99,100
);
SELECT sql_text, hash_value
FROM v$sql
WHERE sql_text LIKE '%my_test%';
SELECT operation, options, filter_predicates
FROM v$sql_plan
WHERE hash_value = '1181594990'; -- hash_value from previous query
SELECT STATEMENT
TABLE ACCESS FULL ("ID"=1 OR "ID"=2 OR "ID"=3 OR "ID"=4 OR "ID"=5
OR "ID"=6 OR "ID"=7 OR "ID"=8 OR "ID"=9 OR "ID"=10 OR "ID"=21 OR
"ID"=22 OR "ID"=23 OR "ID"=24 OR "ID"=25 OR "ID"=26 OR "ID"=27 OR
"ID"=28 OR "ID"=29 OR "ID"=30 OR "ID"=31 OR "ID"=32 OR "ID"=33 OR
"ID"=34 OR "ID"=35 OR "ID"=36 OR "ID"=37 OR "ID"=38 OR "ID"=39 OR
"ID"=40 OR "ID"=41 OR "ID"=42 OR "ID"=43 OR "ID"=44 OR "ID"=45 OR
"ID"=46 OR "ID"=47 OR "ID"=48 OR "ID"=49 OR "ID"=50 OR "ID"=51 OR
"ID"=52 OR "ID"=53 OR "ID"=54 OR "ID"=55 OR "ID"=56 OR "ID"=57 OR
"ID"=58 OR "ID"=59 OR "ID"=60 OR "ID"=61 OR "ID"=62 OR "ID"=63 OR
"ID"=64 OR "ID"=65 OR "ID"=66 OR "ID"=67 OR "ID"=68 OR "ID"=69 OR
"ID"=70 OR "ID"=71 OR "ID"=72 OR "ID"=73 OR "ID"=74 OR "ID"=75 OR
"ID"=76 OR "ID"=77 OR "ID"=78 OR "ID"=79 OR "ID"=80 OR "ID"=81 OR
"ID"=82 OR "ID"=83 OR "ID"=84 OR "ID"=85 OR "ID"=86 OR "ID"=87 OR
"ID"=88 OR "ID"=89 OR "ID"=90 OR "ID"=91 OR "ID"=92 OR "ID"=93 OR
"ID"=94 OR "ID"=95 OR "ID"=96 OR "ID"=97 OR "ID"=98 OR "ID"=99 OR
"ID"=100)
I would question the whole approach. The client of the SP has to send 100000 IDs. Where does the client get those IDs from? Sending such a large number of ID as the parameter of the proc is going to cost significantly anyway.
If you create the table with a primary key:
CREATE TABLE my_test (id NUMBER,
CONSTRAINT PK PRIMARY KEY (id));
and go through the same SELECTs to run the query with the multiple IN values, followed by retrieving the execution plan via hash value, what you get is:
SELECT STATEMENT
INLIST ITERATOR
INDEX RANGE SCAN
This seems to imply that when you have an IN list and are using this with a PK column, Oracle keeps the list internally as an "INLIST" because it is more efficient to process this, rather than converting it to ORs as in the case of an un-indexed table.
I was using Oracle 10gR2 above.

Creating indexes for optimizing the execution of Stored Prcocedures

The WHERE clause of one of my queries looks like this:
and tbl0.Type = 'Alert'
AND (tbl0.AccessRights like '%'+TblCUG0.userGroup+'%'
or tbl0.AccessRights like 'All' )
AND (tbl0.ExpiryDate > CONVERT(varchar(8), GETDATE(), 1)
or tbl0.ExpiryDate is null)
order by tbl0.Priority,tbl0.PublishedDate desc, tbl0.Title asc
I will like to know on which columns can I create indexes and which type of index will best suit. Also I have heard that indexes dont work with Like and Wild cards at the starting. So what should be the approach to optimize the queries.
1 and tbl0.Type = 'Alert'
2 AND (tbl0.AccessRights like '%'+TblCUG0.userGroup+'%'
3 or tbl0.AccessRights like 'All' )
4 AND (tbl0.ExpiryDate > CONVERT(varchar(8), GETDATE(), 1)
5 or tbl0.ExpiryDate is null)
most likely, you will not be able to use an index with a WHERE clause like this.
Line 1, You could create an index on tbl0.Type, but if you have many rows and few actual values, SQL Server will just skip the index and table scan anyway. Also, having nothing to do with the index issue, a column like this, a code/flag value is better as a fixed width value char(1), tiny int, etc, where "A"=alert or 1=alert. I would name the column XyzType, where Xyz is what the type describes (DoctorType, CarType, etc). I would create a new table XyzTye, with a FK back to this column in tb10. this new table would have two columns XyzType PK and XyzDescription, where you expand out the name.
Line 2, are you combining multiple values into tbl0.AccessRights? and trying to use the LIKE to find values within it? if so, split this out into a different table and then you can remove the like and possibly add an index there.
Line 3, OR kills an index usage. Imagine looking through the phone book for all names that are "Smith" or start with "G", you can't just use the index. You may try splitting the query into a UNION or UNION ALL around the OR so an index can be used (one part looks for "Smith" and the other part looks for "G"). You have not provided enough of the query to determine if this is possible or not in your case. You many need to use a derived table that contains this UNION so you can join it to the rest of your query.
Line 4, tbl0.ExpiryDate could benifit from a index, but the or will kill its usage, see the Line 3 comment.
Line 5, you may try the OR union trick discussed above, or just not use NULL, put in a a default like '01/01/3000' so you don't need the OR.
SQL Server's Database Tuning Advisor can suggest which indexes will optimize your query, including covering indexes that will optimize the selected columns that you do not include in your query. Just because you add an index doesn't mean that the query optimizer will use it. Some indexes may cost more to use than others, so the optimizer will choose the bext indexes using the underlying tables' statistics.
Out-of-hand you could use add all ordering and criteria columns to an index, but that would be useless if for example, there are too few distinct Priority values to make it worth the storage.
You are right about LIKE and wildcards. An index is a btree which means that it can speed quick searches for specific values or range queries. A wildcard at the beginning means that the query will have to touch all records to check whether they match the pattern. A wildcard at the end means that the query will only have to touch items that start with the substring up to the wildcard, partially turning this into a range query that can benefit from an index.