Does order of boolean statements make a performance difference in a MySQL query? - sql

Suppose I want to query a table based on multiple WHERE clauses.
Would either of these statements be faster than the other?
SELECT *
FROM table
WHERE (line_type='section_intro' OR line_type='question')
AND (line_order BETWEEN 0 AND 12)
ORDER BY line_order";
...or:
SELECT *
FROM table
WHERE (line_order BETWEEN 0 AND 12)
AND (line_type='section_intro' OR line_type='question')
ORDER BY line_order;
I guess what it would come down to is whether the first one would select more than 12 records, and then pare down from there.

No, the order does not matter. Query optimizer is going to estimate all conditions separately and decide on the best order based on what indexes are applicable / size of targeted selection / etc...

It depends on your indexes. If you have a multi- index on (line_type, line_order), the first query is faster. If you have an index on (line_order, line_type), the second one is faster. This is because for multi-column primary keys, MySQL can only do the comparisons in order. Otherwise, there is no difference.

Related

SQL: index on "order by" column when a query also has a lot of "where" predicates

Suppose I have this sql query:
select * from my_table
where col1 = 'abc' and col2 = 'qwe' and ... --e.g. 10 predicates or more
order by my_date desc
WIll the index only on my_date column even be used by DB? Will it improve performance somehow?
I'm more interested in Postgres.
The PostgreSQL optimizer will use the index if it thinks that that is cheaper than fetching the rows that match the WHERE condition and sorting them.
This will probably be that case if:
there are many such rows, and sorting would be more expensive than the index scan
there are no indexes to support the WHERE condition
Without a LIMIT, the chances of using the single-column index to provide order here are pretty low. Indeed, I can't contrive a situation to do so without monkeying around with enable_sort or enable_seqsan.
Even with a LIMIT, after applying 10 equality conditions it will be pretty unusual for the expected number of rows left over to be high enough to make the index appear to be worthwhile.

Index in query plan is skipped when using OR condition in Postgres

Say, I have a table my_table with field kind:string and an index on this field.
I've noticed that Postgres builds two different query plans for the queries:
SELECT * FROM my_table
WHERE kind = 'kind1' OR kind IS NULL;
and
SELECT * FROM my_table
WHERE kind = 'kind1';
The first one does not use index whereas the second one does. Why?
I know there are a lot of conditions why indexes may be used or not, and I've read a lot about query plans but this case still is not clear to me.
Abelisto explains that the two versions of the query are not the same. SQL engines (in general) can do a poor job of using indexes for ORs. It is possible that there are so many NULL values, that Postgres simply does not think an index is useful when comparing to NULLs. That depends on the data.
You can try rewriting the query as:
SELECT *
FROM my_table
WHERE kind = 'type1'
UNION ALL
SELECT *
FROM my_table
WHERE kind IS NULL;
Postgres might choose to use indexes on each subquery, if they are appropriate for the data.

Does DISTINCT performs a full table scan with multiple expressions?

I have a DISTINCT clause to remove the duplicate values.
What is the performance if there are multiple expressions?
For example:
SELECT DISTINCT city, state
FROM customers
WHERE total_orders > 10
ORDER BY city;
Will this perform a full table scan?
The DBMS performs a full table scan when it thinks it appropriate.
In your example, when the DBMS thinks that with total_orders > 10 it will only get very few rows and there is an index on that column, it will use that index to access the table records. In a second step it will apply DISTINCT and then sort (or sort on-the-fly when making rows distinct). If the DBMS thinks however it will get too many records with total_orders > 10 it may decide for a full table scan. (And then apply DISTINCT and ORDER BY). So whatever the situation, DISTINCT doesn't change anything.
In case you have an index on total_orders + City + state, the DBMS may decide not to access the table at all, because all data exists in the index and even in the order needed. The DBMS would do the same without DISTINCT, however.
In case you have an index on state + total_orders + City (i.e. wrong order; the WHERE clause can not be directly applied), the DBMS may still decide to read the index only, but it is less likely. And again: the DBMS would do the same without DISTINCT.
And if you have no index, the DBMS must do a full table scan of course, because there is no index to circumvent it. Well, I guess that was needless to say :-)
Will this perform a full table scan?
Check the EXPLAIN PLAN.
EXPLAIN PLAN FOR your_query;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
It is up to the optimizer to decide the optimal plan for execution of the query. Since you do not have an index on the column used in the filter predicate, it has no other option than a FTS(Full Table Scan).

Avoiding a Full Table Scan in MySQL

How can I avoid a full table scan on mysql?
In general, by making sure you have a usable index on fields that appear in WHERE, JOIN and ORDER BY clauses.
Index your data.
Write queries that use those indexes.
Anything more than that we need specifics.
Also note that sometimes you just can not rid of a full table scan, i.e. When you need all the rows from your table... or when the cost of scanning the index is gt the cost of scanning the full table.
Use a LIMIT clause when you know how many rows you are expecting to return, for example if you are looking for a record with a known ID field that is unique, limit your select to 1, that way mysql will stop searching after it finds the first record. The same goes for updates and deletes.
SELECT * FROM `yourTable` WHERE `idField` = 123 LIMIT 1

Creating Indexes for Group By Fields?

Do you need to create an index for fields of group by fields in an Oracle database?
For example:
select *
from some_table
where field_one is not null and field_two = ?
group by field_three, field_four, field_five
I was testing the indexes I created for the above and the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?
It could be correct, but that would depend on how much data you have. Typically I would create an index for the columns I was using in a GROUP BY, but in your case the optimizer may have decided that after using the field_two index that there wouldn't be enough data returned to justify using the other index for the GROUP BY.
No, this can be incorrect.
If you have a large table, Oracle can prefer deriving the fields from the indexes rather than from the table, even there is no single index that covers all values.
In the latest article in my blog:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: Oracle
, there is a query in which Oracle does not use full table scan but rather joins two indexes to get the column values:
SELECT l.id, l.value
FROM t_left l
WHERE NOT EXISTS
(
SELECT value
FROM t_right r
WHERE r.value = l.value
)
The plan is:
SELECT STATEMENT
HASH JOIN ANTI
VIEW , 20090917_anti.index$_join$_001
HASH JOIN
INDEX FAST FULL SCAN, 20090917_anti.PK_LEFT_ID
INDEX FAST FULL SCAN, 20090917_anti.IX_LEFT_VALUE
INDEX FAST FULL SCAN, 20090917_anti.IX_RIGHT_VALUE
As you can see, there is no TABLE SCAN on t_left here.
Instead, Oracle takes the indexes on id and value, joins them on rowid and gets the (id, value) pairs from the join result.
Now, to your query:
SELECT *
FROM some_table
WHERE field_one is not null and field_two = ?
GROUP BY
field_three, field_four, field_five
First, it will not compile, since you are selecting * from a table with a GROUP BY clause.
You need to replace * with expressions based on the grouping columns and aggregates of the non-grouping columns.
You will most probably benefit from the following index:
CREATE INDEX ix_sometable_23451 ON some_table (field_two, field_three, field_four, field_five, field_one)
, since it will contain everything for both filtering on field_two, sorting on field_three, field_four, field_five (useful for GROUP BY) and making sure that field_one is NOT NULL.
Do you need to create an index for fields of group by fields in an Oracle database?
No. You don't need to, in the sense that a query will run irrespective of whether any indexes exist or not. Indexes are provided to improve query performance.
It can, however, help; but I'd hesitate to add an index just to help one query, without thinking about the possible impact of the new index on the database.
...the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?
Not always. Often a GROUP BY will require Oracle to perform a sort (but not always); and you can eliminate the sort operation by providing a suitable index on the column(s) to be sorted.
Whether you actually need to worry about the GROUP BY performance, however, is an important question for you to think about.