I have a table in PosgreSQL 9.1.9. There's a schema:
CREATE TABLE chpl_text
(
id integer NOT NULL DEFAULT nextval('chpl_text_id_seq1'::regclass),
page_id bigint NOT NULL,
page_idx integer NOT NULL,
...
);
I have around 40000000 (40M) rows in this table.
Now, there's a query:
SELECT
...
FROM chpl_text
ORDER BY id
LIMIT 100000
OFFSET N
Everything runs smoothly for N <= 5300000. Execution plan looks like this
Limit (cost=12743731.26..12984179.02 rows=100000 width=52)
-> Index Scan using chpl_text_pkey on chpl_text t (cost=0.00..96857560.86 rows=40282164 width=52)
But for N >= 5400000 it magically changes into
Limit (cost=13042543.16..13042793.16 rows=100000 width=52)
-> Sort (cost=13029043.16..13129748.57 rows=40282164 width=52)
Sort Key: id
-> Seq Scan on chpl_text t (cost=0.00..1056505.64 rows=40282164 width=52)
Resulting in very long runtime.
How can I prevent postresql from changeing query plan for higher offsets?
Note: I am aware, that big offsets are not good at all, but I am forced to use them here.
If Postgres is configured halfway decently, your statistics are up to date (ANALYZE or autovacuum) and the cost settings are sane, Postgres generally knows better when to do an index scan or a sequential scan. Details and links:
Keep PostgreSQL from sometimes choosing a bad query plan
To actually test performance without sequential scan, "disable" it (in a debug session only!)
SET enable_seqscan=OFF;
More in the manual.
Then run EXPLAIN ANALYZE again ...
Also, the release of Postgres 9.2 had a focus on "big data". With your given use case you should urgently consider upgrading to the current release.
You can also try this alternative query with a CTE and row_number() and see if the query plan turns out more favorable:
WITH cte AS (
SELECT ..., row_number() OVER (ORDER BY id) AS rn
FROM chpl_text
)
SELECT ...
FROM cte
WHERE rn BETWEEN N+1 AND N+100000
ORDER BY id;
That's not always the case, but might be in your special situation.
Related
I have following table:
create table if not exists inventory
(
expired_at timestamp(0),
-- ...
);
create index if not exists inventory_expired_at_index
on inventory (expired_at);
However when I run following query:
EXPLAIN UPDATE "inventory" SET "status" = 'expired' WHERE "expired_at" < '2020-12-08 12:05:00';
I get next execution plan:
Update on inventory (cost=0.00..4.09 rows=2 width=126)
-> Seq Scan on inventory (cost=0.00..4.09 rows=2 width=126)
Filter: (expired_at < '2020-12-08 12:05:00'::timestamp without time zone)
Same happens for big dataset:
EXPLAIN SELECT * FROM "inventory" WHERE "expired_at" < '2020-12-08 12:05:00';
-[ RECORD 1 ]---------------------------------------------------------------------------
QUERY PLAN | Seq Scan on inventory (cost=0.00..58616.63 rows=1281058 width=71)
-[ RECORD 2 ]---------------------------------------------------------------------------
QUERY PLAN | Filter: (expired_at < '2020-12-08 12:05:00'::timestamp without time zone)
The question is: why not Index Scan but Seq Scan?
This is a bit long for a comment.
The short answer is that you have two rows in the table, so it doesn't make a difference.
The longer answer is that your are using an update, so the data rows have to be retrieved anyway. Using an index requires loading both the index and the data rows and then indirecting from the index to the data rows. It is a little more complicated. And with two rows, not worth the effort at all.
The power of indexes is to handle large amounts of data, not small amounts of data.
To respond to the large question: Database optimizers are not required to use an index. They use some sort of measures (often cost-based optimization) to determine whether or not an index is appropriate. In your larger example, the optimizer has determined that the index is not appropriate. This could happen if the statistics are out-of-synch with the underlying data.
In my postgreSQL database I have a table named "product". In this table I have a column named "date_touched" with type timestamp. I created a simple btree index on this column. This is the schema of my table (I omitted irrelevant column & index definitions):
Table "public.product"
Column | Type | Modifiers
---------------------------+--------------------------+-------------------
id | integer | not null default nextval('product_id_seq'::regclass)
date_touched | timestamp with time zone | not null
Indexes:
"product_pkey" PRIMARY KEY, btree (id)
"product_date_touched_59b16cfb121e9f06_uniq" btree (date_touched)
The table has ~300,000 rows and I want to get the n-th element from the table ordered by "date_touched". when I want to get the 1000th element, it takes 0.2s, but when I want to get the 100,000th element, it takes about 6s. My question is, why does it take too much time to retrieve the 100,000th element, although I've defined a btree index?
Here is my query with explain analyze that shows postgreSQL does not use the btree index and instead sorts all rows to find the 100,000th element:
first query (100th element):
explain analyze
SELECT product.id
FROM product
ORDER BY product.date_touched ASC
LIMIT 1
OFFSET 1000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Limit (cost=3035.26..3038.29 rows=1 width=12) (actual time=160.208..160.209 rows=1 loops=1)
-> Index Scan using product_date_touched_59b16cfb121e9f06_uniq on product (cost=0.42..1000880.59 rows=329797 width=12) (actual time=16.651..159.766 rows=1001 loops=1)
Total runtime: 160.395 ms
second query (100,000th element):
explain analyze
SELECT product.id
FROM product
ORDER BY product.date_touched ASC
LIMIT 1
OFFSET 100000;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Limit (cost=106392.87..106392.88 rows=1 width=12) (actual time=6621.947..6621.950 rows=1 loops=1)
-> Sort (cost=106142.87..106967.37 rows=329797 width=12) (actual time=6381.174..6568.802 rows=100001 loops=1)
Sort Key: date_touched
Sort Method: external merge Disk: 8376kB
-> Seq Scan on product (cost=0.00..64637.97 rows=329797 width=12) (actual time=1.357..4184.115 rows=329613 loops=1)
Total runtime: 6629.903 ms
It is a very good thing, that SeqScan is used here. Your OFFSET 100000 is not a good thing for the IndexScan.
A bit of theory
Btree indexes contain 2 structures inside:
balanced tree and
double-linked list of keys.
First structure allows for fast keys lookups, second is responsible for the ordering. For bigger tables, linked list cannot fit into a single page and therefore it is a list of linked pages, where each page's entries maintain ordering, specified during index creation.
It is wrong to think, though, that such pages are sitting together on the disk. In fact, it is more probable that those are spread across different locations. And in order to read pages based on the index's order, system has to perform random disk reads. Random disk IO is expensive, compared to sequential access. Therefore good optimizer will prefer a SeqScan instead.
I highly recommend “SQL Performance Explained” book to better understand indexes. It is also available on-line.
What is going on?
Your OFFSET clause would cause database to read index's linked list of keys (causing lots of random disk reads) and than discarding all those results, till you hit the wanted offset. And it is good, in fact, that Postgres decided to use SeqScan + Sort here — this should be faster.
You can check this assumption by:
running EXPLAIN (analyze, buffers) of your big-OFFSET query
than do SET enable_seqscan TO 'off';
and run EXPLAIN (analyze, buffers) again, comparing the results.
In general, it is better to avoid OFFSET, as DBMSes not always pick the right approach here. (BTW, which version of PostgreSQL you're using?)
Here's a comparison of how it performs for different offset values.
EDIT: In order to avoid OFFSET one would have to base pagination on the real data, that exists in the table and is a part of the index. For this particular case, the following might be possible:
show first N (say, 20) elements
include maximal date_touched that is shown on the page to all the “Next” links. You can compute this value on the application side. Do similar for the “Previous” links, except include minimal date_touch for these.
on the server side you will get the limiting value. Therefore, say for the “Next” case, you can do a query like this:
SELECT id
FROM product
WHERE date_touched > $max_date_seen_on_the_page
ORDER BY date_touched ASC
LIMIT 20;
This query makes best use of the index.
Of course, you can adjust this example to your needs. I used pagination as it is a typical case for the OFFSET.
One more note — querying 1 row many times, increasing offset for each query by 1, will be much more time consuming, than doing a single batch query that returns all those records, which are then iterated from on the application side.
Recently i upgraded Postgresql from 9.1 to 9.2 version. New planner uses wrong index and query executes too long.
Query:
explain SELECT mentions.* FROM mentions WHERE (searches_id = 7646553) ORDER BY id ASC LIMIT 1000
Explain in 9.1 version:
Limit (cost=5762.99..5765.49 rows=1000 width=184)
-> Sort (cost=5762.99..5842.38 rows=31755 width=184)
Sort Key: id
-> Index Scan using mentions_searches_id_idx on mentions (cost=0.00..4021.90 rows=31755 width=184)
Index Cond: (searches_id = 7646553)
Expain in 9.2 version:
Limit (cost=0.00..450245.54 rows=1000 width=244)
-> Index Scan using mentions_pk on mentions (cost=0.00..110469543.02 rows=245354 width=244
Index Cond: (id > 0)"
Filter: (searches_id = 7646553)
The correct approach is in 9.1 version, where planner uses index on searches_id. In 9.2 version planner doesn't not uses that index and filter rows by searches_id.
When i execute on 9.2 version query without ORDER BY id, planner uses index on searches_id, but i need to order by id.
I also tried to select rows in subquery and order it in second query, but explain shows that, the planner do the same like in normal query.
select * from (
SELECT mentions.* FROM mentions WHERE (searches_id = 7646553))
AS q1
order by id asc
What would you recommend?
If searches_id #7646553 rows are more than a few percent of the table then the index on that column will not be used as a table scan would be faster. Do a
select count(*) from mentions where searches_id = 7646553
and compare to the total rows.
If they are less than a few percent of the table then try
with m as (
SELECT *
FROM mentions
WHERE searches_id = 7646553
)
select *
from m
order by id asc
(From PostgreSQL v12 on, you have to use with ... as materialized.)
Or create a composite index:
create index index_name on mentions (searches_id, id)
If searches_id has low cardinality then create the same index in the opposite order
create index index_name on mentions (id, searches_id)
Do
analyze mentions
After creating an index.
For me, I had indexes but they were all based on 3 columns, and I wasn't calling out one of the columns in the indexes, so it was doing seq scan across the entire thing. Possible fix: more indexes but that use fewer columns (and/or switch column order).
Another problem we saw was that we had the right index, but apparently it was an "invalid" (poorly created CONCURRENT?) index. So dropping it and creating it (or reindexing it) and it started using it.
What are the available options to identify and remove the invalid objects in Postgres (ex: corrupted indexes)
See also http://www.postgresql.org/docs/8.4/static/indexes-multicolumn.html
My query:
DROP TABLE IF EXISTS tmp;
CREATE TEMP TABLE tmp AS SELECT *, ST_BUFFER(the_geom::GEOGRAPHY, 3000)::GEOMETRY AS buffer FROM af_modis_master LIMIT 20000;
CREATE INDEX idx_tmp_the_geom ON tmp USING gist(buffer);
EXPLAIN SELECT (DUMP(ST_UNION(buffer))).path[1], (DUMP(ST_UNION(buffer))).geom FROM tmp;
Output from EXPLAIN:
Aggregate (cost=1705.52..1705.54 rows=1 width=32)
-> Seq Scan on tmp (cost=0.00..1625.01 rows=16101 width=32)
Seq Scan means it is not using the index, right? Why not?
(This question was first posted here: https://gis.stackexchange.com/questions/51877/postgis-query-not-using-gist-index-when-doing-a-st-dumpst-union . Apologies for reposting but the community here is much more active, so perhaps wil provide an answer quicker.)
UPDATE: Even adding a where clause that filters based on the buffer causes a Seq Scan:
ANALYZE tmp;
EXPLAIN SELECT (DUMP(ST_UNION(buffer))).path[1], (DUMP(ST_UNION(buffer))).geom FROM tmp WHERE ST_XMIN(buffer) = 0.0;
A query like you have will never use an index ever. To do so would substitute significant random disk I/O (possibly even in addition to normal disk I/O) for the scan of the table.
In essence you are not selecting on criteria so an index will be slower than just pulling the data from disk in physical order and processing it.
Now if you pull only a single row with a where condition your index might help with, then you may find it may use the index, or not, depending on how big the table is. Very small tables will never use indexes because the extra random disk I/O is never a win. Remember no query plan beats a sequential scan through a single page....
I've got a table with around 20 million rows. For arguments sake, lets say there are two columns in the table - an id and a timestamp. I'm trying to get a count of the number of items per day. Here's what I have at the moment.
SELECT DATE(timestamp) AS day, COUNT(*)
FROM actions
WHERE DATE(timestamp) >= '20100101'
AND DATE(timestamp) < '20110101'
GROUP BY day;
Without any indices, this takes about a 30s to run on my machine. Here's the explain analyze output:
GroupAggregate (cost=675462.78..676813.42 rows=46532 width=8) (actual time=24467.404..32417.643 rows=346 loops=1)
-> Sort (cost=675462.78..675680.34 rows=87021 width=8) (actual time=24466.730..29071.438 rows=17321121 loops=1)
Sort Key: (date("timestamp"))
Sort Method: external merge Disk: 372496kB
-> Seq Scan on actions (cost=0.00..667133.11 rows=87021 width=8) (actual time=1.981..12368.186 rows=17321121 loops=1)
Filter: ((date("timestamp") >= '2010-01-01'::date) AND (date("timestamp") < '2011-01-01'::date))
Total runtime: 32447.762 ms
Since I'm seeing a sequential scan, I tried to index on the date aggregate
CREATE INDEX ON actions (DATE(timestamp));
Which cuts the speed by about 50%.
HashAggregate (cost=796710.64..796716.19 rows=370 width=8) (actual time=17038.503..17038.590 rows=346 loops=1)
-> Seq Scan on actions (cost=0.00..710202.27 rows=17301674 width=8) (actual time=1.745..12080.877 rows=17321121 loops=1)
Filter: ((date("timestamp") >= '2010-01-01'::date) AND (date("timestamp") < '2011-01-01'::date))
Total runtime: 17038.663 ms
I'm new to this whole query-optimization business, and I have no idea what to do next. Any clues how I could get this query running faster?
--edit--
It looks like I'm hitting the limits of indices. This is pretty much the only query that gets run on this table (though the values of the dates change). Is there a way to partition up the table? Or create a cache table with all the count values? Or any other options?
Is there a way to partition up the table?
Yes:
http://www.postgresql.org/docs/current/static/ddl-partitioning.html
Or create a cache table with all the count values? Or any other options?
Create a "cache" table certainly is possible. But this depends on how often you need that result and how accurate it needs to be.
CREATE TABLE action_report
AS
SELECT DATE(timestamp) AS day, COUNT(*)
FROM actions
WHERE DATE(timestamp) >= '20100101'
AND DATE(timestamp) < '20110101'
GROUP BY day;
Then a SELECT * FROM action_report will give you what you want in a timely manner. You would then schedule a cron job to recreate that table on a regular basis.
This approach of course won't help if the time range changes with every query or if that query is only run once a day.
In general most databases will ignore indexes if the expected number of rows returned is going to be high. This is because for each index hit, it will need to then find the row as well, so it's faster to just do a full table scan. This number is between 10,000 and 100,000. You can experiment with this by shrinking the date range and seeing where postgres flips to using the index. In this case, postgres is planning to scan 17,301,674 rows, so your table is pretty large. If you make it really small and you still feel like postgres is making the wrong choice then try running an analyze on the table so that postgres gets its approximations right.
It looks like the range covers just about covers all the data available.
This could be a design issue. If you will be running this often, you are better off creating an additional column timestamp_date that contains only the date. Then create an index on that column, and change the query accordingly. The column should be maintained by insert+update triggers.
SELECT timestamp_date AS day, COUNT(*)
FROM actions
WHERE timestamp_date >= '20100101'
AND timestamp_date < '20110101'
GROUP BY day;
If I am wrong about the number of rows the date range will find (and it is only a small subset), then you can try an index on just the timestamp column itself, applying the WHERE clause to just the column (which given the range works just as well)
SELECT DATE(timestamp) AS day, COUNT(*)
FROM actions
WHERE timestamp >= '20100101'
AND timestamp < '20110101'
GROUP BY day;
Try running explain analyze verbose ... to see if the aggregate is using a temp file. Perhaps increase work_mem to allow more to be done in memory?
Set work_mem to say 2GB and see if that changes the plan. If it doesn't, you might be out of options.
What you really want for such DSS type queries is a date table that describes days. In database design lingo it's called a date dimension. To populate such table you can use the code I posted in this article: http://www.mockbites.com/articles/tech/data_mart_temporal
Then in each row in your actions table put the appropriate date_key.
Your query then becomes:
SELECT
d.full_date, COUNT(*)
FROM actions a
JOIN date_dimension d
ON a.date_key = d.date_key
WHERE d.full_date = '2010/01/01'
GROUP BY d.full_date
Assuming indices on the keys and full_date, this will be super fast because it operates on INT4 keys!
Another benefit is that you can now slice and dice by any other date_dimension column(s).