Setup
A table with one jsonb column attributes and a non unique numeric ID campaignid:
CREATE TABLE coupons (
id integer NOT NULL,
created timestamp with time zone DEFAULT now( ) NOT NULL,
campaignid bigint NOT NULL,
attributes jsonb NOT NULL
);
This table would have up to 500M rows, arbitrary key/values in attributes and hundreds of different campaignid values.
Two indexes exist on the table:
CREATE INDEX campaignid_attrs_idx ON coupons
USING gin (campaignid,attributes);
CREATE INDEX campaignid_idx ON coupons
USING btree (campaignid, deleted);
What I did
I executed the query:
SELECT COUNT(*)
FROM coupons
WHERE
(campaignid = 97 AND
attributes #> '{"CountryId": 3}');
Expected Results
I expected the index campaignid_attrs_idx on (campaignid,attributes) to be fully used and the query to complete quite fast.
Actual Result
The query took a long time (~40 seconds) to execute.
Here's the output from explain (ANALYZE, COSTS):
Aggregate (cost=32337.78..32337.79 rows=1 width=8) (actual time=39726.410..39726.414 rows=1 loops=1)
-> Bitmap Heap Scan on coupons (cost=30164.40..32332.44 rows=2136 width=0) (actual time=16893.439..39549.891 rows=1088478 loops=1)
" Recheck Cond: ((attributes #> '{""CountryId"": 3}'::jsonb) AND (campaignid = 97))"
Rows Removed by Index Recheck: 10531586
Heap Blocks: exact=138344 lossy=583282
-> BitmapAnd (cost=30164.40..30164.40 rows=2136 width=0) (actual time=16837.885..16837.887 rows=0 loops=1)
-> Bitmap Index Scan on coupons_campaignid_attrs_index (cost=0.00..1465.15 rows=178954 width=0) (actual time=9872.279..9872.279 rows=81799565 loops=1)
" Index Cond: (attributes #> '{""CountryId"": 3}'::jsonb)"
-> Bitmap Index Scan on campaignid_idx (cost=0.00..28697.93 rows=2135515 width=0) (actual time=6454.972..6454.972 rows=3088167 loops=1)
Index Cond: (campaignid = 97)
Planning Time: 0.175 ms
Execution Time: 39726.480 ms
Conclusions
It seems like the index campaignid_attrs_idx was used for the first part of the query attributes #> '{"CountryId": 3}' returning ~80M rows, while the index campaignid_idx was used on the second part of the WHERE clause campaignid = 97 in parallel returning ~3M rows. Results from both parts were intersected to arrive at a set that fulfills both conditions. Then there was a Bitmap Heap Scan which verified that the result set complies with the desired conditions which took most of the time (16893.439..39549.891)
My main question, why wasn't campaignid_attrs_idx used to filter both conditions?
EDIT: I removed the second index campaignid_attrs_idx to see if then the multicolumn index will be used for both conditions. Strangely I still see that the only one of the conditions used in the index scan. Here's the plan:
Aggregate (cost=181951.27..181951.28 rows=1 width=8) (actual time=209633.017..209633.018 rows=1 loops=1)
-> Bitmap Heap Scan on coupons (cost=1424.30..181945.81 rows=2183 width=0) (actual time=8938.605..209401.433 rows=1091580 loops=1)
" Recheck Cond: (attributes #> '{""CountryId"": 3}'::jsonb)"
Rows Removed by Index Recheck: 31487517
Filter: (campaignid = 97)
Rows Removed by Filter: 80674951
Heap Blocks: exact=121875 lossy=5572599
-> Bitmap Index Scan on coupons_campaignid_attributes_idx (cost=0.00..1423.75 rows=179434 width=0) (actual time=8908.682..8908.682 rows=81802589 loops=1)
" Index Cond: (attributes #> '{""CountryId"": 3}'::jsonb)"
Planning Time: 6.885 ms
Execution Time: 209638.234 ms
Related
I'm trying to create some queries in order to implement a cursor pagination (something like this: https://shopify.engineering/pagination-relative-cursors) on Postgres. In my implementation I'm trying to reach an efficient pagination even with ordering NON-unique columns.
I'm struggling to do that efficiently, in particular on the query that retrieves the previous page given a specific cursor.
The table that I'm using (>3M records) to test these query is very simple, and it has this structure:
CREATE TABLE "placemarks" (
"id" serial NOT NULL DEFAULT,
"assetId" text,
"createdAt" timestamptz,
PRIMARY KEY ("id")
);
I have an index on the id field clearly and also an index on the assetId column.
This is the query I'm using for retrieving the next page given a cursor composed by the latest ID and the latest assetId:
SELECT
*
FROM
"placemarks"
WHERE
"assetId" > 'CURSOR_ASSETID'
or("assetId" = 'CURSOR_ASSETID'
AND id > CURSOR_INT_ID)
ORDER BY
"assetId",
id
LIMIT 5;
This query is actually pretty fast, it uses the indexes and it allows to handle also duplicated values on assetId by using the unique ID field in order to avoid skipping duplicated rows with same CURSOR_ASSETID values.
-> Sort (cost=25709.62..25726.63 rows=6803 width=2324) (actual time=0.128..0.138 rows=5 loops=1)
" Sort Key: ""assetId"", id"
Sort Method: top-N heapsort Memory: 45kB
-> Bitmap Heap Scan on placemarks (cost=271.29..25596.63 rows=6803 width=2324) (actual time=0.039..0.088 rows=11 loops=1)
" Recheck Cond: (((""assetId"")::text > 'CURSOR_ASSETID'::text) OR ((""assetId"")::text = 'CURSOR_ASSETID'::text))"
" Filter: (((""assetId"")::text > 'CURSOR_ASSETID'::text) OR (((""assetId"")::text = 'CURSOR_ASSETID'::text) AND (id > CURSOR_INT_ID)))"
Rows Removed by Filter: 1
Heap Blocks: exact=10
-> BitmapOr (cost=271.29..271.29 rows=6803 width=0) (actual time=0.030..0.034 rows=0 loops=1)
" -> Bitmap Index Scan on ""placemarks_assetId_key"" (cost=0.00..263.45 rows=6802 width=0) (actual time=0.023..0.023 rows=11 loops=1)"
" Index Cond: ((""assetId"")::text > 'CURSOR_ASSETID'::text)"
" -> Bitmap Index Scan on ""placemarks_assetId_key"" (cost=0.00..4.44 rows=1 width=0) (actual time=0.005..0.005 rows=1 loops=1)"
" Index Cond: ((""assetId"")::text = 'CURSOR_ASSETID'::text)"
Planning time: 0.201 ms
Execution time: 0.194 ms
The issue is when I try to get the same page but with the query that should return me the previous page:
SELECT
*
FROM
placemarks
WHERE
"assetId" < 'CURSOR_ASSETID'
or("assetId" = 'CURSOR_ASSETID'
AND id < CURSOR_INT_ID)
ORDER BY
"assetId" desc,
id desc
LIMIT 5;
With this query no indexes are used, even if it would be much faster:
Limit (cost=933644.62..933644.63 rows=5 width=2324)
-> Sort (cost=933644.62..944647.42 rows=4401120 width=2324)
" Sort Key: ""assetId"" DESC, id DESC"
-> Seq Scan on placemarks (cost=0.00..860543.60 rows=4401120 width=2324)
" Filter: (((""assetId"")::text < 'CURSOR_ASSETID'::text) OR (((""assetId"")::text = 'CURSOR_ASSETID'::text) AND (id < CURSOR_INT_ID)))"
I've noticied that by forcing the usage of indexes with SET enable_seqscan = OFF; the query appears to be using the indexes and it performs better and faster. The query plan resulting:
Limit (cost=12.53..12.54 rows=5 width=108) (actual time=0.532..0.555 rows=5 loops=1)
-> Sort (cost=12.53..12.55 rows=6 width=108) (actual time=0.524..0.537 rows=5 loops=1)
Sort Key: assetid DESC, id DESC
Sort Method: top-N heapsort Memory: 25kB
" -> Bitmap Heap Scan on ""placemarks"" (cost=8.33..12.45 rows=6 width=108) (actual time=0.274..0.340 rows=14 loops=1)"
" Recheck Cond: ((assetid < 'CURSOR_ASSETID'::text) OR (assetid = 'CURSOR_ASSETID'::text))"
" Filter: ((assetid < 'CURSOR_ASSETID'::text) OR ((assetid = 'CURSOR_ASSETID'::text) AND (id < 14)))"
Rows Removed by Filter: 1
Heap Blocks: exact=1
-> BitmapOr (cost=8.33..8.33 rows=7 width=0) (actual time=0.152..0.159 rows=0 loops=1)
" -> Bitmap Index Scan on ""placemarks_assetid_idx"" (cost=0.00..4.18 rows=6 width=0) (actual time=0.108..0.110 rows=12 loops=1)"
" Index Cond: (assetid < 'CURSOR_ASSETID'::text)"
" -> Bitmap Index Scan on ""placemarks_assetid_idx"" (cost=0.00..4.15 rows=1 width=0) (actual time=0.036..0.036 rows=3 loops=1)"
" Index Cond: (assetid = 'CURSOR_ASSETID'::text)"
Planning time: 1.319 ms
Execution time: 0.918 ms
Any clue to optimize the second query in order to use always the indexes?
Postgres DB version: 10.20
The fast performance of your first query seems to be down to luck of where your constant 'CURSOR_ASSETID' falls in the distribution of that column. Or maybe this luck is not luck but is how it will always be?
For good performance more generally, including for reverse sorting, you need to write your query with a tuple comparator, not an OR comparator.
WHERE
("assetId",id) < ('something',500000)
If you are using a version before incremental sorting was introduced in v13, or if "assetId" can have a large number of ties, then you will need a multicolumn index on ("assetId",id) to get optimal performance.
And there is no reason to decorate the index with DESC, as PostgreSQL knows how to read the index backwards. Decorating the index is needed when the two columns have different ordering than each other, as then you would need to read the undecorated index "spirally" rather than either completely forward or completely backwards. (But that wouldn't work well here anyway, as tuple comparators can't have different orderings between the columns.)
I have a function that is running too slow. I've isolated which piece of the function is slow.. a small SELECT statement:
SELECT image_group_id
FROM programs.image_family fam
JOIN programs.provider_file pf
ON (fam.provider_data_id = pf.provider_data_id
AND fam.family_id = $1 AND pf.image_group_id IS NOT NULL)
LIMIT 1
When I run the function this piece of SQL generates the following query plan:
Query Text: SELECT image_group_id FROM programs.image_family fam JOIN programs.provider_file pf ON (fam.provider_data_id = pf.provider_data_id AND fam.family_id = $1 AND pf.image_group_id IS NOT NULL) LIMIT 1
Limit (cost=0.56..6.75 rows=1 width=6) (actual time=3471.004..3471.004 rows=0 loops=1)
-> Nested Loop (cost=0.56..594054.42 rows=96017 width=6) (actual time=3471.002..3471.002 rows=0 loops=1)
-> Seq Scan on image_family fam (cost=0.00..391880.08 rows=96023 width=6) (actual time=3471.001..3471.001 rows=0 loops=1)
Filter: ((family_id)::numeric = '8419853'::numeric)
Rows Removed by Filter: 19204671
-> Index Scan using "IX_DBO_PROVIDER_FILE_1" on provider_file pf (cost=0.56..2.11 rows=1 width=12) (never executed)
Index Cond: (provider_data_id = fam.provider_data_id)
Filter: (image_group_id IS NOT NULL)
When I run the selected query in a query tool (outside of the function) the query plan looks like this:
Limit (cost=1.12..3.81 rows=1 width=6) (actual time=0.043..0.043 rows=1 loops=1)
Output: pf.image_group_id
Buffers: shared hit=11
-> Nested Loop (cost=1.12..14.55 rows=5 width=6) (actual time=0.041..0.041 rows=1 loops=1)
Output: pf.image_group_id
Inner Unique: true
Buffers: shared hit=11
-> Index Only Scan using image_family_family_id_provider_data_id_idx on programs.image_family fam (cost=0.56..1.65 rows=5 width=6) (actual time=0.024..0.024 rows=1 loops=1)
Output: fam.family_id, fam.provider_data_id
Index Cond: (fam.family_id = 8419853)
Heap Fetches: 2
Buffers: shared hit=6
-> Index Scan using "IX_DBO_PROVIDER_FILE_1" on programs.provider_file pf (cost=0.56..2.58 rows=1 width=12) (actual time=0.013..0.013 rows=1 loops=1)
Output: pf.provider_data_id, pf.provider_file_path, pf.posted_dt, pf.file_repository_id, pf.restricted_size, pf.image_group_id, pf.is_master, pf.is_biggest
Index Cond: (pf.provider_data_id = fam.provider_data_id)
Filter: (pf.image_group_id IS NOT NULL)
Buffers: shared hit=5
Planning time: 0.809 ms
Execution time: 0.100 ms
If I disable sequence scans in the function I can get a similar query plan:
Query Text: SELECT image_group_id FROM programs.image_family fam JOIN programs.provider_file pf ON (fam.provider_data_id = pf.provider_data_id AND fam.family_id = $1 AND pf.image_group_id IS NOT NULL) LIMIT 1
Limit (cost=1.12..8.00 rows=1 width=6) (actual time=3855.722..3855.722 rows=0 loops=1)
-> Nested Loop (cost=1.12..660217.34 rows=96017 width=6) (actual time=3855.721..3855.721 rows=0 loops=1)
-> Index Only Scan using image_family_family_id_provider_data_id_idx on image_family fam (cost=0.56..458043.00 rows=96023 width=6) (actual time=3855.720..3855.720 rows=0 loops=1)
Filter: ((family_id)::numeric = '8419853'::numeric)
Rows Removed by Filter: 19204671
Heap Fetches: 368
-> Index Scan using "IX_DBO_PROVIDER_FILE_1" on provider_file pf (cost=0.56..2.11 rows=1 width=12) (never executed)
Index Cond: (provider_data_id = fam.provider_data_id)
Filter: (image_group_id IS NOT NULL)
The query plans are different where the Filter functions are for the Index Only Scan. The function has more Heap Fetches and seems to treat the argument as a string casted to a numeric.
Things I've tried:
Increasing statistics (and running vacuum/analyze)
Calling the problematic piece of SQL in another function with language SQL
Add another index (the one that its using now to perform an INDEX ONLY scan)
Create a CTE for the image_family table (this did help performance but would still do a sequence scan on the image_family instead of using the index so still, too slow)
Change from executing raw SQL to using an EXECUTE ... INTO .. USING in the function.
Makeup of the two tables:
image_family:
provider_data_id: numeric(16)
family_id: int4
(rest omitted for brevity)
unique index on provider_data_id
index on family_id
I recently added a unique index on (family_id, provider_data_id) as well
Approximately 20 million rows here. Families have many provider_data_ids but not all provider_data_ids are part of families and thus aren't all in this table.
provider_file:
provider_data_id numeric(16)
image_group_id numeric(16)
(rest omitted for brevity)
unique index on provider_data_id
Approximately 32 million rows in this table. Most rows (> 95%) have a Non-Null image_group_id.
Postgres Version 10
How can I get the query performance to match whether I call it from a function or as raw SQL in a query tool?
The problem is exhibited in this line:
Filter: ((family_id)::numeric = '8419853'::numeric)
The index on family_id cannot be used because family_id is compared to a numeric value. This requires a cast to numeric, and there is no index on family_id::numeric.
Even though integer and numeric both are types representing numbers, their internal representation is quite different, and so the indexes are incompatible. In other words, the cast to numeric is like a function for PostgreSQL, and since it has no index on that functional expression, it has to resort to a scan of the whole table (or index).
The solution is simple, however: use an integer instead of a numeric parameter for the query. If in doubt, use a cast like
fam.family_id = $1::integer
Recently we changed the format of one of our tables from using a single entry in a column to having a JSONB column in the format of ["key1","key2","key3"] etc. Although we built a GIN index on the JSONB field the queries that we use on it are EXTREMELY slow (in the range of 50 minutes in explain plan). I am trying to find out a way to optimize the query and to correctly utilize the index. I pasted the query below as well as the explain plan for it. The indexed fields are visit.visitor, launch.campaign_key, launch.launch_key, visit.store_key and visits.stop JSONB field as GIN index. We are using PostgresQL 9.4
explain (analyze on) select count(subselect.visitors) as visitors,
subselect.campaign as campaign
from (
select distinct visit.visitor as visitors,
launch.campaign_key as campaign
from visit
join launch on (jsonb_exists(visit.stops, launch.launch_key)) where
visit.store_key = 'ahBzfmdlYXJsYXVuY2gtaHVi'
and launch.state = 'PRODUCTION') as subselect group by subselect.campaign
Explain results:
HashAggregate (cost=63873548.47..63873550.47 rows=200 width=88) (actual time=248617.348..248617.365 rows=58 loops=1)
Group Key: launch.campaign_key
-> HashAggregate (cost=63519670.22..63661221.52 rows=14155130 width=88) (actual time=248587.320..248616.558 rows=1938 loops=1)
Group Key: visit.visitor, launch.campaign_key
-> HashAggregate (cost=63307343.27..63448894.57 rows=14155130 width=88) (actual time=248553.278..248584.868 rows=1938 loops=1)
Group Key: visit.visitor, launch.campaign_key
-> Nested Loop (cost=4903.09..56997885.96 rows=1261891461 width=88) (actual time=180648.410..248550.249 rows=2085 loops=1)
Join Filter: jsonb_exists(visit.stops, (launch.launch_key)::text)
Rows Removed by Join Filter: 624114512
-> Bitmap Heap Scan on launch (cost=3213.19..126084.38 rows=169389 width=123) (actual time=32.082..317.561 rows=166121 loops=1)
Recheck Cond: ((state)::text = 'PRODUCTION'::text)
Heap Blocks: exact=56635
-> Bitmap Index Scan on launch_state_idx (cost=0.00..3170.85 rows=169389 width=0) (actual time=21.172..21.172 rows=166121 loops=1)
Index Cond: ((state)::text = 'PRODUCTION'::text)
-> Materialize (cost=1689.89..86736.04 rows=22349 width=117) (actual time=0.000..0.487 rows=3757 loops=166121)
-> Bitmap Heap Scan on visit (cost=1689.89..86624.29 rows=22349 width=117) (actual time=1.324..14.381 rows=3757 loops=1)
Recheck Cond: ((store_key)::text = 'ahBzfmdlYXJsYXVuY2gtaHVicg8LEgVTdG9yZRinzbKcDQw'::text)
Heap Blocks: exact=3672
-> Bitmap Index Scan on visit_store_key_idx (cost=0.00..1684.31 rows=22349 width=0) (actual time=0.780..0.780 rows=3757 loops=1)
Index Cond: ((store_key)::text = 'ahBzfmdlYXJsYXVuY2gtaHVicg8LEgVTdG9yZRinzbKcDQw'::text)
Planning time: 0.232 ms
Execution time: 248708.088 ms
I should mention the index on stops is built
CREATE INDEX ON visit USING GIN (stops)
I'm wondering if switching to building it to
CREATE INDEX ON visit USING GIN (stops->’value')
Will resolve the issue?
The wrapper function jsonb_exists() prevents the use of the gin index on visits.stops. Instead of
from visit
join launch on (jsonb_exists(visit.stops, launch.launch_key))
try
from visit
join launch on visit.stops ? launch.launch_key::text
I encountered a strange behaviour of the Postgres optimizer on the following query:
select count(product0_.id) as col_0_0_ from Product product0_
where product0_.active=true
and (product0_.aggregatorId is null
or product0_.aggregatorId in ($1 , $2 , $3))
Product has about 54 columns, active is a boolean having a btree index, and aggregatorId is 'varchar(15)` and has a btree index.
On this query above the index for 'aggregatorId' is not used:
Aggregate (cost=169995.75..169995.76 rows=1 width=32) (actual time=3904.726..3904.727 rows=1 loops=1)
-> Seq Scan on product product0_ (cost=0.00..165510.39 rows=1794146 width=32) (actual time=0.055..2407.195 rows=1851827 loops=1)
Filter: (active AND ((aggregatorid IS NULL) OR ((aggregatorid)::text = ANY ('{5109037,5001015,70601}'::text[]))))
Rows Removed by Filter: 542146
Total runtime: 3904.925 ms
But if we reduce the query by leaving out the null check for this column, the index gets used:
Aggregate (cost=17600.93..17600.94 rows=1 width=32) (actual time=614.933..614.935 rows=1 loops=1)
-> Index Scan using idx_prod_aggr on product product0_ (cost=0.43..17487.56 rows=45347 width=32) (actual time=19.284..594.509 rows=12099 loops=1)
Index Cond: ((aggregatorid)::text = ANY ('{5109037,5001015,70601}'::text[]))
Filter: active
Rows Removed by Filter: 49130
Total runtime: 150.255 ms
As far as I know a btree index can handle null checks, so I don't understand why the index is not used for the complete query. The product table contains about 2.3 million entries, so it is not very fast.
EDIT:
The index is very standard:
CREATE INDEX idx_prod_aggr
ON product
USING btree
(aggregatorid COLLATE pg_catalog."default");
Since there are many identical values for the column which you use in the where clause (78% of all the table rows according to your numbers), the database will conclude that it is cheaper to use full table scan than to waste additional time to read the index.
The rule of thumb in most database vendors is that index will probably not be used if it can't narrow the search down to about 5% of all the table records.
Your problem looked interesting, so I reproduced your scenario - postgres 9.1, table with 1M rows, one boolean column, one varchar column, both indexed, half of table has NULL names.
I had same explain analyze output when varchar column was not indexed. However, with index postgres uses bitmap scan on NULL condition and IN condition and then merges them with OR condition.
Then he uses seq scan on boolean condition (because indexes are separated)
explain analyze
select * from A where active is true and ((name is null) OR (name in ('1','2','3') ));
See output:
"Bitmap Heap Scan on a (cost=17.34..21.35 rows=1 width=18) (actual time=0.048..0.048 rows=0 loops=1)"
" Recheck Cond: ((name IS NULL) OR ((name)::text = ANY ('{1,2,3}'::text[])))"
" Filter: (active IS TRUE)"
" -> BitmapOr (cost=17.34..17.34 rows=1 width=0) (actual time=0.047..0.047 rows=0 loops=1)"
" -> Bitmap Index Scan on idx_prod_aggr (cost=0.00..4.41 rows=1 width=0) (actual time=0.010..0.010 rows=0 loops=1)"
" Index Cond: (name IS NULL)"
" -> Bitmap Index Scan on idx_prod_aggr (cost=0.00..12.93 rows=1 width=0) (actual time=0.036..0.036 rows=0 loops=1)"
" Index Cond: ((name)::text = ANY ('{1,2,3}'::text[]))"
"Total runtime: 0.077 ms"
This makes me think that you missed some details, if so, add them to your question.
I have a table with over a billion records. In order to improve performance, I partitioned it to 30 partitions. The most frequent queries have (id = ...) in their where clause, so I decided to partition the table on the id column.
Basically, the partitions were created in this way:
CREATE TABLE foo_0 (CHECK (id % 30 = 0)) INHERITS (foo);
CREATE TABLE foo_1 (CHECK (id % 30 = 1)) INHERITS (foo);
CREATE TABLE foo_2 (CHECK (id % 30 = 2)) INHERITS (foo);
CREATE TABLE foo_3 (CHECK (id % 30 = 3)) INHERITS (foo);
.
.
.
I ran ANALYZE for the entire database and in particular, I made it collect extra statistics for this table's id column by running:
ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000;
However when I run queries that filter on the id column the planner shows that it's still scanning all the partitions. constraint_exclusion is set to partition, so that's not the problem.
EXPLAIN ANALYZE SELECT * FROM foo WHERE (id = 2);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.544..215.540 rows=171477 loops=1)
-> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.539..106.446 rows=171477 loops=1)
-> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id = 2)
-> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.020..0.020 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.018..0.018 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.012..0.012 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.010..0.010 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=30.504..77.033 rows=171477 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=29.825..29.825 rows=171477 loops=1)
Index Cond: (id = 2)
.
.
.
What could I do to make the planer have a better plan? Do I need to run ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000; for all the partitions as well?
EDIT
After using Erwin's suggested change to the query, the planner only scans the correct partition, however the execution time is actually worse then a full scan (at least of the index).
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2);
QUERY PLAN
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.611..224.934 rows=171477 loops=1)
-> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.606..116.565 rows=171477 loops=1)
-> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1)
Filter: (id = 2)
-> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.046..0.046 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.044..0.044 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.021..0.021 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.020..0.020 rows=0 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=32.536..86.730 rows=171477 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=31.842..31.842 rows=171477 loops=1)
Index Cond: (id = 2)
-> Bitmap Heap Scan on foo_3 foo (cost=3475.87..285574.05 rows=129032 width=52) (actual time=0.035..0.035 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_3_idx_1 (cost=0.00..3443.61 rows=129032 width=0) (actual time=0.031..0.031 rows=0 loops=1)
.
.
.
-> Bitmap Heap Scan on foo_29 foo (cost=3401.84..276569.90 rows=126245 width=56) (actual time=0.019..0.019 rows=0 loops=1)
Recheck Cond: (id = 2)
-> Bitmap Index Scan on foo_29_idx_1 (cost=0.00..3370.28 rows=126245 width=0) (actual time=0.018..0.018 rows=0 loops=1)
Index Cond: (id = 2)
Total runtime: 238.790 ms
Versus:
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..273120.30 rows=611 width=56) (actual time=31.519..257.051 rows=171477 loops=1)
-> Append (cost=0.00..273120.30 rows=611 width=56) (actual time=31.516..153.356 rows=171477 loops=1)
-> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1)
Filter: ((id = 2) AND ((id % 30) = 2))
-> Bitmap Heap Scan on foo_2 foo (cost=3249.97..273120.30 rows=610 width=56) (actual time=31.512..124.177 rows=171477 loops=1)
Recheck Cond: (id = 2)
Filter: ((id % 30) = 2)
-> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=30.816..30.816 rows=171477 loops=1)
Index Cond: (id = 2)
Total runtime: 270.384 ms
For non-trivial expressions you have to repeat the more or less verbatim condition in queries to make the Postgres query planner understand it can rely on the CHECK constraint. Even if it seems redundant!
Per documentation:
With constraint exclusion enabled, the planner will examine the
constraints of each partition and try to prove that the partition need
not be scanned because it could not contain any rows meeting the
query's WHERE clause. When the planner can prove this, it excludes
the partition from the query plan.
Bold emphasis mine. The planner does not understand complex expressions.
Of course, this has to be met, too:
Ensure that the constraint_exclusion configuration parameter is not
disabled in postgresql.conf. If it is, queries will not be optimized as desired.
Instead of
SELECT * FROM foo WHERE (id = 2);
Try:
SELECT * FROM foo WHERE id % 30 = 2 AND id = 2;
And:
The default (and recommended) setting of constraint_exclusion is
actually neither on nor off, but an intermediate setting called
partition, which causes the technique to be applied only to queries
that are likely to be working on partitioned tables. The on setting
causes the planner to examine CHECK constraints in all queries, even
simple ones that are unlikely to benefit.
You can experiment with the constraint_exclusion = on to see if the planner catches on without redundant verbatim condition. But you have to weigh cost and benefit of this setting.
The alternative would be simpler conditions for your partitions as already outlined by #harmic.
An no, increasing the number for STATISTICS will not help in this case. Only the CHECK constraints and your WHERE conditions in the query matter.
Unfortunately, partioning in postgresql is fairly primitive. It only works for range and list based constraints. Your partition constraints are too complex for the query planner to use to decide to exclude some partitions.
In the manual it says:
Keep the partitioning constraints simple, else the planner may not be
able to prove that partitions don't need to be visited. Use simple
equality conditions for list partitioning, or simple range tests for
range partitioning, as illustrated in the preceding examples. A good
rule of thumb is that partitioning constraints should contain only
comparisons of the partitioning column(s) to constants using
B-tree-indexable operators.
You might get away with changing your WHERE clause so that the modulus expression is explicitly mentioned, as Erwin suggested. I haven't had much luck with that in the past, although I have not tried recently and as he says, there have been improvements in the planner. That is probably the first thing to try.
Otherwise, you will have to rearrange your partitions to use ranges of id values instead of the modulus method you are using now. Not a great solution, I know.
One other solution is to store the modulus of the id in a separate column, which you can then use a simple value equality check for the partition constraint. Bit of a waste of disk space, though, and you would also need to add a term to the where clauses to boot.
In addition to Erwin's answer about the details of the how the planner works with partitions, there is a larger issue here.
Partitioning is not a magic bullet. There are a handful of very specific things for which partitioning is very useful. If none of those very specific things apply to you, then you cannot expect a performance improvement from partitioning, and most likely will get a decrease instead.
To do partitioning correctly, you need a thorough understanding of your usage patterns, or your data loading and unloading patterns.