on postgresl 9.0 we have a sql query:
SELECT count(*) FROM lane
WHERE not exists
(SELECT 1 FROM id_map
WHERE id_map.new_id=lane.lane_id
and id_map.column_name='lane_id'
and id_map.table_name='lane')
and lane.lane_id is not null;
that usually takes somewhat around 1.5 seconds to finish.
Here's the explain plan: http://explain.depesz.com/s/axNN
Sometimes though this query hangs and will not finish. It may run even for 11 hours without success.
It then takes up 100% of the cpu.
The only locks this query takes are "AccessShareLock"s and they are all granted.
SELECT a.datname,
c.relname,
l.transactionid,
l.mode,
l.granted,
a.usename,
a.current_query,
a.query_start,
age(now(), a.query_start) AS "age",
a.procpid
FROM pg_stat_activity a
JOIN pg_locks l ON l.pid = a.procpid
JOIN pg_class c ON c.oid = l.relation
ORDER BY a.query_start;
The query is run as a part of a java process that connects to a database using a connection pool and performs sequentially similar select queries of this format:
SELECT count(*) FROM {} WHERE not exists (SELECT 1 FROM id_map WHERE id_map.new_id={}.{} and id_map.column_name='{}' and id_map.table_name='{}') and {}.{} is not null
no updates or delete are happening in parallel to this process so I don't think vacuuming can be the issue here.
Prior to running the entire process (so before 6 queries of this sort are run) an analyze on all the tables were run.
postgres logs don't show any entry for the long running queries because they never finish and thus never get to be logged.
Any idea what may cause this kind of behavior and how to prevent it from happening?
the explain plan without analyze:
Aggregate (cost=874337.91..874337.92 rows=1 width=0)
-> Nested Loop Anti Join (cost=0.00..870424.70 rows=1565283 width=0)
Join Filter: (id_map.new_id = lane.lane_id)
-> Seq Scan on lane (cost=0.00..30281.84 rows=1565284 width=8)
Filter: (lane_id IS NOT NULL)
-> Materialize (cost=0.00..816663.60 rows=1 width=8)
-> Seq Scan on id_map (cost=0.00..816663.60 rows=1 width=8)
Filter: (((column_name)::text = 'lane_id'::text) AND ((table_name)::text = 'lane'::text))
VACUUM ANALYZE VERBOSE;
refreshing statistics shall help db to choose optimal plan - not nested loops, which I believe take 100% CPU
This problem might be caused because (from what I understood):
Postgres has ran out the number of available transaction ids (When all of the two billion available transaction IDs have been used, the transaction IDs start over again at one, which results in wraparound issues which can cause severe data loss or DB shutdown)
The database is too segmented, i.e. DELETE or UPDATE (it is converted into INSERT + DELETE by Postgres) commands mark tuples as deleted but do not physically delete it.
If you have any cloud Server like GCloud, you can set some variables on Database flags to make VACUUM be called automatically and clean the tuples marked as deleted and is still in you database, and ANALYZE gather the latest statistics about frequently updated tables used on execution plan. Example:
autovacuum: on
autovacuum_analyze_scale_factor: 0.05
autovacuum_analyze_threshold: 10
autovacuum_naptime: 15
autovacuum_vacuum_cost_delay: 10
autovacuum_vacuum_cost_limit: 1000
autovacuum_vacuum_scale_factor: 0.1
autovacuum_vacuum_threshold: 25
log_autovacuum_min_duration: 0
track_counts: on
Source:
https://www.postgresql.org/docs/9.5/runtime-config-autovacuum.html
https://www.techonthenet.com/postgresql/autovacuum.php
https://aws.amazon.com/premiumsupport/knowledge-center/transaction-id-wraparound-effects/
Related
I am designing a table that has a jsonb column realizing permissions with the following format:
[
{"role": 5, "perm": "view"},
{"role": 30, "perm": "edit"},
{"role": 52, "perm": "view"}
]
TL;DR
How do I convert such jsonb value into an SQL array of integer roles? In this example, it would be '{5,30,52}'::int[]. I have some solutions but none are fast enough. Keep reading...
Each logged-in user has some roles (one or more). The idea is to filter the records using the overlap operator (&&) on int[].
SELECT * FROM data WHERE extract_roles(access) && '{1,5,17}'::int[]
I am looking for the extract_roles function/expression that can also be used in the definition of an index:
CREATE INDEX data_roles ON data USING gin ((extract_roles(access)))
jsonb in Postgres seems to have broad support for building and transforming but less for extracting values - SQL arrays in this case.
What I tried:
create or replace function extract_roles(access jsonb) returns int[]
language sql
strict
parallel safe
immutable
-- with the following bodies:
-- (0) 629ms
select translate(jsonb_path_query_array(access, '$.role')::text, '[]', '{}')::int[]
-- (1) 890ms
select array_agg(r::int) from jsonb_path_query(access, '$.role') r
-- (2) 866ms
select array_agg((t ->> 'role')::int) from jsonb_array_elements(access) as x(t)
-- (3) 706ms
select f1 from jsonb_populate_record(row('{}'::int[]), jsonb_build_object('f1', jsonb_path_query_array(access, '$.role'))) as x (f1 int[])
When the index is used, the query is fast. But there are two problems with these expressions:
some of the functions are only stable and not immutable; this also applies to cast. Am I allowed to mark my function as immutable? The immutability is required by the index definition.
they are slow; the planner does not use the index in some scenarios, and then the query can become really slow (times above are on a table with 3M records):
explain (analyse)
select id, access
from data
where extract_roles(access) && '{-3,99}'::int[]
order by id
limit 100
with the following plan (same for all variants above; prefers scanning the index associated with the primary key, gets sorted results and hopes that it finds 100 of them soon):
Limit (cost=1000.45..2624.21 rows=100 width=247) (actual time=40.668..629.193 rows=100 loops=1)
-> Gather Merge (cost=1000.45..476565.03 rows=29288 width=247) (actual time=40.667..629.162 rows=100 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Index Scan using data_pkey on data (cost=0.43..472184.44 rows=12203 width=247) (actual time=25.522..513.463 rows=35 loops=3)
Filter: (extract_roles(access) && '{-3,99}'::integer[])
Rows Removed by Filter: 84918
Planning Time: 0.182 ms
Execution Time: 629.245 ms
Removing the LIMIT clause is paradoxically fast:
Gather Merge (cost=70570.65..73480.29 rows=24938 width=247) (actual time=63.263..75.710 rows=40094 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=69570.63..69601.80 rows=12469 width=247) (actual time=59.870..61.569 rows=13365 loops=3)
Sort Key: id
Sort Method: external merge Disk: 3744kB
Worker 0: Sort Method: external merge Disk: 3232kB
Worker 1: Sort Method: external merge Disk: 3160kB
-> Parallel Bitmap Heap Scan on data (cost=299.93..68722.36 rows=12469 width=247) (actual time=13.823..49.336 rows=13365 loops=3)
Recheck Cond: (extract_roles(access) && '{-3,99}'::integer[])
Heap Blocks: exact=9033
-> Bitmap Index Scan on data_roles (cost=0.00..292.44 rows=29926 width=0) (actual time=9.429..9.430 rows=40094 loops=1)
Index Cond: (extract_roles(access) && '{-3,99}'::integer[])
Planning Time: 0.234 ms
Execution Time: 77.719 ms
Is there any better and faster way to extract int[] from a jsonb? Because I cannot rely on the planner always using the index. Playing with COST of the extract_roles function helps a bit (planner starts using the index for LIMIT 1000) but even an insanely high value does not force the index for LIMIT 100.
Comments:
If there is not, I will probably store the information in another column roles int[], which is fast but takes extra space and requires extra treatment (can be solved using generated columns on Postgres 12+, which Azure still does not provide, or a trigger, or in the application logic).
Looking into the future, will there be any better support in Postgres 15? Maybe JSON_QUERY but I don’t see any immediate improvement because its RETURNING clause probably refers to the whole result and not its elements.
Maybe jsonb_populate_record could also consider non-composite types (its signature allows it) such as:
select jsonb_populate_record(null::int[], '[123,456]'::jsonb)
The two closest questions are:
Extract integer array from jsonb within postgres 9.6
Cast postgresql jsonb value as array of int and remove element from it
Reaction to suggested normalization:
Normalization is probably not viable. But let's follow the train of thoughts.
I assume that the extra table would look like this: *_perm (id, role, perm). There would be an index on id and another index on role.
Because a user has multiple roles, it could join multiple records for the same id, which would cause multiplication of the records in the data table and force a group by aggregation.
A group by is bad for performance because it prevents some optimizations. I am designing a building block. So there can be for example two data tables at play:
select pd.*, jsonb_agg(to_jsonb(pp))
from posts_data pd
join posts_perm pp on pd.id = pp.id
where exists(
select 1
from comments_data cd on cd.post_id = pd.id
join comments_perm cp on cp.id = cd.id
where cd.reputation > 100
and cp.role in (3,34,52)
-- no group by needed due to semi-join
)
and cp.role in (3,34,52)
group by pd.id
order by pd.title
limit 10
If I am not mistaken, this query will require the aggregation of all records before they are sorted. No index can help here. That will never be fast with millions of records. Moreover, there is non-trivial logic behind group by usage - it is not always needed.
What if we did not need to return the permissions but only cared about its existence?
select pd.*
from posts_data pd
where exists(
select 1
from posts_perm pp on pd.id = pp.id
where cp.role in (3,34,52)
)
and exists(
select 1
from comments_data cd on cd.post_id = pd.id
where exists(
select 1
from comments_perm cp on cp.id = cd.id
where cp.role in (3,34,52)
)
and cd.reputation > 100
)
order by pd.title
limit 10
Then we don't need any aggregation - the database will simply issue a SEMI-JOIN. If there is an index on title, the database may consider using it. We can even fetch the permissions in the projection; something like this:
select pd.*, (select jsonb_agg(to_jsonb(pp)) from posts_perm pp on pd.id = pp.id) perm
...
Where a nested-loop join will be issued for only the few (10) records. I will test this approach.
Another option is to keep the data in both tables - the data table would only store an int[] of roles. Then we save a JOIN and only fetch from the permission table at the end. Now we need an index that supports array operations - GIN.
select pd.*, (select jsonb_agg(to_jsonb(pp)) from posts_perm pp on pd.id = pp.id) perm
from posts_data pd
where pd.roles && '{3,34,52}'::int[]
and exists(
select 1
from comments_data cd on cd.post_id = pd.id
where cd.roles && '{3,34,52}'::int[]
and cd.reputation > 100
)
order by pd.title
limit 10
Because we always aggregate all permissions for the returned records (their interpretation is in the application and does not matter that we return all of them), we can store the post_perms as a json. Because we never need to work with the values in SQL, storing them directly in the data table seems reasonable.
We will need to support some bulk-sharing operations later that update the permissions for many records, but that is much rarer than selects. Because of this we could favor jsonb instead.
The projection does not need the select of permissions anymore:
select pd.*
...
But now the roles column is redundant - we have the same information in the same table, just in JSON format. If we can write a function that extracts just the roles, we can directly index it.
And we are back at the beginning. But it looks like the extract_roles function is never going to be fast, so we need to keep roles column.
Another reason for keeping permissions in the same table is the possibility of combining multiple indices using Bitmap And and avoiding a join.
There will be a huge bias in the roles. Some are going to be present on almost all rows (admin can edit everything), others will be rare (John Doe can only access these 3 records that were explicitly shared with him). I am not sure how well statistics will work on the int[] approach but so far my tests show that the GIN index is used when the role is infrequent (high selectivity).
It looks like the core problem here is the classic one with WHERE...ORDER BY...LIMIT, that the planner assumes all of the qualifying rows are scattered evenly throughout the ordering. But that isn't the case here: rows meeting your && condition are selectively deficient in low-numbered "id". So it has to walk that index far farther than it thought it would need to before it catches the LIMIT.
There is nothing you can do (in any version) to get the planner to estimate that better. You could just prevent that index from being used by rewriting it to order by id+0. But then it wouldn't use that plan even when it would truly be faster, like the admin who is on everything. (Which by the way seems like a bad idea--an exceptional user should probably be handled exceptionally, not shoehorned into the normal system).
The immutable extraction function certainly is slow, but if the above planning problem were fixed that wouldn't matter. Making the function faster would probably require some compiled code, and Azure surely would not let you link the .so file into their managed server.
Because the JSON has a regular structure (int, text), I also considered two alternative storages:
create a composite type role of (int, text) and store the array role[]; extract_roles function is still needed;
store two arrays int[] and text[].
The latter one won for the following reasons:
smallest disk space (important for queries that require seq scan);
no need for extract_roles function - the int array is stored directly;
no need for functional index;
easy append (but same is true for JSON);
the library that I am using (jOOQ) has a good binding for arrays so working with them may even be more pleasant than with a JSON.
Disadvantages are:
harder remove - need to unnest and reaggregate.
Is there any optimization I can do to speed up this query. It is currently taking 30 minutes to run.
SELECT
*
FROM
service s
JOIN
bucket b ON s.procedure = b.hcpc
WHERE
month >= '201904'
AND bucket = 'Respirator'
Explain execution plan -
Gather (cost=1002.24..81397944.91 rows=9782404 width=212)
Workers Planned: 2
-> Hash Join (cost=2.24..80418704.51 rows=4076002 width=212)
Hash Cond: ((s .procedure)::text = (bac.hcpc)::text)
-> Parallel Seq Scan on service s (cost=0.00..77753288.33 rows=699907712 width=154)
Filter: ((month)::text >= '201904'::text)
-> Hash (cost=2.06..2.06 rows=14 width=58)
-> Seq Scan on buckets b (cost=0.00..2.06 rows=14 width=58)
Filter: ((bucket)::text = 'Respirator'::text)
SELECT *
FROM service s JOIN
bucket b
ON s.procedure = b.hcpc
WHERE s.month >= '201904' AND
b.bucket = 'Respirator';
I would suggest indexes on:
bucket(bucket, hcpc)
service(procedure, month)
Query optimization is something that doesn't have super hard and fast rules, it's more of a trial and error thing. Sometimes you will try a technique and it will work really well, but then the same technique will have little to no effect on another query. That being said, here are a couple of things that I would try to get you started.
Instead of SELECT *, list out the column names that you need. If you need all of both tables, still list them out.
Are there any numeric columns that you can use in your WHERE clause to do some preliminary filtering? Comparing only string data types is almost always a pain point in query optimization.
Look at the existing indexes on the table and see if any changes need to be made. Indexes can have a huge impact on query performance, both positive and negative depending on setup.
Again, it's all trial and error, these are just a couple of places to start.
This issue has been resolved
Solution was to create an index on (listings.id, listings.lat, listings.lng). At first the planner refused to use this index in favor of a seq scan, however vacuuming the table (VACUUM calendar) resolved that. Whew.
I'm having issues with query speed, it takes approx. 20s run, I really need to access in a few seconds. Am I using inefficient SQL here?
select c.date, avg(c.price)
from calendar c
join listings l on l.id = c.listing_id
where l.lat >= 51.45 and l.lat <= 51.5
and l.lng >= -0.2 and l.lng <= -0.1
group by c.date;
The lat/lng are hard-coded here, in reality they are dynamic tableau parameters.
I know the lat/lng test and join onto listings is NOT the bottleneck, that runs fast. The query runs in roughly the same time without a group by as well, so I don't believe that is the bottleneck either.
To indicate size, that particular query is averaging ~230,000 rows. I have tried multi-column indexes of (c.date, c.listing_id) and (c.date, c.listing_id, c.price) without any speed improvement.
Naturally I can't pre-compute the result as the lat/lng will be different each time.
Can I improve the speed here? Do I need to upgrade my Amazon RDS instance?
Any advice appreciated, thanks!
(Using Postgres 9.5)
Update
From what I can tell from testing, the query is very slow because it is using a seq scan on 'calendar'. I added the index (l.id, l.lat, l.lng) and force the index usage using SET enable_seqscan = off; and the query ran in <1s. I have no idea why the optimiser is using a seq instead of index scan.
EXPLAIN ANALYZE: with seq scan, without seq scan
I have a simple select query:
SELECT * FROM entities WHERE entity_type_id = 1 ORDER BY entity_id
Then I want to get the first 100 results, so I use this:
SELECT * FROM entities WHERE entity_type_id = 1 ORDER BY entity_id LIMIT 100
The problem is that the second query works much slower then the first one. It takes less than a second to execute the first query and more than a minute to execute the second one.
These are execution plans for the queries:
without limit:
Sort (cost=26201.43..26231.42 rows=11994 width=72)
Sort Key: entity_id
-> Index Scan using entity_type_id_idx on entities (cost=0.00..24895.34 rows=11994 width=72)
Index Cond: (entity_type_id = 1)
with limit:
Limit (cost=0.00..8134.39 rows=100 width=72)
-> Index Scan using xpkentities on entities (cost=0.00..975638.85 rows=11994 width=72)
Filter: (entity_type_id = 1)
I don't understand why these two plans are so different and why the performance decreases so much. How should I tweak the second query to make it work faster?
I use PostgreSql 9.2.
You want the 100 smallest entity_id's matching your condition. Now - if those were numbers 1..100 then clearly using the entity_id index is the best way to handle this - everything is pre-sorted. In fact, if the 100 you wanted were in the range 1..200 then it still makes sense. Probably 1..1000 would.
So - PostgreSQL thinks it will find lots of entity_type_id=1 values at the "start" of the table. It estimates a cost of 8134 vs 26231 to filter by type then sort. In your case it is wrong.
Now - either there is some correlation which isn't obvious (that's bad - we can't tell the planner about that at present), or we don't have up-to-date or sufficient stats.
Does an ANALYZE entities make any difference? You can see what values the planner knows about by reading the planner-stats page in the manuals.
I have two queries that are functionally identical. One of them performs very well, the other one performs very poorly. I do not see from where the performance difference arises.
Query #1:
SELECT id
FROM subsource_position
WHERE
id NOT IN (SELECT position_id FROM subsource)
This comes back with the following plan:
QUERY PLAN
-------------------------------------------------------------------------------
Seq Scan on subsource_position (cost=0.00..362486535.10 rows=128524 width=4)
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..2566.50 rows=101500 width=4)
-> Seq Scan on subsource (cost=0.00..1662.00 rows=101500 width=4)
Query #2:
SELECT id FROM subsource_position
EXCEPT
SELECT position_id FROM subsource;
Plan:
QUERY PLAN
-------------------------------------------------------------------------------------------------
SetOp Except (cost=24760.35..25668.66 rows=95997 width=4)
-> Sort (cost=24760.35..25214.50 rows=181663 width=4)
Sort Key: "*SELECT* 1".id
-> Append (cost=0.00..6406.26 rows=181663 width=4)
-> Subquery Scan on "*SELECT* 1" (cost=0.00..4146.94 rows=95997 width=4)
-> Seq Scan on subsource_position (cost=0.00..3186.97 rows=95997 width=4)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..2259.32 rows=85666 width=4)
-> Seq Scan on subsource (cost=0.00..1402.66 rows=85666 width=4)
(8 rows)
I have a feeling I'm missing either something obviously bad about one of my queries, or I have misconfigured the PostgreSQL server. I would have expected this NOT IN to optimize well; is NOT IN always a performance problem or is there a reason it does not optimize here?
Additional data:
=> select count(*) from subsource;
count
-------
85158
(1 row)
=> select count(*) from subsource_position;
count
-------
93261
(1 row)
Edit: I have now fixed the A-B != B-A problem mentioned below. But my problem as stated still exists: query #1 is still massively worse than query #2. This, I believe, follows from the fact that both tables have similar numbers of rows.
Edit 2: I'm using PostgresQL 9.0.4. I cannot use EXPLAIN ANALYZE because query #1 takes too long. All of these columns are NOT NULL, so there should be no difference as a result of that.
Edit 3: I have an index on both these columns. I haven't yet gotten query #1 to complete (gave up after ~10 minutes). Query #2 returns immediately.
Query #1 is not the elegant way for doing this... (NOT) IN SELECT is fine for a few entries, but it can't use indexes (Seq Scan).
Not having EXCEPT, the alternative is to use a JOIN (HASH JOIN):
SELECT sp.id
FROM subsource_position AS sp
LEFT JOIN subsource AS s ON (s.position_id = sp.id)
WHERE
s.position_id IS NULL
EXCEPT appeared in Postgres long time ago... But using MySQL I believe this is still the only way, using indexes, to achieve this.
Since you are running with the default configuration, try bumping up work_mem. Most likely, the subquery ends up getting spooled to disk because you only allow for 1Mb of work memory. Try 10 or 20mb.
Your queries are not functionally equivalent so any comparison of their query plans is meaningless.
Your first query is, in set theory terms, this:
{subsource.position_id} - {subsource_position.id}
^ ^ ^ ^
but your second is this:
{subsource_position.id} - {subsource.position_id}
^ ^ ^ ^
And A - B is not the same as B - A for arbitrary sets A and B.
Fix your queries to be semantically equivalent and try again.
If id and position_id are both indexed (either on their own or first column in a multi-column index), then two index scans are all that are necessary - it's a trivial sorted-merge based set algorithm.
Personally I think PostgreSQL simply doesn't have the optimization intelligence to understand this.
(I came to this question after diagnosing a query running for over 24 hours that I could perform with sort x y y | uniq -u on the command line in seconds. Database less than 50MB when exported with pg_dump.)
PS: more interesting comment here:
more work has been put into optimizing
EXCEPT and NOT EXISTS than NOT IN, because the latter is substantially
less useful due to its unintuitive but spec-mandated handling of NULLs.
We're not going to apologize for that, and we're not going to regard it as a bug.
What it comes down to is that except is different to not in with respect to null handling. I haven't looked up the details, but it means PostgreSQL (aggressively) doesn't optimize it.
The second query makes usage of the HASH JOIN feature of postgresql. This is much faster then the Seq Scan of the first one.