Postgresql performance issue when querying on time range - sql

I'm trying to understand a strange performance issue on Postgres (v10.9).
We have a requests table and I want to get all requests made by a set of particular users in several time ranges. Here is the relevant info:
There is no user_id column in the table. Rather, there is a jsonb column named params, where the user_id field is stored as a string.
The set of users in question is very large, in the thousands.
There is a time column of type timestamptz and it's indexed with a standard BTREE index.
There is also an separate BTREE index on params->>'user_id'.
The queries I am running are based on the following template:
SELECT *
FROM requests
WHERE params->>'user_id' = ANY (VALUES ('id1'), ('id2'), ('id3')...)
AND time > 't1' AND time < 't2'
Where the ids and times here are placeholders for actual ids and times.
I am running a query like this for several consecutive time ranges of 2 weeks each. The queries for the first few time ranges take a couple of minutes each, which is obviously very long in terms of production but OK for research purposes. Then suddenly there is a dramatic spike in query runtime, and they start taking hours each, which begins to be untenable even for offline purposes.
This spike happens in the same range every time. It's worth noting that in this time range there is a x1.5 increase in total requests. Certainly more compared with the previous time range, but not enough to warrant a spike by a full order of magnitude.
Here is the output for EXPLAIN ANALYZE for the last time range with the reasonable running time:
Hash Join (cost=442.69..446645.35 rows=986171 width=1217) (actual time=66.305..203593.238 rows=445175 loops=1)
Hash Cond: ((requests.params ->> 'user_id'::text) = \"*VALUES*\".column1)
-> Index Scan using requests_time_idx on requests (cost=0.56..428686.19 rows=1976888 width=1217) (actual time=14.336..201643.439 rows=2139604 loops=1)
Index Cond: ((\"time\" > '2019-02-12 22:00:00+00'::timestamp with time zone) AND (\"time\" < '2019-02-26 22:00:00+00'::timestamp with time zone))
-> Hash (cost=439.62..439.62 rows=200 width=32) (actual time=43.818..43.818 rows=29175 loops=1)
Buckets: 32768 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2536kB
-> HashAggregate (cost=437.62..439.62 rows=200 width=32) (actual time=24.887..33.775 rows=29175 loops=1)
Group Key: \"*VALUES*\".column1
-> Values Scan on \"*VALUES*\" (cost=0.00..364.69 rows=29175 width=32) (actual time=0.006..10.303 rows=29175 loops=1)
Planning time: 133.807 ms
Execution time: 203697.360 ms
If I understand this correctly, it seems that most of the time is spent on filtering the requests by time range, even though:
The time index seems to be used.
When running the same queries without the filter on the users (basically just fetching all requests by time range only), they both run in OK times.
Any thoughts on how to solve this problem would be appreciated, thanks!

Since you are retrieving so many rows, the query will never be really fast.
Unfortunately there is no single index to cover both conditions, but you can use these two:
CREATE INDEX ON requests ((params->>'user_id'));
CREATE INDEX ON requests (time);
Then you can hope for two bitmap index scans which get joined by a “bitmap or”.
I am not sure if that will improve performance; PostgreSQL may still opt for the current plan, which is not a bad one. If your indexes are cached or random access to your storage is fast, set effective_cache_size or random_page_cost accordingly, that will make PostgreSQL lean towards an index scan.

Related

Too many buffers hit+read during index scan

I've 2 tables User and Info. I'm writing a simple query with inner join and inserting the result into an unlogged table.
INSERT INTO Result (
iProfileId,email,format,content
)
SELECT
COALESCE(N1.iprofileId, 0),
Lower(N1.email),
W0.format,
W0.content
FROM
Info W0,
User N1
where
(N1.iprofileId = W0.iId);
Info table has 30M rows and User table has 158M rows. Due to some reason, this query is taking too long on one of my prod setups. At first glance it looks like its reading/hitting too many buffers:
Insert on Result (cost=152813.60..15012246.06 rows=31198136 width=1080) (actual time=5126063.502..5126063.502 rows=0 loops=1)
Buffers: shared hit=128815094 read=6103564 dirtied=599445 written=2088037
I/O Timings: read=2563306.517 write=570919.940
-> Merge Join (cost=152813.60..15012246.06 rows=31198136 width=1080) (actual time=0.097..5060947.922 rows=31191937 loops=1)
Merge Cond: (w0.iid = n1.iprofileid)
Buffers: shared hit=96480126 read=5574864 dirtied=70745 written=2009998
I/O Timings: read=2563298.981 write=562810.833
-> Index Scan using user_idx on info w0 (cost=0.56..2984094.60 rows=31198136 width=35) (actual time=0.012..246299.026 rows=31191937 loops=1)
Buffers: shared hit=481667 read=2490602 written=364347
I/O Timings: read=178000.987 write=38663.457
-> Index Scan using profile_id on user n1 (cost=0.57..14938848.88 rows=158842848 width=32) (actual time=0.020..4718272.082 rows=115378606 loops=1)
Buffers: shared hit=95998459 read=3084262 dirtied=70745 written=1645651
I/O Timings: read=2385297.994 write=524147.376
Planning Time: 11.531 ms
Execution Time: 5126063.577 ms
When I ran this query on a different setup but with similar tables and number of records, profile_id scan only used 5M pages(ran in 3m) whereas here it used(read+hit) 100M buffers(ran in 1.45h). When I checked using vacuum verbose this table only has 10M pages.
INFO: "User": found 64647 removable, 109184385 nonremovable row versions in 6876625 out of 10546400 pages
This is one of the good runs but we've seen this query taking up to 4-5 hrs as well. My test system which ran in under 3 mins also had iid distributed among profile_id range. But it had fewer columns and indexes as compared to the prod system. What could be the reason for this slowness?
The execution plan you are showing has a lot of dirtied and written pages. That indicates that the tables were freshly inserted, and your query was the first reader.
In PostgreSQL, the first reader of a new table row consults the commit log to see if that row is visible or not (did the transaction that created it commit?). It then sets flags in the row (the so-called hint bits) to save the next reader that trouble.
Setting the hint bits modifies the row, so the block is dirtied and has to be written to disk eventually. That writing is normally done by the checkpointer or the background writer, but they couldn't keep up, so the query had to clean out many dirty pages itself.
If you run the query a second time, it will be faster. For that reason, it is a good idea to VACUUM tables after bulk loading, which will also set the hint bits.
However, a large query like that will always be slow. Things you can try to speed it up further are:
have lots of RAM and load the tables into shared buffers with pg_prewarm
crank up work_mem in the hope to get a faster hash join
CLUSTER the tables using the indexes, so that heap fetches become more efficient

Why does a pg query stop using an index after a while?

I have this query in Postgres 12.0:
SELECT "articles"."id"
FROM "articles"
WHERE ((jsonfields ->> 'etat') = '0' OR (jsonfields ->> 'etat') = '1' OR (jsonfields ->> 'etat') = '2')
ORDER BY ordre ASC;
At this time:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1274.09..1274.97 rows=354 width=8) (actual time=13.000..13.608 rows=10435 loops=1)
Sort Key: ordre
Sort Method: quicksort Memory: 874kB
-> Bitmap Heap Scan on articles (cost=15.81..1259.10 rows=354 width=8) (actual time=1.957..10.807 rows=10435 loops=1)
Recheck Cond: (((jsonfields ->> 'etat'::text) = '1'::text) OR ((jsonfields ->> 'etat'::text) = '2'::text) OR ((jsonfields ->> 'etat'::text) = '0'::text))
Heap Blocks: exact=6839
-> BitmapOr (cost=15.81..15.81 rows=356 width=0) (actual time=1.171..1.171 rows=0 loops=1)
-> Bitmap Index Scan on myidx (cost=0.00..5.18 rows=119 width=0) (actual time=0.226..0.227 rows=2110 loops=1)
Index Cond: ((jsonfields ->> 'etat'::text) = '1'::text)
-> Bitmap Index Scan on myidx (cost=0.00..5.18 rows=119 width=0) (actual time=0.045..0.045 rows=259 loops=1)
Index Cond: ((jsonfields ->> 'etat'::text) = '2'::text)
-> Bitmap Index Scan on myidx (cost=0.00..5.18 rows=119 width=0) (actual time=0.899..0.899 rows=8066 loops=1)
Index Cond: ((jsonfields ->> 'etat'::text) = '0'::text)
Planning Time: 0.382 ms
Execution Time: 14.234 ms
(15 lignes)
After a while:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=7044.04..7079.35 rows=14127 width=8) (actual time=613.445..614.679 rows=15442 loops=1)
Sort Key: ordre
Sort Method: quicksort Memory: 1108kB
-> Seq Scan on articles (cost=0.00..6070.25 rows=14127 width=8) (actual time=0.060..609.477 rows=15442 loops=1)
Filter: (((jsonfields ->> 'etat'::text) = '1'::text) OR ((jsonfields ->> 'etat'::text) = '2'::text) OR ((jsonfields ->> 'etat'::text) = '3'::text))
Rows Removed by Filter: 8288
Planning Time: 0.173 ms
Execution Time: 615.744 ms
(8 lignes)
I need to re-create index:
DROP INDEX myidx;
CREATE INDEX myidx ON articles ( (jsonfields->>'etat') );
Why? How to fix this?
I tried to decrease memory for disable seqscan. It doesn't work.
I tried to do select pg_stat_reset();. It doesn't work.
pg_stat_reset() does not reset table statistics. It only resets counters (like how often an index was used), it has no effects on query plans.
To update table statistics, use ANALYZE (or VACUUM ANALYZE, while being at it). autovacuum should take care of this automatically, normally.
Your first query finds rows=10435, your second query finds rows=15442. Postgres expects to find rows=354 (!) in the first, but rows=14127 in the second. It largely under-estimates the number of result rows in the first, which favours indexes. So your first query was only fast by accident.
Table statistics have changed, there may be table and index bloat. Most importantly, your cost settings are probably misleading. Consider a lower setting for random_page_cost (and possibly for cpu_index_tuple_cost and others).
Related:
Keep PostgreSQL from sometimes choosing a bad query plan
If recreating the index leads to a different query plan, the index may have been bloated. (A bloated index would also discourage Postgres from using it.) More aggressive autovacuum settings, generally or just for the table or even just the index may help.
Also, expression indexes introduce additional statistics (the essential one on jsonfields->>'etat' in your case). Dropping the index drops those, too. And the new expression index starts out with empty statistics which are filled with the next manual ANALYZE or by autovacuum. So, typically, you should run ANALYZE on the table after creating an expression index - except that in your case you currently only seem to get the fast query when based on misleading stats, so fix that first.
Maybe revisit your database design. Does that etat value really have to be nested in a JSON column? Might be a lot cheaper overall to have it as separate column.
Be that as it may, the most expensive part of your first (fast) query plan is the Bitmap Heap Scan, where Postgres reads actual data pages to return id values. A shortcut with a "covering" index would be possible since Postgres 11:
CREATE INDEX myidx ON articles ((jsonfields->>'etat')) INCLUDE (ordre, id);
But this relies on autovacuum doing its job in timely manner even more, as it requires the visibility map to be up to date.
Or, if your WHERE clause is constant (always filtering for (jsonfields ->> 'etat') = ANY ('{0,1,2}')), a partial index would reign supreme:
CREATE INDEX myidx ON articles (ordre, id)
WHERE (jsonfields ->> 'etat') = ANY ('{0,1,2}');
Immediately after you create the functional index, it doesn't have any statistics gathered on it, so PostgreSQL must make some generic assumptions. Once auto-analyze has had a chance to run, it has real stats to work with. Now it turns out the more-accurate estimates actually leads to a worse plan, which is rather unfortunate.
The PostgreSQL planner generally assumes much of our data is not in cache. This assumption pushes it to favor seq scans over index scan when it will be returning a large number of rows (Your second plan is returning 2/3 of the table!). The reasons it makes this assumption is that it is safer. Assuming too little data is cached leads to merely bad plans, but assuming too much is cached leads to utterly catastrophic plans.
In general, the amount of data assumed to be cache is baked into the random_page_cost setting, so you can tweak that setting if you want it. (baking it into that setting, rather than having a separate setting, was a poor design decision in my opinion, but it was made a very long time ago).
You could set random_page_cost equal to seq_page_cost, to see if that solves the problem. But that is probably not a change you would want to make permanently, as it is likely to create more problems than it solves. Perhaps the correct setting is lower than the default but still higher than seq_page_cost. You should also do EXPLAIN (ANALYZE, BUFFERS), and set track_io_timing = on, to give you more information to use in evaluating this.
Another issue is that the bitmap heap scan never needs to consult the actual JSON data. It gets all the data it needs from the index. The seq scan needs to consult the JSON data, and how slow this is will be depends on things like what type it is (json or jsonb) and how much other stuff is in that JSON. PostgreSQL rather ridiculously thinks that parsing a JSON document take about the same amount of time as comparing two integers does.
You can more or less fix this problem (for json type) by running the following statement:
update pg_proc set procost =100 where proname='json_object_field_text';
(This is imperfect, as the cost of this function gets charged to the recheck condition of the bitmap heap scan even though no recheck is done. But the recheck is charged for each tuple expected to be returned, not each tuple expected to be in the table, so this creates a distinction you can take advantage of).

PostgreSQL GIN index slower than GIST for pg_trgm?

Despite what all the documentation says, I'm finding GIN indexes to be significantly slower than GIST indexes for pg_trgm related searches. This is on a table of 25 million rows with a relatively short text field (average length of 21 characters). Most of the rows of text are addresses of the form "123 Main st, City".
GIST index takes about 4 seconds with a search like
select suggestion from search_suggestions where suggestion % 'seattle';
But GIN takes 90 seconds and the following result when running with EXPLAIN ANALYZE:
Bitmap Heap Scan on search_suggestions (cost=330.09..73514.15 rows=25043 width=22) (actual time=671.606..86318.553 rows=40482 loops=1)
Recheck Cond: ((suggestion)::text % 'seattle'::text)
Rows Removed by Index Recheck: 23214341
Heap Blocks: exact=7625 lossy=223807
-> Bitmap Index Scan on tri_suggestions_idx (cost=0.00..323.83 rows=25043 width=0) (actual time=669.841..669.841 rows=1358175 loops=1)
Index Cond: ((suggestion)::text % 'seattle'::text)
Planning time: 1.420 ms
Execution time: 86327.246 ms
Note that over a million rows are being selected by the index, even though only 40k rows actually match. Any ideas why this is performing so poorly? This is on PostgreSQL 9.4.
Some issues stand out:
First, consider upgrading to a current version of Postgres. At the time of writing that's pg 9.6 or pg 10 (currently beta). Since Pg 9.4 there have been multiple improvements for GIN indexes, the additional module pg_trgm and big data in general.
Next, you need much more RAM, in particular a higher work_mem setting. I can tell from this line in the EXPLAIN output:
Heap Blocks: exact=7625 lossy=223807
"lossy" in the details for a Bitmap Heap Scan (with your particular numbers) indicates a dramatic shortage of work_mem. Postgres only collects block addresses in the bitmap index scan instead of row pointers because that's expected to be faster with your low work_mem setting (can't hold exact addresses in RAM). Many more non-qualifying rows have to be filtered in the following Bitmap Heap Scan this way. This related answer has details:
“Recheck Cond:” line in query plans with a bitmap index scan
But don't set work_mem too high without considering the whole situation:
Optimize simple query using ORDER BY date and text
There may other problems, like index or table bloat or more configuration bottlenecks. But if you fix just these two items, the query should be much faster already.
Also, do you really need to retrieve all 40k rows in the example? You probably want to add a small LIMIT to the query and make it a "nearest-neighbor" search - in which case a GiST index is the better choice after all, because that is supposed to be faster with a GiST index. Example:
Best index for similarity function

Why PostgreSQL queries are slower in the first request after first new connection than during the subsequent requests?

Why PostgreSQL queries are slower in the first request after first new connection than during the subsequent requests?
Using several different technologies to connect to a postgresql database. First request might take 1.5 seconds. Exact same query will take .03 seconds the second time. Open a second instance of my application (connecting to same database) and that first request takes 1.5 seconds and the second .03 seconds.
Because of the different technologies we are using they are connecting at different points and using different connection methods so I really don't think it has anything to do with any code I have written.
I'm thinking that opening a connection doesn't do 'everything' until the first request, so that request has some overhead.
Because I have used the database, and kept the server up everything is in memory so index and the like should not be an issue.
Edit Explain - tells me about the query and honestly the query looks pretty good (indexed, etc). I really think postgresql has some kind of overhead on the first query of a new connection.
I don't know how to prove/disprove that. If I used PG Admin III (pgAdmin version 1.12.3) all the query's seem fast. Any of the other tools I have the first query is slow. Most the time its not noticeably slower, and if it was I always chalked it up to updating the ram with the index. But this is clearly NOT that. If I open my tool(s) and do any other query that returns results the second query is fast regardless. If the first query doesn't return results then the second is still slow, then third is fast.
edit 2
Even though I don't think the query has anything to do with the delay (every first query is slow) here are two results from running Explain (EXPLAIN ANALYZE)
EXPLAIN ANALYZE
select * from company
where company_id = 39
Output:
"Seq Scan on company (cost=0.00..1.26 rows=1 width=54) (actual time=0.037..0.039 rows=1 loops=1)"
" Filter: (company_id = 39)"
"Total runtime: 0.085 ms"
and:
EXPLAIN ANALYZE
select * from group_devices
where device_name ilike 'html5_demo'
and group_id in ( select group_id from manager_groups
where company_id in (select company_id from company where company_name ='TRUTHPT'))
output:
"Nested Loop Semi Join (cost=1.26..45.12 rows=1 width=115) (actual time=1.947..2.457 rows=1 loops=1)"
" Join Filter: (group_devices.group_id = manager_groups.group_id)"
" -> Seq Scan on group_devices (cost=0.00..38.00 rows=1 width=115) (actual time=0.261..0.768 rows=1 loops=1)"
" Filter: ((device_name)::text ~~* 'html5_demo'::text)"
" -> Hash Semi Join (cost=1.26..7.09 rows=9 width=4) (actual time=0.297..1.596 rows=46 loops=1)"
" Hash Cond: (manager_groups.company_id = company.company_id)"
" -> Seq Scan on manager_groups (cost=0.00..5.53 rows=509 width=8) (actual time=0.003..0.676 rows=469 loops=1)"
" -> Hash (cost=1.26..1.26 rows=1 width=4) (actual time=0.035..0.035 rows=1 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 1kB"
" -> Seq Scan on company (cost=0.00..1.26 rows=1 width=4) (actual time=0.025..0.027 rows=1 loops=1)"
" Filter: ((company_name)::text = 'TRUTHPT'::text)"
"Total runtime: 2.566 ms"
I have observed the same behavior. If I start a new connection, and run a query multiple times, the first execution is about 25% slower than the following executions. (This query has been run earlier in other connections, and I have verified that there is no disk I/O involved.) I profiled the process with perf during the first query execution, and this is what I found:
As you can see, a lot of time is spent handling page faults. If I profile the second execution, there are no page faults. AFAICT, these are what is called minor/soft page faults. This happens when a process for the first time access a page that is in shared memory. At that point, the process needs to map the page into its virtual address space (see https://en.wikipedia.org/wiki/Page_fault). If the page needs to be read from disk, it is called as major/hard page fault.
This explanation also fits with other observations that I have made: If I later run a different query in the same connection, the amount of overhead for its first execution seems to depend on how much overlap there is with the data accessed by the first query.
This a very old question, but hopefully this may help.
First query
It doesn't seem like an index is being used, and the optimizer is resorting to a sequential scan of the table.
While scanning the table sequentially, the optimizer may cache the entire table in RAM, if the data fits into the buffer. See this article for more information.
Why the buffering is occurring for each connection I don't know. Regardless, a sequential scan is not desirable for this kind of query and can be avoided with correct indexing and statistics.
Check the structure of the company table. Make sure that company_id is part of a UNIQUE INDEX or PRIMARY KEY.
Make sure you run ANALYZE, so that the optimizer has the correct statistics. This will help to ensure that the index for company will be used in your queries instead of a sequential scan of the table.
See PostgreSQL documentation
Second query
Try using INNER JOIN to avoid the optimizer selecting Hash Semi Join, to get more consistent performance and a simplier EXPLAIN PLAN:
select gd.*
from group_devices gd
inner join manager_groups mg on mg.group_id = gd.group_id
inner join company c on c.company_id = mg.company_id
where gd.device_name like 'html5_demo%'
and c.company_name = 'TRUTHPT';
See related question
First request will read blocks from disk to buffers.
Second request will read from buffers.
It doesnt matter how many connections are made, the result is dependant on whether that query has already been parsed.
Please note changing literals will reparse the query
Also note that if the query hasnt been executed in a while then physical reads may still occur depending on many variables.

Postgres on AWS performance issues on filter or aggregate

I'm working on a system which has a table with aprox. 13 million records.
It does not appear to be big deal for postgres, but i'm facing serious performance issues when hitting this particular table.
The table has aprox. 60 columns (I know it's too much, but I can't change it for reasons beyond my will).
Hardware ain't problem. It's running on AWS. I tested several configurations, even the new RDS for postgres:
vCPU ECU mem(gb)
m1.xlarge 64 bits 4 8 15
m2.xlarge 64 bits 2 6,5 17
hs1.8xlarge 64 bits 16 35 117 SSD
I tuned pg settings with pgtune. And also set the ubuntu's kernel sshmall and shmmax.
Some "explain analyze" queries:
select count(*) from:
$Aggregate (cost=350390.93..350390.94 rows=1 width=0) (actual time=24864.287..24864.288 rows=1 loops=1)
-> Index Only Scan using core_eleitor_sexo on core_eleitor (cost=0.00..319722.17 rows=12267505 width=0) (actual time=0.019..12805.292 rows=12267505 loops=1)
Heap Fetches: 9676
Total runtime: 24864.325 ms
select distinct core_eleitor_city from:
HashAggregate (cost=159341.37..159341.39 rows=2 width=516) (actual time=15965.740..15966.090 rows=329 loops=1)
-> Bitmap Heap Scan on core_eleitor (cost=1433.53..159188.03 rows=61338 width=516) (actual time=956.590..9021.457 rows=5886110 loops=1)
Recheck Cond: ((core_eleitor_city)::text = 'RIO DE JANEIRO'::text)
-> Bitmap Index Scan on core_eleitor_city (cost=0.00..1418.19 rows=61338 width=0) (actual time=827.473..827.473 rows=5886110 loops=1)
Index Cond: ((core_eleitor_city)::text = 'RIO DE JANEIRO'::text)
Total runtime: 15977.460 ms
I have btree indexes on columns frequently used for filter or aggregations.
So, given I can't change my table design. Is there something I can do to improve the performance?
Any help would be awesome.
Thanks
You're aggregating ~12.3M and ~5.9M rows on a VPS cluster that, if I am not mistaking, might span multiple physical servers, with data that is probably pulled from a SAN on yet another set of different server than Postgres itself.
Imho, there's little you can do to make it faster on (AWS anyway), besides a) not running queries that basically visit the entire database table to begin with and b) maintaining a pre-count using triggers if possible if you persist in doing so.
Here you go for improving the performance on RDS:
As referred to link here:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html
Amazon RDS uses MySQL’s built-in replication functionality to create a special type of DB instance called a read replica from a source DB instance. Updates made to the source DB instance are copied to the read replica. You can reduce the load on your source DB instance by routing read queries from your applications to the read replica. Read replicas allow you to elastically scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads.