I've got a simple table with single column PRIMARY KEY called id, type serial. There is exactly 100,000,000 rows in there. Table takes 48GB, PK index ca 2,1GB. Machine running on is "dedicated" only for Postgres and it is something like Core i5, 500GB HDD, 8GB RAM. Pg config was created by pgtune utility (shared buffers ca 2GB, effective cache ca 7GB). OS is Ubuntu server 14.04, Postgres 9.3.6.
Why are both SELECT count(id) and SELECT count(*) so slow in this simple case (cca 11 minutes)?
Why is PostgreSQL planner choosing full table scan instead of index scanning which should be at least 25 times faster (in the case where it would have to read the whole index from HDD). Or where am I wrong?
Btw running the query several times in a row is not changing anything. still cca 11 minutes.
Execution plan here:
Aggregate (cost=7500001.00..7500001.01 rows=1 width=0) (actual time=698316.978..698316.979 rows=1 loops=1)
Buffers: shared hit=192 read=6249809
-> Seq Scan on transaction (cost=0.00..7250001.00 rows=100000000 width=0) (actual time=0.009..680594.049 rows=100000001 loops=1)
Buffers: shared hit=192 read=6249809
Total runtime: 698317.044 ms
Considering the spec of a HDD is usually somewhere between 50Mb/s and 100Mb/s then for 48Gb I would expect to read everything between 500 and 1000s.
Since you have no where clause, the planner sees that you are interested in the large majority of records, so it does not use the index as this would require additional indexes. The reason postgresql cannot use the index lies in the MVCC which postgresql uses for transaction consistency. This requires that the rows are pulled to ensure accurate results. (see https://wiki.postgresql.org/wiki/Slow_Counting)
The cache, CPU, etc will not affect this nor changing the caching settings. This is IO bound and the cache will be completely trashed after the query.
If you can live with an approximation you can use the reltuples field in the table metadata:
SELECT reltuples FROM pg_class WHERE relname = 'tbl';
Since this is just a single row this is blazing fast.
Update: since 9.2 a new way to store the visibility information allowed index-only counts to happen. However there are quite some caveats, especially in the case where there is no predicate to limit the rows. see https://wiki.postgresql.org/wiki/Index-only_scans for more details.
Related
Despite what all the documentation says, I'm finding GIN indexes to be significantly slower than GIST indexes for pg_trgm related searches. This is on a table of 25 million rows with a relatively short text field (average length of 21 characters). Most of the rows of text are addresses of the form "123 Main st, City".
GIST index takes about 4 seconds with a search like
select suggestion from search_suggestions where suggestion % 'seattle';
But GIN takes 90 seconds and the following result when running with EXPLAIN ANALYZE:
Bitmap Heap Scan on search_suggestions (cost=330.09..73514.15 rows=25043 width=22) (actual time=671.606..86318.553 rows=40482 loops=1)
Recheck Cond: ((suggestion)::text % 'seattle'::text)
Rows Removed by Index Recheck: 23214341
Heap Blocks: exact=7625 lossy=223807
-> Bitmap Index Scan on tri_suggestions_idx (cost=0.00..323.83 rows=25043 width=0) (actual time=669.841..669.841 rows=1358175 loops=1)
Index Cond: ((suggestion)::text % 'seattle'::text)
Planning time: 1.420 ms
Execution time: 86327.246 ms
Note that over a million rows are being selected by the index, even though only 40k rows actually match. Any ideas why this is performing so poorly? This is on PostgreSQL 9.4.
Some issues stand out:
First, consider upgrading to a current version of Postgres. At the time of writing that's pg 9.6 or pg 10 (currently beta). Since Pg 9.4 there have been multiple improvements for GIN indexes, the additional module pg_trgm and big data in general.
Next, you need much more RAM, in particular a higher work_mem setting. I can tell from this line in the EXPLAIN output:
Heap Blocks: exact=7625 lossy=223807
"lossy" in the details for a Bitmap Heap Scan (with your particular numbers) indicates a dramatic shortage of work_mem. Postgres only collects block addresses in the bitmap index scan instead of row pointers because that's expected to be faster with your low work_mem setting (can't hold exact addresses in RAM). Many more non-qualifying rows have to be filtered in the following Bitmap Heap Scan this way. This related answer has details:
“Recheck Cond:” line in query plans with a bitmap index scan
But don't set work_mem too high without considering the whole situation:
Optimize simple query using ORDER BY date and text
There may other problems, like index or table bloat or more configuration bottlenecks. But if you fix just these two items, the query should be much faster already.
Also, do you really need to retrieve all 40k rows in the example? You probably want to add a small LIMIT to the query and make it a "nearest-neighbor" search - in which case a GiST index is the better choice after all, because that is supposed to be faster with a GiST index. Example:
Best index for similarity function
I'm trying to understand how index scan's actually performed.
EXPLAIN ANALYZE SELECT * FROM tbl WHERE id = 46983
Consider the following plan:
Index Scan using pk_tbl on tbl (cost=0.29..8.30 rows=1 width=1064) (actual time=0.012..0.013 rows=1 loops=1)
Index Cond: (id = 46983)
Planning time: 0.101 ms
Execution time: 0.050 ms
As far as I undersdtand, the index scan process consists of two random page read. In my case
SHOW random_page_cost
returns 4.
So, I guess we need to find the block the the row with id = 46983 stored in (random access in index) and then we need to read that block by it's address(random access the block in physical storage). That's clear, two random access are actually occured. But from wiki I read that
In data structures, direct access implies the ability to access any
entry in a list in constant time
But it's obviously that traversing the balanced-tree doesn't have constant-time complexity, because it depends on the deep of the tree.
That way, how come is it correct to say that requesting the block of the index is actually random-access?
The reason is that indexes in database are normally stored as B-trees or B+trees, an n-ary tree structure with a variabile but very large number of children per node. A typical tree of this kind with three or four levels can address millions of records, and almost certainly at least the root is kept in the cache (buffer pool), so that a typical access for a random key has a cost in the order of 1 or 2 disk accesses. For this reason, in the database field (and when costs are estimated) the access to a B-tree index is considered as a small constant.
I have a vanilla postgres database running on a small server with only one table called "posts".
The table is on the order of ~5GB and contains 9 million rows.
When I run a simple sequential scan opertaion it takes about 51 seconds!:
EXPLAIN ANALYZE select count(*) from posts;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=988701.41..988701.42 rows=1 width=0) (actual time=51429.607..51429.607 rows=1 loops=1)
-> Seq Scan on posts (cost=0.00..966425.33 rows=8910433 width=0) (actual time=0.004..49530.025 rows=9333639 loops=1)
Total runtime: 51429.639 ms
(3 rows)
The server specs:
Xeon E3-1220v2
4GB RAM
500GB hard drive (stock 7200rpm, No RAID)
postgres 9.1
Ubuntu 12.04
No L1 or L2 cache
Postgres runs on 1 of 4 cores
Postgres configuration is standard, nothing special
I have isolated the server, and nothing else significant is running on the server
When the query runs, the disk is getting read at a rate of ~122M/s (according to iotop) and a "IO>" of ~90%. Only 1 core is getting used at 12% capacity of it's capacity. It looks like little to no memory is used in this operation, maybe ~5MB.
From these statistics is sounds like the bottleneck is IO, but I'm confused because the disk is capable of reading way faster, (from a speed test I did using sudo hdparm -Tt /dev/sda I was getting about 10,000M/s) but at the same time iotop is showing a value of 90% which I'm not fully understanding.
Your disk certainly does not read at 10GB/sec :) This is cached performance. The hardware is maxed out here. 120MB/sec is a typical sequential rate.
I see no indication of a hardware problem. The hardware is being used maximally efficiently.
51sec * 120MB/sec ~ 6GB
You say the table is 5GB in size. Probably it is more like 6GB.
The numbers make sense. No problem here.
I'm working on a system which has a table with aprox. 13 million records.
It does not appear to be big deal for postgres, but i'm facing serious performance issues when hitting this particular table.
The table has aprox. 60 columns (I know it's too much, but I can't change it for reasons beyond my will).
Hardware ain't problem. It's running on AWS. I tested several configurations, even the new RDS for postgres:
vCPU ECU mem(gb)
m1.xlarge 64 bits 4 8 15
m2.xlarge 64 bits 2 6,5 17
hs1.8xlarge 64 bits 16 35 117 SSD
I tuned pg settings with pgtune. And also set the ubuntu's kernel sshmall and shmmax.
Some "explain analyze" queries:
select count(*) from:
$Aggregate (cost=350390.93..350390.94 rows=1 width=0) (actual time=24864.287..24864.288 rows=1 loops=1)
-> Index Only Scan using core_eleitor_sexo on core_eleitor (cost=0.00..319722.17 rows=12267505 width=0) (actual time=0.019..12805.292 rows=12267505 loops=1)
Heap Fetches: 9676
Total runtime: 24864.325 ms
select distinct core_eleitor_city from:
HashAggregate (cost=159341.37..159341.39 rows=2 width=516) (actual time=15965.740..15966.090 rows=329 loops=1)
-> Bitmap Heap Scan on core_eleitor (cost=1433.53..159188.03 rows=61338 width=516) (actual time=956.590..9021.457 rows=5886110 loops=1)
Recheck Cond: ((core_eleitor_city)::text = 'RIO DE JANEIRO'::text)
-> Bitmap Index Scan on core_eleitor_city (cost=0.00..1418.19 rows=61338 width=0) (actual time=827.473..827.473 rows=5886110 loops=1)
Index Cond: ((core_eleitor_city)::text = 'RIO DE JANEIRO'::text)
Total runtime: 15977.460 ms
I have btree indexes on columns frequently used for filter or aggregations.
So, given I can't change my table design. Is there something I can do to improve the performance?
Any help would be awesome.
Thanks
You're aggregating ~12.3M and ~5.9M rows on a VPS cluster that, if I am not mistaking, might span multiple physical servers, with data that is probably pulled from a SAN on yet another set of different server than Postgres itself.
Imho, there's little you can do to make it faster on (AWS anyway), besides a) not running queries that basically visit the entire database table to begin with and b) maintaining a pre-count using triggers if possible if you persist in doing so.
Here you go for improving the performance on RDS:
As referred to link here:
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html
Amazon RDS uses MySQL’s built-in replication functionality to create a special type of DB instance called a read replica from a source DB instance. Updates made to the source DB instance are copied to the read replica. You can reduce the load on your source DB instance by routing read queries from your applications to the read replica. Read replicas allow you to elastically scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads.
I have been trying out postgres 9.3 running on an Azure VM on Windows Server 2012. I was originally running it on a 7GB server... I am now running it on a 14GB Azure VM. I went up a size when trying to solve the problem described below.
I am quite new to posgresql by the way, so I am only getting to know the configuration options bit by bit. Also, while I'd love to run it on Linux, I and my colleagues simply don't have the expertise to address issues when things go wrong in Linux, so Windows is our only option.
Problem description:
I have a table called test_table; it currently stores around 90 million rows. It will grow by around 3-4 million rows per month. There are 2 columns in test_table:
id (bigserial)
url (charachter varying 300)
I created indexes after importing the data from a few CSV files. Both columns are indexed.... the id is the primary key. The index on the url is a normal btree created using the defaults through pgAdmin.
When I ran:
SELECT sum(((relpages*8)/1024)) as MB FROM pg_class WHERE reltype=0;
... The total size is 5980MB
The indiviual size of the 2 indexes in question here are as follows, and I got them by running:
# SELECT relname, ((relpages*8)/1024) as MB, reltype FROM pg_class WHERE
reltype=0 ORDER BY relpages DESC LIMIT 10;
relname | mb | reltype
----------------------------------+------+--------
test_url_idx | 3684 | 0
test_pk | 2161 | 0
There are other indexes on other smaller tables, but they are tiny (< 5MB).... so I ignored them here
The trouble when querying the test_table using the url, particularly when using a wildcard in the search, is the speed (or lack of it). e.g.
select * from test_table where url like 'orange%' limit 20;
...would take anything from 20-40 seconds to run.
Running explain analyze on the above gives the following:
# explain analyze select * from test_table where
url like 'orange%' limit 20;
QUERY PLAN
-----------------------------------------------------------------
Limit (cost=0.00..4787.96 rows=20 width=57)
(actual time=0.304..1898.583 rows=20 loops=1)
-> Seq Scan on test_table (cost=0.00..2303247.60 rows=9621 width=57)
(actual time=0.302..1898
.542 rows=20 loops=1)
Filter: ((url)::text ~~ 'orange%'::text)
Rows Removed by Filter: 210286
Total runtime: 1898.650 ms
(5 rows)
Taking another example... this time with the wildcard between american and .com....
# explain select * from test_table where url
like 'american%.com' limit 50;
QUERY PLAN
-------------------------------------------------------
Limit (cost=0.00..11969.90 rows=50 width=57)
-> Seq Scan on test_table (cost=0.00..2303247.60 rows=9621 width=57)
Filter: ((url)::text ~~ 'american%.com'::text)
(3 rows)
# explain analyze select * from test_table where url
like 'american%.com' limit 50;
QUERY PLAN
-----------------------------------------------------
Limit (cost=0.00..11969.90 rows=50 width=57)
(actual time=83.470..3035.696 rows=50 loops=1)
-> Seq Scan on test_table (cost=0.00..2303247.60 rows=9621 width=57)
(actual time=83.467..303
5.614 rows=50 loops=1)
Filter: ((url)::text ~~ 'american%.com'::text)
Rows Removed by Filter: 276142
Total runtime: 3035.774 ms
(5 rows)
I then went from a 7GB to a 14GB server. Query Speeds were no better.
Observations on the server
I can see that Memory usage never really goes beyond 2MB.
Disk reads go off the charts when running a query using a LIKE statement.
Query speed is perfectly fine when matching against the id (primary key)
The postgresql.conf file has had only a few changes from the defaults. Note that I took some of these suggestions from the following blog post: http://www.gabrielweinberg.com/blog/2011/05/postgresql.html.
Changes to conf:
shared_buffers = 512MB
checkpoint_segments = 10
(I changed checkpoint_segments as I got lots of warnings when loading in CSV files... although a production database will not be very write intensive so this can be changed back to 3 if necessary...)
cpu_index_tuple_cost = 0.0005
effective_cache_size = 10GB # recommendation in the blog post was 2GB...
On the server itself, in the Task Manager -> Performance tab, the following are probably the relevant bits for someone who can assist:
CPU: rarely over 2% (regardless of what queries are run... it hit 11% once when I was importing a 6GB CSV file)
Memory: 1.5/14.0GB (11%)
More details on Memory:
In use: 1.4GB
Available: 12.5GB
Committed 1.9/16.1 GB
Cached: 835MB
Paged Pool: 95.2MB
Non-paged pool: 71.2 MB
Questions
How can I ensure an index will sit in memory (providing it doesn't get too big for memory)? Is it just configuration tweaking I need here?
Is implementing my own search index (e.g. Lucene) a better option here?
Are the full-text indexing features in postgres going to improve performance dramatically, even if I can solve the index in memory issue?
Thanks for reading.
Those seq scans make it look like you didn't run analyze on the table after importing your data.
http://www.postgresql.org/docs/current/static/sql-analyze.html
During normal operation, scheduling to run vacuum analyze isn't useful, because the autovacuum periodically kicks in. But it is important when doing massive writes, such as during imports.
On a slightly related note, see this reversed index tip on Pavel's PostgreSQL Tricks site, if you ever need to run anchord queries at the end, rather than at the beginning, e.g. like '%.com'
http://postgres.cz/wiki/PostgreSQL_SQL_Tricks_I#section_20
Regarding your actual questions, be wary that some of the suggestions in that post you liked to are dubious at best. Changing the cost of index use is frequently dubious and disabling seq scan is downright silly. (Sometimes, it is cheaper to seq scan a table than itis to use an index.)
With that being said:
Postgres primarily caches indexes based on how often they're used, and it will not use an index if the stats suggest that it shouldn't -- hence the need to analyze after an import. Giving Postgres plenty of memory will, of course, increase the likelihood it's in memory too, but keep the latter points in mind.
and 3. Full text search works fine.
For further reading on fine-tuning, see the manual and:
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
Two last notes on your schema:
Last I checked, bigint (bigserial in your case) was slower than plain int. (This was a while ago, so the difference might now be negligible on modern, 64-bit servers.) Unless you foresee that you'll actually need more than 2.3 billion entries, int is plenty and takes less space.
From an implementation standpoint, the only difference between a varchar(300) and a varchar without a specified length (or text, for that matter) is an extra check constraint on the length. If you don't actually need data to fit that size and are merely doing so for no reason other than habit, your db inserts and updates will run faster by getting rid of that constraint.
Unless your encoding or collation is C or POSIX, an ordinary btree index cannot efficiently satisfy an anchored like query. You may have to declare a btree index with the varchar_pattern_ops op class to benefit.
The problem is that you're getting hit with a full table scan for each of those lookups ("index in memory" isn't really an issue). Each time you run one of those queries the database is visiting every single row, which is causing the high disk usage. You might check here for a little more information (especially follow the links to the docs on operator classes and index types). If you follow that advice you should be able to get prefix lookups working fine, i.e. those situations where you're matching something like 'orange%'.
Full text search is nice for more natural text search, like written documents, but it might be more difficult to get it working well for URL searching. There was also this thread in the mailing lists a few months back that might have more domain-specific information for what you're trying to do.
explain analyze select * from test_table where
url like 'orange%' limit 20;
You probably want to use a gin/gist index for like queries. Should give you much better results than btree - I don't think btree supports like queries at all.