I am storing versions of a document in PostgreSQL 9.4. Every time the user creates a new version, it inserts a row so that I can track all changes over time. Each row shares a reference_id column with the previous rows. Some of the rows get approved, and some remain as drafts. Each row also has a viewable_at time.
id | reference_id | approved | viewable_at | created_on | content
1 | 1 | true | 2015-07-15 00:00:00 | 2015-07-13 | Hello
2 | 1 | true | 2015-07-15 11:00:00 | 2015-07-14 | Guten Tag
3 | 1 | false | 2015-07-15 17:00:00 | 2015-07-15 | Grüß Gott
The most frequent query is to get the rows grouped by the reference_id where approved is true and viewable_at is less than the current time. (In this case, row id 2 would be included in the results)
So far, this is the best query I've come up with that doesn't require me to add additional columns:
SELECT DISTINCT ON (reference_id) reference_id, id, approved, viewable_at, content
FROM documents
WHERE approved = true AND viewable_at <= '2015-07-15 13:00:00'
ORDER BY reference_id, created_at DESC`
I have an index on reference_id and a multi-column index on approved and viewable_at.
At only 15,000 rows it's still averaging a few hundred milliseconds (140 - 200) on my local machine. I suspect that the distinct call or the ordering may be slowing it down.
What is the most efficient way to store this information so that SELECT queries are the most performant?
Result of EXPLAIN (BUFFERS, ANALYZE):
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=6668.86..6730.36 rows=144 width=541) (actual time=89.862..99.613 rows=145 loops=1)
Buffers: shared hit=2651, temp read=938 written=938
-> Sort (cost=6668.86..6699.61 rows=12300 width=541) (actual time=89.861..97.796 rows=13184 loops=1)
Sort Key: reference_id, created_at
Sort Method: external merge Disk: 7488kB
Buffers: shared hit=2651, temp read=938 written=938
-> Seq Scan on documents (cost=0.00..2847.80 rows=12300 width=541) (actual time=0.049..40.579 rows=13184 loops=1)
Filter: (approved AND (viewable_at < '2015-07-20 06:46:55.222798'::timestamp without time zone))
Rows Removed by Filter: 2560
Buffers: shared hit=2651
Planning time: 0.218 ms
Execution time: 178.583 ms
(12 rows)
Document Usage Notes:
The documents are manually edited and we're not yet autosaving the documents every X seconds or anything, so the volume will be reasonably low. At this point, there is an average of 7 versions and an average of only 2 approved versions per reference_id. (~30%)
On the min and max side, the vast majority of documents will have 1 or 2 versions and it seems unlikely that any document would have more than 30 or 40. There is a garbage collection process to clean out unapproved versions older than a week, so the total number of versions should stay pretty low.
For retrieving and practical usage, I could use limit / offset on the queries but in my tests that doesn't make a huge difference. Ideally this is a base query that populates a view or something so that I can do additional queries on top of these results, but I'm not entirely sure how that would affect the resulting performance and am open to suggestions. My impression is that if I can get this storage / query as simple / fast as possible then all other queries that start from this point could be improved, but it's likely that I'm wrong and that each query needs more independent thought.
Looking at your explain output, it looks like you're fetching most of the contents in the documents table so it's sensibly doing a sequential scan. Your rowcount estimates are reasonable, there doesn't seem to be any stats issue here.
It's doing an external merge sort on disk, so you might see a significant increase in performance by increasing work_mem in the session, e.g.
SET work_mem = '12MB'
It's possible that an index on (reference_id ASC, created_at DESC) WHERE (approved) might be useful, since it'll allow results to be fetched in the order required.
You could also experiment with adding viewable_at to the index. I think it might have to be the last column, but I'm not sure. Or even making it into a covering index by appending viewable_at, id, content and omitting the unnecessary approved column from the result set. This may permit an index-only scan, though with DISTINCT ON involved I'm not sure.
#Craig already covers most options to make this query faster. More work_mem for the session is probably the most effective item.
Since:
There is a garbage collection process to clean out unapproved versions
older than a week
A partial index excluding unapproved versions won't amount to much. If you use an index, you would still exclude those irrelevant rows, though.
Since you seem to have very few versions per reference_id:
the vast majority of documents will have 1 or 2 versions
You already have the best query technique with DISTINCT ON:
Select first row in each GROUP BY group?
With a growing number of versions, other techniques would be increasingly superior:
Optimize GROUP BY query to retrieve latest record per user
The only slightly unconventional element in your query is that the predicate is on viewable_at, but you then take the row with the latest created_at, which is why your index would actually be:
(reference_id, viewable_at ASC, created_at DESC) WHERE (approved)
Assuming all columns to be defined NOT NULL. The alternating sort order between viewable_at and created_at is important. Then again, while you have so few rows per reference_id I don't expect any index to be of much use. The whole table has to be read anyway, a sequential scan will be about as fast. The added maintenance cost of the index may even outweigh its benefit.
However, since:
Ideally this is a base query that populates a view or something so
that I can do additional queries on top of these results
I have one more suggestion: Create a MATERIALIZED VIEW from your query, giving you a snapshot of your project for the given point in time. If disk space is not an issue and snapshot might be reused, you might even collect a couple of those to stick around:
CREATE MATERIALIZED VIEW doc_20150715_1300 AS
SELECT DISTINCT ON (reference_id)
reference_id, id, approved, viewable_at, content
FROM documents
WHERE approved -- simpler expression for boolean column
AND viewable_at <= '2015-07-15 13:00:00'
ORDER BY reference_id, created_at DESC;
Or, if all additional queries happen in the same session, use a temp table instead (which dies at the end of the session automatically):
CREATE TEMP TABLE doc_20150715_1300 AS ...;
ANALYZE doc_20150715_1300;
Be sure to run ANALYZE on the temp table (and also on the MV if you run queries immediately after creating it):
Are regular VACUUM ANALYZE still recommended under 9.1?
PostgreSQL partial index unused when created on a table with existing data
Either way, it may pay to create one or more indexes on the snapshots supporting subsequent queries. Depends on data and queries.
Note, the current version 1.20.0 of pgAdmin does not display indexes for MVs. That's already been fixed and is waiting to be released with the next version.
Related
I am designing a table that has a jsonb column realizing permissions with the following format:
[
{"role": 5, "perm": "view"},
{"role": 30, "perm": "edit"},
{"role": 52, "perm": "view"}
]
TL;DR
How do I convert such jsonb value into an SQL array of integer roles? In this example, it would be '{5,30,52}'::int[]. I have some solutions but none are fast enough. Keep reading...
Each logged-in user has some roles (one or more). The idea is to filter the records using the overlap operator (&&) on int[].
SELECT * FROM data WHERE extract_roles(access) && '{1,5,17}'::int[]
I am looking for the extract_roles function/expression that can also be used in the definition of an index:
CREATE INDEX data_roles ON data USING gin ((extract_roles(access)))
jsonb in Postgres seems to have broad support for building and transforming but less for extracting values - SQL arrays in this case.
What I tried:
create or replace function extract_roles(access jsonb) returns int[]
language sql
strict
parallel safe
immutable
-- with the following bodies:
-- (0) 629ms
select translate(jsonb_path_query_array(access, '$.role')::text, '[]', '{}')::int[]
-- (1) 890ms
select array_agg(r::int) from jsonb_path_query(access, '$.role') r
-- (2) 866ms
select array_agg((t ->> 'role')::int) from jsonb_array_elements(access) as x(t)
-- (3) 706ms
select f1 from jsonb_populate_record(row('{}'::int[]), jsonb_build_object('f1', jsonb_path_query_array(access, '$.role'))) as x (f1 int[])
When the index is used, the query is fast. But there are two problems with these expressions:
some of the functions are only stable and not immutable; this also applies to cast. Am I allowed to mark my function as immutable? The immutability is required by the index definition.
they are slow; the planner does not use the index in some scenarios, and then the query can become really slow (times above are on a table with 3M records):
explain (analyse)
select id, access
from data
where extract_roles(access) && '{-3,99}'::int[]
order by id
limit 100
with the following plan (same for all variants above; prefers scanning the index associated with the primary key, gets sorted results and hopes that it finds 100 of them soon):
Limit (cost=1000.45..2624.21 rows=100 width=247) (actual time=40.668..629.193 rows=100 loops=1)
-> Gather Merge (cost=1000.45..476565.03 rows=29288 width=247) (actual time=40.667..629.162 rows=100 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Index Scan using data_pkey on data (cost=0.43..472184.44 rows=12203 width=247) (actual time=25.522..513.463 rows=35 loops=3)
Filter: (extract_roles(access) && '{-3,99}'::integer[])
Rows Removed by Filter: 84918
Planning Time: 0.182 ms
Execution Time: 629.245 ms
Removing the LIMIT clause is paradoxically fast:
Gather Merge (cost=70570.65..73480.29 rows=24938 width=247) (actual time=63.263..75.710 rows=40094 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=69570.63..69601.80 rows=12469 width=247) (actual time=59.870..61.569 rows=13365 loops=3)
Sort Key: id
Sort Method: external merge Disk: 3744kB
Worker 0: Sort Method: external merge Disk: 3232kB
Worker 1: Sort Method: external merge Disk: 3160kB
-> Parallel Bitmap Heap Scan on data (cost=299.93..68722.36 rows=12469 width=247) (actual time=13.823..49.336 rows=13365 loops=3)
Recheck Cond: (extract_roles(access) && '{-3,99}'::integer[])
Heap Blocks: exact=9033
-> Bitmap Index Scan on data_roles (cost=0.00..292.44 rows=29926 width=0) (actual time=9.429..9.430 rows=40094 loops=1)
Index Cond: (extract_roles(access) && '{-3,99}'::integer[])
Planning Time: 0.234 ms
Execution Time: 77.719 ms
Is there any better and faster way to extract int[] from a jsonb? Because I cannot rely on the planner always using the index. Playing with COST of the extract_roles function helps a bit (planner starts using the index for LIMIT 1000) but even an insanely high value does not force the index for LIMIT 100.
Comments:
If there is not, I will probably store the information in another column roles int[], which is fast but takes extra space and requires extra treatment (can be solved using generated columns on Postgres 12+, which Azure still does not provide, or a trigger, or in the application logic).
Looking into the future, will there be any better support in Postgres 15? Maybe JSON_QUERY but I don’t see any immediate improvement because its RETURNING clause probably refers to the whole result and not its elements.
Maybe jsonb_populate_record could also consider non-composite types (its signature allows it) such as:
select jsonb_populate_record(null::int[], '[123,456]'::jsonb)
The two closest questions are:
Extract integer array from jsonb within postgres 9.6
Cast postgresql jsonb value as array of int and remove element from it
Reaction to suggested normalization:
Normalization is probably not viable. But let's follow the train of thoughts.
I assume that the extra table would look like this: *_perm (id, role, perm). There would be an index on id and another index on role.
Because a user has multiple roles, it could join multiple records for the same id, which would cause multiplication of the records in the data table and force a group by aggregation.
A group by is bad for performance because it prevents some optimizations. I am designing a building block. So there can be for example two data tables at play:
select pd.*, jsonb_agg(to_jsonb(pp))
from posts_data pd
join posts_perm pp on pd.id = pp.id
where exists(
select 1
from comments_data cd on cd.post_id = pd.id
join comments_perm cp on cp.id = cd.id
where cd.reputation > 100
and cp.role in (3,34,52)
-- no group by needed due to semi-join
)
and cp.role in (3,34,52)
group by pd.id
order by pd.title
limit 10
If I am not mistaken, this query will require the aggregation of all records before they are sorted. No index can help here. That will never be fast with millions of records. Moreover, there is non-trivial logic behind group by usage - it is not always needed.
What if we did not need to return the permissions but only cared about its existence?
select pd.*
from posts_data pd
where exists(
select 1
from posts_perm pp on pd.id = pp.id
where cp.role in (3,34,52)
)
and exists(
select 1
from comments_data cd on cd.post_id = pd.id
where exists(
select 1
from comments_perm cp on cp.id = cd.id
where cp.role in (3,34,52)
)
and cd.reputation > 100
)
order by pd.title
limit 10
Then we don't need any aggregation - the database will simply issue a SEMI-JOIN. If there is an index on title, the database may consider using it. We can even fetch the permissions in the projection; something like this:
select pd.*, (select jsonb_agg(to_jsonb(pp)) from posts_perm pp on pd.id = pp.id) perm
...
Where a nested-loop join will be issued for only the few (10) records. I will test this approach.
Another option is to keep the data in both tables - the data table would only store an int[] of roles. Then we save a JOIN and only fetch from the permission table at the end. Now we need an index that supports array operations - GIN.
select pd.*, (select jsonb_agg(to_jsonb(pp)) from posts_perm pp on pd.id = pp.id) perm
from posts_data pd
where pd.roles && '{3,34,52}'::int[]
and exists(
select 1
from comments_data cd on cd.post_id = pd.id
where cd.roles && '{3,34,52}'::int[]
and cd.reputation > 100
)
order by pd.title
limit 10
Because we always aggregate all permissions for the returned records (their interpretation is in the application and does not matter that we return all of them), we can store the post_perms as a json. Because we never need to work with the values in SQL, storing them directly in the data table seems reasonable.
We will need to support some bulk-sharing operations later that update the permissions for many records, but that is much rarer than selects. Because of this we could favor jsonb instead.
The projection does not need the select of permissions anymore:
select pd.*
...
But now the roles column is redundant - we have the same information in the same table, just in JSON format. If we can write a function that extracts just the roles, we can directly index it.
And we are back at the beginning. But it looks like the extract_roles function is never going to be fast, so we need to keep roles column.
Another reason for keeping permissions in the same table is the possibility of combining multiple indices using Bitmap And and avoiding a join.
There will be a huge bias in the roles. Some are going to be present on almost all rows (admin can edit everything), others will be rare (John Doe can only access these 3 records that were explicitly shared with him). I am not sure how well statistics will work on the int[] approach but so far my tests show that the GIN index is used when the role is infrequent (high selectivity).
It looks like the core problem here is the classic one with WHERE...ORDER BY...LIMIT, that the planner assumes all of the qualifying rows are scattered evenly throughout the ordering. But that isn't the case here: rows meeting your && condition are selectively deficient in low-numbered "id". So it has to walk that index far farther than it thought it would need to before it catches the LIMIT.
There is nothing you can do (in any version) to get the planner to estimate that better. You could just prevent that index from being used by rewriting it to order by id+0. But then it wouldn't use that plan even when it would truly be faster, like the admin who is on everything. (Which by the way seems like a bad idea--an exceptional user should probably be handled exceptionally, not shoehorned into the normal system).
The immutable extraction function certainly is slow, but if the above planning problem were fixed that wouldn't matter. Making the function faster would probably require some compiled code, and Azure surely would not let you link the .so file into their managed server.
Because the JSON has a regular structure (int, text), I also considered two alternative storages:
create a composite type role of (int, text) and store the array role[]; extract_roles function is still needed;
store two arrays int[] and text[].
The latter one won for the following reasons:
smallest disk space (important for queries that require seq scan);
no need for extract_roles function - the int array is stored directly;
no need for functional index;
easy append (but same is true for JSON);
the library that I am using (jOOQ) has a good binding for arrays so working with them may even be more pleasant than with a JSON.
Disadvantages are:
harder remove - need to unnest and reaggregate.
I have a simple select query:
SELECT * FROM entities WHERE entity_type_id = 1 ORDER BY entity_id
Then I want to get the first 100 results, so I use this:
SELECT * FROM entities WHERE entity_type_id = 1 ORDER BY entity_id LIMIT 100
The problem is that the second query works much slower then the first one. It takes less than a second to execute the first query and more than a minute to execute the second one.
These are execution plans for the queries:
without limit:
Sort (cost=26201.43..26231.42 rows=11994 width=72)
Sort Key: entity_id
-> Index Scan using entity_type_id_idx on entities (cost=0.00..24895.34 rows=11994 width=72)
Index Cond: (entity_type_id = 1)
with limit:
Limit (cost=0.00..8134.39 rows=100 width=72)
-> Index Scan using xpkentities on entities (cost=0.00..975638.85 rows=11994 width=72)
Filter: (entity_type_id = 1)
I don't understand why these two plans are so different and why the performance decreases so much. How should I tweak the second query to make it work faster?
I use PostgreSql 9.2.
You want the 100 smallest entity_id's matching your condition. Now - if those were numbers 1..100 then clearly using the entity_id index is the best way to handle this - everything is pre-sorted. In fact, if the 100 you wanted were in the range 1..200 then it still makes sense. Probably 1..1000 would.
So - PostgreSQL thinks it will find lots of entity_type_id=1 values at the "start" of the table. It estimates a cost of 8134 vs 26231 to filter by type then sort. In your case it is wrong.
Now - either there is some correlation which isn't obvious (that's bad - we can't tell the planner about that at present), or we don't have up-to-date or sufficient stats.
Does an ANALYZE entities make any difference? You can see what values the planner knows about by reading the planner-stats page in the manuals.
I need the lowest value for runnerId.
This query:
SELECT "runnerId" FROM betlog WHERE "marketId" = '107416794' ;
takes 80 ms (1968 result rows).
This:
SELECT min("runnerId") FROM betlog WHERE "marketId" = '107416794' ;
takes 1600 ms.
Is there a faster way to find the minimum, or should I calc the min in my java program?
"Result (cost=100.88..100.89 rows=1 width=0)"
" InitPlan 1 (returns $0)"
" -> Limit (cost=0.00..100.88 rows=1 width=9)"
" -> Index Scan using runneridindex on betlog (cost=0.00..410066.33 rows=4065 width=9)"
" Index Cond: ("runnerId" IS NOT NULL)"
" Filter: ("marketId" = 107416794::bigint)"
CREATE INDEX marketidindex
ON betlog
USING btree
("marketId" COLLATE pg_catalog."default");
Another idea:
SELECT "runnerId" FROM betlog WHERE "marketId" = '107416794' ORDER BY "runnerId" LIMIT 1 >1600ms
SELECT "runnerId" FROM betlog WHERE "marketId" = '107416794' ORDER BY "runnerId" >>100ms
How can a LIMIT slow the query down?
What you need is a multi-column index:
CREATE INDEX betlog_mult_idx ON betlog ("marketId", "runnerId");
If interested, you'll find in-depth information about multi-column indexes in PostgreSQL, links and benchmarks under this related question on dba.SE.
How did I figure?
In a multi-column index, rows are ordered (and thereby clustered) by the first column of the index ("marketId"), and each cluster is in turn ordered by the second column of the index - so the first row matches the condition min("runnerId"). This makes the index scan extremely fast.
Concerning the paradox effect of LIMIT slowing down a query - the Postgres query planner has a weakness there. The common workaround is to use a CTE (not necessary in this case). Find more information under this recent, closely related question:
PostgreSQL query taking too long
The min statement will be executed by PostgreSQL using a sequential scan of the entire table. You could optimize the query using the following approach:
SELECT col FROM sometable ORDER BY col ASC LIMIT 1;
When you had an index on ("runnerId") (or at least with "runnerId" as the high order column) but did not have the index on ("marketId", "runnerId") it compared the cost of passing all rows with a matching "marketId" using the index on that column and picking out the minimum "runnerId" from that set to the cost of scanning using the index on "runnerId" and stopping when it found the first row with a matching "marketId". Based on available statistics and the assumption that "marketId" values would be randomly distributed within the index entries for the index on "runnerId" it estimated a lower cost for the latter approach.
It also estimated the cost of scanning the whole table and picking the minimum from matching rows as well as probably a number of other alternatives. It does not always use a certain type of plan, but compares costs of all the alternatives.
The problem is that the assumption that values will be randomly distributed in the range is not necessarily true (as in this example), leading to a scan of a high percentage of the range to find the rows lurking at the end. For some values of "marketId", where the chosen value is available near the beginning of the "runnerId" index, this plan should be very fast.
There has been discussion in the PostgreSQL developer community of how we might bias against plans which are "risky" in terms of running long if the data distribution is not what was assumed, and there has been work on tracking multi-column statistics so that correlated values don't run into such problems. Expect improvements in this area in the next few releases. Until then, Erwin's suggestions are on target for how to work around the issue.
Basically it comes down to making a more attractive plan available or introducing an optimization barrier. In this case you can provide a more attractive option by adding the index on ("marketId", "runnerId") -- which allows a very direct way to get straight to the answer. The planner assigns a very low cost to that alternative, causing it to be chosen. If you preferred not to add the index, you could force an optimization barrier by doing something like this:
SELECT min("runnerId")
FROM (SELECT "runnerId" FROM betlog
WHERE "marketId" = '107416794'
OFFSET 0) x;
When there is an OFFSET clause (even for an offset of zero) it forces the subquery to be planned separately and its results fed to the outer query. I would expect this to run in 80 ms rather than the 1600 ms you get without the optimization barrier. Of course, if you can add the index, the speed of the query when data is cached should be less than 1 ms.
I have the following table:
CREATE TABLE recipemetadata
(
--Lots of columns
diet_glutenfree boolean NOT NULL,
);
Most every row will be set to FALSE unless someone comes up with some crazy new gluten free diet that sweeps the country.
I need to be able to very quickly query for rows where this value is true. I've created the index:
CREATE INDEX IDX_RecipeMetadata_GlutenFree ON RecipeMetadata(diet_glutenfree) WHERE diet_glutenfree;
It appears to work, however I can't figure out how to tell if indeed it's only indexing rows where the value is true. I want to make sure it's not doing something silly like indexing any rows with any value at all.
Should I add an operator to the WHERE clause, or is this syntax perfectly valid? Hopefully this isn't one of those super easy RTFM questions that will get downvoted 30 times.
UPDATE:
I've gone ahead and added 10,000 rows to RecipeMetadata with random values. I then did an ANALYZE on the table and a REINDEX just to be sure. When I run the query:
select recipeid from RecipeMetadata where diet_glutenfree;
I get:
'Seq Scan on recipemetadata (cost=0.00..214.26 rows=5010 width=16)'
' Filter: diet_glutenfree'
So, it appears to be doing a sequential scan on the table even though only about half the rows have this flag. The index is being ignored.
If I do:
select recipeid from RecipeMetadata where not diet_glutenfree;
I get:
'Seq Scan on recipemetadata (cost=0.00..214.26 rows=5016 width=16)'
' Filter: (NOT diet_glutenfree)'
So no matter what, this index is not being used.
I've confirmed the index works as expected.
I re-created the random data, only this time set diet_glutenfree to random() > 0.9 so there's only a 10% chance of an on bit.
I then re-created the indexes and tried the query again.
SELECT RecipeId from RecipeMetadata where diet_glutenfree;
Returns:
'Index Scan using idx_recipemetadata_glutenfree on recipemetadata (cost=0.00..135.15 rows=1030 width=16)'
' Index Cond: (diet_glutenfree = true)'
And:
SELECT RecipeId from RecipeMetadata where NOT diet_glutenfree;
Returns:
'Seq Scan on recipemetadata (cost=0.00..214.26 rows=8996 width=16)'
' Filter: (NOT diet_glutenfree)'
It seems my first attempt was polluted since PG estimates it's faster to scan the whole table rather than hit the index if it has to load over half the rows anyway.
However, I think I would get these exact results on a full index of the column. Is there a way to verify the number of rows indexed in a partial index?
UPDATE
The index is around 40k. I created a full index of the same column and it's over 200k, so it looks like it's definitely partial.
An index on a one-bit field makes no sense. For understanding the decisions made by the planner, you must think in terms of pages, not in terms of rows.
For 8K pages and an (estinated) rowsize of 80, there are 100 rows on every page. Assuming a random distribution, the chance that a page consist of only rows with a true value is neglectable, pow (0.5, 100), about 1e-33, IICC. (and the same for 'false' of course) Thus for a query on gluten_free == true, every page has to be fetched anyway, and filtered afterwards. Using an index would only cause more pages (:the index) to be fetched.
I'm working with a non-profit that is mapping out solar potential in the US. Needless to say, we have a ridiculously large PostgreSQL 9 database. Running a query like the one shown below is speedy until the order by line is uncommented, in which case the same query takes forever to run (185 ms without sorting compared to 25 minutes with). What steps should be taken to ensure this and other queries run in a more manageable and reasonable amount of time?
select A.s_oid, A.s_id, A.area_acre, A.power_peak, A.nearby_city, A.solar_total
from global_site A cross join na_utility_line B
where (A.power_peak between 1.0 AND 100.0)
and A.area_acre >= 500
and A.solar_avg >= 5.0
AND A.pc_num <= 1000
and (A.fips_level1 = '06' AND A.fips_country = 'US' AND A.fips_level2 = '025')
and B.volt_mn_kv >= 69
and B.fips_code like '%US06%'
and B.status = 'active'
and ST_within(ST_Centroid(A.wkb_geometry), ST_Buffer((B.wkb_geometry), 1000))
--order by A.area_acre
offset 0 limit 11;
The sort is not the problem - in fact the CPU and memory cost of the sort is close to zero since Postgres has Top-N sort where the result set is scanned while keeping up to date a small sort buffer holding only the Top-N rows.
select count(*) from (1 million row table) -- 0.17 s
select * from (1 million row table) order by x limit 10; -- 0.18 s
select * from (1 million row table) order by x; -- 1.80 s
So you see the Top-10 sorting only adds 10 ms to a dumb fast count(*) versus a lot longer for a real sort. That's a very neat feature, I use it a lot.
OK now without EXPLAIN ANALYZE it's impossible to be sure, but my feeling is that the real problem is the cross join. Basically you're filtering the rows in both tables using :
where (A.power_peak between 1.0 AND 100.0)
and A.area_acre >= 500
and A.solar_avg >= 5.0
AND A.pc_num <= 1000
and (A.fips_level1 = '06' AND A.fips_country = 'US' AND A.fips_level2 = '025')
and B.volt_mn_kv >= 69
and B.fips_code like '%US06%'
and B.status = 'active'
OK. I don't know how many rows are selected in both tables (only EXPLAIN ANALYZE would tell), but it's probably significant. Knowing those numbers would help.
Then we got the worst case CROSS JOIN condition ever :
and ST_within(ST_Centroid(A.wkb_geometry), ST_Buffer((B.wkb_geometry), 1000))
This means all rows of A are matched against all rows of B (so, this expression is going to be evaluated a large number of times), using a bunch of pretty complex, slow, and cpu-intensive functions.
Of course it's horribly slow !
When you remove the ORDER BY, postgres just comes up (by chance ?) with a bunch of matching rows right at the start, outputs those, and stops since the LIMIT is reached.
Here's a little example :
Tables a and b are identical and contain 1000 rows, and a column of type BOX.
select * from a cross join b where (a.b && b.b) --- 0.28 s
Here 1000000 box overlap (operator &&) tests are completed in 0.28s. The test data set is generated so that the result set contains only 1000 rows.
create index a_b on a using gist(b);
create index b_b on a using gist(b);
select * from a cross join b where (a.b && b.b) --- 0.01 s
Here the index is used to optimize the cross join, and speed is ridiculous.
You need to optimize that geometry matching.
add columns which will cache :
ST_Centroid(A.wkb_geometry)
ST_Buffer((B.wkb_geometry), 1000)
There is NO POINT in recomputing those slow functions a million times during your CROSS JOIN, so store the results in a column. Use a trigger to keep them up to date.
add columns of type BOX which will cache :
Bounding Box of ST_Centroid(A.wkb_geometry)
Bounding Box of ST_Buffer((B.wkb_geometry), 1000)
add gist indexes on the BOXes
add a Box overlap test (using the && operator) which will use the index
keep your ST_Within which will act as a final filter on the rows that pass
Maybe you can just index the ST_Centroid and ST_Buffer columns... and use an (indexed) "contains" operator, see here :
http://www.postgresql.org/docs/8.2/static/functions-geometry.html
I would suggest creating an index on area_acre. You may want to take a look at the following: http://www.postgresql.org/docs/9.0/static/sql-createindex.html
I would recommend doing this sort of thing off of peak hours though because this can be somewhat intensive with a large amount of data. One thing you will have to look at as well with indexes is rebuilding them on a schedule to ensure performance over time. Again this schedule should be outside of peak hours.
You may want to take a look at this article from a fellow SO'er and his experience with database slowdowns over time with indexes: Why does PostgresQL query performance drop over time, but restored when rebuilding index
If the A.area_acre field is not indexed that may slow it down. You can run the query with EXPLAIN to see what it is doing during execution.
First off I would look at creating indexes , ensure your db is being vacuumed, increase the shared buffers for your db install, work_mem settings.
First thing to look at is whether you have an index on the field you're ordering by. If not, adding one will dramatically improve performance. I don't know postgresql that well but something similar to:
CREATE INDEX area_acre ON global_site(area_acre)
As noted in other replies, the indexing process is intensive when working with a large data set, so do this during off-peak.
I am not familiar with the PostgreSQL optimizations, but it sounds like what is happening when the query is run with the ORDER BY clause is that the entire result set is created, then it is sorted, and then the top 11 rows are taken from that sorted result. Without the ORDER BY, the query engine can just generate the first 11 rows in whatever order it pleases and then it's done.
Having an index on the area_acre field very possibly may not help for the sorting (ORDER BY) depending on how the result set is built. It could, in theory, be used to generate the result set by traversing the global_site table using an index on area_acre; in that case, the results would be generated in the desired order (and it could stop after generating 11 rows in the result). If it does not generate the results in that order (and it seems like it may not be), then that index will not help in sorting the results.
One thing you might try is to remove the "CROSS JOIN" from the query. I doubt that this will make a difference, but it's worth a test. Because a WHERE clause is involved joining the two tables (via ST_WITHIN), I believe the result is the same as an inner join. It is possible that the use of the CROSS JOIN syntax is causing the optimizer to make an undesirable choice.
Otherwise (aside from making sure indexes exist for fields that are being filtered), you could play a bit of a guessing game with the query. One condition that stands out is the area_acre >= 500. This means that the query engine is considering all rows that meet that condition. But then only the first 11 rows are taken. You could try changing it to area_acre >= 500 and area_acre <= somevalue. The somevalue is the guessing part that would need adjustment to make sure you get at least 11 rows. This, however, seems like a pretty cheesy thing to do, so I mention it with some reticence.
Have you considered creating Expression based indexes for the benefit of the hairier joins and where conditions?