How can I speed up queries that are looking for the root node of a transitive closure? - sql

I have a historical transitive closure table that represents a tree.
CHILD_NODE_ID number not null enable,
ANCESTOR_NODE_ID number not null enable,
DISTANCE number not null enable,
FROM_DATE date not null enable,
TO_DATE date not null enable,
Here's some sample data:
1 | 1 | 0
2 | 1 | 1
2 | 2 | 0
3 | 1 | 2
3 | 2 | 1
3 | 3 | 0
Unfortunately, my current query for finding the root node causes a full table scan:
select *
from transitive_closure tc
distance = 0
and not exists (
select null
from transitive_closure tci
where tc.child_node_id = tci.child_node_id
and tci.distance <> 0
On the surface, it doesn't look too expensive, but as I approach 1 million rows, this particular query is starting to get nasty... especially when it's part of a view that grabs the adjacency tree for legacy support.
Is there a better way to find the root node of a transitive closure? I would like to rewrite all of our old legacy code, but I can't... so I need to build the adjacency list somehow. Getting everything except the root node is easy, so is there a better way? Am I thinking about this problem the wrong way?
Query plan on a table with 800k rows.
Access Predicates
Filter Predicates
Filter Predicates

How long does the query take to execute, and how long do you want it to take? (You usually do not want to use the cost for tuning. Very few people know what the explain plan cost really means.)
On my slow desktop the query only took 1.5 seconds for 800K rows. And then 0.5 seconds after the data was in memory. Are you getting something significantly worse,
or will this query be run very frequently?
I don't know what your data looks like, but I'd guess that a full table scan will always be best for this query. Assuming that your hierarchical data
is relatively shallow, i.e. there are many distances of 0 and 1 but very few distances of 100, the most important column will not be very distinct. This means
that any of the index entries for distance will point to a large number of blocks. It will be much cheaper to read the whole table at once using multi-block reads
than to read a large amount of it one block at a time.
Also, what do you mean by historical? Can you store the results of this query in a materialized view?
Another possible idea is to use analytic functions. This replaces the second table scan with a sort. This approach is usually faster, but for me this
query actually takes longer, 5.5 seconds instead of 1.5. But maybe it will do better in your environment.
select * from
max(case when distance <> 0 then 1 else 0 end)
over (partition by child_node_id) has_non_zero_distance
from transitive_closure
where distance = 0
and has_non_zero_distance = 0;

Can you try adding an index on distance and child_node_id, or change the order of these column in the existing unique index? I think it should then be possible for the outer query to access the table by the index by distance while the inner query needs only access to the index.

Add ONE root node from which all your current root nodes are descended. Then you would simply query the children of your one root. Problem solved.


PostgreSQL efficiently find last decendant in linear list

I currently try to retrieve the last decendet efficiently from a linked list like structure.
Essentially there's a table with a data series, with certain criteria I split it up to get a list like this
current_id | next_id
for example
1 | 2
2 | 3
3 | 4
4 | NULL
42 | 43
43 | 45
45 | NULL
would result in lists like
1 -> 2 -> 3 -> 4
42 -> 43 -> 45
Now I want to get the first and the last id from each of those lists.
This is what I have right now:
WITH RECURSIVE contract(ruid, rdid, rstart_ts, rend_ts) AS ( -- recursive Query to traverse the "linked list" of continuous timestamps
SELECT start_ts, end_ts FROM track_caps tc
SELECT c.rstart_ts, tc.end_ts AS end_ts0 FROM contract c INNER JOIN track_caps tc ON (tc.start_ts = c.rend_ts AND c.rend_ts IS NOT NULL AND tc.end_ts IS NOT NULL)
fcontract AS ( --final step, after traversing the "linked list", pick the largest timestamp found as the end_ts and the smallest as the start_ts
SELECT DISTINCT ON(start_ts, end_ts) min(rstart_ts) AS start_ts, rend_ts AS end_ts
SELECT rstart_ts, max(rend_ts) AS rend_ts FROM contract
GROUP BY rstart_ts
) sq
GROUP BY end_ts
SELECT * FROM fcontract
ORDER BY start_ts
In this case I just used timestamps which work fine for the given data.
Basically I just use a recursive query that walks through all the nodes until it reaches the end, as suggested by many other posts on StackOverflow and other sites. The next query removes all the sub-steps and returns what I want, like in the first list example: 1 | 4
Just for illustration, the produced result set by the recursive query looks like this:
1 | 2
2 | 3
3 | 4
1 | 3
2 | 4
1 | 4
As nicely as it works, it's quite a memory hog however which is absolutely unsurprising when looking at the results of EXPLAIN ANALYZE.
For a dataset of roughly 42,600 rows, the recursive query produces a whopping 849,542,346 rows. Now it was actually supposed to process around 2,000,000 rows but with that solution right now it seems very unfeasible.
Did I just improperly use recursive queries? Is there a way to reduce the amount of data it produces?(like removing the sub-steps?)
Or are there better single-query solutions to this problem?
The main problem is that your recursive query doesn't properly filter the root nodes which is caused by the the model you have. So the non-recursive part already selects the entire table and then Postgres needs to recurse for each and every row of the table.
To make that more efficient only select the root nodes in the non-recursive part of your query. This can be done using:
select t1.current_id, t1.next_id, t1.current_id as root_id
from track_caps t1
where not exists (select *
from track_caps t2
where t2.next_id = t1.current_id)
Now that is still not very efficient (compared to the "usual" where parent_id is null design), but at least makes sure the recursion doesn't need to process more rows then necessary.
To find the root node of each tree, just select that as an extra column in the non-recursive part of the query and carry it over to each row in the recursive part.
So you wind up with something like this:
with recursive contract as (
select t1.current_id, t1.next_id, t1.current_id as root_id
from track_caps t1
where not exists (select *
from track_caps t2
where t2.next_id = t1.current_id)
select c.current_id, c.next_id, p.root_id
from track_caps c
join contract p on c.current_id = p.next_id
and c.next_id is not null
select *
from contract
order by current_id;
Online example:

How to change / convert values in Output that comes from SQL Server table

I have created a view in my SQL Server database which will give me number of columns.
One of the column heading is Priority and the values in this column are Low, Medium, High and Immediate.
When I execute this view, the result is returned perfectly like below. I want to change or assign values for these priorities. For example: instead of Low I should get 4, instead of Medium I should get 3, for High it should be 2 and for Immediate it should be 1.
What should I do to achieve this?
Ticket# Priority
123 Low
1254 Low
5478 Medium
4585 High
etc., etc.,
Instead of Low I should get 4, instead of Medium I should get 3, for
High it should be 2 and for Immediate it should be 1
[Priority] = CASE Priority
WHEN 'Medium' THEN 3
WHEN 'High' THEN 2
WHEN 'Immediate' THEN 1
FROM table_name;
If you use dictionary table like in George Botros Solution you need to remember about:
1) Maintaining and storing dictionary table
2) Adding UNIUQE index to Priority.Name to avoid duplicates like:
Priority table
Id | Name | Value
1 | Low | 4
2 | Low | 4
3) Instead of INNER JOIN defensively you ought to use LEFT JOIN to get all results even if there is no corresponding value in dictionary table.
I have an alternative solution for your problem by creating a new Priority table (Id, Name, Value)
by joining to this table you will be able to select the value column
SELECT Ticket.*, Priority.Value
FROM Ticket INNER JOIN Priority
ON Priority.Name = Ticket.Priority
Note: although using the case keyword is the most straight forward solution for
this problem
this solution may be useful if you will need this priority value in many places at your system

Efficiently running an SQL query over multiple inputs

Hi I've got a simulation snapshot that is currently stored in an PostgreSQL database as a table the schema for the snapshot table is
simdb=> \d isonew_4.snapshot_102
Table "isonew_4.snapshot_102"
Column | Type | Modifiers
id | integer |
x | real |
y | real |
z | real |
vx | real |
vy | real |
vz | real |
pot | real |
mass | real |
"snapshot_102_id_idx" btree (id) WITH (fillfactor=100)
I've got a query that calculates the mass enclosed for a single radius fine:
SELECT SUM(mass) AS mass
FROM isonew_4.snapshot_102 AS s
WHERE SQRT(s.x^2 + s.y^2 + s.z^2) < {radius}
However I would like to run this over a number number of different radii.
Since the table has around 100 million rows it's something that I would prefer to do as a SQL query rather than grabbing all of the particles and using something like numpy.histogram in python to do the binning on my machine locally.
Method #1
This query might work, with for example 10,20 and 25 as the successive values for the radius:
WITH r(radius) as (values (10),(20),(25))
SELECT radius, SUM(mass) AS mass
FROM isonew_4.snapshot_102 AS s CROSS JOIN r
WHERE SQRT(s.x^2 + s.y^2 + s.z^2) < radius
GROUP BY radius;
The output has two columns: radius and corresponding sum(mass).
Method #2
If the query is too slow because of the CROSS JOIN with the list (presumably, EXPLAIN or better EXPLAIN ANALYZE would tell for sure), a different approach that certainly guarantees a single scan of the big table is to gather all results in a single row, one column per radius, with a generated query looking like this:
sum(case when r < 10 then s.mass else 0 end) as radius10,
sum(case when r < 20 then s.mass else 0 end) as radius20,
sum(case when r < 25 then s.mass else 0 end) as radius25
FROM (select mass,SQRT(x^2 + y^2 + z^2) as r from isonew_4.snapshot_102) AS s
Method #3
If it's not practical, another completely different approach that might be worth trying would be to pre-compute SQRT(x^2 + y^2 + z^2) in a btree functional index in the hope that the SQL engine can use it with the inequality comparison. Whether this happens and if the query would be faster or not depends mainly on the data distribution.
create index radius_idx on isonew_4.snapshot_102(SQRT(x^2 + y^2 + z^2));
Then use the first query, either repeated with single radius each time, or method #1 with the GROUP BY and all values at once. If the values are very selective, the execution might be way faster than even a single large sequential scan.

SQL optimization - execution plan changes based on constraint value - Why?

I've got a table ItemValue full of data on a SQL 2005 Server running in 2000 compatibility mode that looks something like (it's a User-Defined values table):
ID ItemCode FieldID Value
-- ---------- ------- ------
1 abc123 1 D
2 abc123 2 287.23
4 xyz789 1 A
5 xyz789 2 3782.23
6 xyz789 3 23
7 mno456 1 W
9 mno456 3 45
... and so on.
FieldID comes from the ItemField table:
ID FieldNumber DataFormatID Description ...
-- ----------- ------------ -----------
1 1 1 Weight class
2 2 4 Cost
3 3 3 Another made up description
. . x xxx
. . x xxx
. . x xxx
x 91 (we have 91 user-defined fields)
Because I can't PIVOT in 2000 mode, we're stuck building an ugly query using CASEs and GROUP BY to get the data to look how it should for some legacy apps, which is:
ItemNumber Field1 Field2 Field3 .... Field51
---------- ------ ------- ------
abc123 D 287.23 NULL
xyz789 A 3782.23 23
mno456 W NULL 45
You can see we only need this table to show values up to the 51st UDF. Here's the query:
,MAX(CASE WHEN f.FieldNumber = 1 THEN iv.[Value] ELSE NULL END) [Field1]
,MAX(CASE WHEN f.FieldNumber = 2 THEN iv.[Value] ELSE NULL END) [Field2]
,MAX(CASE WHEN f.FieldNumber = 3 THEN iv.[Value] ELSE NULL END) [Field3]
,MAX(CASE WHEN f.FieldNumber = 51 THEN iv.[Value] ELSE NULL END) [Field51]
FROM ItemField f
LEFT JOIN ItemValue iv ON f.ID = iv.FieldID
WHERE f.FieldNumber <= 51
GROUP BY iv.ItemNumber
When the FieldNumber constraint is <= 51, the execute plan goes something like:
SELECT <== Computer Scalar <== Stream Aggregate <== Sort (Cost: 70%) <== Hash Match <== (Clustered Index Seek && Table Scan)
and it's fast! I can pull back 100,000+ records in about a second, which suits our needs.
However, if we had more UDFs and I change the constraint to anything above 66 (yes, I tested them one by one) or if I remove it completely, I lose the Sort in the Execution plan, and it gets replaced with a whole bunch of Parallelism blocks that gather, repartition, and distribute streams, and the entire thing is slow (30 seconds for even just 1 record).
FieldNumber has a clustered, unique index, and is part of composite primary key with the ID column (non-clustered index) in the ItemField table. The ItemValue table's ID and ItemNumber columns make a PK, and there is an extra non-clustered index on the ItemNumber column.
What is the reasoning behind this? Why does changing my simple integer constraint change the entire execution plan?
And if you're up to it... what would you do differently? There's a SQL upgrade planned for a couple months from now but I need to get this problem fixed before that.
SQL Server is smart enough to take CHECK constraints into account when optimizing the queries.
Your f.FieldNumber <= 51 is optimized out and the optimizer sees that the whole two tables should be joined (which is best done with a HASH JOIN).
If you don't have the constraint, the engine needs to check the condition and most probably uses index traversal to do this. This may be slower.
Could please post the whole plans for the queries? Just run SET SHOWPLAN_TEXT ON and then the queries.
What is the reasoning behind this? Why does changing my simple integer constraint change the entire execution plan?
If by a constraint you mean the WHERE condition, this is probably the other thing.
Set operations (that's what SQL does) have no single most efficient algorithm: efficiency of each algorithm depends heavily on the data distribution in the sets.
Say, for taking a subset (that's what the WHERE clause does) you can either find the range of record in the index and use the index record pointers to locate the data rows in the table, or just scan all records in the table and filter them using the WHERE condition.
Efficiency of the former operation is m × const, that of the latter is n, where m is the number of record satisfying the condition, n is the total number of records in the table and const > 1.
This means that for larger values of m the fullscan is more efficient.
SQL Server is aware of that and changes execution plans accordingly to the constants that affect the data distribution in the set operations.
TO do this, SQL Server maintains statistics: aggregated histograms of the data distribution in each indexed column and uses them to build the query plans.
So changing the integer in the WHERE condition in fact affects the size and the data distribution of the underlying sets and makes SQL Server to reconsider the algorithms best fit to work with the sets of that size and layout.
it gets replaced with a whole bunch of Parallelism blocks
Try this:
,MAX(CASE WHEN f.FieldNumber = 1 THEN iv.[Value] ELSE NULL END) [Field1]
,MAX(CASE WHEN f.FieldNumber = 2 THEN iv.[Value] ELSE NULL END) [Field2]
,MAX(CASE WHEN f.FieldNumber = 3 THEN iv.[Value] ELSE NULL END) [Field3]
,MAX(CASE WHEN f.FieldNumber = 51 THEN iv.[Value] ELSE NULL END) [Field51]
FROM ItemField f
LEFT JOIN ItemValue iv ON f.ID = iv.FieldID
WHERE f.FieldNumber <= 51
GROUP BY iv.ItemNumber
OPTION (Maxdop 1)
By using Option(Maxdop 1), this should prevent the parellelism in the execution plan.
At 66 you are hitting some internal cost estimate threshold that decides it is better to use one plan vs. the other. What that threshold is and why it happens is not really important. Note that your query differ with each FieldNumber value, as you are not only changing the WHERE: you also change the pseudo-'pivot' projected fields.
Now I don't know all the details of your table and your queries and insert/update/delete/pattern, but for the particular query you posted the proper clustered index structure for the ItemValue table is this:
CREATE CLUSTERED INDEX [cdxItemValue] ON ItemValue (FieldID, ItemNumber);
This structure eliminate the need to intermediate sort the results for this 'pivot' query.

PostgreSQL - fetch the rows which have the Max value for a column in each GROUP BY group

I'm dealing with a Postgres table (called "lives") that contains records with columns for time_stamp, usr_id, transaction_id, and lives_remaining. I need a query that will give me the most recent lives_remaining total for each usr_id
There are multiple users (distinct usr_id's)
time_stamp is not a unique identifier: sometimes user events (one by row in the table) will occur with the same time_stamp.
trans_id is unique only for very small time ranges: over time it repeats
remaining_lives (for a given user) can both increase and decrease over time
07:00 | 1 | 1 | 1
09:00 | 4 | 2 | 2
10:00 | 2 | 3 | 3
10:00 | 1 | 2 | 4
11:00 | 4 | 1 | 5
11:00 | 3 | 1 | 6
13:00 | 3 | 3 | 1
As I will need to access other columns of the row with the latest data for each given usr_id, I need a query that gives a result like this:
11:00 | 3 | 1 | 6
10:00 | 1 | 2 | 4
13:00 | 3 | 3 | 1
As mentioned, each usr_id can gain or lose lives, and sometimes these timestamped events occur so close together that they have the same timestamp! Therefore this query won't work:
SELECT b.time_stamp,b.lives_remaining,b.usr_id,b.trans_id FROM
(SELECT usr_id, max(time_stamp) AS max_timestamp
FROM lives GROUP BY usr_id ORDER BY usr_id) a
JOIN lives b ON a.max_timestamp = b.time_stamp
Instead, I need to use both time_stamp (first) and trans_id (second) to identify the correct row. I also then need to pass that information from the subquery to the main query that will provide the data for the other columns of the appropriate rows. This is the hacked up query that I've gotten to work:
SELECT b.time_stamp,b.lives_remaining,b.usr_id,b.trans_id FROM
(SELECT usr_id, max(time_stamp || '*' || trans_id)
AS max_timestamp_transid
FROM lives GROUP BY usr_id ORDER BY usr_id) a
JOIN lives b ON a.max_timestamp_transid = b.time_stamp || '*' || b.trans_id
ORDER BY b.usr_id
Okay, so this works, but I don't like it. It requires a query within a query, a self join, and it seems to me that it could be much simpler by grabbing the row that MAX found to have the largest timestamp and trans_id. The table "lives" has tens of millions of rows to parse, so I'd like this query to be as fast and efficient as possible. I'm new to RDBM and Postgres in particular, so I know that I need to make effective use of the proper indexes. I'm a bit lost on how to optimize.
I found a similar discussion here. Can I perform some type of Postgres equivalent to an Oracle analytic function?
Any advice on accessing related column information used by an aggregate function (like MAX), creating indexes, and creating better queries would be much appreciated!
P.S. You can use the following to create my example case:
create TABLE lives (time_stamp timestamp, lives_remaining integer,
usr_id integer, trans_id integer);
insert into lives values ('2000-01-01 07:00', 1, 1, 1);
insert into lives values ('2000-01-01 09:00', 4, 2, 2);
insert into lives values ('2000-01-01 10:00', 2, 3, 3);
insert into lives values ('2000-01-01 10:00', 1, 2, 4);
insert into lives values ('2000-01-01 11:00', 4, 1, 5);
insert into lives values ('2000-01-01 11:00', 3, 1, 6);
insert into lives values ('2000-01-01 13:00', 3, 3, 1);
I would propose a clean version based on DISTINCT ON (see docs):
FROM lives
ORDER BY usr_id, time_stamp DESC, trans_id DESC;
On a table with 158k pseudo-random rows (usr_id uniformly distributed between 0 and 10k, trans_id uniformly distributed between 0 and 30),
By query cost, below, I am referring to Postgres' cost based optimizer's cost estimate (with Postgres' default xxx_cost values), which is a weighed function estimate of required I/O and CPU resources; you can obtain this by firing up PgAdminIII and running "Query/Explain (F7)" on the query with "Query/Explain options" set to "Analyze"
Quassnoy's query has a cost estimate of 745k (!), and completes in 1.3 seconds (given a compound index on (usr_id, trans_id, time_stamp))
Bill's query has a cost estimate of 93k, and completes in 2.9 seconds (given a compound index on (usr_id, trans_id))
Query #1 below has a cost estimate of 16k, and completes in 800ms (given a compound index on (usr_id, trans_id, time_stamp))
Query #2 below has a cost estimate of 14k, and completes in 800ms (given a compound function index on (usr_id, EXTRACT(EPOCH FROM time_stamp), trans_id))
this is Postgres-specific
Query #3 below (Postgres 8.4+) has a cost estimate and completion time comparable to (or better than) query #2 (given a compound index on (usr_id, time_stamp, trans_id)); it has the advantage of scanning the lives table only once and, should you temporarily increase (if needed) work_mem to accommodate the sort in memory, it will be by far the fastest of all queries.
All times above include retrieval of the full 10k rows result-set.
Your goal is minimal cost estimate and minimal query execution time, with an emphasis on estimated cost. Query execution can dependent significantly on runtime conditions (e.g. whether relevant rows are already fully cached in memory or not), whereas the cost estimate is not. On the other hand, keep in mind that cost estimate is exactly that, an estimate.
The best query execution time is obtained when running on a dedicated database without load (e.g. playing with pgAdminIII on a development PC.) Query time will vary in production based on actual machine load/data access spread. When one query appears slightly faster (<20%) than the other but has a much higher cost, it will generally be wiser to choose the one with higher execution time but lower cost.
When you expect that there will be no competition for memory on your production machine at the time the query is run (e.g. the RDBMS cache and filesystem cache won't be thrashed by concurrent queries and/or filesystem activity) then the query time you obtained in standalone (e.g. pgAdminIII on a development PC) mode will be representative. If there is contention on the production system, query time will degrade proportionally to the estimated cost ratio, as the query with the lower cost does not rely as much on cache whereas the query with higher cost will revisit the same data over and over (triggering additional I/O in the absence of a stable cache), e.g.:
cost | time (dedicated machine) | time (under load) |
some query A: 5k | (all data cached) 900ms | (less i/o) 1000ms |
some query B: 50k | (all data cached) 900ms | (lots of i/o) 10000ms |
Do not forget to run ANALYZE lives once after creating the necessary indices.
Query #1
-- incrementally narrow down the result set via inner joins
-- the CBO may elect to perform one full index scan combined
-- with cascading index lookups, or as hash aggregates terminated
-- by one nested index lookup into lives - on my machine
-- the latter query plan was selected given my memory settings and
-- histogram
lives AS l1
MAX(time_stamp) AS time_stamp_max
) AS l2
l1.usr_id = l2.usr_id AND
l1.time_stamp = l2.time_stamp_max
MAX(trans_id) AS trans_max
usr_id, time_stamp
) AS l3
l1.usr_id = l3.usr_id AND
l1.time_stamp = l3.time_stamp AND
l1.trans_id = l3.trans_max
Query #2
-- cheat to obtain a max of the (time_stamp, trans_id) tuple in one pass
-- this results in a single table scan and one nested index lookup into lives,
-- by far the least I/O intensive operation even in case of great scarcity
-- of memory (least reliant on cache for the best performance)
lives AS l1
MAX(ARRAY[EXTRACT(EPOCH FROM time_stamp),trans_id])
AS compound_time_stamp
) AS l2
l1.usr_id = l2.usr_id AND
EXTRACT(EPOCH FROM l1.time_stamp) = l2.compound_time_stamp[1] AND
l1.trans_id = l2.compound_time_stamp[2]
2013/01/29 update
Finally, as of version 8.4, Postgres supports Window Function meaning you can write something as simple and efficient as:
Query #3
-- use Window Functions
-- performs a SINGLE scan of the table
last_value(time_stamp) OVER wnd,
last_value(lives_remaining) OVER wnd,
last_value(trans_id) OVER wnd
FROM lives
PARTITION BY usr_id ORDER BY time_stamp, trans_id
Here's another method, which happens to use no correlated subqueries or GROUP BY. I'm not expert in PostgreSQL performance tuning, so I suggest you try both this and the solutions given by other folks to see which works better for you.
FROM lives l1 LEFT OUTER JOIN lives l2
ON (l1.usr_id = l2.usr_id AND (l1.time_stamp < l2.time_stamp
OR (l1.time_stamp = l2.time_stamp AND l1.trans_id < l2.trans_id)))
WHERE l2.usr_id IS NULL
ORDER BY l1.usr_id;
I am assuming that trans_id is unique at least over any given value of time_stamp.
There is a new option in Postgressql 9.5 called DISTINCT ON
SELECT DISTINCT ON (location) location, time, report
FROM weather_reports
ORDER BY location, time DESC;
It eliminates duplicate rows an leaves only the first row as defined my the ORDER BY clause.
see the official documentation
I like the style of Mike Woodhouse's answer on the other page you mentioned. It's especially concise when the thing being maximised over is just a single column, in which case the subquery can just use MAX(some_col) and GROUP BY the other columns, but in your case you have a 2-part quantity to be maximised, you can still do so by using ORDER BY plus LIMIT 1 instead (as done by Quassnoi):
FROM lives outer
WHERE (usr_id, time_stamp, trans_id) IN (
SELECT usr_id, time_stamp, trans_id
FROM lives sq
WHERE sq.usr_id = outer.usr_id
ORDER BY trans_id, time_stamp
I find using the row-constructor syntax WHERE (a, b, c) IN (subquery) nice because it cuts down on the amount of verbiage needed.
Actaully there's a hacky solution for this problem. Let's say you want to select the biggest tree of each forest in a region.
SELECT (array_agg( ORDER BY tree_size.size)))[1]
FROM tree JOIN forest ON (tree.forest =
When you group trees by forests there will be an unsorted list of trees and you need to find the biggest one. First thing you should do is to sort the rows by their sizes and select the first one of your list. It may seems inefficient but if you have millions of rows it will be quite faster than the solutions that includes JOIN's and WHERE conditions.
BTW, note that ORDER_BY for array_agg is introduced in Postgresql 9.0
You can do it with window functions
FROM lives) as t
WHERE t.r = 1
FROM lives
) lo, lives l
WHERE l.ctid = (
FROM lives li
WHERE li.usr_id = lo.usr_id
time_stamp DESC, trans_id DESC
Creating an index on (usr_id, time_stamp, trans_id) will greatly improve this query.
You should always, always have some kind of PRIMARY KEY in your tables.
I think you've got one major problem here: there's no monotonically increasing "counter" to guarantee that a given row has happened later in time than another. Take this example:
timestamp lives_remaining user_id trans_id
10:00 4 3 5
10:00 5 3 6
10:00 3 3 1
10:00 2 3 2
You cannot determine from this data which is the most recent entry. Is it the second one or the last one? There is no sort or max() function you can apply to any of this data to give you the correct answer.
Increasing the resolution of the timestamp would be a huge help. Since the database engine serializes requests, with sufficient resolution you can guarantee that no two timestamps will be the same.
Alternatively, use a trans_id that won't roll over for a very, very long time. Having a trans_id that rolls over means you can't tell (for the same timestamp) whether trans_id 6 is more recent than trans_id 1 unless you do some complicated math.