I have a Firebird database with a table with two columns: Date and value.
I want to get the most recent value. Problem is, that this table can have easily over 200k rows. My select query takes more than 400ms, which is just to long for my application. Is there any way, I can speed this up?
I can't change the database in any way. I cannot use any window function introduced in Firebird 3.0.
Here is my query:
SELECT REG_DATE_TIME,REG_VAL FROM TAB_REG_VAL
WHERE REG_DATE_TIME = (SELECT MAX(REG_DATE_TIME) FROM TAB_REG_VAL);
I also tried select first .. order by, but the execution time was similar.
I am using C# ado.net connected layer, if that's important.
You need to create a descending index for the column REG_DATE_TIME:
create descending index idx_tab_reg_val_reg_date_time on TAB_REG_VAL(REG_DATE_TIME);
You can then use
SELECT FIRST 1 REG_DATE_TIME,REG_VAL
FROM TAB_REG_VAL
ORDER BY REG_DATE_TIME DESC
Without the index, Firebird will need to materialize and sort the entire result set before it can return you that first row. This is inefficient (and can get even worse if the result set is larger than the sort memory, in which case Firebird will sort in a temporary file on disk).
With the index, Firebird just needs to access a few pages in the index and one or more records in the table to locate the first record that is visible to your transaction.
Note: Firebird indexes can - currently - only be used for sorting in a single direction, which means that depending on your access needs, you may need to create an index in ascending direction as well.
I would try:
SELECT FIRST (1) REG_DATE_TIME, REG_VAL
FROM TAB_REG_VAL
ORDER BY REG_DATE_TIME DESC;
With an index on TAB_REG_VAL(REG_DATE_TIME, REG_VAL).
Related
Background: A large table, 50M+, all column in query is indexed.
when I do a query like this:
select * from table where A=? order by id DESC limit 10;
In statement, A, id are both indexed.
Now confusing things happen:
the more rows where returned, the less time whole sql cost
the less rows where returned, the more time whole sql cost
I have a guess here: postgres do the order by first, and then where , so it cost more time to find 10 row in the orderd index when target rowset is small(like find 10 particular sand on beach); oppositeļ¼ if target rowset is large, it's easy to find the first 10.
Is it right? Or there are some other reason for this?
Final question: How to optimize this situation?
It can either use the index on A to apply the selectivity, then sort on "id" and apply the limit. Or it can read them already in order using the index on "id", then filter out the ones that meet the A condition until it finds 10 of them. It will choose the one it thinks is faster, and sometimes it makes the wrong choice.
If you had a multi-column index, on (A,id) it could use that one index to do both things, get the selectivity on A and still fetch the already in order by "id", at the same time.
Do you know PGAdmin? With "explain verbose" before your statement, you can check how the query is executed (meaning the order of the operators). Usually first happens the filter and only afterwards the sorting...
I need to know the last entry in a derby database with a few million entries. Using the MAX() function I get very poor performance. I also tried scrollable resultSets and jumping to the last entry but this is even worse.
Any easy simple performant way to get the last line without iterating over all entries?
SQL: select max(IDDATALINE) from DATADOUBLE_2 where DATADOUBLE_2.DATATYPE = 19
Table: IDDATALINE [BIGINT] | DATATYPE[INTEGER] | DATA [DOUBLE]
I dont know if any indexes are defined, i'm working with source code I took over. I'll see if I finde something, if I don't find anything where/how do I add an index?
This is your query:
select max(IDDATALINE)
from DATADOUBLE_2
where DATADOUBLE_2.DATATYPE = 19;
You should be able to improve performance by creating an index. I would recommend:
create index idx_DATADOUBLE2_DATATYPE_IDDATALINE on DATADOUBLE_2(DATATYPE, IDDATALINE)
Note this is a composite index using two columns in that particular order.
You can sort descending and fetch the first row. Example:
select Something
from SomeTable
order by SomeField desc
fetch first 1 rows only
I've a big table which contains more than 100K records, in oracle. I want to get all of the records and save each row to a file with JDBC.
In order to make it faster, I want to create 100 threads to read the data from the table concurrently. I will get the total count of the records in the first sql, then split it to 100 pages, then get one page in a thread with a new connection.
But I've a problem, that there is no any column can be used to order. There is no column with sequence, no accurate timestamp. I can't use a sql query without order by clause to query, since there is no guarantee it will return the data with the same order every time (per this question).
So is it possible to solve it?
Finally, I used rowid to order:
select * from mytable order by rowid
It seems work well.
I am not clear on the following:
Assume I have a table A and I have created an index on column X.
If I do select based on the column that result will be sorted, right?
But if for some reason I did a select followed by an ORDER BY X (e.g. I was unaware that the column was indexed) will the SQL server do the sort performing sequential access or will it go and use the index?
If you don't specify an ORDER BY in your SELECT query, then there's no guaranteed / no deterministic ordering of any kind.
The data will be returned - but there's no guarantee of any order (or that it will be returned in the same order next time you query for it)
So if you need an order - then you must specify an ORDER BY in your SELECT - it's as simply as that.
And yes - if there's an index in place, and if it makes sense - then the SQL Server query optimizer will indeed use that index. But there are a lot of ifs and buts involved - it might - or might not - use the index - that's entirely up to the optimizer to decide (and this depends on a lot of factors - too many to list here).
(this specifically applies to SQL Server which I know well - I don't know how other RDBMS handle this situation)
Whether or not the RDMS will use the index or not depends on the predicates in your SQL query, what other indexes are on the table and how the optimiser chooses to execute your query..
So if you have
SELECT A, X
FROM T
WHERE A = some_value
ORDER BY X
and there is an index on A then the RDMS may (emphasis on MAY) choose to access the data via that index rather than the index on T. This means the SELECT will be followed by a sort.
As with most database questions, the answer is "It depends"
Without ORDER BY in your SELECT, then there is no no deterministic ordering of any kind.
Indexing plus sorting is quite complex story. It depends on RDBMS , RDBMS version and data in table.
1 000 000 rows in table
comment table
i - int, unique number from sequence/auto_increment etc.
created - long, indexed
updated - long, indexed
title - varchar(50) - indexed
body - text
Selects:
SELECT id FROM comments;
Oracle,MySQL,Postgresql return records completely randomly. You can have the illusion of order. However, after some administrative work vacuum, table analyze, optimize table; everything can change
SELECT * FROM comments ORDER BY created DESC;
Postgresql and Oracle will perform full table scan.
SELECT created FROM comments ORDER BY created;
Oracle will perform full index scan (only index data) but Postgresql will perform full table scan.
SELECT * FROM comments WHERE created>sysday-7 ORDER BY created DESC;
Postgresql and Oracle will perform index range scan and read data from table.
SELECT created FROM comments WHERE created>sysday-7 ORDER BY created;
Oracle will perform index range scan (no read from table) but Postgresql will perform index range scan and read data from table.
I've got a table with huge amount of data. Lets say 10GB of lines, containing bunch of crap. I need to select for example X rows (X is usually below 10) with highest amount column.
Is there any way how to do it without sorting the whole table? Sorting this amount of data is extremely time-expensive, I'd be OK with one scan through the whole table and selecting X highest values, and letting the rest untouched. I'm using SQL Server.
Create an index on amount then SQL Server can select the top 10 from that and do bookmark lookups to retrieve the missing columns.
SELECT TOP 10 Amount FROM myTable ORDER BY Amount DESC
if it is indexed, the query optimizer should use the index.
If not, I do no see how one could avoid scanning the whole thing...
Wether an index is usefull or not depends on how often you do that search.
You could also consider putting that query into an indexed view. I think this will give you the best benefit/cost ration.