I have a SQLite DB file lets say a.db. And I have 100 rows in table a. When I give
SELECT * FROM a WHERE rowid BETWEEN 1 AND 100
it gives me the first 100 results. Now if I delete the first 2 rows from the table and run the
SELECT * FROM a WHERE rowid BETWEEN 1 AND 100
it gives me only 98 rows. And when I give the query
SELECT * FROM a WHERE rowid = 1
it gives me an empty row. The database doesn't seem to rearrange the rowid.
Please help.. Is there any way to force the SQLite to rearrange the row ids ?
In SQLite, rowid is just an int column with autoincremented values. The only thing you're guaranteed is that each rowid will be unique to that table -- nothing more. You aren't guaranteed that rowids will be sequential, and you aren't even guaranteed that every number will be used! For example, if an INSERT fails, the rowid used in that failed insert will probably be skipped and never used again (if the autoincrement keyword is explicitly used -- otherwise, if a row is deleted, its rowid is free to be reused by any newly inserted data.)
If you want rowids to be sequential and autoupdated, you'll have to create your own version of the rowid column and keep it updated with triggers.
I found out the answer. We need to run the Vacuum query after every delete. This will rearrange all the rowindex and remove the dead rows.
http://sqlite.org/lang_vacuum.html
Related
I have a table called points. I executed the following query and expected a list of lexicographicaly sorted list of ROWIDs but that did not happen. How does Order by rowid sorts the row?
select rowid from points order by rowid
I had rows like following
AAAE6MAAFAAABiSAAA
AAAE6MAAFAAABi+AAA
2nd row is lexicographicaly smaller than first row. So what is sorting criteria if it is not lecxicographical sorting?
Why you see is only a representation used for display purposes.
The actual rowid contains binary information about the data block, the row in the block, the file where the block is located and the internal object id of the table (See the manual for details)
When you use order by rowid Oracle sorts the rows based on that (internal) information, not based on the "string representation".
If you change your query to:
select rowid,
dbms_rowid.rowid_relative_fno(rowid) as rel_fno,
dbms_rowid.rowid_row_number(rowid) as row_num,
dbms_rowid.rowid_block_number(rowid) as block_num,
dbms_rowid.rowid_object(rowid)
from points
order by rowid
You will most probably see the logic behind the ordering of the rownumber.
Note that the value for dbms_rowid.rowid_object() will always be the same. And if you only have two rows in your table, both will most probably also have the same value for rowid_block_number()
The sequence of rowid is not guranteed. It depends on how you have set the NLS settings. Also rowid represents the physical allocation of the row in the database. A rowid is considered immutable(does not change) but if you delete a row and insert it again then it changes.
If you delete a row, then Oracle may reassign its rowid to a new row
inserted later.
What is the fastest regarding performance way to check that integer column contains specific value?
I have a table with 10 million rows in postgresql 8.4. I need to do at least 10000 checks per sec.
Currently i am doing query SELECT id FROM table WHERE id = my_value and then checking does DataReader have rows. But it is quite slow. Is there any way to speed up without loading whole column into memory?
You can select COUNT instead:
SELECT COUNT(*) FROM table WHERE id = my_value
It will return just one integer value - number of rows matching your select condition.
You need two things,
As Marcin pointed out, you want to use the COUNT(*) if all you need is to know how many. You also need an index on that column. The index will have the answer pretty much right at hand. Without the index, Postgresql would still have to go through the entire table to count that one number.
CREATE INDEX id_idx ON table (id) ASC NULLS LAST;
Something of the sort should get you there. Whether it is enough to run the query 10,000/sec. will depend on your hardware...
If you use where id = X then all values matching X will be returned. Suppose 1000 values match X then 1000 values will be returned.
Now, if you only want to check if the value is at least once then after you matched the first value there is no need to process the other 999. Even if you count the values you are still going through all of them.
What I would do in this case is this:
SELECT 1 FROM table
WHERE id = my_value
LIMIT 1
Note that I'm not even returning the id itself. So if you get one record then the value is there.
Of course, in order to improve this query, make sure you have an index on the id column.
I have some data in oracle table abot 10,000 rows i want to genrate column which return 1 for ist row and 2 for second and so on 1 for 3rd and 2 for 4th and 1 for 5th and 2 for 6th and so on..Is there any way that i can do it using sql query or any script which can update my column like this.that it will generate 1,2 as i mentioned above i have thought much but i didn't got to do this using sql or any other scencrio for my requirements.plz help if any possibility for doing this with my table data
You can use the combination of the ROWNUM and MOD functions.
Your query would look something like this:
SELECT ROWNUM, 2 - MOD(ROWNUM, 2) FROM ...
The MOD function will return 0 for even rows and 1 for odd rows.
select mod(rownum,5)+1,fld1, fld2, fld3 from mytable;
Edit:
I did not misunderstand requirements, I worked around them. Adding a column and then updating a table that way is a bad design idea. Tables are seldom completely static, even rule and validation tables. The only time this might make any sense is if the table is locked against delete, insert, and update. Any change to any existing row can alter the logical order. Which was never specified. Delete means the entire sequence has to be rewritten. Update and insert can have the same effect.
And if you wanted to do this you can use a sequence to insert a bogus counter. A sequence that cycles over and over, assuming you know the order and can control inserts and updates in terms of that order.
I need to periodically update a local cache with new additions to some DB table. The table rows contain an auto-increment sequential number (SN) field. The cache keeps this number too, so basically I just need to fetch all rows with SN larger than the highest I already have.
SELECT * FROM table where SN > <max_cached_SN>
However, the majority of the attempts will bring no data (I just need to make sure that I have an absolutely up-to-date local copy). So I wander if this will be more efficient:
count = SELECT count(*) from table;
if (count > <cache_size>)
// fetch new rows as above
I suppose that selecting by an indexed numeric field is quite efficient, so I wander whether using count has benefit. On the other hand, this test/update will be done quite frequently and by many clients, so there is a motivation to optimize it.
this test/update will be done quite frequently and by many clients
this could lead to unexpected race competition for cache generation
I would suggest
upon new addition to your table add the newest id into a queue table
using like crontab to trigger the cache generation by checking queue table
upon new cache generated, delete the id from queue table
as you stress majority of the attempts will bring no data, the above will only trigger where there is new addition
and the queue table concept, even can expand for update and delete
I believe that
SELECT * FROM table where SN > <max_cached_SN>
will be faster, because select count(*) may call table scan. Just for clarification, do you never delete rows from this table?
SELECT COUNT(*) may involve a scan (even a full scan), while SELECT ... WHERE SN > constant can effectively use an index by SN, and looking at very few index nodes may suffice. Don't count items if you don't need the exact total, it's expensive.
You don't need to use SELECT COUNT(*)
There is two solution.
You can use a temp table that has one field that contain last count of your table, and create new Trigger after insert on your table and inc temp table field in Trigger.
You can use a temp table that has one field that contain last SN of your table is cached and create new Trigger after insert on your table and update temp table field in Trigger.
not much to this really
drop table if exists foo;
create table foo
(
foo_id int unsigned not null auto_increment primary key
)
engine=innodb;
insert into foo values (null),(null),(null),(null),(null),(null),(null),(null),(null);
select * from foo order by foo_id desc limit 10;
insert into foo values (null),(null),(null),(null),(null),(null),(null),(null),(null);
select * from foo order by foo_id desc limit 10;
I've always just used "SELECT COUNT(1) FROM X" but perhaps this is not the most efficient. Any thoughts? Other options include SELECT COUNT(*) or perhaps getting the last inserted id if it is auto-incremented (and never deleted).
How about if I just want to know if there is anything in the table at all? (e.g., count > 0?)
The best way is to make sure that you run SELECT COUNT on a single column (SELECT COUNT(*) is slower) - but SELECT COUNT will always be the fastest way to get a count of things (the database optimizes the query internally).
If you check out the comments below, you can see arguments for why SELECT COUNT(1) is probably your best option.
To follow up on girasquid's answer, as a data point, I have a sqlite table with 2.3 million rows. Using select count(*) from table, it took over 3 seconds to count the rows. I also tried using SELECT rowid FROM table, (thinking that rowid is a default primary indexed key) but that was no faster. Then I made an index on one of the fields in the database (just an arbitrary field, but I chose an integer field because I knew from past experience that indexes on short fields can be very fast, I think because the index is stored a copy of the value in the index itself). SELECT my_short_field FROM table brought the time down to less than a second.
If you are sure (really sure) that you've never deleted any row from that table and your table has not been defined with the WITHOUT ROWID optimization you can have the number of rows by calling:
select max(RowId) from table;
Or if your table is a circular queue you could use something like
select MaxRowId - MinRowId + 1 from
(select max(RowId) as MaxRowId from table) JOIN
(select min(RowId) as MinRowId from table);
This is really really fast (milliseconds), but you must pay attention because sqlite says that row id is unique among all rows in the same table. SQLite does not declare that the row ids are and will be always consecutive numbers.
The fastest way to get row counts is directly from the table metadata, if any. Unfortunately, I can't find a reference for this kind of data being available in SQLite.
Failing that, any query of the type
SELECT COUNT(non-NULL constant value) FROM table
should optimize to avoid the need for a table, or even an index, scan. Ideally the engine will simply return the current number of rows known to be in the table from internal metadata. Failing that, it simply needs to know the number of entries in the index of any non-NULL column (the primary key index being the first place to look).
As soon as you introduce a column into the SELECT COUNT you are asking the engine to perform at least an index scan and possibly a table scan, and that will be slower.
I do not believe you will find a special method for this. However, you could do your select count on the primary key to be a little bit faster.
sp_spaceused 'table_name' (exclude single quote)
this will return the number of rows in the above table, this is the most efficient way i have come across yet.
it's more efficient than select Count(1) from 'table_name' (exclude single quote)
sp_spaceused can be used for any table, it's very helpful when the table is exceptionally big (hundreds of millions of rows), returns number of rows right a way, whereas 'select Count(1)' might take more than 10 seconds. Moreover, it does not need any column names/key field to consider.