How to get rid of gaps in rowid numbering after deleting rows?

How to get rid of gaps in rowid numbering after deleting rows? - sql

Table tmp :
CREATE TABLE if not exists tmp (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL);
I inserted 5 rows. select rowid,id,name from tmp; :
rowid
id
name
1
1
a
2
2
b
3
3
c
4
4
d
5
5
e
Now I delete rows with id 3 and 4 and run above query again:
rowid
id
name
1
1
a
2
2
b
5
5
e
rowid is not getting reset and leaves holes. Even after vacuum it doesn't reset rowid.
I want :
rowid
id
name
1
1
a
2
2
b
3
5
e
How to achieve above output?

I assume you already know a little about rowid, since you're asking about its interaction with the VACUUM command, but this may be useful information for future readers:
rowid is a special column available in all tables (unless you use WITHOUT ROWID), used internally by sqlite. A VACUUM is supposed to rebuild the table, aiming to reduce fragmentation in the database file, and may change the values of the rowid column. Moving on.
Here's the answer to your question: rowid is really special. So special that if you have an INTEGER PRIMARY KEY, it becomes an alias for the rowid column. From the docs on rowid:
With one exception noted below, if a rowid table has a primary key that consists of a single column and the declared type of that column is "INTEGER" in any mixture of upper and lower case, then the column becomes an alias for the rowid. Such a column is usually referred to as an "integer primary key". A PRIMARY KEY column only becomes an integer primary key if the declared type name is exactly "INTEGER". Other integer type names like "INT" or "BIGINT" or "SHORT INTEGER" or "UNSIGNED INTEGER" causes the primary key column to behave as an ordinary table column with integer affinity and a unique index, not as an alias for the rowid.
This makes your primary key faster than it would've been otherwise (presumably because there's no lookup from your primary key to rowid):
The data for rowid tables is stored as a B-Tree structure containing one entry for each table row, using the rowid value as the key. This means that retrieving or sorting records by rowid is fast. Searching for a record with a specific rowid, or for all records with rowids within a specified range is around twice as fast as a similar search made by specifying any other PRIMARY KEY or indexed value.
Of course, when your primary key is an alias for rowid, it would be terribly inconvenient if this could change. Since rowid is now aliased to your application data, it would not be acceptable for sqlite to change it.
Hence, this little note in the VACUUM docs:
The VACUUM command may change the ROWIDs of entries in any tables that do not have an explicit INTEGER PRIMARY KEY.
If you really really really absolutely need the rowid to change on a VACUUM (I don't see why -- feel free to discuss your reasons in the comments, I may have some suggestions), you can avoid this aliasing behavior. Note that it will decrease the performance of any table lookups using your primary key.
To avoid the aliasing, and degrade your performance, you can use INT instead of INTEGER when defining your key:
A PRIMARY KEY column only becomes an integer primary key if the declared type name is exactly "INTEGER". Other integer type names like "INT" or "BIGINT" or "SHORT INTEGER" or "UNSIGNED INTEGER" causes the primary key column to behave as an ordinary table column with integer affinity and a unique index, not as an alias for the rowid.

I found a solution for some case. I don't know why, but this worked.
1.Rename column "id" to any other name (not PRIMARY KEY) or delete this column because you have already "rowid".
CREATE TABLE if not exists tmp (
my_i INTEGER NOT NULL,
name TEXT NOT NULL);
2.Insert 5 rows in it.
select rowid,* from tmp;
rowid my_i name
1 1 a
2 2 b
3 3 c
4 4 d
5 5 e
3.Delete rows with rowid 3 and 4 and run above query again.
DELETE FROM tmp WHERE rowid = 3;
DELETE FROM tmp WHERE rowid = 4;
select rowid,* from tmp;
rowid my_i name
1 1 a
2 2 b
5 5 e
4.Run SQL
VACUUM;
5.Run SQL
select rowid,* from tmp;
The output:
rowid my_i name
1 1 a
2 2 b
3 5 e

You must define all data from database to new array / list.After that you must delete table and rewrite all data from array / list to database .
Check it ;
https://stackoverflow.com/a/57862686/8363647

I don't get why there is so much hesitance in illustrating the answer here.
If there are any tips or specific examples y'all could provide on how or why or when to be weary of usage, we'd all appreciate it.
Here is how I solved my problem, similar to OP.
c.execute(f"DELETE FROM customers WHERE rowid = ({id})")
print(f"deleted {id}")
conn.commit()
c.execute("VACUUM")
conn.close()

Related

Database cache in SQL Or correcting autoincrement [duplicate]

This question already has answers here:
How to get rid of gaps in rowid numbering after deleting rows?
(4 answers)
Closed 5 months ago.
I've created 2 rows in an table in SQL (sqlite3 on cmd) and then deleted 1 of them.
CREATE TABLE sample1( name TEXT, id INTEGER PRIMARY KEY AUTOINCREMENT);
INSERT INTO sample1 VALUES ('ROY',1);
INSERT INTO sample1(name) VALUES ('RAJ');
DELETE FROM sample1 WHERE id = 2;
Later when I inserted another row, its id was given 3 by the system instead of 2.
INSERT INTO sample1 VALUES ('AMIE',NULL);
SELECT * FROM sample1;
picture of table
How do I correct it so the next values are given right id's automatically? Or how do I clear the sql database cache to solve it?

The simplest fix to resolve the problem you describe, is to omit AUTOINCREMENT.
The result of your test would then be as you wish.
However, the rowid (which the id column is an alias of, if INTEGER PRIMARY KEY is specified, with or without AUTOINCREMENT), will still be generated and probably be 1 higher than the highest existing id (alias of rowid).
There is a subtle difference between using and not using AUTOINCREMENT.
without AUTOINCREMENT then the generated value of the rowid and therefore it's alias will be the highest existing rowid for the table plus 1 (not absolutely guaranteed though).
with AUTOINCREMENT the generated value will be 1 plus the higher of:-
the highest existing rowid, or
the highest used rowid
the highest, in some circumstances, may have only existed briefly
In your example as 2 had been used then 2 + 1 = 3 even though 2 had been deleted.
Using AUTOINCREMENT is inefficient as to know what the last used value was requires a system table, sqlite_sequence and it being accessed to store the latest id and also to retrieve the id.
The SQLite AUTOINCREMENT documentation, says this:-
The AUTOINCREMENT keyword imposes extra CPU, memory, disk space, and disk I/O overhead and should be avoided if not strictly needed. It is usually not needed.
There are other differences, such as with AUTOINCREMENT if the id 9223372036854775807 has been reached, then another insert will result in an SQLITE_FULL error. Whilst without AUTOINCREMENT then an unused id (there would be one as current day storage devices could not hold that number of rows).
The intention of id's (rowid's) is to uniquely identify a row and to be able to access such a row efficiently if accessing it by the id. The intention is not for it to be used as a sequence/order. Using it as a sequence/order number will probably invariably result in unanticipated sequences or inefficient overheads trying to maintain such a sequence/order.
You should always consider that rows are unordered unless specifically ordered by a clause that orders the output, such as an ORDER BY clause.
However, if you take your example a little further, omitting AUTOINCREMENT, will still probably result in the order/sequence issues as if, for example, the row with an id of 1 were deleted instead of 2 then you would end up with id's of 2 and 3.
Perhaps consider the following which shows a) how the limited issue you have posed, is solved without AUTOINCREMENT, and b) that it is not the solution if it is not the highest id that is deleted:-
DROP TABLE IF EXISTS sample1;
CREATE TABLE IF NOT EXISTS sample1( name TEXT, id INTEGER PRIMARY KEY);
INSERT INTO sample1 VALUES ('ROY',1);
INSERT INTO sample1(name) VALUES ('RAJ');
DELETE FROM sample1 WHERE id = 2;
INSERT INTO sample1 VALUES ('AMIE',NULL);
/* Result 1 */
SELECT * FROM sample1;
/* BUT if a lower than the highest id is deleted */
DELETE FROM sample1 WHERE id=1;
INSERT INTO sample1 VALUES ('EMMA',NULL);
/* Result 2 */
SELECT * FROM sample1;
Result 1 (your exact issue resolved)
Result 2 (if not the highest id deleted)

Oracle Enforce Uniqueness

I need to enforce uniqueness on specific data in a table (~10 million rows). This example data illustrates the rule -
For code=X the part# cannot be duplicate. For any other code there can be duplicate part#. e.g ID 8 row can't be there but ID 6 row is fine. There are several different codes in the table and part# but uniqueness is desired only for one code=X.
ID CODE PART#
1 A R0P98
2 X R9P01
3 A R0P98
4 A R0P44
5 X R0P44
6 A R0P98
7 X T0P66
8 X T0P66
The only way I see is to create a trigger on the table and check for PART# for code=X before insert or update. However, I fear this solution may slow down inserts and updates on this table.
Appreciate your help!

In Oracle, you can create a unique index on an expression for this:
create unique index myidx
on mytable (case when code = 'X' then part# end);

SELECT statement to select from multiple tables referenced by ROWIDs

I have a small SQLITE 3 database accessed by AutoIt. Works all great, but now I need a more complex statement and maybe I now regret that I have referenced tables using only the ROWID instead of particular ID fields...
This is the configuration:
Table 1 Person
Name (string)
Initials (string)
Table 2 Projekte
Description (string)
Person (containing the ROWID of table Person)
Table 3 Planungen
ProjID (contains ROWID of table Projekte)
PlID (numeric, main selection identifier)
(plus some other fields that do not matter)
Initially, I only needed to read all data from table 3 Planungen filtered by a specific PlID. I did that successfully by using:
SELECT ROWID,* FROM Planungen WHERE PlID=[FilterValue1] ORDER BY ROWID;
Works great.
Now, I need to SELECT only a subset of these records, where PlID=[FilterValue1] and where ProjID points to a table 2 Projekte entry, that complies to Projekte.Person=[FilterValue2]. So I do not even need table 1 (Person), just 2 and 3.
I thought I could do it that way (now it becomes obvious, I am SQL idiot):
SELECT ROWID,* FROM Planungen p, Projekte pj WHERE pj.Person=[FilterValue2] and p.ProjID=pj.ROWID and p.PlID=[FilterValue1] ORDER BY ROWID;
That runs into an SQLite Error telling me that there is no such column ROWID. Oops! Really? How can that be? I can't use ROWID in the WHERE clause?? Well, probably it won't do what I intent anyway.
Can someone please help me? Can this be done without changing the database structure and introducing ID fields?
It would be great if the output of the SELECT would be identical to the first, working SELECT command, just with the additional "filtering" applied.

You really should add a proper INTEGER PRIMARY KEY column to your tables. (The implicit rowid might be changed by a VACUUM.)
Anyway, this query fails because the column name rowid is ambiguous. Replace it with pj.rowid (or whatever table you want to access).

Indexes and multi column primary keys

In a MySQL database I have a table with the following primary key
PRIMARY KEY id (invoice, item)
In my application I will also frequently be selecting on item by itself and less frequently on only invoice. I'm assuming I would benefit from indexes on these columns.
MySQL does not complain when I define the following:
INDEX (invoice),
INDEX (item),
PRIMARY KEY id (invoice, item)
But I don't see any evidence (using DESCRIBE -- the only way I know how to look) that separate indexes have been established for these two columns.
Are the columns that make up a primary key automatically indexed individually? Is there a better way than DESCRIBE to explore the structure of my table?

I'm not intimately familiar with the internals of indices on mySql, but on the two database vendor products that I am familiar with (MsSQL, Oracle) indices are balanced-Tree structures, whose nodes are organized as a sequenced tuple of the columns the index is defined on (In the Sequence Defined)
So, unless mySql does it very differently, (probably not), any composite index (on more than one column) can be useable by any query that needs to filter or sort by a subset of the columns in the index, as long as the list of columns is compatible, i.e., if the columns, when sequenced the same as the sequenced list of columns in the complete index, is an ordered subset of the complete set of index columns, which starts at the beginning of the actual index sequence, with no gaps except at the end...
In other words, this means that if you have an index on (a,b,c,d) a query that filters on (a), (a,b), or (a,b,c) can also use the index, but a query that needs to filter on (b), or (c) or (b,c) will not be able to use the index...
So in your case, if you often need to filter or sort on column item alone, you need to add another index on that column by itself...

I personally use phpMyAdmin to view and edit the structure of MySQL databases. It is a web application but it runs well enough on a local web server (I run an instance of apache on my machine for this and phpPgAdmin).
As for the composite key of (invoice, item), it acts like an index for (invoice, item) and for invoice. If you want to index by just item you have to add that index yourself. Your PK will be sorted by invoice and then by item where invoice is the same in multiple records. While the order in a composite PK does not matter for uniqueness enforcement, it does matter for access.
On your table I would use:
PRIMARY KEY id (invoice, item), INDEX (item)

I'm not that familiar with MySQL, but generally an multiple-column index is equally useful on the first column in the index as an index on that column alone. The multiple-column index becomes less useful for querying against a single column the further the column appears into the index.
This makes some sense if you think of the multi-column index as a hierarchy. The first column in the index is the root of the hierarchy, so searching it is just a matter of scanning that first level. However, in order to scan the second column, the database has to look up the tree for each unique value found in the first column. This can be costly enough that most optimizers won't bother to look deeply into a multi-column index, instead opting to full-table-scan.
For example, if you have a table as follows:
Col1 |Col2 |Col3
----------------
A | 1 | Z
A | 2 | Y
A | 2 | X
B | 1 | Z
B | 2 | X
Assuming you have an index on all three columns, in order, the tree will look something like this:
A
+-1
+-Z
+-2
+-X
+-Y
B
+-1
+-Z
+-2
+-X
Looking for Col1='A' is easy: you only have to look at 2 ordered values. However, to resolve col3='X', you have to look at all of the values in the 4 bigger buckets, each of which is ordered individually.

To return table index information, you can use:
SHOW INDEX FROM <table>;
See: http://dev.mysql.com/doc/refman/5.0/en/show-index.html
To view table information:
SHOW CREATE TABLE <table>;
See: http://dev.mysql.com/doc/refman/5.0/en/show-create-table.html
Primary keys are indexes, so there's no need to create additional indexes. You can find out more information about them under the CREATE TABLE syntax (there's too much to insert here):
http://dev.mysql.com/doc/refman/5.0/en/create-table.html

There is a difference between composite index and composite primary key.
If you have defined a composite index like below
INDEX idx(invoice,item)
the index wont work if you query based on item and you need to add a separate index
INDEX itemidx(item)
But, if you have defined a composite primary key like below
PRIMARY KEY(invoice, item)
the index would work if you query based on item and no separate index is required.
Working example:
mysql>create table test ( col1 int(20), col2 int(20) ) primary key(col1,col2);
mysql>explain select * from test where col2 = 1;
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | test | index | NULL | PRIMARY | 8 | NULL | 10 | Using where; Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+--------------------------+

Mysql auto create an index for composite keys. Depending on your queries, you may have to create separate index for individual column in the composite key.
If you are using mysql workbench, you can manually right click the schema and click on edit to see everything about the table

If your query is using both columns in where clause then you don't need to create a separate index in a composite primary key.
EXPLAIN SELECT * FROM `table` WHERE invoice = 1 and item = 1
You are also fine if you want to query with first column only
EXPLAIN SELECT * FROM `table` WHERE invoice = 1
But if you want to query with subsequent columns col2, col3 in composite PK then you would need to create separate indexes on those columns. The following explain query shows the second column does not have a possible key detected by MySQL
EXPLAIN SELECT * FROM `table` WHERE item = 1

What's a MySQL index table?

I need to speed up a query. Is an index table what I'm looking for? If so, how do I make one? Do I have to update it each insert?
Here are the table schemas:
--table1-- | --tableA-- | --table2--
id | id | id
attrib1 | t1id | attrib1
attrib2 | t2id | attrib2
| attrib1 |
And the query:
SELECT
table1.attrib1,
table1.attrib2,
tableA.attrib1
FROM
table1,
tableA
WHERE
table1.id = tableA.t1id
AND (tableA.t2id = x or ... or tableA.t2id = z)
GROUP BY
table1.id

You need to create a composite index on tableA:
CREATE INDEX ix_tablea_t1id_t2id ON table_A (t1id, t2id)
Indexes in MySQL are considered a part of a table: they are updated automatically, and used automatically whenever the optimizer decides it's a good move to use them.
MySQL does not use the term index table.
This term is used by Oracle to refer to what other databases call CLUSTERED INDEX: a kind of table where the records themselves are arranged according to the value of a column (or a set of columns).
In MySQL:
When you use MyISAM storage, an index is created as a separate file that has .MYI extension.
The contents of this file represent a B-Tree, each leaf containing the index key and a pointer to the offset in .MYD file which contains the data.
The size of the pointer is determined by the server setting called myisam_data_pointer_size, which can vary from 2 to 7 bytes, and defaults to 6 since MySQL 5.0.6.
This allows creating MyISAM tables up to 2 ^ (8 * 6) bytes = 256 TB
In InnoDB, all tables are inherently ordered by the PRIMARY KEY, it does not support heap-organized tables.
Each index, therefore, in fact is just a plain InnoDB table consisting of a single PRIMARY KEY of N+M records: N records being an indexed value, and M records being a PRIMARY KEY of the main table record which holds the indexed data.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas