Sqlite - composite PK with two auto-incrementing values [duplicate]

Sqlite - composite PK with two auto-incrementing values [duplicate] - sql

I have a composite primary key {shop_id, product_id} for SQLite
Now, I want an auto-increment value for product_id which resets to 1 if shop id is changed. Basically, I want auto-generated composite key
e.g.
Shop ID Product Id
1 1
1 2
1 3
2 1
2 2
3 1
Can I achieve this with auto-increment? How?

Normal Sqlite tables are B*-trees that use a 64-bit integer as their key. This is called the rowid. When inserting a row, if a value is not explicitly given for this, one is generated. An INTEGER PRIMARY KEY column acts as an alias for this rowid. The AUTOINCREMENT keyword, which can only be used on said INTEGER PRIMARY KEY column, contrary to the name, merely alters how said rowid is calculated - if you leave out a value, one will be created whether that keyword is present or not, because it's really the rowid and must have a number. Details here. (rowid values are generally generated in increasing, but not necessarily sequential, order, and shouldn't be treated like a row number or anything like that, btw).
Any primary key other than a single INTEGER column is treated as a unique index, while the rowid remains the true primary key (Unless it's a WITHOUT ROWID table), and is not autogenerated. So, no, you can't (easily) do what you want.
I would probably work out a database design where you have a table of shops, a table of products, each with their own ids, and a junction table that establishes a many-to-many relation between the two. This keeps the product id the same between stores, which is probably going to be less confusing to people - I wouldn't expect the same item to have a different SKU in two different stores of the same chain, for instance.
Something like:
CREATE TABLE stores(store_id INTEGER PRIMARY KEY
, address TEXT
-- etc
);
CREATE TABLE product(prod_id INTEGER PRIMARY KEY
, name TEXT
-- etc
);
CREATE TABLE inventory(store_id INTEGER REFERENCES stores(store_id)
, prod_id INTEGER REFERENCES product(prod_id)
, PRIMARY KEY(store_id, prod_id)) WITHOUT ROWID;

Related

Can I use identity for primary key in more than one table in the same ER model

As it is said in the title, my question is can I use int identity(1,1) for primary key in more than one table in the same ER model? I found on Internet that Primary Key need to have unique value and row, for example if I set int identity (1,1) for table:
CREATE TABLE dbo.Persons
(
Personid int IDENTITY(1,1) PRIMARY KEY,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int
);
GO
and the other table
CREATE TABLE dbo.Job
(
jobID int IDENTITY(1,1) NOT NULL PRIMARY KEY,
nameJob NVARCHAR(25) NOT NULL,
Personid int FOREIGN KEY REFERENCES dbo.Persons(Personid)
);
Wouldn't Personid and jobID have the same value and because of that cause an error?

Constraints in general are defined and have a scope of one table (object) in the database. The only exception is the FOREIGN KEY which usually has a REFERENCE to another table.
The PRIMARY KEY (or any UNIQUE key) sets a constraint only on the table it is defined on and is not affecting or is not affected by other constraints on other tables.
The PRIMARY KEY defines a column or a set of columns which can be used to uniquely identify one record in one table (and none of the columns can hold NULL, UNIQUE on the other hand allows NULLs and how it is treated might differ in different database engines).
So yes, you might have the same value for PersonID and JobID, but their meaning is different. (And to select the one unique record, you will need to tell SQL Server in which table and in which column of that table you are looking for it, this is the table list and the WHERE or JOIN conditions in the query).
The query SELECT * FROM dbo.Job WHERE JobID = 1; and SELECT * FROM dbo.Person WHERE PersonID = 1; have a different meaning even when the value you are searching for is the same.
You will define the IDENTITY on the table (the table can have only one IDENTITY column). You don't need to have an IDENTITY definition on a column to have the value 1 in it, the IDENTITY just gives you an easy way to generate unique values per table.
You can share sequences across tables by using a SEQUENCE, but that will not prevent you to manually insert the same values into multiple tables.
In short, the value stored in the column is just a value, the table name, the column name and the business rules and roles will give it a meaning.
To the notion "every table needs to have a PRIMARY KEY and IDENTITY, I would like to add, that in most cases there are multiple (independent) keys in the table. Usually every entity has something what you can call business key, which is in loose terms the key what the business (humans) use to identify something. This key has very similar, but usually the same characteristics as a PRIMARY KEY with IDENTITY.
This can be a product's barcode, or the employee's ID card number, or something what is generated in another system (say HR) or a code which is assigned to a customer or partner.
These business keys are useful for humans, but not always useful for computers, but they could serve as PRIMARY KEY.
In databases we (the developers, architects) like simplicity and a business key can be very complex (in computer terms), can consist of multiple columns, and can also cause performance issues (comparing a strings is not the same as comparing numbers, comparing multiple columns is less efficient than comparing one column), but the worst, it might change over time. To resolve this, we tend to create our own technical key which then can be used by computers more easily and we have more control over it, so we use things like IDENTITYs and GUIDs and whatnot.

sqlite text as primary key vs autoincrement integers

I'm currently debating between two strategies to using a text column as a key.
The first one is to simply use the text column itself as a key, as such:
create table a(
key_a text primary key,
)
create table b(
key_b text primary key,
)
create table c(
key_a text,
key_b text,
foreign key("key_a") references a("key_a"),
foreign key("key_b") references b("key_b")
)
I'm concerned that this would result in every key being duplicated, once in a and b and another in c, since text isn't stored inline.
My second approach is to use an autoincrement id on the first two tables as a primary key, and use those ids on table c to refer to them, as such:
create table a(
id_a integer,
key_a text unique,
primary key("id_a" autoincrement)
)
create table b(
id_b integer,
key_b text unique,
primary key("id_a" autoincrement)
)
create table c(
id_a integer,
id_b integer,
foreign key("id_a") references a("id_a"),
foreign key("id_b") references b("id_b")
)
Am I right to be concerned about text duplication in the first case? Or does sqlite somehow intern these and just use an id for both, akin to what the second strategy does?

SQLite does not automatically compress text. So the answer to your question is "no".
Should you use text or an auto-incrementing id as the primary key? This can be a complex question. But happily, the answer is that it doesn't make much difference. That said, there are some considerations:
Integers are of fixed length. In general, fix length keys are slightly more efficient in B-tree indexes than variable length keys.
If the strings are short (like 1 or 2 or 3 characters), then they may be shorter -- or no longer -- than integers.
If you change the string (say, if it is originally misspelled), then using an "artificial" primary key makes this easy: just change the value in one table. Using the string itself as a key can result in lots of updates to lots of tables.

Am I right to be concerned about text duplication in the first case?
Or does sqlite somehow intern these and just use an id for both, akin
to what the second strategy does?
Yes, you are right to be concerned. The text will be duplicated.
Also, even if you did not define an integer primary key in your 1st approach, there is one.
From Rowid Tables:
The PRIMARY KEY of a rowid table (if there is one) is usually not the
true primary key for the table, in the sense that it is not the unique
key used by the underlying B-tree storage engine. The exception to
this rule is when the rowid table declares an INTEGER PRIMARY KEY. In
the exception, the INTEGER PRIMARY KEY becomes an alias for the rowid.
The true primary key for a rowid table (the value that is used as the
key to look up rows in the underlying B-tree storage engine) is the
rowid.
In your 2nd approach actually you are not creating a new column in each of the tables a and b by defining an integer primary key.
What you are doing is aliasing the existing rowid column:
id_a becomes the alias of rowid of the table a
id_b becomes the alias of rowid of the table b.
So, defining these integer primary keys is not more expensive in terms of space in the parent tables.
Although with your 1st approach you can avoid explicit updates in the child tables when you update a value in the parent tables by defining the foreign keys with ON UPDATE CASCADE, your 2nd approach is what I would suggest.
An integer primary key with a value assigned to it by the system and you don't even have to know or worry about it is common practice.
All you have to do is use that primary key and its corresponding foreign keys in the queries that you create to access the parent tables when you want to fetch from them the text values.

For performance (also it is a good db practice) you should stick to numeric/int value for the Primary Key.
As for the second approach, I'm not getting the concept you are after. Could you elaborate more on this?

Foreign and Primary Key Conceptual Questions

I am a newbie at SQL/PostgreSQL, and I had a conceptual question about foreign keys and keys in general:
Let's say I have two tables: Table A and Table B.
A has a bunch of columns, two of which are A.id, A.seq. The primary key is btree(A.id, A.seq) and it has a foreign key constraint that A.id references B.id. Note that A.seq is a sequential column (first row has value 1, second has 2, etc).
Now say B has a bunch of columns, one of which is the above mentioned B.id. The primary key is btree(B.id).
I have the following questions:
What exactly does btree do? What is the significance of having two column names in the btree rather than just one (as in btree(B.id)).
Why is it important that A references B instead of B referencing A? Why does order matter when it comes to foreign keys??
Thanks so much! Please correct me if I used incorrect terminology for anything.
EDIT: I am using postgres

A btree index stored values in sorted order, which means you can not only search for a single primary key value, but you can also efficiently search for a range of values:
SELECT ... WHERE id between 6060842 AND 8675309
PostgreSQL also supports other index types, but only btree is supported for a unique index (for example, the primary key).
In your B table, the primary key being a single column id means that only one row can exist for each value in id. In other words, it is unique, and if you search for a value by primary key, it will find at most one row (it may also find zero rows if you don't have a row with that value).
The primary key in your A table is for (id, seq). This means you can have multiple rows for each value of id. You can also have multiple rows for each value of seq as long as they are for different id values. The combination must be unique though. You can't have more than one row with the same pair of values.
When a foreign key in A references B, it means that the row must exist in B before you are allowed to store the row in A with the same id value. But the reverse is not necessary.
Example:
Suppose B is for users and A is for phones. You must store a user before you can store a phone record for that user. You can store one or more phones for that user, one per row in the A table. We say that a row in A therefore references the user row in B, meaning, "this phone belongs to user #1234."
But the reverse is not restricted. A user might have no phones (at least not known to this database), so there is no requirement for B to reference A. In other words, it is not required for a user to have a phone. You can store a row in B (the user) even if there is no row in A (the phones) for that user.
The reference also means you are not allowed to DELETE FROM B WHERE id = ? if there are rows in another table that reference that given row in B. Deleting that user would cause those other rows to become orphaned. No one would be able to know who those phones belonged to, if the user row they reference is deleted.

To your questions:
There are several strategies to implement unique keys. The most common one is to use an index that is "unique" using a "b-tree" strategy. That's what "btree" means in PostgreSQL.
Having two columns in a key just depends on how you want to design your table. When you have a key with more than one column that is called a "composite key".
When A references B, the columns in B must represent a "key". The columns in A do not represent a key, but just a reference to one. In fact the values in A for that column can be repeated; that is, multiple rows in A can point to the same row in B.

Your data structure makes no sense. Why would the primary key of A haver both id and name? Normally it would just be id. In some data models, you might have a version or timestamp added. I can't think of a reasonable data model where name would also be included.
In addition, B's foreign key would have to be to both id and name.
But, your question is what is btree for? Most databases don't have such an option. A primary key would typically be expressed as:
id int primary key;
constraint unq_t_id primary key (id);
btree is a type of index -- in fact the default type of index in all databases that I'm aware of. Databases that have a plethora of available types of indexes -- such as Postgres -- you can specify the index type associated with the primary key.

Is there a way to auto increment a column with respect to the foreign key in DB2?

Say I have an address table with the addresses of different facilities of a manufacturing company.
The foreign key lets me know which company the addresses belong to, but i need a surrogate id to differentiate between each facility. This id should increment automatically based on the foreign key value.
Note : I just need simple integer values for keys.
eg:
My table has the following columns, ORGANIZATION_ID is the foreign key.
FACILITY_ID is a second surrogate key dependednt on the foreign key.
ADDRESS_TABLE
->ORGANIZATION_ID
->FACILITY_ID
->ADDRESS_LINE_1
->ADDRESS_LINE_2
->CITY
->STATE
->ZIP_CODE
I want the facility id to increase automatically from 1 depending on the
organization id. i.e
ORGANIZATION_ID 1
FACILITY_ID 1
When I insert data for new organization, facility should start from 1
ORGANIZATION_ID 2
FACILITY_ID 1
Next time I insert data for the same organization, my facility id should increment accordingly -
ORGANIZATION_ID 1
FACILITY_ID 2
Is there any way to make this happen in DB2?
I'm currently on DB2 V 10.5.6

No. Auto-increment, or Identity keys as DB2 calls them, don't support composite keys.
Best you could do would be to have a on insert trigger that handled assigning the values you want. Possibly making use of a SEQUENCE; though you'd have to create a new sequence to use for the facilities of each new organization.

sqlite: how can I add an autoincrementing id to an existing table?

I would like to add an autoincrementing integer field called uid to an existing table assoc, but it doesn't look like I can do that unless it's a primary key.
I have fields local_id and remote_id which are the existing primary key pair, and I do that so that I can INSERT OR IGNORE INTO assoc so that I don't get duplicate primary keys, but if I have a pair of columns as a primary key, I can't seem to use them as an update (see other SO question).
Could anyone suggest how to restructure the table (and implement that restructuring using ALTER TABLE) so that I can get the behavior I need:
a single autoincrementing key, so I can use that for UPDATEs
a pair of fields local_id and remote_id so that the pair (local_id, remote_id) remains unique in the table

In this case, you could drop the primary key on your existing columns, create the new primary key integer autoincrementing column, then create a UNIQUE index on the other two columns.

Aha, I don't need to -- there's a builtin rowid column.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas