Why can primary key column be empty in SQLite?

Why can primary key column be empty in SQLite? - sql

I have a table with only 1 primary column (nvarchar), as in the designer, it's marked as primary key and not allow nulls. But by somehow, there is a row in that table with the key value being empty, it's of course not null and doesn't duplicate with any other rows in the primary key column and there isn't any conflict or violation occurring.
However as far as I know, that kind of value (empty) is not allowed for a primary column in SQL Server. I wonder if there is any option to turn on to make it work properly. Or I have to check the value myself through CHECK constraint or right in C# code (before updating).
Your help would be highly appreciated. Thanks!

"Empty-string" is a string with length of zero. It is NOT NULL, so it doesn't violate a null check. It most certainly IS an allowed value for a character-based primary key column in SQL Server. If the empty string value is not allowed by the business, a check constraint would be the best way to implement this as a business rule. That way, clients which might not know about the rule can't violate it.
This code runs without violations in SQL Server, I just tested it just to be sure.
create table TestTable (
myKey varchar(10) primary key,
myData int
)
GO
insert TestTable
select '', 1

Related

Does defining the constraint PRIMARY KEY already makes sure that the column values are unique and not null or do you have to define it seperately?

Does defining the constraint PRIMARY KEY already makes sure that the column values are unique and not null or do you have to define it seperately?

Yes. But one exception is Sqlite.
See https://sqlite.org/lang_createtable.html
Each row in a table with a primary key must have a unique combination of values in its primary key columns. For the purposes of determining the uniqueness of primary key values, NULL values are considered distinct from all other values, including other NULLs. If an INSERT or UPDATE statement attempts to modify the table content so that two or more rows have identical primary key values, that is a constraint violation.
According to the SQL standard, PRIMARY KEY should always imply NOT NULL. Unfortunately, due to a bug in some early versions, this is not the case in SQLite. Unless the column is an INTEGER PRIMARY KEY or the table is a WITHOUT ROWID table or the column is declared NOT NULL, SQLite allows NULL values in a PRIMARY KEY column. SQLite could be fixed to conform to the standard, but doing so might break legacy applications. Hence, it has been decided to merely document the fact that SQLite allows NULLs in most PRIMARY KEY columns.

which constraint makes sure a column has some value entered?

which constraint makes sure a column has some value entered? I am confused between primary key and not null constraint .

A NOT NULL constraint.
All columns that participate in a PK must also not allow NULL but the PK constraint guarantees something more, uniqueness, - i.e. no two rows in the table can have the same value for the primary key.
In SQL Server even though syntactically you can name a NOT NULL constraint in the DDL it is different from other constraints in that no metadata (including even the name) is actually stored for the constraint itself.
CREATE TABLE T
(
X INT CONSTRAINT NotNull NOT NULL
)

Another point I didn't see addressed: NULL and empty string are two very different things, but they are often deemed interchangeable by a large portion of the community.
You can declare a varchar column as NOT NULL but you can still do this:
DECLARE #x TABLE(y VARCHAR(32) NOT NULL);
INSERT #x(y) VALUES('');
So if your goal is to make sure there is a valid value that is neither NULL nor a zero-length string, you can also add a check constraint, e.g.
DECLARE #x TABLE(y VARCHAR(32) NOT NULL CHECK (DATALENGTH(LTRIM(y)) > 0));

NOT NULL
is the condition that a field has a value. You can enforce that a field always have a value entered for every record inserted or updated, making the field NOT NULL in the table definition.
A primary key must meet these three conditions:
The values of the field are NOT NULL.
The values are unique.
The values are immutable.
The database can enforce the first two conditions with a unique index (and a not null condition on the field).
The third condition is not typically enforced by the database. Databases will typically allow changes to primary key fields, so DBAs can "fix" them. So the third condition is more philosophical, in that you agree to use the key for identification, and not write an application which changes the value, unless intended for an administrator to fix the keys.
I have been using field here, but a primary key can be a compound primary key, made up of any combination of fields which meets the conditions. Any combination of fields which matches the first 2 or all 3 conditions is called a candidate key.
Only one candidate key can be used as the primary key. Which one is just an arbitrary choice.

SQLite: autoincrement primary key questions

I have the following SQLite query:
CREATE TABLE Logs ( Id integer IDENTITY (1, 1) not null CONSTRAINT PKLogId PRIMARY KEY, ...
IDENTITY (1, 1) -> What does this mean?
PKLogId what is this? This doesn't seem to be defined anywhere
I want Id to be integer primary key with autoincrement. I would like to be able to insert into this Logs table omitting Id column in my query. I want Id to be automatically added and incremented. Is this possible? How can I do this?
At the moment when I try to insert without Id I get:
Error while executing query: Logs.Id may not be NULL

I'm not sure whether you're actually using SQLite according to the syntax of your example.
If you are, you may be interested in SQLite FAQ #1: How do I create an AUTOINCREMENT field?:
Short answer: A column declared INTEGER PRIMARY KEY will
autoincrement.

Change it to:
CREATE TABLE Logs ( Id integer PRIMARY KEY,....

If you are, you may be interested in SQLite FAQ #1: How do I create an AUTOINCREMENT field?:
Short answer: A column declared INTEGER PRIMARY KEY will auto increment.
This in fact is not entirely accurate. An integer primary key will indeed increment, however if the table drops all rows, it starts from the beginning again, It is important if you want to have all associated records tied correctly to use the autoincrement description after the primary key declaration on the integer field.

Constraint To Prevent Adding Value Which Exists In Another Table

I would like to add a constraint which prevents adding a value to a column if the value exists in the primary key column of another table. Is this possible?
EDIT:
Table: MasterParts
MasterPartNumber (Primary Key)
Description
....
Table: AlternateParts
MasterPartNumber (Composite Primary Key, Foreign Key to MasterParts.MasterPartNumber)
AlternatePartNumber (Composite Primary Key)
Problem - Alternate part numbers for each master part number must not themselves exist in the master parts table.
EDIT 2:
Here is an example:
MasterParts
MasterPartNumber Decription MinLevel MaxLevel ReOderLevel
010-00820-50 Garmin GTN™ 750 1 5 2
AlternateParts
MasterPartNumber AlternatePartNumber
010-00820-50 0100082050
010-00820-50 GTN750

only way I could think of solving this would be writing a checking function(not sure what language you are working with), or trying to play around with table relationships to ensure that it's unique

Why not have a single "part" table with an "is master part" flag and then have an "alternate parts" table that maps a "master" part to one or more "alternate" parts?

Here's one way to do it without procedural code. I've deliberately left out ON UPDATE CASCADE and ON DELETE CASCADE, but in production I'd might use both. (But I'd severely limit who's allowed to update and delete part numbers.)
-- New tables
create table part_numbers (
pn varchar(50) primary key,
pn_type char(1) not null check (pn_type in ('m', 'a')),
unique (pn, pn_type)
);
create table part_numbers_master (
pn varchar(50) primary key,
pn_type char(1) not null default 'm' check (pn_type = 'm'),
description varchar(100) not null,
foreign key (pn, pn_type) references part_numbers (pn, pn_type)
);
create table part_numbers_alternate (
pn varchar(50) primary key,
pn_type char(1) not null default 'a' check (pn_type = 'a'),
foreign key (pn, pn_type) references part_numbers (pn, pn_type)
);
-- Now, your tables.
create table masterparts (
master_part_number varchar(50) primary key references part_numbers_master,
min_level integer not null default 0 check (min_level >= 0),
max_level integer not null default 0 check (max_level >= min_level),
reorder_level integer not null default 0
check ((reorder_level < max_level) and (reorder_level >= min_level))
);
create table alternateparts (
master_part_number varchar(50) not null references part_numbers_master (pn),
alternate_part_number varchar(50) not null references part_numbers_alternate (pn),
primary key (master_part_number, alternate_part_number)
);
-- Some test data
insert into part_numbers values
('010-00820-50', 'm'),
('0100082050', 'a'),
('GTN750', 'a');
insert into part_numbers_master values
('010-00820-50', 'm', 'Garmin GTN™ 750');
insert into part_numbers_alternate (pn) values
('0100082050'),
('GTN750');
insert into masterparts values
('010-00820-50', 1, 5, 2);
insert into alternateparts values
('010-00820-50', '0100082050'),
('010-00820-50', 'GTN750');
In practice, I'd build updatable views for master parts and for alternate parts, and I'd limit client access to the views. The updatable views would be responsible for managing inserts, updates, and deletes. (Depending on your company's policies, you might use stored procedures instead of updatable views.)

Your design is perfect.
But SQL isn't very helpful when you try to implement such a design. There is no declarative way in SQL to enforce your business rule. You'll have to write two triggers, one for inserts into masterparts, checking the new masterpart identifier doesn't yet exist as an alias, and the other one for inserts of aliases checking that the new alias identifier doesn't yet identiy a masterpart.
Or you can do this in the application, which is worse than triggers, from the data integrity point of view.
(If you want to read up on how to enforce constraints of arbitrary complexity within an SQL engine, best coverage I have seen of the topic is in the book "Applied Mathematics for Database Professionals")

Apart that it sounds like a possibly poor design,
You in essence want values spanning two columns in different tables, to be unique.
In order to utilize DBs native capability to check for uniqueness, you can create a 3rd, helper column, which will contain a copy of all the values inside the wanted two columns. And that column will have uniqueness constraint. So for each new value added to one of your target columns, you need to add the same value to the helper column. In order for this to be an inner DB constraint, you can add this by a trigger.
And again, needing to do the above, sounds like an evidence for a poor design.
--
Edit:
Regarding your edit:
You say " Alternate part numbers for each master part number must not themselves exist in the master parts table."
This itself is a design decision, which you don't explain.
I don't know enough about the domain of your problem, but:
If you think of master and alternate parts, as totally different things, there is no reason why you may want "Alternate part numbers for each master part number must not themselves exist in the master parts table". Otherwise, you have a common notion of "parts" be it master or alternate. This means they need to be in the same table, and column.
If the second is true, you need something like this:
table "parts"
columns:
id - pk
is_master - boolean (assuming a part can not be master and alternate at the same time)
description - text
This tables role is to list and describe the parts.
Then you have several ways to denote which part is alternate to which. It depends on whether a part can be alternate to more than one part. And it sounds that anyway one master part can have several alternates.
You can do it in the same table, or create another one.
If same: add column: alternate_to, which will be null for master parts, and will have a foreign key into the id column of the same table.
Otherwise create a table, say "alternatives" with: master_id, alternate_id both referencing with a foreign key to the parts table.
(The first above assumes that a part cannot be alternate to more than one other part. If this is not true, the second will work anyway)

Can I put constraint on column without referring to another table?

I have a text column that should only have 1 of 3 possible strings. To put a constraint on it, I would have to reference another table. Can I instead put the values of the constraint directly on the column without referring to another table?

If this is SQL Server, Oracle, or PostgreSQL, yes, you can use a check constraint.
If it's MySQL, check constraints are recognized but not enforced. You can use an enum, though. If you need a comma-separated list, you can use a set.
However, this is generally frowned upon, since it's definitely not easy to maintain. Just best to create a lookup table and ensure referential integrity through that.

In addition to the CHECK constraint and ENUM data type that other mention, you could also write a trigger to enforce your desired restriction.
I don't necessarily recommend a trigger as a good solution, I'm just pointing out another option that meets your criteria of not referencing a lookup table.
My habit is to define lookup tables instead of using constraints or triggers, when the rule is simply to restrict a column to a finite set of values. The performance impact of checking against a lookup table is no worse than using CHECK constraints or triggers, and it's a lot easier to manage when the set of values might change from time to time.
Also a common task is to query the set of permitted value, for instance to populate a form field in the user interface. When the permitted values are in a lookup table, this is a lot easier than when they're defined in a list of literal values in a CHECK constraint or ENUM definition.
Re comment "how exactly to do lookup without id"
CREATE TABLE LookupStrings (
string VARCHAR(20) PRIMARY KEY
);
CREATE TABLE MainTable (
main_id INT PRIMARY KEY,
string VARCHAR(20) NOT NULL,
FOREIGN KEY (string) REFERENCES LookupStrings (string)
);
Now you can be assured that no value in MainTable.string is invalid, since the referential integrity prevents that. But you don't have to join to the LookupStrings table to get the string, when you query MainTable:
SELECT main_id, string FROM MainTable;
See? No join! But you get the string value.
Re comment about multiple foreign key columns:
You can have two individual foreign keys, each potentially pointing to different rows in the lookup table. The foreign key column doesn't have to be named the same as the column in the referenced table.
My common example is a bug-tracking database, where a bug was reported by one user, but assigned to be fixed by a different user. Both reported_by and assigned_to are foreign keys referencing the Accounts table.
CREATE TABLE Bugs (
bug_id INT PRIMARY KEY,
reported_by INT NOT NULL,
assigned_to INT,
FOREIGN KEY (reported_by) REFERENCES Accounts (account_id),
FOREIGN KEY (assigned_to) REFERENCES Accounts (account_id)
);

In Oracle, SQL Server and PostgreSQL, use CHECK constraint.
CREATE TABLE mytable (myfield INT VARCHAR(50) CHECK (myfield IN ('first', 'second', 'third'))
In MySQL, use ENUM datatype:
CREATE TABLE mytable (myfield ENUM ('first', 'second', 'third'))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas