Help with Primary keys and unique constraints - sql

In a table I've got 3 columns:
id
tag1
tag2
id is a primary key.
And i only want one unique tag1-tag2-combination in that table.
eg if one entry looks like:
id: 1
tag1: cat
tag2: dog
I dont want a second entry like this one beneath to get inserted:
id: 2
tag1: cat
tag2: dog
So i made all 3 columns primary keys but the problem is that then the second entry would get inserted since it looks in the combination of all 3 of them.
How do i solve this so that only the combination of the tag1 and tag2 is unique?
UPDATE: I added a unique contraint on tag1 and tag2. however, its still possible to insert:
id: 3
tag1: dog
tag2: cat
Is there a way to prevent this?

You should leave ID as the primary key, and then can create a unique constraint for the tag1 and tag2:
ALTER TABLE my_table ADD CONSTRAINT uc_tags UNIQUE (tag1, tag2)
With the unique constraint, you will be guaranteed that you will never have two rows with duplicate tag1 and tag2 values.
EDIT:
Further to your last update, you cannot enforce that with unique constraints. Keep in mind that for the database a record with (tag1 = dog, tag2 = cat) is totally different from a record with (tag1 = cat, tag2 = dog).
Probably your best bet is to redesign your database schema, as follows:
Table "tags"
Table "messages" (or whatever you are tagging)
Table "tags_messages" with the following fields (message_id, tag_id)
Then you can simply set (message_id, tag_id) of the "tag_messages" table as a primary key. This will automatically enforce that there cannot be any message with a duplicate tag.
Some sample data:
Table: messages
message_id | title
-------------+------------------
1 | some message
2 | another message
Table: tags
tag_id | tags
-------------+-------------------
1 | cat
2 | dog
3 | duck
4 | horse
Table: messages_tags
message_id | tag_id
-------------+-------------------
1 | 1
1 | 2
2 | 3
2 | 4
2 | 1

You can keep the primary key on the "id" column and add a unique constraint on the "tag1" and "tag2" columns. See this link.

Add a unique index that combines tag1 and tag2.
http://dev.mysql.com/doc/refman/5.1/en/create-index.html

Depending on if and when you need to use the "unique record" in other tables, it can be argued that your "id" field is unnecessary. (ID here is a surrogate key) If you won't be using the "id" field in another table, then is really makes more sense to make your primary key the (tag1, tag2) and to remove the "id" column all together.

I guess the question is, Why would you do it this way? It would help to know the business reason.
You can always SELECT DISTINCT to only get the rows with unique values.

If you have some control over the order of insertion and update you can enforce uniqueness of permutation:
alter table t23
add constraint tags_ck check (tag1 < tag2)
/
alter table t23
add constraint tags_uk unique (tag1, tag2)
/
This works because the check constraint rejects ('dog','cat') as an invalid combination. Consequently the unique constraint can ensure that there is only evy one record with that particular permutation of tags.
As a solution this does require some intervention at insert and update time, which may be enough to sink this implementation for you. I know of an elegant solution whcih woks in Oracle, using a function-based index (I posted it here) but I don't think MySQL supports a similar type of index.

Related

Validate whether value in one table is the same as in related table - performance

Let's say I have two tables and I'm doing all the operations in .NET Core 2 Web API.
Table A:
Id,
SomeValue,
TeamName
Table B:
Id,
Fk_Id_a (references Id in table A),
OtherValue,
TeamName
I can add and get records from table B indepedently.
But for every record in Table B TeamName has to be the same as for it's corresponidng Fk_Id_a in Table A.
Assume these values comes in:
{
"Fk_Id_a": 3,
"SomeValue": "test val",
"TeamName": "Super team"
}
Which way would be better to check it in terms of performance? 1ST way requires two connections, when 2nd requires storing some extra keys etc.
1ST WAY:
get record from Table A for Fk_Id_a (3),
check if TeamName is the same as in coming request (Super team),
do the rest of the logic
2ND WAY:
using compound foreign keys and indexes:
TableA has alternate unique key (Id, TeamName)
TableB has foreign compound key (Fk_Id_a, TeamName) that references TableA (Id, TeamName)
SQL SCRIPT TO SHOW:
ALTER TABLE Observation
ADD UNIQUE (Id, PowelTeamId)
GO
ALTER TABLE ObservationPicturesId
ADD FOREIGN KEY(ObservationId, PowelTeamId)
REFERENCES Observation(Id, PowelTeamId)
ON DELETE CASCADE
ON UPDATE CASCADE
EDIT: Simple example how the tables might look like. TeamName has to be valid for FK referenced value in Table A.
Table A
ID | ObservationTitle | TeamName
---------------------------------------
1 | Fire damage | CX_team
2 | Water damage | CX_team
3 | Wind damage | Dd_WP3
Table B
ID | PictureId | AddedBy | TeamName | TableA_ID_FK
-----------------------------------------------------
1 | Fire | James | CX_team | 1
2 | Water | Andrew | CX_team | 1
3 | Wind | John | Dd_WP3 | 3
Performance wise, the 2nd option would be faster because there is no comparison to check (the foreign key will force that they match when inserting, updating or deleting) when selecting the rows from the table. It would also make a unique index on table A.
That being said, there is something very fishy about the structure you mention. First of all why is the TeamName repeated in table B? If a row in table B is "valid" only when the TeamName match, then you should enforce that no row should be inserted with a different TeamName, throught the ID foreign key (and not actually storing the TeamName value). If there are records on table B that represent another thing rather than the entity that is linked to table A then you should split it onto another table or just update the foreign key column when the team matches and not always.
The issue is that you are using a foreign key as a partial link, making the relationship valid only when an additional condition is true.

Can a foreign key refer to a primary key in the same table?

I just think that the answer is false because the foreign key doesn't have uniqueness property.
But some people said that it can be in case of self joining the table.
I am new to SQL. If its true please explain how and why?
Employee table
| e_id | e_name | e_sala | d_id |
|---- |------- |----- |--------|
| 1 | Tom | 50K | A |
| 2 | Billy | 15K | A |
| 3 | Bucky | 15K | B |
department table
| d_id | d_name |
|---- |------- |
| A | XXX |
| B | YYY |
Now, d_id is foreign key so how it can be a primary key. And explain something about join. What is its use?
I think the question is a bit confusing.
If you mean "can foreign key 'refer' to a primary key in the same table?", the answer is a firm yes as some replied. For example, in an employee table, a row for an employee may have a column for storing manager's employee number where the manager is also an employee and hence will have a row in the table like a row of any other employee.
If you mean "can column(or set of columns) be a primary key as well as a foreign key in the same table?", the answer, in my view, is a no; it seems meaningless. However, the following definition succeeds in SQL Server!
create table t1(c1 int not null primary key foreign key references t1(c1))
But I think it is meaningless to have such a constraint unless somebody comes up with a practical example.
AmanS, in your example d_id in no circumstance can be a primary key in Employee table. A table can have only one primary key. I hope this clears your doubt. d_id is/can be a primary key only in department table.
This may be a good explanation example
CREATE TABLE employees (
id INTEGER NOT NULL PRIMARY KEY,
managerId INTEGER REFERENCES employees(id),
name VARCHAR(30) NOT NULL
);
INSERT INTO employees(id, managerId, name) VALUES(1, NULL, 'John');
INSERT INTO employees(id, managerId, name) VALUES(2, 1, 'Mike');
-- Explanation:
-- In this example.
-- John is Mike's manager. Mike does not manage anyone.
-- Mike is the only employee who does not manage anyone.
Sure, why not? Let's say you have a Person table, with id, name, age, and parent_id, where parent_id is a foreign key to the same table. You wouldn't need to normalize the Person table to Parent and Child tables, that would be overkill.
Person
| id | name | age | parent_id |
|----|-------|-----|-----------|
| 1 | Tom | 50 | null |
| 2 | Billy | 15 | 1 |
Something like this.
I suppose to maintain consistency, there would need to be at least 1 null value for parent_id, though. The one "alpha male" row.
EDIT: As the comments show, Sam found a good reason not to do this. It seems that in MySQL when you attempt to make edits to the primary key, even if you specify CASCADE ON UPDATE it won’t propagate the edit properly. Although primary keys are (usually) off-limits to editing in production, it is nevertheless a limitation not to be ignored. Thus I change my answer to:- you should probably avoid this practice unless you have pretty tight control over the production system (and can guarantee no one will implement a control that edits the PKs). I haven't tested it outside of MySQL.
Eg: n sub-category level for categories .Below table primary-key id is referred by foreign-key sub_category_id
A good example of using ids of other rows in the same table as foreign keys is nested lists.
Deleting a row that has children (i.e., rows, which refer to parent's id), which also have children (i.e., referencing ids of children) will delete a cascade of rows.
This will save a lot of pain (and a lot of code of what to do with orphans - i.e., rows, that refer to non-existing ids).
Other answers have given clear enough examples of a record referencing another record in the same table.
There are even valid use cases for a record referencing itself in the same table. For example, a point of sale system accepting many tenders may need to know which tender to use for change when the payment is not the exact value of the sale. For many tenders that's the same tender, for others that's domestic cash, for yet other tenders, no form of change is allowed.
All this can be pretty elegantly represented with a single tender attribute which is a foreign key referencing the primary key of the same table, and whose values sometimes match the respective primary key of same record. In this example, the absence of value (also known as NULL value) might be needed to represent an unrelated meaning: this tender can only be used at its full value.
Popular relational database management systems support this use case smoothly.
Take-aways:
When inserting a record, the foreign key reference is verified to be present after the insert, rather than before the insert.
When inserting multiple records with a single statement, the order in which the records are inserted matters. The constraints are checked for each record separately.
Certain other data patterns, such as those involving circular dependences on record level going through two or more tables, cannot be purely inserted at all, or at least not with all the foreign keys enabled, and they have to be established using a combination of inserts and updates (if they are truly necessary).
Adding to the answer by #mysagar the way to do the same in MySQL is demonstrated below -
CREATE TABLE t1 (
-> c1 INT NOT NULL,
-> PRIMARY KEY (c1),
-> CONSTRAINT fk FOREIGN KEY (c1)
-> REFERENCES t1 (c1)
-> ON UPDATE RESTRICT
-> ON DELETE RESTRICT
-> );
would give error -
ERROR 1822 (HY000): Failed to add the foreign key constraint. Missing index for constraint 'fk' in the referenced table 't1'
The correct way to do it is -
CREATE TABLE t1 (
-> c1 INT NOT NULL,
-> PRIMARY KEY (c1),
-> KEY i (c1),
-> CONSTRAINT fk FOREIGN KEY (c1)
-> REFERENCES t1 (c1)
-> ON UPDATE RESTRICT
-> ON DELETE RESTRICT
-> );
One practical utility I can think of is a quick-fix to ensure that after a value is entered in the PRIMARY KEY column, it can neither be updated, nor deleted.
For example, over here let's populate table t1 -
INSERT INTO t1 (c1) VALUES
-> (1),
-> (2),
-> (3),
-> (4),
-> (5);
SELECT * FROM t1;
+----+
| c1 |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
Now, let's try updating row1 -
UPDATE t1
-> SET c1 = 6 WHERE c1 = 1;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`constraints`.`t1`, CONSTRAINT `fk` FOREIGN KEY (`c1`) REFERENCES `t1` (`c1`) ON DELETE RESTRICT ON UPDATE RESTRICT)
Now, let's try deleting row1 -
DELETE FROM t1
-> WHERE c1 = 1;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`constraints`.`t1`, CONSTRAINT `fk` FOREIGN KEY (`c1`) REFERENCES `t1` (`c1`) ON DELETE RESTRICT ON UPDATE RESTRICT)

Constrain a table such that each account can have one of another table

I have a table which has these columns:
Id (Primary Key): the id.
OwnerId (Foreign Key): the id of the owner, which resides in another table.
TypeId (Foreign Key): the type of thing this record represents. There are a finite number of types, which are represented in another table. This links to that table.
TypeCreatorId (ForeignKey): the owner of the type represented by TypeId.
SourceId (Foreign Key): this isn't important to this question.
I need to constrain this table such that for each Id, there can be only one of each TypeCreatorId. I hope that makes sense!
For SQL Server, you have two options:
create a UNIQUE CONSTRAINT
ALTER TABLE dbo.YourTable
ADD CONSTRAINT UNIQ_Id_TypeCreator UNIQUE(Id, TypeCreatorId)
create a UNIQUE INDEX:
CREATE UNIQUE INDEX UIX_YourTable_ID_TypeCreator
ON dbo.YourTable(Id, TypeCreatorId)
Basically, both things achieve the same thing - you cannot have two rows with the same (Id, TypeCreatorId) values.
Simply create a unique index on OwnerId and TypeCreatorId.
An example using MySQL (sorry, I don't use SQL Server):
alter table yourTable
add unique index idx_newIndex(OwnerId, TypeCreatorId);
Example. I'll just put here what would happen with this new unique index:
OwnerId | TypeCreatorId
--------+--------------
1 | 1
1 | 2 -- This is Ok
2 | 1 -- Ok too
2 | 2 -- Ok again
1 | 2 -- THIS WON'T BE ALLOWED because it would be a duplicate

Inserting 2 rows, each to different tables where one row refrences the other's primary key

Hello folks
Checkout this scenario
Table 1 columns -> | table_1_id (pkey) | some_column | comments |
Table 2 columns -> | table_2_id (pkey) | some_other_column | table_1_id (fkey) | comments |
All primary keys are of type serial or auto number.
The 3rd column on Table 2 is an fk that references Table 1's primary key.
I would like to insert rows into both programmaticaly (from a c++ app)
Do i have to insert to table one then SELECT-query the entry's primary key then insert the Table 2 row with the pkey result?
Is there a more efficient way of handling this? Say using almost 2 queries?
I would suggest looking http://wiki.postgresql.org/wiki/FAQ
The site is a useful resource to go through to get familiar with PostgreSQL
Specifically, the section How do I get the value of a SERIAL insert?
The simplest way is to retrieve the
assigned SERIAL value with RETURNING.
Using the example table in the
previous question, it would look like
this:
INSERT INTO person (name) VALUES
('Blaise Pascal') RETURNING id;
You can also call nextval() and use that value in the INSERT, or call currval() after the INSERT.
If you don't need the table_1_id value in your application, you can skip retrieving it completely:
INSERT INTO table_1(cols...) VALUES(vals...)
INSERT INTO table_2(table_1_id, cols...) VALUES(currval('table_1_table_1_id_seq'), vals...)

how to keep combination of cells unique

i have table A and table B. I have a bridge table called tableC
in table C i have:
ID
tableA_ID
tableB_ID
ID is the primary key.
i also want to enforce the combination of tableA_ID and tableB_ID to be unique so there are no duplicate records.
how do i enforce this?
create unique index myIdx on tableC(tableA_ID, tableB_ID)
or whatever the syntax for your particular database system is.
Make the PRIMARY KEY tableA_ID and tableB_ID, EXCLUDING ID
lets say we have a table TABLEA with values
tableAID
1
2
3
and table TABLEB with values
tableBID
4
5
6
making the primary key (ID, tableA_ID, tableB_ID) will not work eg.
ID | tableAID | tableBID
1 | 1 | 4
2 | 1 | 4
will work fine with the above pk, but you need PRIMARY KEY (tableA_ID, tableB_ID)
Drop the ID column then make the other two columns the primary key and their uniqueness will be enforced by the database server.
It's not really necessary to have the ID column - even though it serves as a handy way of referencing a particular record - as the uniqueness of the other two columns will mean that they are sufficient to reference a particular record.
You may also want to put an index on this table, that includes bothe columns, to make access faster.