naming the id field of a table properly

naming the id field of a table properly - sql

I am currently reading the book "SQL Programming Style" wrote by Joe Celko.
In the first chapter, at the paragraph "Develop Standardized Postfixes" he states for the id column :
"_id" = identifier. It is unique in
the schema and refers to one entity
anywhere it appears in the schema.
Never user ">table_name<_id"
Few pages later he states
Do not use an underscore as the first
or last letter in a name. It looks
like the name is missing another
component.
He deprecated "id" as column name.
So I would like to know how you guys name the id column ?
I know that most people might think what the point of this question, but I am looking on standardizing my data model, following industry standards and ISO standards as much as I can.

I also deprecate the use of "Id" as a column name, even though it has become very widespread. "EmployeeId" is longer than "Id", but it is more descriptive. It also allows a foreign key to generally have the same name as the column to which it refers. This is enormously helpful when control over the database passes from one person to the next.
There is an exception to the above. It's possible to have two foreign keys in the same table that both refer to the same key. It's also possible to have a reflexive foreign key that refers to the key in a different row of the same table where it appears.
Let me give an example of a reflexive key. You have a table of employees, with key EmployeeId. You have another column, called SupervisorId, that records the relationship between a supervisor and several subordintes. The name of the foreign key in this case names the role, and not the entity.
As an alternative, it's possible to use user defined domains to document the fact that two columns refer to the same thing. Again, this is most useful when the fundamental meaning of the data has to be communicated to someone new.
The use of underscore as an internal visual separator inside a symbol is a completely separable issue. Camelcasing has become more widespread than underscore, and there are even systems where underscore is not allowed as a symbol constituent.
Above all, keep it consistent. If you use arbitrary, capricious, and contradictory naming conventions, you'll eventually confuse even your self.

I think it's a good question. Do what looks good to you, and always do that, every time. Then you'll be fine.
I use the tablename + 'id' model: UserId, PersonId etc.

Rather than share my opinions on naming standards, I'll attempt to answer your question ;)
I think the point Celko is making is that student_ID in a table of students is a code smell i.e. it could be that the designer's style is to always adds an ID column, probably an auto-increment column, to every table they create in the physical model (even when there is no such column in the logical model) with the intention of using these ID columns for foreign keys. In other words, Celko does not want you to always use surrogate key, rather he wants you to use natural keys where appropriate.
If you read on to section 1.2.5 (p14-15) and follow his rules for table names, you'll discover why table name + _ID an unlikely occurrence:
if I cannot find an industry standard
(name), I would look for a collective
or class name... Exception: use a
singular name if the table actually
has one and only one row in it.
So, for example, if you had a table containing student data it may be called Students rather than Student but more likely to be Enrolment (or similar). And a table containing one and only one row is unlikely to need an _ID column.
I suppose there are nouns for whom the plural is the same as the singular so maybe Sheep_ID is acceptable (but only in absence of an industry standard ovine identifier, of course!)
Also consider the rule 1.3.2. (p19) Avoid Names That Change From Place to Place e.g. the same domain referred to in the Students table as ID and in other tables as student_ID. It is unlikely that there will only be one element named _ID in the entire schema!

For Table IDs I always use tablename + ID.
The reason for this is to avoid ambiguous column names in Queries when it is a 1 to 1 mapping
Sometimes I quickly write up sql to test like this
Select
*
FROM table1
Inner join table2 on table1ID = table2ID
If I didnt use tablename in the ID column then this would throw an error (forcing me to use aliases on the tables)
Select
*
FROM table1
Inner join table2 on ID = ID
Also another good reason to use the table name, in general testing queries to see what data exists use the "*" to select columns. If you do a join and Select *, sometimes it is difficult to understand what ID came from what table, especially if you are returning a large number of columns from more than 2 tables

I always advocate for globally unique TABLENAME_ID. On that note, I strongly encourage table names which fully describe their context, so there is never any ambiguity as to their application when foreign references are made.

ID as a column name is hard to maintain and in my opinion can more easily lead to mistakes in joins.
Suppose for instance you always used ID as a column name in every table.
Now suppose you need to join to six of those tables. And being a typical person, you copy the first joins and change the table names. If you miss one, and you use id you will get a query that runs and gives the wrong anaswer. If you use tablenameId you will get a syntax error. see the following code for an example:
create table #test1 (id int identity, test varchar(10))
create table #test2 (id int identity, test varchar(10))
create table #test3 (id int identity, test varchar(10))
insert #test1
values ('hi')
insert #test1
values ('hello')
insert #test2
values ('hi there')
insert #test3
values ('hello')
insert #test3
values ('hi')
select *
from #test1 t1
join #test2 t2
on t1.id = t2.id
join #test3 t3
on t1.id = t2.id
select *
from #test1 t1
join #test2 t2
on t1.id = t2.id
join #test3 t3
on t1.id = t3.id
Drop table #test1
drop table #test2
drop table #test3
Go
create table #test1 (t1id int identity, test varchar(10))
create table #test2 (t2id int identity, test varchar(10))
create table #test3 (t3id int identity, test varchar(10))
insert #test1
values ('hi')
insert #test1
values ('hello')
insert #test2
values ('hi there')
insert #test3
values ('hello')
insert #test3
values ('hi')
select *
from #test1 t1
join #test2 t2
on t1.t1id = t2.t2id
join #test3 t3
on t1.t1id = t3.t3id
select *
from #test1 t1
join #test2 t2
on t1.t1id = t2.t2id
join #test3 t3
on t1.t1id = t2.t3id
Drop table #test1
drop table #test2
drop table #test3
Another thing about using tablenameId is that when you want the actual id from several tables in a complex reporting query, you don't have to create aliases in order to see which id came from where (and to make the reporting application happy as most of them inist on unique fieldnames for a report).

Wow, I was going to write "I always use TablenameID but everyone else in the world disagrees with me". However, it looks like everyone here agrees with me.
That is, of course, when I use a surrogate integer ID in the table. If there's a natural primary key I use that instead.

In my database:
For a foreign key ID, I use the singular version of the foreign table name + "Id". I use the capital I, lower d as it is a standard ingrained in me by FX cop.
For auto incrementing identities I often use "SequenceId"
In my data layer:
I use the name of the object + "Id", following best practice standards for "Id"

Related

How to add data manually and from other tables to a table in SQL?

I need to insert data into a relation table. I want to add three different types of data. Two ints and a string. The table looks like this:
drop table if exists Favorite cascade;
create table Favorite (
id_user int,
id_serie int,
since date,
primary key (id_user, id_serie),
foreign key (id_user) references User (id_user),
foreign key (id_serie) references Serie (id_serie)
);
Tables User and Serie have been already created and it's values have been already inserted.
I want to insert into table Favorite all the data, making sure Favorite.id_user = User.id_user and also Favorite.id_serie = Serie.id_serie but I also want to add manually the since field.
How can I do this?
insert into Favorite (id_user, id_serie, since) [What should I type here?];
Would something like this work (I guess it won't)?
insert into Favorite (id_user, id_serie, since) select User.id_user, Sere.id_serie from User, Serie values ('2020/02/15');
Thanks in advance.

INSERT INTO Favorite (id_user, id_serie, since)
SELECT
User.id_user,
Serie.id_serie,
'2020/02/15'
FROM User, Serie
WHERE ...

Would something like this work (I guess it won't)?
Yes, what you ask will work pretty well.
Is this what you are looking for?
INSERT INTO table3 (table1.column1, table1.column2, table2.column3, ...)
SELECT table1.column1, table1.column2, table2.column3, 'constant value' ...
FROM table1, table 2
WHERE condition;
Go here for more
https://www.w3schools.com/sql/sql_insert_into_select.asp

Matrix table index SQL Server 2008

I have a table with two columns built from another table of names, one identity and one a name like this:
ID---Name
1----Mike
2----Jeff
3----Robert
...down to however many
Could be 10 rows, could be 100. This will vary depending on input from other tables that are always changing but never be over 160 or so.
Now, pairings of names will have some meaning and thus a decimal data type score will be associated with said pairing (how at this point doesn’t matter, just need to build it for now...numbers just illustrative). I envision a matrix kind of like this:
ID------Name------Mike-------Jeff--------Robert-------- ...out to however many
1 -------Mike-------NULL------100.1------5.4-------- ...out to however many
2 -------Jeff---------100.1------NULL-----21.23--------- ...out to however many
3 ------Robert-------5.4--------21.23-----NULL---------...out to however many
…down to however many happen to be in the first table…
Maybe this isn’t quite the most optimal way to go (Yes, I know there are duplicates in the table but I plan to structure the queries such that the duplicates are ignored) but at this point am not aware of many viable options. After searching around, I thought maybe I wanted a pivot but that doesn’t seem to fit what I have here because I’m leaving the names in the column and associating them as column heads for a paired score. Then I thought maybe I wanted to store a variable as the value of each row and then add them as the columns. That was no help. My latest iteration was maybe creating a temp table as an exact copy with and identity column, then trying to select the specific name by the identity and looping through them but I can’t even seem to grab the first name and make it a column name in addition to a row value under the name column...see below
--create a table of names with an identity column
CREATE TABLE myTable2
(
ID INT IDENTITY(1,1),
Name VARCHAR(5),
);
--add names to the table from a different table
INSERT INTO myTable1 (Name)
SELECT Name
FROM myTable1
--create a temp table with the same values
SELECT ID, Name
INTO #new
FROM myTable2
GROUP BY ID, Name
--insert name from first row as a column head
INSERT INTO myTable2 (SELECT Number FROM #new WHERE ID =1)
So, in the last bit there, INSERT INTO”, I want to copy the names, in this instance “Mike” and make it ALSO a column head in the same table where it is a row (like in my second table). I get an error message that the syntax is not correct for the statement. Why isn’t this allowed? How can I get it to do what I want? It also has been suggested by someone that knows way more about this stuff than me, that maybe instead of building the table as a matrix, build it as below. It is possible here to get rid of the duplicates this way and I would except I have no idea where to even begin doing this…
Name1-----------Name2-----------Calculated Value
Mike--------------Mike-------------NULL
Jeff---------------Mike-------------100.1
Robert-------------Mike-------------5.4
Mike--------------Jeff-------------100.1
Jeff----------------Jeff-------------NULL
Robert------------Jeff-------------21.23
Mike--------------Robert-----------5.4
Jeff---------------Robert-----------21.23
Robert------------Robert-----------NULL
...etc
Any help suggestions or pointing of me in the right and most appropriate direction would be greatly appreciated!
EDIT: Here's how I solved my problem. Looks like the Cartesian product was the way to go. Thanks #Alex Kudryashev
--create a table of cross joined names
CREATE TABLE cartNames
(
Name1 VARCHAR(5),
Name2 VARCHAR(5),
);
--create two temporary tables from a source table of names
SELECT Name AS Name1
INTO #name1
FROM names
GROUP BY Name
SELECT Name AS Name2
INTO #Name2
FROM names
GROUP BY Name
--populate the Cartesian table
INSERT INTO cartNames
SELECT * FROM #name1 CROSS JOIN #name2
--get rid of the temp tables
DROP TABLE #Name1
DROP TABLE #Name2
--add columns and populate calculated scores
---

It looks like you want to create a Cartesian Product. There is very easy way to do so.
declare #tbl table(name varchar(10))
insert #tbl(name) values('MIke'),('Jeff'),('Robert')
select t1.name name1,t2.name name2, some_udf(t1.name,t2.name) calc_value
from #tbl t1 cross join #tbl t2

Relating two tables

I have created tables T1 with columns( id as Primary key and name) and T2 with columns( id as primary key, name, t_id as foreign key references T1(id)) . I Inserted some values from inputs from a Windows form. After querying SELECT * FROM T2; using isql, all the values in the foreign key column are null instead of duplicating values in T1(id) because of the relationship created. Is they anything I have left out or need to add? The primary key of both tables are autoincremented.

You are confusing auto-incremented keys and relationship uses.
Auto-incremented keys (or generally talking, fields) just help you when you are inserting a new record on the table of the key. But when you are inserting a new record that makes a reference to a record in another table, then you must specify that record, using the foreign key field. Or in your case, the user that is inserting the "name" in T2 must say which one record on T1 that "name" in T2 is making a reference.
Your confusion on the relationship is that you are thinking that an established relationship will enforce the use of that values automatically. But the relationship just enforce the validation of the values. So, the field t_id in T2 will not use the value of the last record of T1 automatically. But if you try to insert a value that do not exist in T1 in the field t_id, the relationship will not let you do.
So, answering your question, what you left out and need to add?
You left out the part of the code that insert the value on the t_id field of T2 table.
Let me try to explain using an example that is more common.
The most common case of this is that the application insert first the T1 record and then when the user is inserting T2, the application provide a way to the user to choose which one T1 record his T2 record is referencing.
Suppose T1 is a publishers table and T2 is a book table. User insert a publisher, and when it is inserting a book it can choose which one publisher publish that book.

Field "ID" of Customers will be AUTOINCREMENT by default in table create using Event BeforeInsert on table CUSTOMERS. LOOK AT
CREATE TRIGGER nametrigger FOR nametable
ACTIVE BEFORE INSERT POSITION 0
AS
BEGIN
IF (NEW.ID IS NULL) THEN BEGIN
NEW.ID = GEN_ID(GEN_PK_ID, 1);
END
END
Now one new record in Customers
INSERT INTO Customers (CustomerName, ContactName, Address, City, PostalCode, Country)
VALUES ('Cardinal','Tom B. Erichsen','Skagen 21','Stavanger','4006','Norway');
Then ID will be automaticaly one sequencial number from 1 up to last integer or smallint or bigint as you defined in your create table (pay attencion that ID field is not include in FIELDS and VALUES) because TRIGGER
now you can use the dataset (obj) options to link the table MATER and DETAIL see in help delphi
or in SQL you can to use PARAMS FIELDS
later insert one new record in table MASTER try...
INSERT INTO xTable2 (IDcustomersField, ..., ..., ...., ....)
VALUES ( :IDcustomersField, ..., ..., ...., ....);
xTable2 may using one field ID (Primary Key) autoincrement too. this help when DELETING or UPDATING fileds in this table
Then you can say the value to :IDcustomersField in table detail using
xQuery.PARAM( 0 ).value or xQuery.PARAMBYNAME( IDcustomersField).value (here im using Query obj as example )
you can to use example with DATASOURCE in code to say the value for IDcustomersField
can to use
Events in SQL
can to use
PROCEDURE IN SQL
DONT FORGOT
you have to create Relationship between two table ( REFERENCIAL INTEGRITY and PRIMARY KEY in mater table ) NOT NULL FOR TWO FIELDS ON TABLES
I believe that understand me about my poor explanation (i dont speak english

You need to insert the values for t_id manually, after you get the ID's value from the main table T1.
Depending on your logic in the database you also can use a trigger or a stored procedure. Give us more information about what values you expect to have in NAME field in T2 after the insert? Are they duplicates from T1 or independent from T1?
If T1.NAME=T2.NAME, you can automate the process with a trigger
CREATE OR ALTER TRIGGER TR_T1_AI0 FOR T1
ACTIVE AFTER INSERT POSITION 0
AS
BEGIN
INSERT INTO T2(NAME, T_ID)
VALUES (NEW.NAME, NEW.ID);
END
If T2.NAME's value is different from T1.NAME you can use a stored procedure with parameters both names:
CREATE ORA ALTER PROCEDURE XXXX(
P_NAME_T1 TYPE OF T1.NAME,
P_NAME_T2 TYPE OF T2.NAME)
AS
DECLARE VARIABLE L_ID TYPE OF T1.ID;
BEGIN
INSERT INTO T1(NAME)
VALUES (:p_NAME_T1)
RETURNING ID INTO:L_ID;
INSERT INTO T2(NAME, T_ID)
VALUES (:P_NAME_T2, :l_ID);
END
You can use both statements from the stored procedure directly in your program if it supports the returning syntax. If not, you need an additional query with SELECT NEXT VALUE FOR GENERATOR_FOR_T1 FROM RDB$DATABASE; and use the value returned from it in both INSERT statements.

using insert into to append to existing table in a remote database

my goal is to select items from a table and append those items into another table located on a remote database on the same server. All columns in both tables match up and are identical. In this case,
I have the tsql:
INSERT INTO db1.dbo.tblitems
SELECT *
FROM db2.dbo.tblitems i2
WHERE i2 = 'import'
i get an error saying:
An explicit value for the identity column in table 'db1.dbo.tblitems' can only be specified when a column list is used and IDENTITY_INSERT is ON.
any ideas why this doesn't work?
thanks in advance

Sounds like there is an identity column in the table. An identity column is a column that is made up of values generated by the database. For example:
create table #TestTable (id int identity, name varchar(50))
insert into #TestTable select 1, 'Will Smith'
This gives the identity column error. You can avoid that in two ways: the first is not to insert the identity column, like:
insert into #TestTable (name) select 'Will Smith'
The second is to use set identity_insert (requires admin privileges):
set identity_insert #TestTable on
insert into #TestTable (id, name) select 1, 'Will Smith'
set identity_insert #TestTable off
In both cases, you have to specify the column list.

I agree with Andomar but a further consideration...
Have you considered the effects of merging these two data sets?
Say I had two identical tables in two databases with this data:
Id Name
1 Bill
2 Bob
3 Bert
Id Name
3 Jenny
4 Joan
5 Jackie
Option 1 of Andomar's would give the girls new IDs. If that ID has been used as a primary key in the table and other tables referenced it as a foreign key then this will break the referential integrity (you will have records pointing to the wrong place).
Option 2 would fall over if there is a unique index on the ID column, which quite likely if it is being used as a key. This is because the two ID values for Bert and Jenny are not unique.
So while Andomar is right in that it will fix the identity insert problem, it doesn't address the issue of why there were identity columns in the first place.
p.s. if this is an issue ask for a solution in a new question.

This might be an issue of permissions. As the server the query is running on cannot determine if the connected user has the permission to insert the data into the destination server/table, it just might not be possible.

Stuck trying to migrate two tables from one DB to another DB

i'm trying to migrate some data from two tables in an OLD database, to a NEW database.
The problem is that I wish to generate new Primary Key's in the new database, for the first table that is getting imported. That's simple.
But the 2nd table in the old database has a foreign key dependency on the first table. So when I want to migrate the old data from the second table, the foreign key's don't match any more.
Are there any tricks/best practices involved to help me migrate the data?
Serious Note: i cannot change the current schema of the new tables, which do not have any 'old id' column.
Lets use the following table schema :-
Old Table1 New Table1
ParentId INT PK ParentId INT PK
Name VARCHAR(50) Name VARCHAR(50)
Old Table 2 New Table 2
ChildId INT PK ChildId INT PK
ParentId INT FK ParentId INT FK
Foo VARCHAR(50) Foo VARCHAR(50)
So the table schema's are identical.
Thoughts?
EDIT:
For those that are asking, RDBMS is Sql Server 2008. I didn't specify the software because i was hoping i would get an agnostic answer with some generic T-Sql :P

I think you need to do this in 2 steps.
You need to import the old tables and keep the old ids (and generate new ones). Then once they're in the new database and they have both new and old ids you can use the old Id's to get associate the new ids, then you drop the old ids.
You can do this by importing into temporary (i.e. they will be thrown away) tables, then inserting into the permanent tables, leaving out the old ids.
Or import directy into the new tables (with schema modified to also hold old ids), then drop the old id's when they're no longer necessary.
EDIT:
OK, I'm a bit clearer on what you're looking for thanks to comments here and on other answers. I knocked this up, I think it'll do what you want.
Basically without cursors it steps through the parent table, row by row, and inserts the new partent row, and all the child rows for that parent row, keeping the new id's in sync.
I tried it out and it should work, it doesn't need exclusive access to the tables and should be orders of magniture faster than a cursor.
declare #oldId as int
declare #newId as int
select #oldId = Min(ParentId) from OldTable1
while not #oldId is null
begin
Insert Into NewTable1 (Name)
Select Name from OldTable1 where ParentId = #oldId
Select #newId = SCOPE_IDENTITY()
Insert Into NewTable2 (ParentId, Foo)
Select #newId, Foo From OldTable2 Where ParentId = #oldId
select #oldId = Min(ParentId) from OldTable1 where ParentId > #oldId
end
Hope this helps,

Well, I guess you'll have to determine other criteria to create a map like oldPK => newPK (for example: Name field is equal?
Then you can determine the new PK that matches the old PK and adjust the ParentID accordingly.
You may also do a little trick: Add a new column to the original Table1 which stores the new PK value for a copied record. Then you can easily copy the values of Table2 pointing them to the value of the new column instead of the old PK.
EDIT: I'm trying to provide some sample code of what I meant by my little trick. I'm not altering the original database structure, but I'm using a temporary table now.
OK, you might try to following:
1) Create temporary table that holds the values of the old table, plus, it gets a new PK:
CREATE TABLE #tempTable1
(
newPKField INT,
oldPKField INT,
Name VARCHAR(50)
)
2) Insert all the values from your old table into the temporary table calculating a new PK, copying the old PK:
INSERT INTO #tempTable1
SELECT
newPKValueHere AS newPKField,
ParentID as oldPKField,
Name
FROM
Table1
3) Copy the values to the new table
INSERT INTO NewTable1
SELECT
newPKField as ParentId,
Name
FROM
#tempTable1
4) Copy the values from Table2 to NewTable2
INSERT INTO NewTable2
SELECT
ChildID,
t.newPKField AS ParentId,
Foo
FROM
Table2
INNER JOIN #tempTable1 t ON t.ParentId = parentId
This should do. Please note that this is only pseudo T-SQL Code - I have not tested this on a real database! However, it should come close to what you need.

Can you change the schema of the old tables? If so, you could put a "new id" column on the old tables, and use that as the reference.
You might have to do a row by row insert on the new table and then retrieve the scope_identity, store it in the old table1. But for table2, you can then join to the old table1 and grab the new_id.

First of all - can you not even have some temporary schema that you can later drop?! That would make life easier. Assuming you can't:
If you're lucky (and if you can guarantee that no other inserts will be happening at the same time) then when you insert the Table1's data into your new table you could perhaps cheat by relying on the sequential order of the inserts.
You could then create a view that joins the 2 tables on a row-count so that you have a way to correlate the keys to each other. That way you'd be one step closer to being able to identify the 'ParentId' for the new Table2.

I'm not sure from your question what database software you're using, but if temporary tables are an option, create a temporary table containing the original primary key of table1 and the new primary key of table1. Then create another temporary table with a copy of table2, update the copy using the "old key, new key" table you created earlier, then use "insert into select from" (or whatever the appropriate command is for your database) to copy the revised temporary table into its permanent location.

I had the wonderful opportunity to be dug deep in migration scripts last summer. I was using Oracle's PL/SQL for the task. But you did not mention what technology are you using? What are you migrating the data into? SQL Server? Oracle? MySQL?
The approach is to INSERT a row from table1 RETURING the new primary key generated (probably by a SEQUENCE [in Oracle]) and then INSERT the dependent records from table2, changing their foreign key value to the value returned by the first INSERT. Can't help you any better unless you can specify what DBMS are you migrating data into.

The following Pseudo-ish code should work for you
CREATE TABLE newtable1
ParentId INT PK
OldId INT
Name VARCHAR(50)
CREATE TABLE newtable2
ChildId INT pk
ParentId INT FK
OldParent INT
Foo VARCHAR(50)
INSERT INTO newtable1(OldId, Name)
SELECT ParentId, Name FROM oldtable1
INSERT INTO newtable2(OldParent, Foo)
SELECT ParentId, Foo FROM oldtable2
UPDATE newtable2 SET ParentId = (
SELECT n.ParentId
FROM newtable1 AS n
WHERE n.OldId = newtable2.oldParent
)
ALTER TABLE newtable1 DROP OldId
ALTER TABLE newtable2 DROP OldParent

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas