What is difference between primary key and identity? - sql

What is its use, when both identifies the unique row?
Why people are using identity column as a primary key ?
Can anyone briefly describe the answer ?

A primary key is a logical concept - it is the means by which you will uniquely identify each record in a table. There are several types of primary key - a natural key uses a data attribute from the business domain which is guaranteed to have the requirements for a primary key (unique, not null, immutability) such as a social security number, a compound key is a key made up of multiple columns (often used in "parent-child" relationships), and a surrogate key is created by the system; it could be an auto-increment, or identity column.
Identity is a data type. It is very useful for use as a surrogate primary key, because it has all the attributes required. It's unlikely you'd use the identity type for purposes other than as a primary key, but there's nothing to stop you from doing so.
So, not all primary keys use the identity data type, and not all identity columns are primary keys.

Primary key is a kind of unique key. It's, in fact, a restriction (constraint) that values for a specific column (or, in general case, set of columns) cannot be the same in different rows (even when manually/explicitly setting to same values with an insert/update).
Primary/unique key isn't required to be an auto-incremented. It's, in fact, isn't required to be integer at all — it can be text or other type.
Primary key is a bit stricter than usual unique key in that it usually implies NOT NULL and, in additional to that, only one primary key is allowed per table (while several unique keys per table are allowed in addition to primary key).
Creating primary/unique key usually implicitly creates an index to make search and constraint-checking by that column(s) faster.
E.g. if column my_column of my_table is marked as primary or unique key, you can't do this:
INSERT INTO my_table (my_column, other_column, third_column)
VALUES (10, …, …);
INSERT INTO my_table (my_column, other_column, third_column)
VALUES (10, …, …); -- the same value for my_column again
Identity in your RDBMS is what other RDBMSes may call auto_increment or serial. It's just a feature that during an row-insert operation a specific column, when not being explicitly set to some value, is automatically initialized to (most often) consecutive integer values.
E.g. if column my_column of my_table is marked as auto_increment/serial/identity, you can do this:
INSERT INTO my_table (other_column, third_column) VALUES (…, …);
-- not specifying any value for my_column manually,
-- it'll be initialized automatically to some value
-- (usually an increasing integer sequence)
Auto_increment/serial/identity usually doesn't guarantee strict consequentiality of automatic values (especially in case of aborted transactions).
Concretely documentation for TRANSACT-SQL says that identity doesn't guarantee:
uniqueness (use unique/primary keys to enforce that);
strict consequentiality.
Update: As a_horse_with_no_name suggested, "identity" appears to be not only a name of the common auto_increment/serial/identity feature within specific RDBMSes (e.g. Microsoft SQL Server), but also a name defined by ANSI SQL standard.
AFAIK, it doesn't differ very much from what I described above (about the common auto_increment/serial/identity feature in implementations). I mean that it makes column values to be automatically initialized with an integer sequence, but doesn't guarantee uniqueness and strict consequentiality.
Still, I suppose that, unlike auto_increment/serial columns in MySQL/PostgreSQL, an ANSI-SQL-standard generated always as identity column doesn't allow its values to be set manually in INSERT or UPDATE (only automatically).

In a database table, every row should be unique, and you need to be able to identify a particular row uniquely.
A given table may have one or more columns which have unique values, so any of these columns can do the job. There may also be two or more columns which, while not unique of themselves, form a unique combination. That will also do.
Any column, or combination of columns, which can uniquely identify a row is called a candidate key. In principle, you can choose any key that you like, but you need to ensure that uniqueness is enduring. For example, in a small table of persons, the given name may be unique, but you run the risk of blowing that with the next additional person.
A primary key is the candidate key you nominate as your preferred
key. For example, a table of persons may have a number of unique attributes such as email address, mobile phone number and others. The primary key is the attribute you choose in preference to the others.
The following is not strictly required, but is good practice for a good Primary Key:
A Primary Key shouldn’t change
A Primary Key shouldn’t be recycled
For this reason, a Primary Key shouldn’t have any real meaning, so there should never be a reason to change or reuse it. As a result, the primary key is often an arbitrary code whose only real meaning is that it identifies the row. If the key is purely used for identification and has no other meaning, it is often referred to as a Surrogate Key.
You can put some effort into generating arbitrary codes. Sometimes they follow complex patterns which can be used to check their validity.
If you want to take a lazier approach, you can use a sequence number. Contrary to my previous advice, though, it does sort of have a meaning: it is strictly sequential, so you can learn which row was added after another, but not exactly when. However, that fact won’t change — it will not change in value, and will not be reused — so it’s still pretty stable.
An identity column is, in fact, a sequence number. It is auto-generated, and very useful if you want an arbitrary code for your primary key. Unfortunately it is relatively late to the very slow moving standards, so every DBMS had its own non-standard variation:
MySQL calls it AUTO_INCREMENT
SQLite calls it AUTOINCREMENT
MSSQL calls it IDENTITY()
MSACCESS calls it AUTONUMBER
PostgreSQL calls it SERIAL
and each has its own quirks.
More recently, (2003, I believe) it has been added to the standards in the form of:
int generated by default as identity
but this has only just started to appear in PostgreSQL and Oracle. This use of IDENTITY behaves differently to Microsoft’s.

primary key required unique number and primary key value can not be null that can get by identity and we not need to manually add at each new record which is added in table.
when record is failed to insert in table for some reason that time also identity is increase in sql server.

Related

When unique value can’t be used practically as a primary key in a database

I am having trouble answering the following question...
Illustrate by an example a scenario where an attribute has unique values in the different rows, yet it can’t be used practically as a primary key in the database relations/tables.
If the suggested column that has unique values is nullable and contains null values too, it cannot be a practical primary key. Because primary keys can't be null.
A non-sequential GUID is a bad candidate for a primary key, when the data is stored ordered by that key. An insert of a new row will not be appended to the table but must be inserted in the middle, meaning that data must be moved around to make room.
That is why there may also exist sequential GUIDs .
"random" is unique but practically as a key I would not use it.

Confusing t-sql exam answer about sequence or uniqueidentifier

I found a t-sql question and its answer. It is too confusing. I could use a little help.
The question is:
You develop a database application. You create four tables. Each table stores different categories of products. You create a Primary Key field on each table.
You need to ensure that the following requirements are met:
The fields must use the minimum amount of space.
The fields must be an incrementing series of values.
The values must be unique among the four tables.
What should you do?
A. Create a ROWVERSION column.
B. Create a SEQUENCE object that uses the INTEGER data type.
C. Use the INTEGER data type along with IDENTITY
D. Use the UNIQUEIDENTIFIER data type along with NEWSEQUENTIALID()
E. Create a TIMESTAMP column.
The said answer is D. But, I think the more suitable answer is B. Because sequence will use less space than GUID and it satisfies all the requirements.
D is a wrong answer, because NEWSEQUENTIALID doesn't guarantee "an incrementing series of values" (second requirement).
NEWSEQUENTIALID()
Creates a GUID that is greater than any GUID
previously generated by this function on a specified computer since
Windows was started. After restarting Windows, the GUID can start
again from a lower range, but is still globally unique.
I'd say that B (sequence) is the correct answer. At least, you can use a sequence to fulfil all three requirements, if you don't restart/recycle it manually. I think it is the easiest way to meet all three requirements.
Between the choices provided D B is the correct answer, since it meets all requirements:
ROWVERSION is a bad choice for a primary key, as stated in MSDN:
Every time that a row with a rowversion column is modified or inserted, the incremented database rowversion value is inserted in the rowversion column. This property makes a rowversion column a poor candidate for keys, especially primary keys. Any update made to the row changes the rowversion value and, therefore, changes the key value. If the column is in a primary key, the old key value is no longer valid, and foreign keys referencing the old value are no longer valid.
TIMESTAMP is deprecated, as stated in that same page:
The timestamp syntax is deprecated. This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature.
An IDENTITY column does not guarantee uniqueness, unless all it's values are only ever generated automatically (you can use SET IDENTITY_INSERT to insert values manually), nor does it guarantee uniqueness between tables for any value.
A GUID is practically guaranteed to be unique per system, so if a guid is the primary key for all 4 tables it ensures uniqueness for all tables. the one requirement it doesn't fulfill is storage size - It's storage size is quadruple that of int (16 bytes instead of 4).
A SEQUENCE, when is not declared as recycle, guarantee uniqueness, and has the lowest storage size.
The sequence of numeric values is generated in an ascending or descending order at a defined interval and can be configured to restart (cycle) when exhausted.
However,
I would actually probably choose a different option all together - create a base table with a single identity column and link it with a 1:1 relationship with all other categories. then use an instead of insert trigger for all categories tables that will first insert a record to the base table and then use scope_identity() to get the value and insert it as the primary key for the category table.
This will enforce uniqueness as well as make it possible to use a single foreign key reference between the categories and products.
The issue has been discussed extensively in the past, in general:
http://blog.codinghorror.com/primary-keys-ids-versus-guids/
The constraint #3 is why a SEQUENCE could run into issues as there is a higher risk of collision/lowered number of possible rows in each table.

What is the difference between a primary key and a surrogate key?

I googled a lot, but I did not find the exact straight forward answer with an example.
Any example for this would be more helpful.
The primary key is a unique key in your table that you choose that best uniquely identifies a record in the table. All tables should have a primary key, because if you ever need to update or delete a record you need to know how to uniquely identify it.
A surrogate key is an artificially generated key. They're useful when your records essentially have no natural key (such as a Person table, since it's possible for two people born on the same date to have the same name, or records in a log, since it's possible for two events to happen such they they carry the same timestamp). Most often you'll see these implemented as integers in an automatically incrementing field, or as GUIDs that are generated automatically for each record. ID numbers are almost always surrogate keys.
Unlike primary keys, not all tables need surrogate keys, however. If you have a table that lists the states in America, you don't really need an ID number for them. You could use the state abbreviation as a primary key code.
The main advantage of the surrogate key is that they're easy to guarantee as unique. The main disadvantage is that they don't have any meaning. There's no meaning that "28" is Wisconsin, for example, but when you see 'WI' in the State column of your Address table, you know what state you're talking about without needing to look up which state is which in your State table.
A surrogate key is a made up value with the sole purpose of uniquely identifying a row. Usually, this is represented by an auto incrementing ID.
Example code:
CREATE TABLE Example
(
SurrogateKey INT IDENTITY(1,1) -- A surrogate key that increments automatically
)
A primary key is the identifying column or set of columns of a table. Can be surrogate key or any other unique combination of columns (for example a compound key). MUST be unique for any row and cannot be NULL.
Example code:
CREATE TABLE Example
(
PrimaryKey INT PRIMARY KEY -- A primary key is just an unique identifier
)
All keys are identifiers used as surrogates for the things they identify. E.F.Codd explained the concept of system-assigned surrogates as follows [1]:
Database users may cause the system to generate or delete a surrogate,
but they have no control over its value, nor is its value ever
displayed to them.
This is what is commonly referred to as a surrogate key. The definition is immediately problematic however because Codd was assuming that such a feature would be provided by the DBMS. DBMSs in general have no such feature. The keys are normally visible to at least some DBMS users as, for obvious reasons, they have to be. The concept of a surrogate has therefore morphed slightly in usage. The term is generally used in the data management profession to mean a key that is not exposed and used as an identifier in the business domain. Note that this is essentially unrelated to how the key is generated or how "artificial" it is perceived to be. All keys consist of symbols invented by humans or machines. The only possible significance of the term surrogate therefore relates how the key is used, not how it is created or what its values are.
[1] Extending the database relational model to capture more meaning, E.F.Codd, 1979
This is a great treatment describing the various kinds of keys:
http://www.agiledata.org/essays/keys.html
A surrogate key is typically a numeric value. Within SQL Server, Microsoft allows you to define a column with an identity property to help generate surrogate key values.
The PRIMARY KEY constraint uniquely identifies each record in a database table.
Primary keys must contain UNIQUE values.
A primary key column cannot contain NULL values.
Most tables should have a primary key, and each table can have only ONE primary key.
http://www.databasejournal.com/features/mssql/article.php/3922066/SQL-Server-Natural-Key-Verses-Surrogate-Key.htm
I think Michelle Poolet describes it in a very clear way:
A surrogate key is an artificially produced value, most often a
system-managed, incrementing counter whose values can range from 1 to
n, where n represents a table's maximum number of rows. In SQL Server,
you create a surrogate key by assigning an identity property to a
column that has a number data type.
http://sqlmag.com/business-intelligence/surrogate-key-vs-natural-key
It usually helps you use a surrogate key when you change a composite key with an identity column.

SQL: what exactly do Primary Keys and Indexes do?

I've recently started developing my first serious application which uses a SQL database, and I'm using phpMyAdmin to set up the tables. There are a couple optional "features" I can give various columns, and I'm not entirely sure what they do:
Primary Key
Index
I know what a PK is for and how to use it, but I guess my question with regards to that is why does one need one - how is it different from merely setting a column to "Unique", other than the fact that you can only have one PK? Is it just to let the programmer know that this value uniquely identifies the record? Or does it have some special properties too?
I have no idea what "Index" does - in fact, the only times I've ever seen it in use are (1) that my primary keys seem to be indexed, and (2) I heard that indexing is somehow related to performance; that you want indexed columns, but not too many. How does one decide which columns to index, and what exactly does it do?
edit: should one index colums one is likely to want to ORDER BY?
Thanks a lot,
Mala
Primary key is usually used to create a numerical 'id' for your records, and this id column is automatically incremented.
For example, if you have a books table with an id field, where the id is the primary key and is also set to auto_increment (Under 'Extra in phpmyadmin), then when you first add a book to the table, the id for that will become 1'. The next book's id would automatically be '2', and so on. Normally, every table should have at least one primary key to help identifying and finding records easily.
Indexes are used when you need to retrieve certain information from a table regularly. For example, if you have a users table, and you will need to access the email column a lot, then you can add an index on email, and this will cause queries accessing the email to be faster.
However there are also downsides for adding unnecessary indexes, so add this only on the columns that really do need to be accessed more than the others. For example, UPDATE, DELETE and INSERT queries will be a little slower the more indexes you have, as MySQL needs to store extra information for each indexed column. More info can be found at this page.
Edit: Yes, columns that need to be used in ORDER BY a lot should have indexes, as well as those used in WHERE.
The primary key is basically a unique, indexed column that acts as the "official" ID of rows in that table. Most importantly, it is generally used for foreign key relationships, i.e. if another table refers to a row in the first, it will contain a copy of that row's primary key.
Note that it's possible to have a composite primary key, i.e. one that consists of more than one column.
Indexes improve lookup times. They're usually tree-based, so that looking up a certain row via an index takes O(log(n)) time rather than scanning through the full table.
Generally, any column in a large table that is frequently used in WHERE, ORDER BY or (especially) JOIN clauses should have an index. Since the index needs to be updated for evey INSERT, UPDATE or DELETE, it slows down those operations. If you have few writes and lots of reads, then index to your hear's content. If you have both lots of writes and lots of queries that would require indexes on many columns, then you have a big problem.
The difference between a primary key and a unique key is best explained through an example.
We have a table of users:
USER_ID number
NAME varchar(30)
EMAIL varchar(50)
In that table the USER_ID is the primary key. The NAME is not unique - there are a lot of John Smiths and Muhammed Khans in the world. The EMAIL is necessarily unique, otherwise the worldwide email system wouldn't work. So we put a unique constraint on EMAIL.
Why then do we need a separate primary key? Three reasons:
the numeric key is more efficient
when used in foreign key
relationships as it takes less space
the email can change (for example
swapping provider) but the user is
still the same; rippling a change of
a primary key value throughout a schema
is always a nightmare
it is always a bad idea to use
sensitive or private information as
a foreign key
In the relational model, any column or set of columns that is guaranteed to be both present and unique in the table can be called a candidate key to the table. "Present" means "NOT NULL". It's common practice in database design to designate one of the candidate keys as the primary key, and to use references to the primary key to refer to the entire row, or to the subject matter item that the row describes.
In SQL, a PRIMARY KEY constraint amounts to a NOT NULL constraint for each primary key column, and a UNIQUE constraint for all the primary key columns taken together. In practice many primary keys turn out to be single columns.
For most DBMS products, a PRIMARY KEY constraint will also result in an index being built on the primary key columns automatically. This speeds up the systems checking activity when new entries are made for the primary key, to make sure the new value doesn't duplicate an existing value. It also speeds up lookups based on the primary key value and joins between the primary key and a foreign key that references it. How much speed up occurs depends on how the query optimizer works.
Originally, relational database designers looked for natural keys in the data as given. In recent years, the tendency has been to always create a column called ID, an integer as the first column and the primary key of every table. The autogenerate feature of the DBMS is used to ensure that this key will be unique. This tendency is documented in the "Oslo design standards". It isn't necessarily relational design, but it serves some immediate needs of the people who follow it. I do not recommend this practice, but I recognize that it is the prevalent practice.
An index is a data structure that allows for rapid access to a few rows in a table, based on a description of the columns of the table that are indexed. The index consists of copies of certain table columns, called index keys, interspersed with pointers to the table rows. The pointers are generally hidden from the DBMS users. Indexes work in tandem with the query optimizer. The user specifies in SQL what data is being sought, and the optimizer comes up with index strategies and other strategies for translating what is being sought into a stategy for finding it. There is some kind of organizing principle, such as sorting or hashing, that enables an index to be used for fast lookups, and certain other uses. This is all internal to the DBMS, once the database builder has created the index or declared the primary key.
Indexes can be built that have nothing to do with the primary key. A primary key can exist without an index, although this is generally a very bad idea.

in general, should every table in a database have an identity field to use as a PK?

I'm running into an issue with a join: getting back too many records. I added a table to the set of joins and the number of rows expanded. Usually when this happens I add a select of all the ID fields that are involved in the join. That way it's pretty obvious where the expansion is happening and I can change the ON of the join to fix it. Except in this case, the table that I added doesn't have an ID field. This is a problem. But perhaps I'm wrong.
Should every table in a database have an IDENTITY field that's used as the PK? Are there any drawbacks to having an ID field in every table? What if you're reasonably sure this table will never be used in a PK/FK relationship?
When having an identity column is not a good idea?
Surrogate vs. natural/business keys
Wikipedia Surrogate Key article
There are two concepts that are close but should not be confused: IDENTITY and PRIMARY KEY
Every table (except for the rare conditions) should have a PRIMARY KEY, that is a value or a set of values that uniquely identify a row.
See here for discussion why.
IDENTITY is a property of a column in SQL Server which means that the column will be filled automatically with incrementing values.
Due to the nature of this property, the values of this column are inherently UNIQUE.
However, no UNIQUE constraint or UNIQUE index is automatically created on IDENTITY column, and after issuing SET IDENTITY_INSERT ON it's possible to insert duplicate values into an IDENTITY column, unless it had been explicity UNIQUE constrained.
The IDENTITY column should not necessarily be a PRIMARY KEY, but most often it's used to fill the surrogate PRIMARY KEYs
It may or may not be useful in any particular case.
Therefore, the answer to your question:
The question: should every table in a database have an IDENTITY field that's used as the PK?
is this:
No. There are cases when a database table should NOT have an IDENTITY field as a PRIMARY KEY.
Three cases come into my mind when it's not the best idea to have an IDENTITY as a PRIMARY KEY:
If your PRIMARY KEY is composite (like in many-to-many link tables)
If your PRIMARY KEY is natural (like, a state code)
If your PRIMARY KEY should be unique across databases (in this case you use GUID / UUID / NEWID)
All these cases imply the following condition:
You shouldn't have IDENTITY when you care for the values of your PRIMARY KEY and explicitly insert them into your table.
Update:
Many-to-many link tables should have the pair of id's to the table they link as the composite key.
It's a natural composite key which you already have to use (and make UNIQUE), so there is no point to generate a surrogate key for this.
I don't see why would you want to reference a many-to-many link table from any other table except the tables they link, but let's assume you have such a need.
In this case, you just reference the link table by the composite key.
This query:
CREATE TABLE a (id, data)
CREATE TABLE b (id, data)
CREATE TABLE ab (a_id, b_id, PRIMARY KEY (a_id, b_id))
CREATE TABLE business_rule (id, a_id, b_id, FOREIGN KEY (a_id, b_id) REFERENCES ab)
SELECT *
FROM business_rule br
JOIN a
ON a.id = br.a_id
is much more efficient than this one:
CREATE TABLE a (id, data)
CREATE TABLE b (id, data)
CREATE TABLE ab (id, a_id, b_id, PRIMARY KEY (id), UNIQUE KEY (a_id, b_id))
CREATE TABLE business_rule (id, ab_id, FOREIGN KEY (ab_id) REFERENCES ab)
SELECT *
FROM business_rule br
JOIN a_to_b ab
ON br.ab_id = ab.id
JOIN a
ON a.id = ab.a_id
, for obvious reasons.
Almost always yes. I generally default to including an identity field unless there's a compelling reason not to. I rarely encounter such reasons, and the cost of the identity field is minimal, so generally I include.
Only thing I can think of off the top of my head where I didn't was a highly specialized database that was being used more as a datastore than a relational database where the DBMS was being used for nearly every feature except significant relational modelling. (It was a high volume, high turnover data buffer thing.)
I'm a firm believer that natural keys are often far worse than artificial keys because you often have no control over whether they will change which can cause horrendous data integrity or performance problems.
However, there are some (very few) natural keys that make sense without being an identity field (two-letter state abbreviation comes to mind, it is extremely rare for these official type abbreviations to change.)
Any table which is a join table to model a many to many relationship probably also does not need an additional identity field. Making the two key fields together the primary key will work just fine.
Other than that I would, in general, add an identity field to most other tables unless given a compelling reason in that particular case not to. It is a bad practice to fail to create a primary key on a table or if you are using surrogate keys to fail to place a unique index on the other fields needed to guarantee uniqueness where possible (unless you really enjoy resolving duplicates).
Every table should have some set of field(s) that uniquely identify it. Whether or not there is a numeric identifier field separate from the data fields will depend on the domain you are attempting to model. Not all data easily falls into the 'single numeric id' paradigm, and as such it would be inappropriate to force it. Given that, a lot of data does easily fit in this paradigm and as such would call for such an identifier. There is no one answer to always do X in any programming environment, and this is another example.
If you have modelled, designed, normalised etc, then you will have no identity columns.
You will have identified natural and candidate keys for your tables.
You may decide on a surrogate key because of the physical architecture (eg narrow, numeric, strictly monotonically increasing), say, because using a nvarchar(100) column is not a good idea (still need unique constraint).
Or because of ideology: they appeal to OO developers I've found.
Ok, assume ID columns. As your db gets more complex, say several layers, how can you jon parent and grand-.child tables directly. You can't: you always need intermediate tables and well indexed PK-FL columns. With a composite key, it's all there for you...
Don't get me wrong: I use them. But I know why I use them...
Edit:
I'd be interested to collate "always ID"+"no stored procs" matches on one hand, with "use stored procs"+"IDs when they benefit" on the other...
No. Whenever you have a table with an artificial identity column, you also need to identify the natural primary key for the table and ensure that there is a unique constraint on that set of columns too so that you don't get two rows that are identical apart from the meaningless identity column by accident.
Adding an identity column is not cost free. There is an overhead in adding an unnecessary identity column to a table - typically 4 bytes per row of storage for the identity value, plus a whole extra index (which will probably weigh in at 8-12 bytes per row plus overhead). It also takes slightly to work out the most cost-effective query plan because there is an extra index per table. Granted, if the table is small and the machine is big, this overhead is not critical - but for the biggest systems, it matters.
Yes, for the vast majority of cases.
Edge cases or exceptions might be things like:
two-way join tables to model m:n relationships
temporary tables used for bulk-inserting huge amounts of data
But other than that, I think there is no good reason against having a primary key to uniquely identify each row in a table, and in my opinion, using an IDENTITY field is one of the best choices (I prefer surrogate keys over natural keys - they're more reliable, stable, never changing etc.).
Marc
I can't think of any drawback about having an ID field in each table. Providing your the type of your ID field provides enough space for your table to grow.
However, you don't necessarily need a single field to ensure the identity of your rows.
So no, a single ID field is not mandatory.
Primary and Foreign Keys can consist not only of one field, but of multiple fields. This is typical for tables implementing a N-N relationship.
You can perfectly have PRIMARY KEY (fa, fb) on your table:
CREATE TABLE t(fa INT , fb INT);
ALTER TABLE t ADD PRIMARY KEY(fa , fb);
Recognize the distinction between an Identity field and a key... Every table should have a key, to eliminate the data corruption of inadvertently entering multiple rows that represent the same 'entity'. If the only key a table has is a meaningless surrogate key, then this function is effectively missing.
otoh, No table 'needs' an identity, and certainly not every table benefits from one... Examples are: A table with a short and functional key, a table which does not have any other table referencing it through a foreign Key, or a table which is in a one to zero-or-one relationship with another table... none of these need an Identity
I'd say, if you can find a simple, natural key in your table (i.e. one column), use that as a key instead of an identity column.
I generally give every table some kind of unique identifier, whether it is natural or generated, because then I am guaranteed that every row is uniquely identified somehow.
Personally, I avoid IDENTITY (incrementing identity columns, like 1, 2, 3, 4) columns like the plague. They cause a lot of hassle, especially if you delete rows from that table. I use generated uniqueidentifiers instead if there is no natural key in the table.
Anyway, no idea if this is the accepted practice, just seems right to me. YMMV.