SQL Server: Using binary_checksum column on a table best practice - sql

Looking at table definitions on an SQL server database have noticed that the (1) binary_checksum column sometimes includes the primary key my_table_id and sometimes does not. What is the best practice?
(2) Also what about the update_by and update_timestamp should they be included or not?
CREATE TABLE [dbo].[my_table] (
[my_table_id] SMALLINT NOT NULL PRIMARY KEY,
[a] SMALLINT NOT NULL,
[b] CHAR(25) NOT NULL,
[update_timestamp] DATETIME NOT NULL DEFAULT getdate(),
[update_by] CHAR(8) NOT NULL,
[my_checksum_col] AS (binary_checksum([a], [g], [update_by], [update_timestamp]))
)
VS
CREATE TABLE [dbo].[my_table] (
[my_table_id] SMALLINT NOT NULL PRIMARY KEY,
[a] SMALLINT NOT NULL,
[b] CHAR(25) NOT NULL,
[update_timestamp] DATETIME NOT NULL DEFAULT getdate(),
[update_by] CHAR(8) NOT NULL,
[my_checksum_col] AS (binary_checksum([my_table_id],[a], [g], [update_by], [update_timestamp]))
)

This may be a matter of opinion, but it depends on how the checksum is going to be used. If the primary key is auto-generated (such as an identity or newid() column), then including it in the checksum is not very interesting. At least, you can't use the checksum to find duplicates.
If the primary key is a data key provided externally, then it is functioning as both data and as a primary key. In that case, including it in the checksum makes more sense.

Related

ORA-00907 (missing right parenthesis)

I'm trying to run this in my database but for some reason, I keep getting missing the right parenthesis. Thoughts?
CREATE TABLE PET_OWNER
(
OwnerID Int NOT NULL IDENTITY(1, 1),
OwnerLastName Char(25) NOT NULL,
OwnerFirstName Char(25) NOT NULL,
OwnerPhone Char(12) NULL,
OwnerEmail VarChar(100) NULL
CONSTRAINT OWNER_PK PRIMARY KEY(OwnerID)
);
EShirvana's answer directly answers the question you asked. However, I would suggest:
CREATE TABLE PET_OWNER (
OwnerID Int generated always as identity primary key,
OwnerLastName varchar2(25) NOT NULL,
OwnerFirstName varchar2(25) NOT NULL,
OwnerPhone varchar2(12),
OwnerEmail varchar2(100)
);
The differences:
The primary key constraint can be inlined. To be honest, I don't generally see much use for naming the primary key as a separate constraint (no harm).
Declaring a primary key column as NOT NULL is redundant. That is part of being a primary key.
Do you know what the char() data type does? It pads strings with spaces so they match the length. Use variable length strings -- and Oracle recommends varchar2() for this purpose.
By default, columns are NULLable. I actually find that explicitly declaring NULL is harder to read because I have to distinguish between NOT NULL and NULL which requires more work than just seeing NOT NULL.
2 issues:
identity and missing comma before constraint:
CREATE TABLE PET_OWNER(
OwnerID Int GENERATED ALWAYS AS IDENTITY NOT NULL,
OwnerLastName Char(25) NOT NULL,
OwnerFirstName Char(25) NOT NULL,
OwnerPhone Char(12) NULL,
OwnerEmail VarChar(100) NULL,
CONSTRAINT OWNER_PK PRIMARY KEY(OwnerID)
);

SQL on Azure - using a computed column as a primary key index

I am not sure what is wrong with the below SQL.
I used to have a primary key based off of the customer_reference_no.
They now have some duplicates so I am creating a column called uniquePoint that is a combination of customer_no, customer_reference_no and stop_zip.
The below works fine:
CREATE TABLE [dbo].[stop_address_details] (
[customer_no] NCHAR (5) NOT NULL,
[customer_reference_no] VARCHAR (20) NOT NULL,
[stop_name] VARCHAR (40) NOT NULL,
[stop_address] VARCHAR (40) NULL,
[stop_city] VARCHAR (30) NULL,
[stop_state] CHAR (2) NULL,
[stop_zip] VARCHAR (10) NULL,
[point_no] VARCHAR (20) NULL,
[branch_id] VARCHAR (6) NULL,
[delivery_route] VARCHAR (10) NULL,
[dateTimeCreated] DATETIME NULL,
[dateTimeUpdated] DATETIME NULL,
[estimated_delivery_time] TIME (0) NULL,
[est_del_time] DATETIME NULL,
[dateTimeLastUsedInDatatrac] DATETIME NULL,
[uniquePoint] as customer_no + '_' + customer_reference_no + '_' + stop_zip PERSISTED ,
CONSTRAINT [AK_stop_address_details_customer_reference_no] UNIQUE NONCLUSTERED ([customer_reference_no] ASC),
CONSTRAINT [PK_stop_address_details] PRIMARY KEY ([uniquePoint])
But when I remove the constraint for customer_reference_no I get the following error:
SQL71516 :: The referenced table '[dbo].[stop_address_details]' contains no primary or candidate keys that match the referencing column list in the foreign key. If the referenced column is a computed column, it should be persisted.
I am referencing the computed column and it is persisted.
Not sure what is missing?
Thank you,
Joe
The answer appears to be that I have another table that is referencing this table with a foreign key:
REATE TABLE [dbo].[rep_assigned_stop_matrix] (
[customer_reference_no] VARCHAR (20) NOT NULL,
[rep_id] INT NULL,
[dateTimeCreated] DATETIME NULL,
[sendSMS] BIT NULL,
[sendEmail] BIT NULL,
[id] INT IDENTITY (1, 1) NOT NULL,
CONSTRAINT [PK_rep_assigned_stop_matrix] PRIMARY KEY CLUSTERED ([id] ASC),
CONSTRAINT [AK_rep_assigned_stop_matrix_Column] UNIQUE NONCLUSTERED ([customer_reference_no] ASC, [rep_id] ASC),
CONSTRAINT [FK_pod_update_lookup_rep_info] FOREIGN KEY ([rep_id]) REFERENCES [dbo].[rep_info] ([id]) ON DELETE CASCADE,
CONSTRAINT [FK_lookup_Stop_Details] FOREIGN KEY ([customer_reference_no]) REFERENCES [dbo].[stop_address_details] ([customer_reference_no])
);
When this bottom constrain was removed the error went away. What I don't understand is why the error message was not a bit clearer (meaning naming the rep_assigned_stop_matrix table) -- or am I still missing something?
Joe
It seems that your '[dbo].[stop_address_details]' is still referring to the customer_reference_no column. Try Remove and re-add it using the new column name.

Use a common table with many to many relationship

I have two SQL tables: Job and Employee. I need to compare Job Languages Proficiencies and Employee Languages Proficiencies. A Language Proficiency is composed by a Language and a Language Level.
create table dbo.EmployeeLanguageProficiency (
EmployeeId int not null,
LanguageProficiencyId int not null,
constraint PK_ELP primary key clustered (EmployeeId, LanguageProficiencyId)
)
create table dbo.JobLanguageProficiency (
JobId int not null,
LanguageProficiencyId int not null,
constraint PK_JLP primary key clustered (JobId, LanguageProficiencyId)
)
create table dbo.LanguageProficiency (
Id int identity not null
constraint PK_LanguageProficiency_Id primary key clustered (Id),
LanguageCode nvarchar (4) not null,
LanguageLevelId int not null,
constraint UQ_LP unique (LanguageCode, LanguageLevelId)
)
create table dbo.LanguageLevel (
Id int identity not null
constraint PK_LanguageLevel_Id primary key clustered (Id),
Name nvarchar (80) not null
constraint UQ_LanguageLevel_Name unique (Name)
)
create table dbo.[Language]
(
Code nvarchar (4) not null
constraint PK_Language_Code primary key clustered (Code),
Name nvarchar (80) not null
)
My question is about LanguageProficiency table. I added an Id has PK but I am not sure this is the best option.
What do you think about this scheme?
Your constraint of EmployeeId, LanguageProficiencyId allows an employee to have more than one proficiency per language. This sounds counterintuitive.
This would be cleaner, as it allows only one entry per language:
create table dbo.EmployeeLanguageProficiency (
EmployeeId int not null,
LanguageId int not null,
LanguageLevelId int not null,
constraint PK_ELP primary key clustered (EmployeeId, LanguageId)
)
I don't see the point of table LanguageProficiency at the moment.
Same applies to the Job of course. Unless you would like to allow a "range" of proficiencies. But assuming that "too high proficiency" does not hurt, it can easilly be defined through a >= statement in our queries.
Rgds

Making one of my columns default the DateCreated to current time

I have the following SQL definition:
CREATE TABLE [dbo].[James] (
[JamesID] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (255) NOT NULL,
[DateCreated] DATETIME NULL,
CONSTRAINT [PK_dbo.James] PRIMARY KEY CLUSTERED ([JamesID] ASC)
);
How might I make it so new entries have the DateCreated filled out automatically when I create new entries.
What about existing data that has not had that column filled out?
If you are starting from scratch and assuming this is SQL Server:
CREATE TABLE [dbo].[James] (
[JamesID] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (255) NOT NULL,
[DateCreated] DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT [PK_dbo.James] PRIMARY KEY CLUSTERED ([JamesID] ASC)
);
If you want to update the table you can use this:
ALTER TABLE dbo.James
ADD CONSTRAINT DF_namehere DEFAULT CURRENT_TIMESTAMP FOR DateCreated;
However, any current NULL values will remain NULL with the ALTER TABLE solution. How you want to address this depends if you want to backfill information.

Table Primary Key Length

I am wondering does length of a primary key have a non-trivial effect on performance. For example consider the following table definitions,
CREATE TABLE table1 (
id VARCHAR(50) PRIMARY KEY,
first_column VARCHAR(50) NULL,
second_column VARCHAR(75) NOT NULL
);
CREATE TABLE table2(
id VARCHAR(250) PRIMARY KEY,
first_column VARCHAR(50) NULL,
second_column VARCHAR(75) NOT NULL
);
Does table1 performs better than table2, why?
In general, performance will depend more on what is stored than on the length of a varchar column. If both the varchar(50) and varchar(250) columns have a median length of 40 characters, they'll probably have similar performance.
In some dbms, the primary key is also a clustered key by default. But if your primary key is unsuitable as a clustered key, you can usually tell the dbms to not use a clustered key.
yes the primary key with varchar(50) will be more efficient. as You know the primary key holds Clustered Index on it, and as soon as new record is entered in the table, the value will be arranged in clustered index internally. You will see this difference in billions of records.
so its generally advised to have a natural primary key. Like id's etc.