Data historicization

Data historicization - sql

during the phase of collection and analysis of requirements, the users of the application told me that many concept should be historicized. Then I'm thinking to implement it in this way:
Suppose to have to historicized a table called user details,
create table user_details(
id int,
mail varchar(50),
telephone varchar(50),
fax varchar(50),
dateFrom date,
dateTo date,
primary key(id,dateFrom)
);
with the last two fields I think to manage the historicization of this entity.
Any suggestions about this?
Is this the better way to manage it?

It might work, but problem nowadays is that DBMS's do not generally enforce temporal primary key constraints or temporal referential integrity constraints.
I would do like this (in T-SQL syntax):
CREATE TABLE USER
(
id_user int not null identity (1,1),
natural_key varchar(50) not null
)
;
ALTER TABLE USER
ADD CONSTRAINT [XPK_user_iduser]
PRIMARY KEY CLUSTERED (id_user ASC)
GO
ALTER TABLE USER
ADD CONSTRAINT [XAK1_user_naturalkey]
UNIQUE (natural_key ASC)
GO
CREATE TABLE USER_DETAIL
(
id_user int not null,
mail varchar(50),
telephone varchar(50),
fax varchar(50),
dateFrom date not null,
dateTo date
)
;
ALTER TABLE USER_DETAIL
ADD CONSTRAINT [XPK_userdetail_1]
PRIMARY KEY CLUSTERED (id_user ASC, dateFrom ASC)
GO
And finally RI here:
ALTER TABLE USER_DETAIL
ADD CONSTRAINT [XFK_userdetail_user_1]
FOREIGN KEY (id_user) REFERENCES USER (id_user)
ON DELETE NO ACTION
ON UPDATE NO ACTION
GO
This construct does not help stopping all anomalities but at least there is no possibility for two tuples having same starting time.
Of course you could create a table USER_DETAIL and USER_DETAIL_HIST and latter would contain values from earlier periods. Your USER_DETAIL table could contain only current records.
I would then create a following view for end-user applications:
CREATE VIEW USER_DETAIL_TOT AS
SELECT id_user,mail,telephone,fax,dateFrom,dateTo,'Current' as rowStatus
FROM USER_DETAIL
UNION ALL
SELECT id_user,mail,telephone,fax,dateFrom,dateTo,'Historical' as rowStatus
FROM USER_DETAIL_HIST
GO

Related

Add unique constraint on fields from different tables

I have two tables/entities:
One table Users with these 3 fields :
id | login | external_id
There is a unique constraint on external_id but not on login
And another table User_Platforms that have these 3 fields :
id | user_id | platform_name
There is a #OneToMany relation between Users and Platforms. ( One user can have multiple platforms).
Is there a way to put a unique constraint on the fields login ( from Users table) and platform_name ( from User_Platforms table) to avoid having multiple users with the same login on the same platform ?
I was thinking of duplicating login field inside User_Platforms table to be able to do it easily. Is there a better way ?

UNIQUE constraints cannot span multiple tables. In the model you are presenting "as is" it's not possible to create a unique constraint that will ensure data quality.
However, if you are willing to add redundancy to the data model you can enforce the rule. You'll need to add the column login as a redundant copy in the second table. This will change the way you insert data in the second table, but will ensure data quality.
For example:
create table users (
id int primary key not null,
login varchar(10) not null,
external_id varchar(10) not null,
constraint uq1 unique (id, login)
-- extra UNIQUE constraint for redundancy purposes
);
create table user_platforms (
id int primary key not null,
user_id varchar(10) not null,
platform_name varchar(10) not null,
login varchar(10) not null, -- new redundant column
constraint fk1 foreign key (id, login) references users (id, login),
-- FK ensures that the redundancy doesn't become stale
constraint uq2 unique (platform_name, login) -- finally, here's the prize!
);

SQL create multiple vs single "statuses tables"

After reviewing many DB designs I'm still not sure what is the best approach.
I am designing a database where most of the entities have different statuses. For example I may have something like
User statuses: Active, Inactive, Disabled, etc.
Order statuses: Open, Close, Canceled;
Office statuses: Open, Close.
And I thinking in two different options.
1) Create one "status" table for every entity
CREATE TABLE UserStatus(
UserStatusID int,
Description varchar(255)
);
CREATE TABLE OrderStatus(
OrderStatusID int,
Description varchar(255)
);
2) Create a single shared status table for all the entities
CREATE TABLE Status(
StatusID int,
Description varchar(255)
);
If you could explain which option is better or the advantanges of each one I would be grateful

Multiple tables have a key advantage: You can declare proper foreign key relationships to ensure that the values are correct in the referenced tables.
A single table has a different advantage: You have all the statuses in one place. This can be quite handy if you need to do something like translate all the statuses into a different language.
In most cases, I think the first advantage outweighs the second. In come cases, however, the second can be important.

One more option - create a single shared status table for all the entities with EntityID.
Something like this:
CREATE TABLE dbo.Entity(
EntityID int NOT NULL
CONSTRAINT PK_Entity PRIMARY KEY CLUSTERED,
Name varchar(50) NOT NULL,
Description varchar(255) NULL,
)
CREATE TABLE dbo.Status(
EntityID int NOT NULL
CONSTRAINT FK_Status_Entity
FOREIGN KEY(EntityID) REFERENCES dbo.Entity (EntityID),
StatusID int NOT NULL,
Name varchar(50) NOT NULL,
Description varchar(255) NULL,
CONSTRAINT PK_Status PRIMARY KEY CLUSTERED (
EntityID ASC,
StatusID ASC)
CONSTRAINT UQ_Status_EntityID_Name UNIQUE NONCLUSTERED (
EntityID ASC,
Name ASC)
)

Is my query correct when I set primary key for 3 columns in a table?

In my case, I have only 1 candidate may go with 1 job at the time so they are must be 2 primary key.
Then, a column is as JobApplicationId use for the table CandidateDetail as a foreign key.
Is that correct when I decide to set these 3 columns above as primary key or there are other ways to address my problem here?
CREATE TABLE Candidate(
CandidateId int identity primary key,
FullName nvarchar(50)
)
CREATE TABLE Job(
JobId int identity primary key,
JobTitle nvarchar(50)
)
CREATE TABLE JobApplication(
JobApplicationId int identity,
JobId int,
CandidateId int,
CreatedDate datetime,
primary key(JobApplicationId, JobId, CandidateId)
)
CREATE TABLE CandidateDetail(
CandidateDetailId int identity primary key,
JobApplicationId int,
[Description] nvarchar(300)
)
ALTER TABLE JobApplication ADD CONSTRAINT fk_JobApplication_Job FOREIGN KEY (JobId) REFERENCES Job(JobId)
ALTER TABLE JobApplication ADD CONSTRAINT fk_JobApplication_Candidate FOREIGN KEY (CandidateId) REFERENCES Candidate(CandidateId)
ALTER TABLE CandidateDetail ADD CONSTRAINT fk_CandidateDetail_JobApplication FOREIGN KEY (JobApplicationId) REFERENCES JobApplication(JobApplicationId)

Instead of a primary key with three columns you could just have JobApplicationId as the primary key and a unique constraint on JobId, CandidateId.
Otherwise, two rows with JobApplicationId=1, JobId=1, CandidateId=1 and JobApplicationId=2, JobId=1, CandidateId=1 would still be valid in terms of your current primary key approach, but would be invalid in terms of the business case.

From both a performance and usability perspective, a compound primary key can be a hassle and can create performance issues. Personally, I would choose JobApplicationId as the primary key (because this is an identity column and will be unique for each record). Then, if you need to constrain the table so that JobId and CandidateId are always unique (not allowing more than 1 record for any given candidate and the job they've applied for) then I would use a compound Unique Constraint.
However, I would suggest that you evaluate those requirements more closely because what if a candidate applies for the same position in a different time frame? It might stand to reason that having the same candidate applied to the same job more than once in that table might be valid data.

SQL Foreign key issue with 2 parent tables

I have have 2 tables User and Group.
I have a table Attributes shared by user and group with columns:
attributeName.
AttributeValue.
ObjectID.
ObjectID points to either the primary key of user or the primary key of Group.
I have added a foreign constraint with Cascade on Delete in order to delete automatically the attributes when user or a group is deleted.
The problem now is when I insert an attribute for the user, I have a foreign key constraint because the group does not exist.
How should I proceed?

You have basically 3 options:
Keep your current design, but replace Attribute.ObjectID with UserID and GroupID, attach a separate FK to each of them (one towards Group and the other towards User) and allow either to be NULL. You'd also want a CHECK constraint to ensure not both of them are NULL.
Split Attribute table to UserAttribute and GroupAttribute, thus separating each foreign key into its own table.
Use inheritance, like this:
The solution (1) is highly dependent on how your DBMS handles UNIQUE on NULLs and both (1) and (2) allow the same AttributeName to be used for two different attributes, one for user an the other for group.

As you have discovered you can not have one column as foreign key to two different tables. You can't add a attribute for a user when it does not exist a group with the same id. And you can of course not know if the attribute is for a user or a group.
From comments you also mentioned a m:m relation between user and group so I would suggest the following.
create table [User]
(
UserID int identity primary key,
Name varchar(50) not null
)
go
create table [Group]
(
GroupID int identity primary key,
Name varchar(50) not null
)
go
create table UserGroup
(
UserID int not null references [User](UserID),
GroupID int not null references [Group](GroupID),
primary key (UserID, GroupID)
)
go
create table UserAttribute
(
UserAttributeID int identity primary key,
Name varchar(50) not null,
Value varchar(50) not null,
UserID int not null references [User](UserID) on delete cascade
)
go
create table GroupAttribute
(
GroupAttributeID int identity primary key,
Name varchar(50) not null,
Value varchar(50) not null,
GroupID int not null references [Group](GroupID) on delete cascade
)
Note: The use of an attribute table should be for attributes you don't know before hand. All the stuff you know will be attributes should be fields in the actual table instead. Reserve the use of the attributes for customer defined attributes.

I think you should allow NULL values for this foreign key field ObjectId, so that you can insert any row with ObjectId = null that not referencing any user or group.
For a better design you should remove this ObjectId column, add a new column AttributeId to the two tables User and Group.

SQL Server 2008 - deleting rows with FK constraints

I've got SQL database in SQL Server 2008 generated as follows:
CREATE TABLE Client (
ID bigint,
Code varchar(50),
ClientID int NOT NULL
);
ALTER TABLE Client
ADD CONSTRAINT PK_Client PRIMARY KEY CLUSTERED (ClientID);
CREATE TABLE Company (
ID bigint,
Description nvarchar(100),
SubsidiaryOf bigint,
companyID int NOT NULL,
FK_Client_Company int,
PK_Company int
);
ALTER TABLE Company
ADD CONSTRAINT PK_Company PRIMARY KEY CLUSTERED (companyID);
ALTER TABLE Company
ADD CONSTRAINT (ID = ID) FOREIGN KEY (FK_Client_Company)
REFERENCES Client (ClientID);
ALTER TABLE Company
ADD CONSTRAINT (SubsidiaryOf = ID) FOREIGN KEY (PK_Company)
REFERENCES Company (companyID);
CREATE TABLE ContactData (
ID bigint,
LocationID bigint,
Contact nvarchar(50),
contactDataID int NOT NULL,
PK_Location int
);
ALTER TABLE ContactData
ADD CONSTRAINT PK_ContactData PRIMARY KEY CLUSTERED (contactDataID);
ALTER TABLE ContactData
ADD CONSTRAINT (LocationID = ID) FOREIGN KEY (PK_Location)
REFERENCES Location (locationID);
CREATE TABLE Location (
ID bigint,
CompanyID bigint,
Country nvarchar(50),
ZIPCode nvarchar(50),
locationID int NOT NULL,
PK_Company int
);
ALTER TABLE Location
ADD CONSTRAINT PK_Location PRIMARY KEY CLUSTERED (locationID);
ALTER TABLE Location
ADD CONSTRAINT (CompanyID = ID) FOREIGN KEY (PK_Company)
REFERENCES Company (companyID);
And would like to delete all the Companies with ID > 140000 (with related rows in other tables). I tried some combination of INNER JOINs all together in one transaction, but there is still a problem with FK_Client_Company constraint. Can anyone help me?
One more thing - I cannot add anything/modify DB structure/constraints. It has to be a query-base-solution.

First delete those companies' clients
delete client where id in (select fk_client_company from company where id > 140000)
After that you should be able to run the delete statement on the company table
delete company where id > 140000
I'm 'fairly' sure that's the answer you're looking for but I'm not a 100% positive only because your naming scheme seems a little odd. I'm making the assumption that company.fk_client_company = client.id.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Data historicization - sql

Related

Add unique constraint on fields from different tables

SQL create multiple vs single "statuses tables"

Is my query correct when I set primary key for 3 columns in a table?

SQL Foreign key issue with 2 parent tables

SQL Server 2008 - deleting rows with FK constraints

Categories

Resources