Creating a table specifically for tracking change information to remove duplicated columns from tables - sql

When creating tables, I have generally created them with a couple extra columns that track change times and the corresponding user:
CREATE TABLE dbo.Object
(
ObjectId int NOT NULL IDENTITY (1, 1),
ObjectName varchar(50) NULL ,
CreateTime datetime NOT NULL,
CreateUserId int NOT NULL,
ModifyTime datetime NULL ,
ModifyUserId int NULL
) ON [PRIMARY]
GO
I have a new project now where if I continued with this structure I would have 6 additional columns on each table with this type of change tracking. A time column, user id column and a geography column. I'm now thinking that adding 6 columns to every table I want to do this on doesn't make sense. What I'm wondering is if the following structure would make more sense:
CREATE TABLE dbo.Object
(
ObjectId int NOT NULL IDENTITY (1, 1),
ObjectName varchar(50) NULL ,
CreateChangeId int NOT NULL,
ModifyChangeId int NULL
) ON [PRIMARY]
GO
-- foreign key relationships on CreateChangeId & ModifyChangeId
CREATE TABLE dbo.Change
(
ChangeId int NOT NULL IDENTITY (1, 1),
ChangeTime datetime NOT NULL,
ChangeUserId int NOT NULL,
ChangeCoordinates geography NULL
) ON [PRIMARY]
GO
Can anyone offer some insight into this minor database design problem, such as common practices and functional designs?

Where i work, we use the same construct as yours - every table has the following fields:
CreatedBy (int, not null, FK users table - user id)
CreationDate (datetime, not null)
ChangedBy (int, null, FK users table - user id)
ChangeDate (datetime, null)
Pro: easy to track and maintain; only one I/O operation (i'll come to that later)
Con: i can't think of any at the moment (well ok, sometimes we don't use the change fields ;-)
IMO the approach with the extra table has the problem, that you will have to reference somehow also the belonging table for every record (unless you only need the one direction Object to Tracking table). The approach also leads to more I/O database operations - for every insert or modify you will need to:
add entry to Table Object
add entry to Tracking Table and get the new Id
update Object Table entry with the Tracking Table Id
It would certainly make the application code that communicates with the DB a bit more complicated and error-prone.

Related

Save filter criteria on a SQL database

I have a Products table as follows:
create table dbo.Product (
Id int not null
Name nvarchar (80) not null,
Price decimal not null
)
I am creating Baskets (lists of products) as follows:
create table dbo.Baskets (
Id int not null
Name nvarchar (80) not null
)
create table dbo.BasketProducts (
BasketId int not null,
ProductId int not null,
)
A basket is created based on a Search Criteria using parameters:
MinimumPrice;
MaximumPrice;
Categories (can be zero to many);
MinimumWarrantyPeriod
I need to save these parameters so later I know how the basket was created.
In the future I will have more parameters so I see 2 options:
Add MinimumPrice, MaximumPrice and MinimumWarrantyPeriod as columns to Basket table and add a BasketCategories and Categories tables to relate a Basket to Categories.
Create a more flexible design using a Parameters table:
create table dbo.BasketParameters (
BasketId int not null,
ParameterTypeId int not null,
Value nvarchar (400) not null
)
create table dbo.ParameterType (
Id int not null
Name nvarchar (80) not null
)
Parameter types are MinimumPrice, MaximumPrice, Categories, MinimumWarrantyPeriod, etc.
So for each Basket I have a list of BasketParameters, all different, having each on value. Later if I need for parameter types I add them to the ParameterType table ...
The application will be responsible for using each Basket Parameters to build the Basket ... I will have, for example, a Categories table but will be decoupled from the BasketParameters.
Does this make sense? Which approach would you use?
Your first option is superior (especially since you are using a relational data store. I.e. SQL Server), since it is properly referential. This will be much easier to maintain and query as well as far more performant.
Your second solution is equivalent to an EVA table: https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Which are usually a terrible idea (and if you need that type of flexibility you should probably use a Document Database or other NoSQL Solution instead). The only benefit to this is if you need to add/remove attributes regularly or based on other criteria.

T-SQL create table statement will not accept variable

Why can I not use a variable to name a new table?
As a beginning SQL project, I'm making a personal finance database. Each account will have a corresponding table in the database. There is also a table listing all the current accounts. See (simplified) code sample below:
CREATE TABLE accountList
(
[Id] INT NOT NULL PRIMARY KEY IDENTITY,
[Name] NCHAR(30) NOT NULL UNIQUE,
[Active] BIT NOT NULL
)
INSERT INTO accountList(name, active)
VALUES
('Bank_One_Checking', 1);
CREATE TABLE Bank_One_Checking
(
[Id] BIGINT NOT NULL PRIMARY KEY IDENTITY,
[payee] NCHAR(30) NOT NULL UNIQUE,
[category] NCHAR(30) NOT NULL UNIQUE,
[amount] INT NOT NULL DEFAULT 0.00
)
This code works. I want to set the account name to a variable (so it can be passed as a parameter to a stored procedure). See code below:
DECLARE #accountName nchar(30);
SET #accountName = 'Bank_One_Savings';
INSERT INTO accountList(name, active)
VALUES
(#accountName, 1);
CREATE TABLE #accountName
(
[Id] BIGINT NOT NULL PRIMARY KEY IDENTITY,
[payee] NCHAR(30) NOT NULL UNIQUE,
[category] NCHAR(30) NOT NULL UNIQUE,
[amount] INT NOT NULL DEFAULT 0.00
)
Line 6 in that code (CREATE TABLE #accountName) produces an error
Incorrect syntax near #accountName, expecting '.', 'ID', or 'QUOTEID'.
Why won't it insert the variable into the command?
SQL doesn't allow tables to be variables. You could use dynamic SQL, if you like, but I strongly recommend against it.
Your code has several flaws. You should learn not only to fix them but why they are wrong.
You need a "master" table, where AccountName is a column. Multiple tables with the same structure is almost always a sign of poor database design.
Strings should be designed using VARCHAR() or NVARCHAR(), unless they are short or known to be the same length (say an account number that is always 15 characters). Fixed-length strings just waste space.
I find it unlikely that a column named category would be unique in such a table. It seems to violate what uniqueness means.
Integers are not appropriate for monetary amounts in most of the world (use decimal or money). And, they shouldn't be initialized to constants with a decimal point.

Database Normalization using Foreign Key

I have a sample table like below where Course Completion Status of a Student is being stored:
Create Table StudentCourseCompletionStatus
(
CourseCompletionID int primary key identity(1,1),
StudentID int not null,
AlgorithmCourseStatus nvarchar(30),
DatabaseCourseStatus nvarchar(30),
NetworkingCourseStatus nvarchar(30),
MathematicsCourseStatus nvarchar(30),
ProgrammingCourseStatus nvarchar(30)
)
Insert into StudentCourseCompletionStatus Values (1, 'In Progress', 'In Progress', 'Not Started', 'Completed', 'Completed')
Insert into StudentCourseCompletionStatus Values (2, 'Not Started', 'In Progress', 'Not Started', 'Not Applicable', 'Completed')
Now as part of normalizing the schema I have created two other tables - CourseStatusType and Status for storing the Course Status names and Status.
Create Table CourseStatusType
(
CourseStatusTypeID int primary key identity(1,1),
CourseStatusType nvarchar(100) not null
)
Insert into CourseStatusType Values ('AlgorithmCourseStatus')
Insert into CourseStatusType Values ('DatabaseCourseStatus')
Insert into CourseStatusType Values ('NetworkingCourseStatus')
Insert into CourseStatusType Values ('MathematicsCourseStatus')
Insert into CourseStatusType Values ('ProgrammingCourseStatus')
Insert into CourseStatusType Values ('OperatingSystemsCourseStatus')
Insert into CourseStatusType Values ('CompilerCourseStatus')
Create Table Status
(
StatusID int primary key identity(1,1),
StatusName nvarchar (100) not null
)
Insert into Status Values ('Completed')
Insert into Status Values ('Not Started')
Insert into Status Values ('In Progress')
Insert into Status Values ('Not Applicable')
The modified table is as below:
Create Table StudentCourseCompletionStatus1
(
CourseCompletionID int primary key identity(1,1),
StudentID int not null,
CourseStatusTypeID int not null CONSTRAINT [FK_StudentCourseCompletionStatus1_CourseStatusType] FOREIGN KEY (CourseStatusTypeID) REFERENCES dbo.CourseStatusType (CourseStatusTypeID),
StatusID int not null CONSTRAINT [FK_StudentCourseCompletionStatus1_Status] FOREIGN KEY (StatusID) REFERENCES Status (StatusID),
)
I have few question on this:
Is this the correct way to normalize it ? The old table was very helpful to get data easily - I can store a student's course status in a single row, but now 5 rows are required. Is there a better way to do it?
Moving the data from the old table to this new table seems to be not an easy task. Can I achieve this using a query or I have to manually to do this ?
Any help is appreciated.
vou could also consider storing results in flat table like this:
studentID,courseID,status
1,1,"completed"
1,2,"not started"
2,1,"not started"
2,3,"in progress"
you will also need additional Courses table like this
courserId,courseName
1, math
2, programming
3, networking
and a students table
students
1 "john smith"
2 "perry clam"
3 "john deere"
etc..you could also optionally create a status table to store the distinct statusstrings statusstings and refer to their PK instead ofthestrings
studentID,courseID,status
1,1,1
1,2,2
2,1,2
2,3,3
... etc
and status table
id,status
1,"completed"
2,"not started"
3,"in progress"
the beauty of this representation is: it is quite easy to filter and aggregate data , i.e it is easy to query which subjects a particular person have completed, how many subjects are completed by an average student, etc. this things are much more difficult in the columnar design like you had. you can also easily add new subjects without the need to adapt your tables or even queries they,will just work.
you can also always usin SQLs PIVOT query to get it to a familiar columnar presentation like
name,mathstatus,programmingstatus,networkingstatus,etc..
but now 5 rows are required
No, it's still just one row. That row simply contains identifiers for values stored in other tables.
There are pros and cons to this. One of the main reasons to normalize in this way is to protect the integrity of the data. If a column is just a string then anything can be stored there. But if there's a foreign key relationship to a table containing a finite set of values then only one of those options can be stored there. Additionally, if you ever want to change the text of an option or add/remove options, you do it in a centralized place.
Moving the data from the old table to this new table seems to be not an easy task.
No problem at all. Create your new numeric columns on the data table and populate them with the identifiers of the lookup table records associated with each data table record. If they're nullable, you can make them foreign keys right away. If they're not nullable then you need to populate them before you can make them foreign keys. Once you've verified that the data is correct, remove the old de-normalized columns. Done.
In StudentCourseCompletionStatus1 you still need 2 associations to Status and CourseStatusType. So I think you should consider following variant of normalization:
It means, that your StudentCourseCompletionStatus would hold only one CourseStatusID and another table CourseStatus would hold the associations to CourseType and Status.
To move your data you can surely use a query.

Beginner with triggers

Im a beginner in database and i got this difficult auction database project.
Im using SQL Server Management Studio also.
create table user(
name char(10) not null,
lastname char(10) not null
)
create table item(
buyer varchar(10) null,
seller varchar(10) not null,
startprice numeric(5) not null,
description char(22) not null,
start_date datetime not null,
end_date datetime not null,
seller char(10) not null,
item_nummer numeric(9) not null,
constraint fk_user foreign key (buyer) references user (name)
)
Basically what the rule im trying to make here is:
Column buyer has NULL unless the time (start_date and end_date) is over and startprice didnt go up or increased. Then column buyer will get the name from table user who bidded on the item.
The rule is a bid too difficult for me to make, i was thinking to make a trigger, but im not sure..
Your model is incorrect. First you need a table to store the bids. Then when the auction is over, you update the highest one as the winning bid. Proably the best way is to have a job that runs once a minute and finds the winners of any newly closed auctions.
A trigger will not work on the two tables you have because triggers only fire on insert/update or delete. It would not fire because the time is past. Further triggers are an advanced technique and a db beginner should avoid them as you can do horrendous damage with a badly written trigger.
You could have a trigger that works on insert to the bids table, that updates the bid to be the winner and takes that status away from the previous winner. Then you simply stop accepting new bids at the time the auction is over. Your application could show the bidder who is marked as the winner as the elader if the auction is till open and teh winner if it is closed.
There are some initial problems with your schema that need addressed before tackling your question. Here are changes I would make to significantly ease the implementation of the answer:
-- Added brackets around User b/c "user" is a reserved keyword
-- Added INT Identity PK to [User]
CREATE TABLE [user]
(
UserId INT NOT NULL
IDENTITY
PRIMARY KEY
, name CHAR(10) NOT NULL
, lastname CHAR(10) NOT NULL
)
/* changed item_nummer (I'm not sure what a nummer is...) to ItemId int not null identity primary key
Removed duplicate Seller columns and buyer column
Replaced buyer/seller columns with FK references to [User].UserId
Add currentBid to capture current bid
Added CurrentHighBidderId
Added WinningBidderId as computed column
*/
CREATE TABLE item
(
ItemId INT NOT NULL
IDENTITY
PRIMARY KEY
, SellerId INT NOT NULL
FOREIGN KEY REFERENCES [User] ( UserId )
, CurrentHighBidderId INT NULL
FOREIGN KEY REFERENCES [User] ( UserId )
, CurrentBid MONEY NOT NULL
, StartPrice NUMERIC(5) NOT NULL
, Description CHAR(22) NOT NULL
, StartDate DATETIME NOT NULL
, EndDate DATETIME NOT NULL
)
go
ALTER TABLE dbo.item ADD
WinningBidderId AS CASE WHEN EndDate < CURRENT_TIMESTAMP
AND currentBid > StartPrice THEN CurrentHighBidderId ELSE NULL END
GO
With the additional columns a computed column can return the correct information. If you must return the winner's name instead of id, then you could keep the schema above the same, add an additional column to store the user's name, populate it with a trigger and keep the computed column to conditionally show/not show the winner..

Can "auto_increment" on "sub_groups" be enforced at a database level?

In Rails, I have the following
class Token < ActiveRecord
belongs_to :grid
attr_accessible :turn_order
end
When you insert a new token, turn_order should auto-increment. HOWEVER, it should only auto-increment for tokens belonging to the same grid.
So, take 4 tokens for example:
Token_1 belongs to Grid_1, turn_order should be 1 upon insert.
Token_2 belongs to Grid_2, turn_Order should be 1 upon insert.
If I insert Token_3 to Grid_1, turn_order should be 2 upon insert.
If I insert Token_4 to Grid_2, turn_order should be 2 upon insert.
There is an additional constraint, imagine I execute #Token_3.turn_order = 1, now #Token_1 must automatically set its turn_order to 2, because within these "sub-groups" there can be no turn_order collision.
I know MySQL has auto_increment, I was wondering if there is any logic that can be applied at the DB level to enforce a constraint such as this. Basically auto_incrementing within sub-groups of a query, those sub-groups being based on a foreign key.
Is this something that can be handled at a DB level, or should I just strive for implementing rock-solid constraints at the application layer?
If i understood your question properly then you could use one of the following two methods (innodb vs myisam). Personally, I'd take the innodb road as i'm a fan of clustered indexes which myisam doesnt support and I prefer performance over how many lines of code I need to type, but the decision is yours...
http://dev.mysql.com/doc/refman/5.0/en/innodb-table-and-index.html
Rewriting mysql select to reduce time and writing tmp to disk
full sql script here : http://pastie.org/1259734
innodb implementation (recommended)
-- TABLES
drop table if exists grid;
create table grid
(
grid_id int unsigned not null auto_increment primary key,
name varchar(255) not null,
next_token_id int unsigned not null default 0
)
engine = innodb;
drop table if exists grid_token;
create table grid_token
(
grid_id int unsigned not null,
token_id int unsigned not null,
name varchar(255) not null,
primary key (grid_id, token_id) -- note clustered PK order (innodb only)
)
engine = innodb;
-- TRIGGERS
delimiter #
create trigger grid_token_before_ins_trig before insert on grid_token
for each row
begin
declare tid int unsigned default 0;
select next_token_id + 1 into tid from grid where grid_id = new.grid_id;
set new.token_id = tid;
update grid set next_token_id = tid where grid_id = new.grid_id;
end#
delimiter ;
-- TEST DATA
insert into grid (name) values ('g1'),('g2'),('g3');
insert into grid_token (grid_id, name) values
(1,'g1 t1'),(1,'g1 t2'),(1,'g1 t3'),
(2,'g2 t1'),
(3,'g3 t1'),(3,'g3 t2');
select * from grid;
select * from grid_token;
myisam implementation (not recommended)
-- TABLES
drop table if exists grid;
create table grid
(
grid_id int unsigned not null auto_increment primary key,
name varchar(255) not null
)
engine = myisam;
drop table if exists grid_token;
create table grid_token
(
grid_id int unsigned not null,
token_id int unsigned not null auto_increment,
name varchar(255) not null,
primary key (grid_id, token_id) -- non clustered PK
)
engine = myisam;
-- TEST DATA
insert into grid (name) values ('g1'),('g2'),('g3');
insert into grid_token (grid_id, name) values
(1,'g1 t1'),(1,'g1 t2'),(1,'g1 t3'),
(2,'g2 t1'),
(3,'g3 t1'),(3,'g3 t2');
select * from grid;
select * from grid_token;
My opinion: Rock-solid constraints at the app level. You may get it to work in SQL -- I've seen some people do some pretty amazing stuff. A lot of SQL logic used to be squirreled away in triggers, but I don't see much of that lately.
This smells more like business logic and you absolutely can get it done in Ruby without wrapping yourself around a tree. And... people will be able to see the tests and read the code.
This to me sounds like something you'd want to handle in an after_save method or in an observer. If the model itself doesn't need to be aware of when or how something increments then I'd stick the business logic in the observer. This approach will make the incrementing logic more expressive to other developers and database agnostic.