SQL Server Composite Primary Keys - sql

I am attempting to replace all records for a give day in a certain table. The table has a composite primary key comprised of 7 fields. One such field is date.
I have deleted all records which have a date value of 2/8/2010. When I try to then insert records into the table for 2/8/2010, I get a primary key violation. The records I am attempting to insert are only for 2/8/2010.
Since date is a component of the PK, shouldn't there be no way to violate the constraint as long as the date I'm inserting is not already in the table?
Thanks in advance.

You could have duplicates in the data you are inserting.
Also, it is a very, very, very poor practice to have a primary key that consists of 7 fields. The proper way to handle this is to havea surrogate identity key and a unique index on the seven fields. Joining to child tables on 7 feilds is a guarantee of poor performance and updating records when they have child records becomes a nightmare and can completely lock up your system. A primary key should be unique adnit should NEVER change.

Do all the rows have only a date component in that field (i.e. the time is always set to midnight: 00:00:00)? If not, you'll need to delete the rows >= 2/8/2010 and < 2/9/2010.
Also, are you sure you're not accidentally trying to insert two records with the same date (and same values in the other 6 PK fields)?

Perhaps there's something going on here you're not aware of. When you insert a row and get a primary key violation, try doing a SELECT with the appropriate key values from the row which could not be inserted (after doing a ROLLBACK, of course) and see what you get. Or perhaps there's a trigger on the table into which you're inserting data that is inserting rows into another table which uses the same primary key but was not cleaned out.
You might try the following SELECT to see what turns up:
SELECT *
FROM YOUR_TABLE
WHERE DATE > 2/7/2010 AND
DATE < 2/9/2010;
(Not sure about the proper format for a date constant in SQL Server as I haven't used it in a few years, but I'm sure you get the idea). See what you get.
Good luck.

Related

Pros & Cons of Date Column as Part of Primary Key

I am currently working on a database, where a log is required to track a bunch of different changes of data. Stuff like price changes, project status changes, etc. To accomplish this I've made different 'log' tables that will be storing the data needing to be kept.
To give a solid example, in order to track the changing prices for parts which need to be ordered, I've created a Table called Part_Price_Log. The primary key is composite made up of the date in which the part price is being modified, and a foreign key to the Part's unique ID on the Parts Table.
My logic here, is that if you need to look up the current price for a part, you just need to find the most recent entry for that Part ID. However, I am being told not to implement it this way because using Date as part of a primary key is an easy way to get errors in your data.
So my question is thus.
What are the pros/cons of using a Date column as part of a composite primary key? What are some better alternatives?
In general, I think the best primary keys are synthetic auto-incremented keys. These have certain advantages:
The key value records the insertion order.
The keys are fixed length (typically 4 bytes).
Single keys are much simpler for foreign key references.
In databases (such as SQL Server by default) that cluster the data based on the primary key, inserts go "at the end".
They are relatively easy to type and compare (my eyes just don't work well for comparing UUIDs).
The fourth of these is a really big concern in a database that has lots of inserts, as suggested by your data.
There is nothing a priori wrong with composite primary keys. They are sometimes useful. But that is not a direction I would go in.
I agree that it is better to keep the identity column/uniqueidentifier as primary key in this scenario, Also if you make partid and date as composite primary key, it is going to fail in a case when two concurrent users try to update the part price at same time.The primary key is going to fail in that case.So the better approach will be to have an identity column as primary key and keep on dumping the changes in log table.In case you hit some performance barriers later on you can partition your table year wise and can overcome that performance challenge.
Pros and cons will vary depending on the performance requirements and how often you will query this table.
As a first example think about the following:
CREATE TABLE Part_Price_Log (
ModifiedDate DATE,
PartID INT,
PRIMARY KEY (ModifiedDate, PartID))
If the ModifiedDate is first and this is an logging table with insert-only rows, then every new row will be placed at the end, which is good (reduces fragmentation). This approach is also good when you want to filter directly by ModifiedDate, or by ModifiedDate + PartID, as ModifiedDate is the first column in the primary key. A con here would be searching by PartID, as the clustered index of the primary key won't be able to seek directly the PartID.
A second example would be the same but inverted primary key ordering:
CREATE TABLE Part_Price_Log (
ModifiedDate DATE,
PartID INT,
PRIMARY KEY (PartID, ModifiedDate))
This is good for queries by PartID, but not much for queries directly by ModifiedDate. Also having PartID first would make inserts displace data pages as inserted PartIDis lower than the max PartID (which increases fragmentation).
The last example would be using a surrogate primary key like an IDENTITY.
CREATE TABLE Part_Price_Log (
LogID BIGINT IDENTITY PRIMARY KEY,
ModifiedDate DATE,
PartID INT)
This will make all inserts go last and reduce fragmentation but you will need an additional index to query your data, such as:
CREATE NONCLUSTERED INDEX NCI_Part_Price_Log_Date_PartID ON Part_Price_Log (ModifiedDate, PartID)
CREATE NONCLUSTERED INDEX NCI_Part_Price_Log_PartID_Date ON Part_Price_Log (PartID, ModifiedDate)
The con about this last one is that insert operations will take longer (as the index also has to be updated) and the size of the table will increase due to indexes.
Also keep in mind that if your data allows for multiple updates of the same part for the same day, then using compound PRIMARY KEY would make the 2nd update fail. Your choices here are to use a surrogate key, use a DATETIME instead of DATE (will give you more margin for updates), or use a CLUSTERED INDEX with no PRIMARY KEY or UNIQUE constraint.
I would suggest doing the following. You only keep one index (the actual table, as it is clustered), the order is always insert, you don't need to worry about repeated ModifiedDate with same PartID and your queries by date will be fast.
CREATE TABLE Part_Price_Log (
LogID INT IDENTITY PRIMARY KEY NONCLUSTERED,
ModifiedDate DATE,
PartID INT)
CREATE CLUSTERED INDEX NCI_Part_Price_Log_Date_PartID ON Part_Price_Log (ModifiedDate, PartID)
Without knowing your domain, it's really hard to advise. How do your identify a part in the real world? Let's assume you use EAN. This is your 'natural key'. Now, does a part get a new EAN each time the price changes? Probably not, in which case the real world identifier for a part price is a composite of its EAN and the period of time during which that price was effective.
I think the comment about "an easy way to get errors in your data" is referring to the fact the tempoal databases are not only more complex by nature (they have a additional dimension - time), the support for temporal functionality is lacking in most SQL DBMSs.
For example, does your SQL product of choice have an interval data type, or do you need to roll your own using a pair of start_date and end_date columns? Does your SQL product of choice have the capability to intra-table constraints e.g. to prevent overlapping or non-concurrent intervals for the same part? Does your SQL product have temporal functions to query temporal data easily?

Order of comparison in composite UNIQUE constraint in MS SQL

I'm designing a database and have come across a performance related problem. Please note that we are still in the phase of designing not implementing so I can't test anything yet.
I have the following table structure
Events
----------------------
EventID INT PK
SourceID INT FK
TypeID INT FK
Date DATETIME
The table is expected to contain tens of millions of entries. SourceID and TypeID both reference tables with at most hundreds of entries.
What I want is to have the tuple (SourceID, TypeID, Date) unique across the table. The question is: can I somehow specify which of the three columns will be used as the first to determine uniqueness when I would insert a new item in the table?
Because if the index compared the Date first, then the addition would be much faster, than if it for example used TypeID first, right? Or is this a wrong question altogether and I should trust the SQL server to optimize this itself?
Any feedback is appreciated.
The underlying index created to support the unique constraint will have the same column order as defined by the constraint.

How to force duplicate key insert mssql

I know this sounds crazy (And if I designed the database I would have done it differently) but I actually want to force a duplicate key on an insert. I'm working with a database that was designed to have columns as 'not null' pk's that have the same value in them for every row. The records keeping software I'm working with is somehow able to insert dups into these columns for every one of its records. I need to copy data from a column in another table into one column on this one. Normally I just would try to insert into that column only, but the pk's are set to 'not null' so I have to put something in them, and the way the table is set up that something has to be the same thing for every record. This should be impossible but the company that made the records keeping software made it work some how. I was wondering if anyone knows how this could be done?
P.S. I know this is normally not a good idea at all. So please just include suggestions for how this could be done regardless of how crazy it is. Thank you.
A SQL Server primary key has to be unique and NOT NULL. So, the column you're seeing duplicate data in cannot be the primary key on it's own. As urlreader suggests, it must be part of a composite primary key with one or more other columns.
How to tell what columns make up the primary key on a table: In Enterprise Manager, expand the table and then expand Columns. The primary key columns will have a "key" symbol next to them. You'll also see "PK" in the column description after, like this:
MyFirstIDColumn (PK, int, not null)
MySecondIDColumn (PK, int, not null)
Once you know which columns make up the primary key, simply ensure that you are inserting a combination of unique data into the columns. So, for my sample table above, that would be:
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,1) --SUCCEED
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,2) --SUCCEED
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,1) --FAIL because of duplicate (1,1)
INSERT INTO MyTable (MyFirstIDColumn, MySecondIDColumn) VALUES (1,3) --SUCCEED
More on primary keys:
http://msdn.microsoft.com/en-us/library/ms191236%28v=sql.105%29.aspx

SQL Server Does Not Delete Records

I am newbie to MSSQL Server and don't have any knowledge about it.
i have below question.
I have added nine records with same value as show per below image in SQL Server 2005.
i Have not given any primary key to Table.
Now when i selecting one record or multiple record and hit the delete key it does not delete the records from table instead it gives me error.
You need to add a primary key to uniquely identify each record, otherwise the SQL server has no way of distinguishing the records, and therefore no way of knowing which one to delete, causing an error.
That's because you don't have any primary key and server doesn't know which row to remove. Clear the table ( DELETE * FROM dbo.Patient ) and create new Id column as a primary key.
In MSSQL you need to have a primary key for the table. This will uniquely identify each row of that particular table.
For example in Oracle you don't need this as there you can use ROWID (meaning every row from every table has a unique ID in the database). Once you know this ID you Oracle knows for sure from which table it is.
So now you can add a primary key to the table and you can make it be auto-increment - ensuring uniqueness.

Database table id-key Null value and referential integrity

I'm learning databases, using SQLce. Got some problems, with this error:
A foreign key value cannot be inserted because a corresponding primary key value does not exist.
How does the integrity and acceptance of data work when attempting to save a data row that does not have specified one foreign key. Isn't it possible to set it to NULL in some way, meaning it will not reference the other table? In case, how would I do that? (For an integer key field)
Also, what if you save a row with a valid foreign key that corresponds to an existing primary key in other table. But then decide to delete that entry in this other table. So the foreign key will no longer be valid. Will I be allowed to delete? How does it work? I would think it should then be simply reset to a null value.. But maybe it's not that simple?
What you need to do is insert your data starting from the parent down.
So if you have an orders table and an items table that refers to orders, you have to create the new order first before adding all the children to the list.
Many of the data access libraries that you can get (in C# there is Linq to SQL) which will try and abstract this problem.
If you need to delete data you actually have to go the other way, delete the items before you delete the parent order record.
Of course, this assumes you are enforcing the foreign key, it is possible to not enforce the key, which might be useful during a bulk delete.
This is because of "bad data" you have in the tables. Check if you have all corresponding values in the primary table.
DBMS checks the referential integrity for ensuring the "correctness" of data within database.
For example, if you have a column called some_id in TableA with values 1 through 10 and a column called some_id in TableB with values 1 through 11 then TableA has no corresponding value (11) for that which you have already in TableB.
You can make a foreign key nullable but I don't recommend it. There are too many problems and inconsistencies that can arise. Redesign your tables so that you don't need to populate the foreign key for values that don't exist. Usually you can do that by moving the column to a new table for example.