Merging contacts in SQL table without creating duplicate entries - sql

I have a table that holds only two columns - a ListID and PersonID. When a person is merged with another in the system, I was to update all references from the "source" person to be references to the "destination" person.
Ideally, I would like to call something simple like
UPDATE MailingListSubscription
SET PersonID = #DestPerson
WHERE PersonID = #SourcePerson
However, if the destination person already exists in this table with the same ListID as the source person, a duplicate entry will be made. How can I perform this action without creating duplicated entries? (ListID, PersonID is the primary key)
EDIT: Multiple ListIDs are used. If SourcePerson is assigned to ListIDs 1, 2, and 3, and DestinationPerson is assigned to ListIDs 3 and 4, then the end result needs to have four rows - DestinationPerson assigned to ListID 1, 2, 3, and 4.

--out with the bad
DELETE
FROM MailingListSubscription
WHERE PersonId = #SourcePerson
and ListID in (SELECT ListID FROM MailingListSubscription WHERE PersonID = #DestPerson)
--update the rest (good)
UPDATE MailingListSubscription
SET PersonId = #DestPerson
WHERE PersonId = #SourcePerson

First you should subscribe destperson to all lists that SourcePerson is subscribed to that Destperson isn't already subscibed. Then delete all the SourcePersons subscriptions.
This will work with multiple ListIDs.
Insert into MailingListSubscription
(
ListID,
PersonID
)
Select
ListID,
#DestPerson
From
MailingListSubscription as t1
Where
PersonID = #SourcePerson and
Not Exists
(
Select *
From MailingListSubscription as t2
Where
PersonID = #DestPerson and
t1.ListID = t2.ListID
)
Delete From MailingListSubscription
Where
PersonID = #SourcePerson

I have to agree with David B here. Remove all the older stuff that shouldn't be there and then do your update.

Actually, I think you should go back and reconsider your database design as you really shouldn't be in circumstances where you're changing the primary key for a record as you're proposing to do - it implies that the PersonID column is not actually a suitable primary key in the first place.
My guess is your PersonID is exposed to your users, they've renumbered their database for some reason and you're syncing the change back in. This is generally a poor idea as it breaks audit trails and temporal consistency. In these circumstances, it's generally better to use your own non-changing primary key - usually an identity - and set up the PersonID that the users see as an attribute of that. It's extra work but will give you additional consistency and robustness in the long run.
A good rule of thumb is the primary key of a record should not be exposed to the users where possible and you should only do so after careful consideration. OK, I confess to breaking this myself on numerous occasions but it's worth striving for where you can :-)

Related

Can I delete entries from two tables in one statement?

I have to remove a row from each of two tables, they're linked by an ID but not with a proper PK - FK relationship (this db has NO foreign keys!)
The tables have a supposed 1-1 relationship. I don't know why they weren't just put in the same table but I'm not at liberty to change it.
People
PersonId | Name | OwnsMonkey
----------------------------
1 Jim true
2 Jim false
3 Gaz true
Info
PersonId | FurtherInfo
-----------------------------
1 Hates his monkey
2 Wants a monkey
3 Loves his monkey
To decide what to delete, I have to find a username and whether or not they own a monkey:
Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false'
SO I'm doing two separate statements using this idea, deleting from Info first and then from People
delete from Info where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
delete from People where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
I found a promising answer here on StackOverflow
delete a.*, b.*
from People a
inner join Info b
where a.People = b.Info
and a.PersonId =
(Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false')
But it gives a syntax error in Sql Server (2012), I tried it without alias' too, but it doesn't seem possible to delete on two tables at once
Can I delete entries from two tables in one statement?
No. One statement can delete rows only from one table in MS SQL Server.
The answer that you refer to talks about MySQL and MySQL indeed allows to delete from several tables with one statement, as can be seen in the MySQL docs. MS SQL Server doesn't support this, as can be seen in the docs. There is no syntax to include more than one table in the DELETE statement in SQL Server. If you try to delete from a view, rather than a table, there is a limitation as well:
The view referenced by table_or_view_name must be updatable and
reference exactly one base table in the FROM clause of the view
definition.
I was hoping to avoid two separate statements on the off-chance the
second doesn't work for whatever reason, interrupted - concurrency
really, I guess the TRY/CATCH will work well for that.
This is what transactions are for.
You can put several statements in a transaction and either all of them would succeed, or all of them would fail. Either all or nothing.
In your case you not just can, but should put both DELETE statements in a transaction.
TRY/CATCH helps to process possible errors in a more controlled way, but the primary concept is "transaction".
BEGIN TRANSACTION
delete from Info where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
delete from People where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
COMMIT
I highly recommend to read a great article Error and Transaction Handling in SQL Server by Erland Sommarskog.
If you try to be tricky, like this:
WITH
CTE
AS
(
SELECT
Info.PersonId AS ID1, People.PersonId AS ID2
FROM
Info
INNER JOIN People ON Info.PersonId = People.PersonId
)
DELETE FROM CTE
WHERE ID1 = 1;
You'll get an error:
View or function 'CTE' is not updatable because the modification
affects multiple base tables.
Or like this:
WITH
CTE
AS
(
SELECT
PersonId
FROM Info
UNION ALL
SELECT
PersonId
FROM People
)
DELETE FROM CTE
WHERE PersonId = 1;
You'll get another error:
View 'CTE' is not updatable because the definition contains a UNION
operator.

How to prevent from updating the record in sql which is used in another table?

Tables
Country (country_id, country_name)
Company (company_id, country_id, Company_name)
How to prevent updating the row in the country table which is used in Company table?
UPDATE country SET your_values_here
WHERE
countryId = 123 AND
NOT EXISTS (SELECT 1 FROM Company WHERE countryId = 123)
You will have to use stored procedure to do that, Here's the pseudo code:
First, check if the countryID exists in Company
select 1 from Company where country_id = #country_id
If exist, throw an error, else do update:
update country set your_values_here
Company::country_id is just a posted foreign key so an update to one table shouldn't affect the other.
Are you talking about insertions or deletes? If so you would either alter Company so that country_id can be null (i.e. not requiring a corresponding record in Country table) or you would turn off cascading deletes.
Hopefully one of those and Google will sort it for you.

unique pair in a "friendship" database

I'm posting this question which is somewhat a summary of my other question.
I have two databases:
1) db_users.
2) db_friends.
I stress that they're stored in separate databases on different servers and therefore no foreign keys can be used.
In 'db_friends' I have the table 'tbl_friends' which has the following columns:
- id_user
- id_friend
Now how do I make sure that each pair is unique at this table ('tbl_friends')?
I'd like to enfore that at the table level, and not through a query.
For example these are invalid rows:
1 - 2
2 - 1
I'd like this to be impossible to add.
Additionally - how would I seach for all of the friends of user 713 while he could be mentioned, on some friendship rows, at the second column ('id_friend')?
You're probably not going to be able to do this at the database level -- your application code is going to have to do this. If you make sure that your tbl_friends records always go in with (lowId, highId), then a typical PK/Unique Index will solve the duplicate problem. In fact, I'd go so far to rename the columns in your tbl_friends to (id_low, id_high) just to reinforce this.
Your query to find anything with user 713 would then be something like
SELECT id_low AS friend FROM tbl_friends WHERE (id_high = ?)
UNION ALL
SELECT id_high AS friend FROM tbl_friends WHERE (id_low = ?)
For efficiency, you'd probably want to index it forward and backward -- that is by (id_user, id_friend) and (id_friend, id_user).
If you must do this at a DB level, then a stored procedure to swap arguments to (low,high) before inserting would work.
You'd have to use a trigger to enforce that business rule.
Making the two columns in tbl_friends the primary key (unique constraint failing that) would only ensure there can't be duplicates of the same set: 1, 2 can only appear once but 2, 1 would be valid.
how would I seach for all of the friends of user 713 while he could be mentioned, on some friendship rows, at the second column ('id_friend')?
You could use an IN:
WHERE 713 IN (id_user, id_friend)
..or a UNION:
JOIN (SELECT id_user AS user
FROM TBL_FRIENDS
UNION ALL
SELECT id_friend
FROM TBL_FRIENDS) x ON x.user = u.user
Well, a unique constraint on the pair of columns will get you half way there. I think the easiest way to ensure you don't get the reversed version would be to add a constraint ensuring that id_user < id_friend. You will need to compensate for this ordering at insertion time, but it will get you the database Level constraint you desire without duplicating data or relying on foreign keys.
As for the second question, to find all friends for id=1 you could select id_user, id_friend from tbl_friend where id_user = 1 or id_friend = 1 and then in your client code throw out all the 1's regardless of column.
One way you could do it is to store the two friends on two rows:
CREATE TABLE FriendPairs (
pair_id INT NOT NULL,
friend_id INT NOT NULL,
PRIMARY KEY (pair_id, friend_id)
);
INSERT INTO FriendPairs (pair_id, friend_id)
VALUES (1234, 317), (1234, 713);
See? It doesn't matter which order you insert them, because both friends go in the friend_id column. So you can enforce uniqueness easily.
You can also query easily for friends of 713:
SELECT f2.friend_id
FROM FriendPairs AS f1
JOIN FriendPairs AS f2 ON (f1.pair_id = f2.pair_id)
WHERE f1.friend_id = 713

TSQL foreign keys on views?

I have a SQL-Server 2008 database and a schema which uses foreign key constraints to enforce referential integrity. Works as intended. Now the user creates views on the original tables to work on subsets of the data only. My problem is that filtering certain datasets in some tables but not in others will violate the foreign key constraints.
Imagine two tables "one" and "two". "one" contains just an id column with values 1,2,3. "Two" references "one". Now you create views on both tables. The view for table "two" doesn't filter anything while the view for table "one" removes all rows but the first. You'll end up with entries in the second view that point nowhere.
Is there any way to avoid this? Can you have foreign key constraints between views?
Some Clarification in response to some of the comments:
I'm aware that the underlying constraints will ensure integrity of the data even when inserting through the views. My problem lies with the statements consuming the views. Those statements have been written with the original tables in mind and assume certain joins cannot fail. This assumption is always valid when working with the tables - but views potentially break it.
Joining/checking all constraints when creating the views in the first place is annyoing because of the large number of referencing tables. Thus I was hoping to avoid that.
I love your question. It screams of familiarity with the Query Optimizer, and how it can see that some joins are redundant if they serve no purpose, or if it can simplify something knowing that there is at most one hit on the other side of a join.
So, the big question is around whether you can make a FK against the CIX of an Indexed View. And the answer is no.
create table dbo.testtable (id int identity(1,1) primary key, val int not null);
go
create view dbo.testview with schemabinding as
select id, val
from dbo.testtable
where val >= 50
;
go
insert dbo.testtable
select 20 union all
select 30 union all
select 40 union all
select 50 union all
select 60 union all
select 70
go
create unique clustered index ixV on dbo.testview(id);
go
create table dbo.secondtable (id int references dbo.testview(id));
go
All this works except for the last statement, which errors with:
Msg 1768, Level 16, State 0, Line 1
Foreign key 'FK__secondtable__id__6A325CF7' references object 'dbo.testview' which is not a user table.
So the Foreign key must reference a user table.
But... the next question is about whether you could reference a unique index that is filtered in SQL 2008, to achieve a view-like FK.
And still the answer is no.
create unique index ixUV on dbo.testtable(val) where val >= 50;
go
This succeeded.
But now if I try to create a table that references the val column
create table dbo.thirdtable (id int identity(1,1) primary key, val int not null check (val >= 50) references dbo.testtable(val));
(I was hoping that the check constraint that matched the filter in the filtered index might help the system understand that the FK should hold)
But I get an error saying:
There are no primary or candidate keys in the referenced table 'dbo.testtable' that matching the referencing column list in the foreign key 'FK__thirdtable__val__0EA330E9'.
If I drop the filtered index and create a non-filtered unique non-clustered index, then I can create dbo.thirdtable without any problems.
So I'm afraid the answer still seems to be No.
It took me some time to figure out the misunderstaning here -- not sure if I still understand completely, but here it is.
I will use an example, close to yours, but with some data -- easier for me to think in these terms.
So first two tables; A = Department B = Employee
CREATE TABLE Department
(
DepartmentID int PRIMARY KEY
,DepartmentName varchar(20)
,DepartmentColor varchar(10)
)
GO
CREATE TABLE Employee
(
EmployeeID int PRIMARY KEY
,EmployeeName varchar(20)
,DepartmentID int FOREIGN KEY REFERENCES Department ( DepartmentID )
)
GO
Now I'll toss some data in
INSERT INTO Department
( DepartmentID, DepartmentName, DepartmentColor )
SELECT 1, 'Accounting', 'RED' UNION
SELECT 2, 'Engineering', 'BLUE' UNION
SELECT 3, 'Sales', 'YELLOW' UNION
SELECT 4, 'Marketing', 'GREEN' ;
INSERT INTO Employee
( EmployeeID, EmployeeName, DepartmentID )
SELECT 1, 'Lyne', 1 UNION
SELECT 2, 'Damir', 2 UNION
SELECT 3, 'Sandy', 2 UNION
SELECT 4, 'Steve', 3 UNION
SELECT 5, 'Brian', 3 UNION
SELECT 6, 'Susan', 3 UNION
SELECT 7, 'Joe', 4 ;
So, now I'll create a view on the first table to filter some departments out.
CREATE VIEW dbo.BlueDepartments
AS
SELECT * FROM dbo.Department
WHERE DepartmentColor = 'BLUE'
GO
This returns
DepartmentID DepartmentName DepartmentColor
------------ -------------------- ---------------
2 Engineering BLUE
And per your example, I'll add a view for the second table which does not filter anything.
CREATE VIEW dbo.AllEmployees
AS
SELECT * FROM dbo.Employee
GO
This returns
EmployeeID EmployeeName DepartmentID
----------- -------------------- ------------
1 Lyne 1
2 Damir 2
3 Sandy 2
4 Steve 3
5 Brian 3
6 Susan 3
7 Joe 4
It seems to me that you think that Employee No 5, DepartmentID = 3 points to nowhere?
"You'll end up with entries in the
second view that point nowhere."
Well, it points to the Department table DepartmentID = 3, as specified with the foreign key. Even if you try to join view on view nothing is broken:
SELECT e.EmployeeID
,e.EmployeeName
,d.DepartmentID
,d.DepartmentName
,d.DepartmentColor
FROM dbo.AllEmployees AS e
JOIN dbo.BlueDepartments AS d ON d.DepartmentID = e.DepartmentID
ORDER BY e.EmployeeID
Returns
EmployeeID EmployeeName DepartmentID DepartmentName DepartmentColor
----------- -------------------- ------------ -------------------- ---------------
2 Damir 2 Engineering BLUE
3 Sandy 2 Engineering BLUE
So nothing is broken here, the join simply did not find matching records for DepartmentID <> 2 This is actually the same as if I join tables and then include filter as in the first view:
SELECT e.EmployeeID
,e.EmployeeName
,d.DepartmentID
,d.DepartmentName
,d.DepartmentColor
FROM dbo.Employee AS e
JOIN dbo.Department AS d ON d.DepartmentID = e.DepartmentID
WHERE d.DepartmentColor = 'BLUE'
ORDER BY e.EmployeeID
Returns again:
EmployeeID EmployeeName DepartmentID DepartmentName DepartmentColor
----------- -------------------- ------------ -------------------- ---------------
2 Damir 2 Engineering BLUE
3 Sandy 2 Engineering BLUE
In both cases joins do not fail, they simply do as expected.
Now I will try to break the referential integrity through a view (there is no DepartmentID= 127)
INSERT INTO dbo.AllEmployees
( EmployeeID, EmployeeName, DepartmentID )
VALUES( 10, 'Bob', 127 )
And this results in:
Msg 547, Level 16, State 0, Line 1
The INSERT statement conflicted with the FOREIGN KEY constraint "FK__Employee__Depart__0519C6AF". The conflict occurred in database "Tinker_2", table "dbo.Department", column 'DepartmentID'.
If I try to delete a department through the view
DELETE FROM dbo.BlueDepartments
WHERE DepartmentID = 2
Which results in:
Msg 547, Level 16, State 0, Line 1
The DELETE statement conflicted with the REFERENCE constraint "FK__Employee__Depart__0519C6AF". The conflict occurred in database "Tinker_2", table "dbo.Employee", column 'DepartmentID'.
So constraints on underlying tables still apply.
Hope this helps, but then maybe I misunderstood your problem.
Peter already hit on this, but the best solution is to:
Create the "main" logic (that filtering the referenced table) once.
Have all views on related tables join to the view created for (1), not the original table.
I.e.,
CREATE VIEW v1 AS SELECT * FROM table1 WHERE blah
CREATE VIEW v2 AS SELECT * FROM table2 WHERE EXISTS
(SELECT NULL FROM v1 WHERE v1.id = table2.FKtoTable1)
Sure, syntactic sugar for propagating filters for views on one table to views on subordinate tables would be handy, but alas, it's not part of the SQL standard. That said, this solution is still good enough -- efficient, straightforward, maintainable, and guarantees the desired state for the consuming code.
If you try to insert, update or delete data through a view, the underlying table constraints still apply.
Something like this in View2 is probably your best bet:
CREATE VIEW View2
AS
SELECT
T2.col1,
T2.col2,
...
FROM
Table2 T2
INNER JOIN Table1 T1 ON
T1.pk = T2.t1_fk
If rolling over tables so that Identity columns will not clash, one possibility would be to use a lookup table that referenced the different data tables by Identity and a table reference.
Foreign keys on this table would work down the line for referencing tables.
This would be expensive in a number of ways
Referential integrity on the lookup table would have to be be enforced using triggers.
Additional storage of the lookup table and indexing in addition to the data tables.
Data reading would almost certainly involve a Stored Procedure or three to execute a filtered UNION.
Query plan evaluation would also have a development cost.
The list goes on but it might work on some scenarios.
Using Rob Farley's schema:
CREATE TABLE dbo.testtable(
id int IDENTITY(1,1) PRIMARY KEY,
val int NOT NULL);
go
INSERT dbo.testtable(val)
VALUES(20),(30),(40),(50),(60),(70);
go
CREATE TABLE dbo.secondtable(
id int NOT NULL,
CONSTRAINT FK_SecondTable FOREIGN KEY(id) REFERENCES dbo.TestTable(id));
go
CREATE TABLE z(n tinyint PRIMARY KEY);
INSERT z(n)
VALUES(0),(1);
go
CREATE VIEW dbo.SecondTableCheck WITH SCHEMABINDING AS
SELECT 1 n
FROM dbo.TestTable AS t JOIN dbo.SecondTable AS s ON t.Id = s.Id
CROSS JOIN dbo.z
WHERE t.Val < 50;
go
CREATE UNIQUE CLUSTERED INDEX NoSmallIds ON dbo.SecondTableCheck(n);
go
I had to create a tiny helper table (dbo.z) in order to make this work, because indexed views cannot have self joins, outer joins, subqueries, or derived tables (and TVCs count as derived tables).
Another approach, depending on your requirements, would be to use a stored procedure to return two recordsets. You pass it filtering criteria and it uses the filtering criteria to query table 1, and then those results can be used to filter the query to table 2 so that it's results are also consistent. Then you return both results.
You could stage the filtered table 1 data to another table. The contents of this staging table are your view 1, and then you build view 2 via a join of the staging table and table 2. This way the proccessing for filtering table 1 is done once and reused for both views.
Really what it boils down to is that view 2 has no idea what kind of filtering you performed in view 1, unless you tell view 2 the filtering criteria, or make it somehow dependent on the results of view 1, which means emulating the same filtering that occurs on view1.
Constraints don't perform any kind of filtering, they only prevent invalid data, or cascade key changes and deletes.
No, you can't create foreign keys on views.
Even if you could, where would that leave you? You would still have to declare the FK after creating the view. Who would declare the FK, you or the user? If the user is sophisticated enough to declare a FK, why couldn't he add an inner join to the referenced view? eg:
create view1 as select a, b, c, d from table1 where a in (1, 2, 3)
go
create view2 as select a, m, n, o from table2 where a in (select a from view1)
go
vs:
create view1 as select a, b, c, d from table1 where a in (1, 2, 3)
go
create view2 as select a, m, n, o from table2
--# pseudo-syntax for fk:
alter view2 add foreign key (a) references view1 (a)
go
I don't see how the foreign key would simplify your job.
Alternatively:
Copy the subset of data into another schema or database. Same tables, same keys, less data, faster analysis, less contention.
If you need a subset of all the tables, use another database. If you only need a subset of some tables, use a schema in the same database. That way your new tables can still reference the non-copied tables.
Then use the existing views to copy the data over. Any FK violations will raise an error and identify which views require editing. Create a job and schedule it daily, if necessary.
From a purely data integrity perspective (and nothing to do with the Query Optimizer), I had considered an Indexed View. I figured you could make a unique index on it, which could be broken when you try to have broken integrity in your underlying tables.
But... I don't think you can get around the restrictions of indexed views well enough.
For example:
You can't use outer joins, or sub-queries. That makes it very hard to find the rows that don't exist in the view. If you use aggregates, you can't use HAVING, so that cuts out some options you could use there too. You can't even have constants in an indexed view if you have grouping (whether or not you use a GROUP BY clause), so you can't even try putting an index on a constant field so that a second row will fall over. You can't use UNION ALL, so the idea of having a count which will break a unique index when it hits a second zero won't work.
I feel like there should be an answer, but I'm afraid you're going to have to take a good look at your actual design and work out what you really need. Perhaps triggers (and good indexes) on the tables involved, so that any changes that might break something can roll it all that.
But I was really hoping to be able to suggest something that the Query Optimizer might be able to leverage to help the performance of your system, but I don't think I can.

What's the best way to store (and access) historical 1:M relationships in a relational database?

Hypothetical example:
I have Cars and Owners. Each Car belongs to one (and only one) Owner at a given time, but ownership may be transferred. Owners may, at any time, own zero or more cars. What I want is to store the historical relationships in a MySQL database such that, given an arbitrary time, I can look up the current assignment of Cars to Owners.
I.e. At time X (where X can be now or anytime in the past):
Who owns car Y?
Which cars (if any) does owner Z own?
Creating an M:N table in SQL (with a timestamp) is simple enough, but I'd like to avoid a correlated sub-query as this table will get large (and, hence, performance will suffer). Any ideas? I have a feeling that there's a way to do this by JOINing such a table with itself, but I'm not terribly experienced with databases.
UPDATE: I would like to avoid using both a "start_date" and "end_date" field per row as this would necessitate a (potentially) expensive look-up each time a new row is inserted. (Also, it's redundant).
Make a third table called CarOwners with a field for carid, ownerid and start_date and end_date.
When a car is bought fill in the first three and check the table to make sure no one else is listed as the owner. If there is then update the record with that data as the end_date.
To find current owner:
select carid, ownerid from CarOwner where end_date is null
To find owner at a point in time:
select carid, ownerid from CarOwner where start_date < getdate()
and end_date > getdate()
getdate() is MS SQL Server specific, but every database has some function that returns the current date - just substitute.
Of course if you also want additional info from the other tables, you would join to them as well.
select co.carid, co.ownerid, o.owner_name, c.make, c.Model, c.year
from CarOwner co
JOIN Car c on co.carid = c.carid
JOIN Owner o on o.ownerid = co.ownerid
where co.end_date is null
I've found that the best way to handle this sort of requirement is to just maintain a log of VehicleEvents, one of which would be ChangeOwner. In practice, you can derive the answers to all the questions posed here - at least as accurately as you are collecting the events.
Each record would have a timestamp indicating when the event occurred.
One benefit of doing it this way is that the minimum amount of data can be added in each event, but the information about the Vehicle can accumulate and evolve.
Also, with the timestamp, events can be added after the fact (as long as the timestamp accurately reflects when the event occurred.
Trying to maintain historical state for something like this in any other way I've tried leads to madness. (Maybe I'm still recovering. :D)
BTW, the distinguishing characteristic here is probably that it's a Time Series or Event Log, not that it's 1:m.
Given your business rule that each car belongs to at least one owner (ie. owners exist before they are assigned to a a car) and your operational constraint that the table may grow large, I'd design the schema as follows:
(generic sql 92 syntax:)
CREATE TABLE Cars
(
CarID integer not null default autoincrement,
OwnerID integer not null,
CarDescription varchar(100) not null,
CreatedOn timestamp not null default current timestamp,
Primary key (CarID),
FOREIGN KEY (OwnerID ) REFERENCES Owners(OwnerID )
)
CREATE TABLE Owners
(
OwnerID integer not null default autoincrement,
OwnerName varchar(100) not null,
Primary key(OwnerID )
)
CREATE TABLE HistoricalCarOwners
(
CarID integer not null,
OwnerID integer not null,
OwnedFrom timestamp null,
Owneduntil timestamp null,
primary key (cardid, ownerid),
FOREIGN KEY (OwnerID ) REFERENCES Owners(OwnerID ),
FOREIGN KEY (CarID ) REFERENCES Cars(CarID )
)
I personally would not touch the third table from my client application but would simply let the database do the work - and maintain data integrity - with ON UPDATE AND ON DELETE triggers on the Cars table to populate the HistoricalCarOwners table whenever a car changes owners (i.e whenever an UPDATE is committed on the OwnerId column) or a car is deleted.
With the above schema, selecting the current car owner is trivial and selecting historical car owners is a simple as
select ownerid, ownername from owners o inner join historicalcarowners hco
on hco.ownerid = o.ownerid
where hco.carid = :arg_id and
:arg_timestamp between ownedfrom and owneduntil
order by ...
HTH, Vince
If you really do not want to have a start and end date you can use just a single date and do a query like the following.
SELECT * FROM CarOwner co
WHERE co.CarId = #CarId
AND co.TransferDate <= #AsOfDate
AND NOT EXISTS (SELECT * FROM CarOwner co2
WHERE co2.CarId = #CarId
AND co2.TransferDate <= #AsOfDate
AND co2.TransferDate > co.Transferdate)
or a slight variation
SELECT * FROM Car ca
JOIN CarOwner co ON ca.Id = co.CarId
AND co.TransferDate = (SELECT MAX(TransferDate)
FROM CarOwner WHERE CarId = #CarId
AND TransferDate < #AsOfDate)
WHERE co.CarId = #CarId
These solution are functionally equivalent to Javier's suggestion but depending on the database you are using one solution may be faster than the other.
However, depending on your read versus write ratio you may find the performance better if you redundantly update the end date in the associative entity.
Why not have a transaction table? Which would contain the car ID, the FROM owner, the TO owner and the date the transaction occcured.
Then all you do is find the first transaction for a car before the desired date.
To find cars owned by Owner 253 on March 1st:
SELECT * FROM transactions WHERE ownerToId = 253 AND date > '2009-03-01'
cars table can have an id called ownerID, YOu can then simply
1.select car from cars inner join owners on car.ownerid=owner.ownerid where ownerid=y
2.select car from cars where owner=z
Not the exact syntax but simple pseudo code.