A table to hold History of changes for two fields - sql

I have a table which holds a Person (details). Two of the fields, we have a requirement to store and show the history of the changes. They are 'IsActive BIT' and 'IsIdle BIT'.
At the moment, they're fields within the Person table.
The requirement is to be able to display when the person was active, and when they were idle. As well as who set those values. (All tales have a LastUpdatedBy and CreatedBy column).
My plan is to use a PersonHistory table, with the PersonID FK to the Person, and the IsActive and IsIdle columns, and the CreatedBy and LastUpdatedBy columns. And a 'EffectiveFrom' DATETIME.
So when we create a person, we add a row to the history with the IsActive and IsIdle values, user and the PersonID.
To display a person, we have to do an (Untidy?) selection of the person record, and then join to the last record for that person in the history table. INNER JOIN .. SELECT TOP 1 * FROM History, using a ROW_NUMBER? Might be slow.
When editing 'IsActive', we need to add a new row with the PersonID and the new IsActive (and/or IsIdle) value. Actually, we need to store both. A row will only get written when these values change. Which means we'll need to do a pre-save check to see if the values changed.
Does this seem like a standard way to handle this requirement - or is there a better more common approach?

You might try this method before changing the data structure:
select p.*, ph.*
from Person p outer apply
(select top 1 ph.*
from personhistory ph
where ph.personid = p.personid
order by ph.effectivefrom desc
) ph;
For performance, you want an index on personhistory(personid, effetivefrom).

Related

List manipulating data from two tables content

I have these two tables
I would like to know how to do to list the product table: ProdID, Quantity, Name, Price
and table productUser: userId, State
The problem is that I need to also list all the information of product table and adding the UserId field with the same value and the state looks for a default value would be false ..
It is possible? could also, not to add the userId, State and drive it from my application code for assigning values​​. thanks
UPDATE:
If I understand your question correctly, you want to list specific fields from both tables but only when the child records match your criteria. If so, the query below would allow you to specify the userID and State.
SELECT p.prodId
,p.Quantity
,p.Name
,p.Price
,pu.userId
,pu.State
FROM Product p
INNER JOIN ProductUser pu
ON p.prodId = pu.prodId
WHERE pu.userID = #userID
AND pu.State = 0
--AND pu.State = #State
If my understanding is not correct, please post some sample table data and indicate which results you want returned.
Update to your update: I've defaulted State to zero in the query above. Followup question: do you want columns from both tables or are you trying to do an existence check against ProductUser to return just columns from Product?

Design : multiple visits per patient

Above is my schema. What you can't see in tblPatientVisits is the foreign key from tblPatient, which is patientid.
tblPatient contains a distinct copies of each patient in the dataset as well as their gender. tblPatientVists contains their demographic information, where they lived at time of admission and which hospital they went to. I chose to put that information into a separate table because it changes throughout the data (a person can move from one visit to the next and go to a different hospital).
I don't get any strange numbers with my queries until I add tblPatientVisits. There are just under one millions claims in tblClaims, but when I add tblPatientVisits so I can check out where that person was from, it returns over million. I thinkthis is due to the fact that in tblPatientVisits the same patientID shows up more than once (due to the fact that they had different admission/dischargedates).
For the life of me I can't see where this is incorrect design, nor do I know how to rectify it beyond doing one query with count(tblPatientVisits.PatientID=1 and then union with count(tblPatientVisits.patientid)>1.
Any insight into this type of design, or how I might more elegantly find a way to get the claimType from tblClaims to give me the correct number of rows with I associate a claim ID with a patientID?
EDIT: The biggest problem I'm having is the fact that if I include the admissionDate,dischargeDate or the patientStatein the tblPatient table I can't use the patientID as a primary key.
It should be noted that tblClaims are NOT necessarily related to tblPatientVisits.admissionDate, tblPatientVisits.dischargeDate.
EDIT: sample queries to show that when tblPatientVisits is added, more rows are returned than claims
SELECT tblclaims.id, tblClaims.claimType
FROM tblClaims INNER JOIN
tblPatientClaims ON tblClaims.id = tblPatientClaims.id INNER JOIN
tblPatient ON tblPatientClaims.patientid = tblPatient.patientID INNER JOIN
tblPatientVisits ON tblPatient.patientID = tblPatientVisits.patientID
more than one million query rows returned
SELECT tblClaims.id, tblPatient.patientID
FROM tblClaims INNER JOIN
tblPatientClaims ON tblClaims.id = tblPatientClaims.id INNER JOIN
tblPatient ON tblPatientClaims.patientid = tblPatient.patientID
less than one million query rows returned
I think this is crying for a better design. I really think that a visit should be associated with a claim, and that a claim can only be associated with a single patient, so I think the design should be (and eliminating the needless tbl prefix, which is just clutter):
CREATE TABLE dbo.Patients
(
PatientID INT PRIMARY KEY
-- , ... other columns ...
);
CREATE TABLE dbo.Claims
(
ClaimID INT PRIMARY KEY,
PatientID INT NOT NULL FOREIGN KEY
REFERENCES dbo.Patients(PatientID)
-- , ... other columns ...
);
CREATE TABLE dbo.PatientVisits
(
PatientID INT NOT NULL FOREIGN KEY
REFERENCES dbo.Patients(PatientID),
ClaimID INT NULL FOREIGN KEY
REFERENCES dbo.Claims(ClaimID),
VisitDate DATE
, -- ... other columns ...
, PRIMARY KEY (PatientID, ClaimID, VisitDate) -- not convinced on this one
);
There is some redundant information here, but it's not clear from your model whether a patient can have a visit that is not associated with a specific claim, or even whether you know that a visit belongs to a specific claim (this seems like crucial information given the type of query you're after).
In any case, given your current model, one query you might try is:
SELECT c.id, c.claimType
FROM dbo.tblClaims AS c
INNER JOIN dbo.tblPatientClaims AS pc
ON c.id = pc.id
INNER JOIN dbo.tblPatient AS p
ON pc.patientid = p.patientID
-- where exists tells SQL server you don't care how many
-- visits took place, as long as there was at least one:
WHERE EXISTS (SELECT 1 FROM dbo.tblPatientVisits AS pv
WHERE pv.patientID = p.patientID);
This will still return one row for every patient / claim combination, but it should only return one row per patient / visit combination. Again, it really feels like the design isn't right here. You should also get in the habit of using table aliases - they make your query much easier to read, especially if you insist on the messy tbl prefix. You should also always use the dbo (or whatever schema you use) prefix when creating and referencing objects.
I'm not sure I understand the concept of a claim but I suspect you want to remove the link table between claims and patient and instead make the association between patient visit and a claim.
Would that work out better for you?

Update values in each row based on foreign_key value

Downloads table:
id (primary key)
user_id
item_id
created_at
updated_at
The user_id and item_id in this case are both incorrect, however, they're properly stored in the users and items table, respectively (import_id for in each table). Here's what I'm trying to script:
downloads.each do |download|
user = User.find_by_import_id(download.user_id)
item = item.find_by_import_id(download.item_id)
if user && item
download.update_attributes(:user_id => user.id, :item.id => item.id)
end
end
So,
look up the user and item based on
their respective "import_id"'s. Then
update those values in the download record
This takes forever with a ton of rows. Anyway to do this in SQL?
If I understand you correctly, you simply need to add two sub-querys in your SELECT statement to lookup the correct IDs. For example:
SELECT id,
(SELECT correct_id FROM User WHERE import_id=user_id) AS UserID,
(SELECT correct_id FROM Item WHERE import_id=item_id) AS ItemID,
created_at,
updated_at
FROM Downloads
This will translate your incorrect user_ids to whatever ID you want to come from the User table and it will do the same for your item_ids. The information coming from SQL will now be correct.
If, however, you want to update the tables with the correct information, you could write this like so:
UPDATE Downloads
SET user_id = User.user_id,
item_id = Item.item_id
FROM Downloads
INNER JOIN User ON Downloads.user_id = User.import_id
INNER JOIN Item ON Downloads.item_id = Item.import_id
WHERE ...
Make sure to put something in the WHERE clause so you don't update every record in the Downloads table (unless that is the plan). I rewrote the above statement to be a bit more optimized since the original version had two SELECT statements per row, which is a bit intense.
Edit:
Since this is PostgreSQL, you can't have the table name in both the UPDATE and the FROM section. Instead, the tables in the FROM section are joined to the table being updated. Here is a quote about this from the PostgreSQL website:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the fromlist, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
http://www.postgresql.org/docs/8.1/static/sql-update.html
With this in mind, here is an example that I think should work (can't test it, sorry):
UPDATE Downloads
SET user_id = User.user_id,
item_id = Item.item_id
FROM User, Item
WHERE Downloads.user_id = User.import_id AND
Downloads.item_id = Item.import_id
That is the basic idea. Don't forget you will still need to add extra criteria to the WHERE section to limit the rows that are updated.
i'm totally guessing from your question, but you have some kind of lookup table that will match an import user_id with the real user_id, and similarly from items. i.e. the assumption is your line of code:
User.find_by_import_id(download.user_id)
hits the database to do the lookup. the import_users / import_items tables are just the names i've given to the lookup tables to do this.
UPDATE downloads
SET downloads.user_id = users.user_id
, downloads.item_id = items.items_id
FROM downloads
INNER JOIN import_users ON downloads.user_id = import_users.import_user_id
INNER JOIN import_items ON downloads.item_id = import_items.import_item_id
Either way (lookup is in DB, or it's derived from code), would it not just be easier to insert the information correctly in the first place? this would mean you can't have any FK's on your table since sometimes they point to one table, and others they point to another. seems a bit odd.

SQL - Updating records based on most recent date

I am having difficulty updating records within a database based on the most recent date and am looking for some guidance. By the way, I am new to SQL.
As background, I have a windows forms application with SQL Express and am using ADO.NET to interact with the database. The application is designed to enable the user to track employee attendance on various courses that must be attended on a periodic basis (e.g. every 6 months, every year etc.). For example, they can pull back data to see the last time employees attended a given course and also update attendance dates if an employee has recently completed a course.
I have three data tables:
EmployeeDetailsTable - simple list of employees names, email address etc., each with unique ID
CourseDetailsTable - simple list of courses, each with unique ID (e.g. 1, 2, 3 etc.)
AttendanceRecordsTable - has 3 columns { EmployeeID, CourseID, AttendanceDate, Comments }
For any given course, an employee will have an attendance history i.e. if the course needs to be attended each year then they will have one record for as many years as they have been at the company.
What I want to be able to do is to update the 'Comments' field for a given employee and given course based on the most recent attendance date. What is the 'correct' SQL syntax for this?
I have tried many things (like below) but cannot get it to work:
UPDATE AttendanceRecordsTable
SET Comments = #Comments
WHERE AttendanceRecordsTable.EmployeeID = (SELECT EmployeeDetailsTable.EmployeeID FROM EmployeeDetailsTable WHERE (EmployeeDetailsTable.LastName =#ParameterLastName AND EmployeeDetailsTable.FirstName =#ParameterFirstName)
AND AttendanceRecordsTable.CourseID = (SELECT CourseDetailsTable.CourseID FROM CourseDetailsTable WHERE CourseDetailsTable.CourseName =#CourseName))
GROUP BY MAX(AttendanceRecordsTable.LastDate)
After much googling, I discovered that MAX is an aggregate function and so I need to use GROUP BY. I have also tried using the HAVING keyword but without success.
Can anybody point me in the right direction? What is the 'conventional' syntax to update a database record based on the most recent date?
So you want to update the AttendantsRecordsTable, and set the comment to the comment in the most recent CourseDetailsTable for each employee?
UPDATE
dbo.AttendanceRecordsTable
SET
Comments = #Comments
FROM
CourseDetailsTable cd
INNER JOIN
Employee e ON e.EmployeeID = AttendanceRecordTable.EmployeeID
WHERE
e.LastName = #LastName
AND e.FirstName = #FirstName
AND cd.CourseName = #CourseName
AND AttendanceRecordsTable.CourseID = cd.CourseID
AND AttendanceRecordsTable.LastDate =
(SELECT MAX(LastDate)
FROM AttendanceRecordsTable a
WHERE a.EmployeeID = e.EmployeeID
AND a.CourseID = cd.CourseID)
I think something like that should work.
You basically need to do a join between the AttendanceRecordTable, which you want to update, and the Employee and CourseDetailsTable tables. For these two, you have defined certain parameters to select a single row each, and then you need to make sure to update only that last AttendanceRecordTable entry which you do by making sure it's the MAX(LastDate) of the table.
The subselect here:
(SELECT MAX(LastDate)
FROM AttendanceRecordsTable a
WHERE a.EmployeeID = e.EmployeeID AND a.CourseID = cd.CourseID)
will select the MAX (last) of the LastDate entries in AttendanceRecordsTable, based on selection of a given employee (e.EmployeeID) and a given course (cd.CourseID).
Pair that with the selects to select the single employee by first name and last name (that of course only works if you never have two John Miller in your employee table!). You also select the course by means of the course name, so that too must be unique - otherwise you'll get multiple hits in the course table.
Marc
Assuming that you primary key on the AttendanceRecordsTable is id:
UPDATE AttendanceRecordsTable SET Comments = #Comments
WHERE AttendanceRecordsTable.id = (
SELECT AttendanceRecordsTable.id
FROM EmployeeDetailsTable
JOIN AttendanceRecordsTable ON AttendanceRecordsTable.EmployeeID = EmployeeDetailsTable.EmployeeID·
JOIN CourseDetailsTable ON AttendanceRecordsTable.CourseID = CourseDetailsTable.CourseID
WHERE
EmployeeDetailsTable.LastName =#ParameterLastName AND EmployeeDetailsTable.FirstName =#ParameterFirstName AND
CourseDetailsTable.CourseName =#CourseName
ORDER BY AttendanceRecordsTable.LastDate DESC LIMIT 1)
Basically, that sub select will first join the attendence, employee and coursedetail tables, extract those rows where the employee's and course details' name match those given by your parameters and limit the output in reverted order to one line. You might want to test that sub-select statement first.
Edit: I just read your posting again, you don't have a single primary key column on AttendanceRecordsTable. Bummer.

How can I compare two tables and delete on matching fields (not matching records)

Scenario: A sampling survey needs to be performed on membership of 20,000 individuals. Survey sample size is 3500 of the total 20000 members. All membership individuals are in table tblMember. Same survey was performed the previous year and members whom were surveyed are in tblSurvey08. Membership data can change over the year (e.g. new email address, etc.) but the MemberID data stays the same.
How do I remove the MemberID/records contained tblSurvey08 from tblMember to create a new table of potential members to be surveyed (lets call it tblPotentialSurvey09). Again the record for a individual member may not match from the different tables but the MemberID field will remain constant.
I am fairly new at this stuff but I seem to be having a problem Googling a solution - I could use the EXCEPT function but the records for the individuals members are not necessarily the same from one table to next - just the MemberID may be the same.
Thanks
SELECT
* (replace with column list)
FROM
member m
LEFT JOIN
tblSurvey08 s08
ON m.member_id = s08.member_id
WHERE
s08.member_id IS NULL
will give you only members not in the 08 survey. This join is more efficient than a NOT IN construct.
A new table is not such a great idea, since you are duplicating data. A view with the above query would be a better choice.
I apologize in advance if I didn't understand your question but I think this is what you're asking for. You can use the insert into statement.
insert into tblPotentialSurvey09
select your_criteria from tblMember where tblMember.MemberId not in (
select MemberId from tblSurvey08
)
First of all, I wouldn't create a new table just for selecting potential members. Instead, I would create a new true/false (1/0) field telling if they are eligible.
However, if you'd still want to copy data to the new table, here's how you can do it:
INSERT INTO tblSurvey00 (MemberID)
SELECT MemberID
FROM tblMember m
WHERE NOT EXISTS (SELECT 1 FROM tblSurvey09 s WHERE s.MemberID = m.MemberID)
If you just want to create a new field as I suggested, a similar query would do the job.
An outer join should do:
select m_09.MemberID
from tblMembers m_09 left outer join
tblSurvey08 m_08 on m_09.MemberID = m_08.MemberID
where
m_08.MemberID is null