Data Design debate - sql

I am having a tough time with this design problem and would appreciate any insight.
I have a doctors office that is provided certain privileges currently there are only 5 privileges but more could be added. Each of these privileges has a status of Yes or No, but there could be a finer grained status in the future and each of these privileges are related to a location(er,inpatient, outpatient) and they too could expand in future.
So currently I have the tables OfficePrivileges, PrivilegeLocation, PrivilegeType, PrivilegeStatus.
OfficePrivilege is a Joining table between PrivilegeLocation and DoctorOffice. It has a double primary key of OfficeID and PrivilegeLocationID.
At one time I had Type and status joined to the OfficePrivileges Table then switched to have the type table be a child of Location and status be a child of type. They are all single primary key tables.
if you were designing this set of tables how would you do it? I am thinking that this is almost a hierarchy problem..and I hate them. I would like to lay out the edit screen as crosstab table having Location across the top, Type down the side and details being the statuses. That is how it is currently in the system I am trying to integrate to, but its a Cobol backend and handles hierarchies better than relational db....
EDIT:To Help clear up confusion:
For my example there are Admit Privileges, Attending Privileges,Consulting privileges and Surgery Privilges. And the locations are Inpatient, ER, OutPatient, and OP/Surgery. Currently the statuses are only Yes or NO. But they could change in the future depending on client need.
This information is stored in tables in my database.

I think the big thing here it to try to be as flexible as possible, since you're aware of "possible" changes, but you don't really want to code to those, yet.
Something like what you've got is really probably pretty good: I'd go this way: you'll actually have a 3-way relationship between Offices, PrivilegeLocations, and PrivilegeTypes
I would say your OfficePrivaleges table should have the following 6 columns: Id (its own PK), OfficeId, LocationId, PTypeId, StartDate, EndDate When a new Privilege is granted for an Office, you'll add a record to this table linking the three and adding the first date the office has the privilege in StartDate. If the Privilege is revoked, add that date to EndDate. If it's renabled, either add a new row, or reset the StartDate.
I would avoid making PrivilegeType a child of PrivilegeLocation, because then you either have to store every location n times (n = different privilege combinations). This way, you're only storing Offices, Locations, and Types once each.
So, for instance, Doctor A could have Inpatient Privileges at Location X, but only DX Imaging privileges at Location Y, while Doctor B could have Inpatient privileges at both Locations X and Y.

Related

Correct relations to Loans table?

Hello stackers!
Ive made a library databse, i was wondering.. i am making one-to-one relation between Copies and Loans. and one-to-many relation from Users to Loans.
Since a copy of a book only should be allowed to be assigned one loan at a time, and a loan should only be able to containt one copy of a book. if they rent other books, its multiple loans.
and a user should be able to make multiple loans, but a loan can only be assigned one specific user.
is my current relations between these three tables correct?
if not, i would love to know how to fix it, and the reason to my failed logic on the issue.
thank you in advance!
Following on from the earlier answer. If you add a User_Roles table it could/(will) prevent you from falling into the membership trap. If you assume a user with the Admin role can perform every function a user with only Basic role, then every function which requires role-checking has to have a list of acceptable roles (Admin + Basic). Many times it is more efficient to just directly assign all the different roles, i.e. Basic AND Admin, to individual users. Then when a feature requires Basic role-authorization all users can treated the same way. It's simpler in the long run.
The Loans table has a number of issues. First, there's no primary key, to be consistent with the rest of your design, it could be a LoanID. CopyID should a foreign key to the Copies table (maybe that's what is currently drawn).
One 'advanced' or modern approach to your data model could be to use a temporal table to model the Copies. Since a single copy may only be lent out 1 time, properties of the loan could be add to the Copies table. Then anytime a change is made the System Version'ed Copies table the Copies_history table would automatically keep a full accounting of all prior loan activity.
The model looks good to me. You may need to apply some in the logic to enforce a user to have only one loan with same copy of a book.
A user will be able loan a copy over and over again ? then the relationship to loan to copy 1:M

Best way to add content (large list) to relational database

I apologize if this may seem like somewhat of a novice question (which it probably is), but I'm just introducing myself to the idea of relational databases and I'm struggling with this concept.
I have a database with roughly 75 fields which represent different characteristics of a 'user'. One of those fields represents a the locations that user has been and I'm wondering what the best way is to store the data so that it is easily retrievable and can be used later on (i.e. tracking a route on Google Maps, identifying if two users shared the same location etc.)
The problem is that some users may have 5 locations in total while others may be well over 100.
Is it best to store these locations in a text file named using the unique id of each user(one location on each line, or in a csv)?
Or to create a separate table for each individual user connected to their unique id (that seems like overkill to me)?
Or, is there a way to store all of the locations directly in the single field in the original table?
I'm hoping that I'm missing a concept, or there is a link to a tutorial that will help my understanding.
If it helps, you can assume that the locations will be stored in order and will not be changed once stored. Also, these locations are static (I don't need to add any more locations once as they can't be updated).
Thank you for time in helping me. I appreciate it!
Store the location data for the user in a separate table. The location table would link back to the user table by a common user_id.
Keeping multiple locations for a particular user in a single table is not a good idea - you'll end up with denormalized data.
You may want to read up on:
Referential Integrity
Relational denormalization
The most common way would be to have a separate table, something like
USER_LOCATION
+------------+------------------+
| USER_ID | LOCATION_ID |
+------------+------------------+
| | |
If user 3 has 5 locations, there will be five rows containing user_id 3.
However, if you say the order of locations matter then an additional field specifying the ordinal position of the location within a user can be used.
The separate table approach is what we call normalized.
If you store a location list as a comma-separated string of location ids, for example, it is trival to maintain the order, but you lose the ability for the database to quickly answer the question "which users have been at location x?". Your data would be what we call denormalized.
You do have options, of course, but relational databases are pretty good with joining tables, and they are not overkill. They do look a little funny when you have ordering requirements, like the one you mention. But people use them all the time.
In a relational database you would use a mapping table. So you would have user, location and userlocation tables (user is a reserved word so you may wish to use a different name). This allows you to have a many-to-many relationship, i.e. many users can visit many locations. If you want to model a route as an ordered collection of locations then you will need to do more work. This site gives an example

Database design for CRM User table for a specific scenario

I'm designing a database for our CRM system and need some help with the CRM User table.
Types of users:
Admin
Sales Reps from Branch 2
Sales Reps from Branch 3
Client login
Now for this scenario would it make sense to have all users in a single table and have a table attribute called "type" to identify the type of user? OR should I have a seperate table for each type of user? Also, there will be some information sharing between the Sales reps.
Typically, I usually go with one User table with a Type associated with it. If you have additional Sales Rep attributes you want to store, create a SalesRep table with a foreign key back to the User table. Then, create a view that joins User and SalesRep, so it looks, logically, like there's just a usvSalesRep table that has all of the attributes you need for Sales Reps.
But, this depends a lot on data volume and transaction load, so additional information you can supply there is useful.
It depends on the number of users that you expect.
But usually a single table is enought.
If you'll have billions of users maybe you can do horizontal partitioning and make multiple tables.
Single table should be fine. And I disagree that the number users really have much impact on its design in this case.
Whenever possible, you should design your tables to mimick real life. Admin, Sales Rep, etc are just descriptions/attributes of who they are. Ultimately, they are all "People"... or User. So having one "User" table with "Admin", "SalesRep" as attibutes makes sense to me. Use the "Type" approach only if the user can only be one "Type". Use separate columns if they can be multiple user types. Ie. one can be a SalesRepBranch2 and SalesRepBranch3 at the same time. Might consider normalizing this even further.

SSIS Population of Slowly Changing Dimension with outrigger

Working on a data warehouse, a suitable analogy for the problem is that we have Healthcare Practitioners. Healthcare Practitioners have a number of professional attributes and work in an open number of teams and in an open number of clinical areas.
For example, you may have a nurse who works in children's services across a number of teams as a relief/contractor/bank staff person. Or you may have a newly qualified doctor who works general medicine who is doing time in a special area pending qualifying as a consultant of that special area.
So we have an open number of areas of work and an open number of teams, we can't have team 1, team 2 etc in our dimensions. The other attributes may change over time also, like base location (where they work out of), the main team and area they work in..
So, following Kimble I've gone for outriggers:
Table DimHealthProfessionals:
Key (primary key, identity)
Name
Main Team
Main Area of Work
Base Location
Other Attribute 1
Other Attribute 2
Start Date
End Date
Table OutriggerHealthProfessionalTeam:
HPKey (foreign key to DimHealthPRofessionals.Key)
Team Name
Team Type
Other Team Attribute 1
Other Team Attribute 2
Table OutriggerHealthProfessionalAreaOfWork:
HPKey (as above)
Area of Work
Other AoW attribute 1
If any attribute of the HP changes, or the combination of teams or areas of work in which they work change, we need to create a new entry in the SCD and it's outrigger tables to encapsulate this.
And we're doing this in SSIS.
The source data is basically an HP table with the main attributes, a table of areas of work, a table of teams and a pair of mapping tables to map a current set of areas of work to an HP.
I have three data sources, one brings in the HCP information, one the areas of work of all HCPs and one the team memberships.
The problem is how to run over all three datasets to determine if an HP has changed an attribute, and if they have changed an attribute, how we update the DIM and two outriggers appropriately.
Can anyone point me at a best practice for this? OR suggest an alternative way of modelling this dimension?
Admittedly I may not understand everything here, but it seems to me that the relationship in this example should be reversed. Place TeamKey and the WorkAreaKey in the dimHealthProfessionals -- this should simplify things.
With this in place, you simply make sure to deliver outriggers before the dimHealthProfessionals.
Treat outriggers as dimensions in their own right. You may want to treat dimHealthProfessionals as a type 2 dimension, to properly capture the history.
EDIT
Considering that team to person is many-to-many, a fact is more appropriate.
A column in a dimension table is appropriate only if a person can belong to only one team at a time. Same with work areas.
The problem is how to run over all three datasets to determine if an HP has changed an attribute, and if they have changed an attribute, how we update the DIM and two outriggers appropriately.
Can anyone point me at a best practice for this? OR suggest an alternative way of modelling this dimension?
I'm not sure I understand your question fully. If you are unsure about change detection, then use Checksums in the package. Build up a temp table with the data as it is in the source, then compare each row to its counterpart (joined via the business keys) by computing the checksum for both rows and comparing those. If they differ, the data has changed.
If you are talking about cascading updates in a historized dimension hierarchy (and you can treat the outriggers like a hierarchy in this context) then the foreign key lookups will automatically lookup the newer entry in DimHealthProfessionals if you have a historization (i.e. have validFrom / validThrough timestamps in DimHealthProfessionals). Those different foreign keys result in a different checksum.

Database design to hold a person's information that changes with time?

We use a third-party product to manage our sports centre membership. We have several membership types (eg. junior, student, staff, community) and several membership statuses (eg. annual, active, inactive, suspended). Unfortunately the product only records a member's current membership type and status. I'd like to be able to track the way our members' type and status have changed over time.
At present, we have access to the product's database design. It runs on SQL Server and we regularly run our own SQL queries against the product's tables to produce our own tables. We then link our tables to pivot-tables in Excel to produce charts. So we're familiar with database design and SQL. However we're stuck as to how to best approach this problem.
The product records a member's membership purchases and their start and expiry dates. So we can work back through that data to determine a member's type and status at any point in time. For example, if they bought a junior membership on Jan 1, 2007 and it expired on Dec 31, 2007 and then they bought a student membership on Jun 1, 2008, we can see their status went from active to inactive to active (on Jan 1, 2008 and Jun 1, 2008, respectively) and their type went from junior to student (on Jun 1, 2008).
Essentially we'd like to turn a member's type and status properties into temporal properties or effectivities a-la Fowler (or some other thing that varies with time).
Our question (finally :) - given the above: what database table design would you recommend we use to hold this member information. I imagine it would have a column for MemberID so we can key into the existing Member table. It would also need to store a member's status and type and the date range they were held for. We'd like to be able to easily write queries against this table(s) to determine how many members of each type and status we had at a given point in time.
UPDATE 2009-08-25: Have been side-tracked and haven't had a chance to try out the proposed solutions yet. Hope to do so soon and will select an answer based on the results.
Given that your system is already written and in place, the simplest approach to this problem (and the one that affects the existing database/code the least), is to add a membership history table that contains MemberID, status, type and date columns. Then add an UPDATE and an INSERT trigger to the main member table. When these triggers fire, you write the new values for the member (along with the date of the status change) into the member history table. You can then just query this table to get the histories for each member.
This is fairly simple to implement, and won't affect the existing system at all.
I'll write this for you for a free membership. :)
I cannot recommend you enough to read Joe Celko's "Sql for smarties - advanced sql programming". he has a whole chapter on temporal database design AND how to (effeciently and effectively) run Temporal Projection, Selection and Temporal Join queries. And I would not do him justice to even attempt to explain what he says in his chapter in this post.
I would create a reporting database that was organized into a star schema. The membership dimension would be arranged temporally, so that there would be different rows for the same member at different points in time. That way different rows in the fact table could pertain to different points in history.
Then I would create update procedures for updating the reporting database periodically, say one a week, from the main database. This is where the main work would come.
Then, I would drive the reports off the reporting database. It's pretty easy to make a star schema do the same things a pivot table does. If necessary, I'd get some kind of OLAP tool to sit in front of the reporting database.
This is a lot of work, but it would pay off over time.
I would put the membership info in it's own table with start and end dates. Keeping the customer in separate table. This is a pain if you need the "current" membership info all the time but there are many ways to get around that either through queries or triggers.