User-specific database records - sql

This is more of a theoretical questions. Anyways,
Assume that I have a database (50 tables)
And Data in ALL of those tables should be hold specifically to one user.
What I mean is, usually we do foreign keys for each table that is user-specific. and retrieving data using something like:
select Column1,Col2,Col3 from Table where UserId=#UserId
This clutters queries significantly. Is the other way of storing such data?

What you're referring to has a name. It's called multi-tenant. The link gives some general information on multi-tenant structures.
This all depends on your RDBMS though. Oracle provides a tool which makes this very easy.
The
WHERE User_ID = #User_ID
becomes a policy which you can apply to tables. You also create a protected stored procedure to update #User_id to ensure that it cannot be changed without authority. Then every query to a table with such a policy adds the where clause in the background. It's automatic and there's no way to avoid it.

There might be other ways, but I wouldn't say they are better. If every record in every table needs to belong to a specific user then you need a way to define that.
A possible method to reduce the number of UserId columns would be if you have child tables and you simply assumed that the parent table defined ownership and that it was implied that it's child records belonged to that user.
Example:
CarMake:
MakeId
MakeName
UserId
CarModel:
ModelId
MakeId
ModelName
Notice that CarModel does not have a UserId column, but you could assume that it inherits the same UserId of the parent CarMake record. If you cannot assume that children should always inherit the parent UserId then your original method is best.

Related

EF: Inserting already present record in many to many relationship

For what I searched there are 2 ways to insert an already present record into a ICollection list:
group.Users.Add(db.Users.FirstOrDefault(x=> x.Id = 1));
var to_add = new User{Id: 1}; db.Users.Attach(to_add); group.Users.Add(to_add);
The problem with both the above approach is it makes a db call every time we want to add a record. While we already know the user's Id and the group's id and that's all it needs to create a relationship.
Imagine a long list to be added, both the above methods would make multiple calls to db.
So you have Groups and Users. Every Group has zero or more Users; every User has zero or more Groups. A traditional many-to-many relationship.
Normally one would add a User to a Group, or a Group to a User. However you don't have a Group, nor a User, you only have a GroupId and a UserId. and because of the large number of insertions you don't want to fetch the Users and the Groups of which you want to create relations
The problem is, if you could add the GroupId-UserId combination directly to your junction table, how would you know that you wouldn't be adding a Group-User relation that already exists? If you wouldn't care, you'd end up with twice the relation. This would lead to problems: Would you want them to be shown twice if you'd ask the Users of a Group? Which one should be removed if the relation ends, or should they all be removed?
If you really want to implement the possibility of double relation, then you'd need to Implement a a Custom Junction Table as described here The extra field would be the number of relations.
This would not help you with your large batch, because you would still need to fetch the field from the custom junction table to increment the NrOfRelations value.
On the other hand, if you don't want double relations, you'd have to check whether the value already exists, and you didn't want to fetch data before inserting.
Usually the number of additions to a database is far less then the number of queries. If you have a large batch of data to be inserted, then it is usually only during the initialization phase of the database. I wouldn't bother optimizing initialization too much.
Consider remembering already fetched Groups and Users in a dictionary, preventing them to be fetched twice. However, if your list is really huge, this is not a practical solution.
If you really need this functionality for a prolonged period of time consider creating a Stored Procedure that checks if the GroupId / UserId already exists in the junction table, and if not, add it.
See here For SQL code how to do Add-Or-Update
Entity Framework call stored procedure

Table referenced by other tables having different PKs

I would like to create a table called "NOTES". I was thinking this table would contain a "table_name" VARCHAR(100) which indicates what table put in the note, a "key" or multiple "key" columns representing the primary key values of the table that this note applies to and a "note" field VARCHAR(MAX). When other tables use this table they would supply THEIR primary key(s) and their "table_name" and get all the notes associated with the primary key(s) they supplied. The problem is that other tables might have 1, 2 or more PKs so I am looking for ideas on how I can design this...
What you're suggesting sounds a little convoluted to me. I would suggest something like this.
Notes
------
Id - PK
NoteTypeId - FK to NoteTypes.Id
NoteContent
NoteTypes
----------
Id - PK
Description - This could replace the "table_name" column you suggested
SomeOtherTable
--------------
Id - PK
...
Other Columns
...
NoteId - FK to Notes.Id
This would allow you to keep your data better normalized, but still get the relationships between data that you want. Note that this assumes a 1:1 relationship between rows in your other tables and Notes. If that relationship will be many to one, you'll need a cross table.
Have a look at this thread about database normalization
What is Normalisation (or Normalization)?
Additionally, you can check this resource to learn more about foreign keys
http://www.w3schools.com/sql/sql_foreignkey.asp
Instead of putting the other table name's and primary key's in this table, have the primary key of the NOTES table be NoteId. Create an FK in each other table that will make a note, and store the corresponding NoteId's in the other tables. Then you can simply join on NoteId from all of these other tables to the NOTES table.
As I understand your problem, you're attempting to "abstract" the auditing of multiple tables in a way that you might abstract a class in OOP.
While it's a great OOP design principle, it falls flat in databases for multiple reasons. Perhaps the largest single reason is that if you cannot envision it, neither will someone (even you) looking at it later have an easy time reassembling the data. Smaller that that though, is that while you tend to think of a table as a container and thus similar to an object, in reality they are implemented instances of this hypothetical container you are trying to put together and operate better if you treat them as such. By creating an audit table specific to a table or a subset of tables that share structural similarity and data similarity, you increase the performance of your database and you won't run in to strange trigger or select related issues later.
And you can't envision it not because you're not good at what you're doing, but rather, the structure is not conducive to database logging.
Instead, I would recommend that you create separate logging tables that manage the auditing of each table you want to audit or log. In fact, some fast google searches show many scripts already written to do much of this tasking for you: Example of one such search
You should create these individual tables and then if you want to be able to report on multiple table or even all tables at once, you can create a stored procedure (if you want to make queries based on criterion) or a view with an included SELECT statement that JOINs and/or UNIONs the tables you are interested in - in a fashion that makes sense to the report type. You'll still have to write new objects in to the view, but even with your original table design, you'd have to account for that.

What is the proper table and join structure for a many to many relationship between the same attributes of the same table?

Lets say I have a Users table with a UserID column and I need to model a scenario where a user can have multiple relationships with another user, e.g. phone calls. Any user can initiate 0...n phone calls with another user.
Would it be a classic junction table like:
UserCalls
-----------------------
| CallerID | CalleeID |
-----------------------
?
The thing that's really important to get right on these tables is the primary key. If a person is allowed to call another person several times and each is represented by a distinct row, then (Caller, Callee) is not a candidate key. There needs to be something like a surrogate key or some kind of timestamp which is used to ensure you have a good primary key.
In addition, from a business rules perspective, if the relationship is reversible where any time you are looking for calls, you only care that the two parties were the same (not who called who), having the table distinguish them in the way you have can be problematic. The typical way around that is to have a Calls table and a CallParties table which links the call to the parties in the call (which may have flags which help identify the call originator). In this way the column order dependency goes away and MAY make certain queries easier (it may make others more difficult). This can also reduce the number of indexes required.
So, I would consider first the table design as you have it, but also keep in mind the possible need for reversals.
Sounds like you're on the right track...
CREATE TABLE CallHistory
(
CallerID int,
RecipientID int,
DurationInMinutes int,
/* etc etc */
CallStartedAt smalldatetime
)
For your PK, consider this article on choosing a PK:
http://www.agiledata.org/essays/keys.html
Yes, that's correct.

What to do if 2 (or more) relationship tables would have the same name?

So I know the convention for naming M-M relationship tables in SQL is to have something like so:
For tables User and Data the relationship table would be called
UserData
User_Data
or something similar (from here)
What happens then if you need to have multiple relationships between User and Data, representing each in its own table? I have a site I'm working on where I have two primary items and multiple independent M-M relationships between them. I know I could just use a single relationship table and have a field which determines the relationship type, but I'm not sure whether this is a good solution. Assuming I don't go that route, what naming convention should I follow to work around my original problem?
To make it more clear, say my site is an auction site (it isn't but the principle is similar). I have registered users and I have items, a user does not have to be registered to post an item but they do need to be to do anything else. I have table User which has info on registered users and Items which has info on posted items. Now a user can bid on an item, but they can also report a item (spam, etc.), both of these are M-M relationships. All that happens when either event occurs is that an email is generated, in my scenario I have no reason to keep track of the actual "report" or "bid" other than to know who bid/reported on what.
I think you should name tables after their function. Lets say we have Cars and People tables. Car has owners and car has assigned drivers. Driver can have more than one car. One of the tables you could call CarsDrivers, second CarsOwners.
EDIT
In your situation I think you should have two tables: AuctionsBids and AuctionsReports. I believe that report requires additional dictinary (spam, illegal item,...) and bid requires other parameters like price, bid date. So having two tables is justified. You will propably be more often accessing bids than reports. Sending email will be slightly more complicated then when this data is stored in one table, but it is not really a big problem.
I don't really see this as a true M-M mapping table. Those usually are JUST a mapping. From your example most of these will have additional information as well. For example, a table of bids, which would have a User and an Item, will probably have info on what the bid was, when it was placed, etc. I would call this table... wait for it... Bids.
For reporting items you might want what was offensive about it, when it was placed, etc. Call this table OffenseReports or something.
You can name tables whatever you want. I would just name them something that makes sense. I think the convention of naming them Table1Table2 is just because sometimes the relationships don't make alot of sense to an outside observer.
There's no official or unofficial convention on relations or tables names. You can name them as you want, the way you like.
If you have multiple user_data relationships with the same keys that makes absolutely no sense. If you have different keys, name the relation in a descriptive way like: stores_products_manufacturers or stores_products_paymentMethods
I think you're only confused because the join tables are currently simple. Once you add more information, I think it will be obvious that you should append a functional suffix. For example:
Table User
UserID
EmailAddress
Table Item
ItemID
ItemDescription
Table UserItem_SpamReport
UserID
ItemID
ReportDate
Table UserItem_Post
UserID -- can be (NULL, -1, '', ...)
ItemID
PostDate
Table UserItem_Bid
UserId
ItemId
BidDate
BidAmount
Then the relation will have a Role. For instance a stock has 2 companies associated: an issuer and a buyer. The relationship is defined by the role the parent and child play to each other.
You could either put each role in a separate table that you name with the role (IE Stock_Issuer, Stock_Buyer etc, both have a relationship one - many to company - stock)
The stock example is pretty fixed, so two tables would be fine. When there are multiple types of relations possible and you can't foresee them now, normalizing it into a relationtype column would seem the better option.
This also depends on the quality of the developers having to work with your model. The column approach is a bit more abstract... but if they don't get it maybe they'd better stay away from databases altogether..
Both will work fine I guess.
Good luck, GJ
GJ

How to model a mutually exclusive relationship in SQL Server

I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.