Best way to mimic inheritance in postgresql? - orm

For an application I am writing, there are two types of "users", those who have made accounts and those who have not, virtual_users. These two types are nearly identical, except account_users have a password, and email is required and must be unique amongst all account_users, although it can be the same as any number for virtual_users. A large number of tables have a column that references users, which should include both, and 90% of app functionality treats them as interchangeable. What is the best way of handling this? Some options I have considered:
-Put both types of users in the same table and have a complicated constraints regarding uniqueness, basically, if password is not NULL, email must be unique among all users where password is not NULL. I have no idea how I would write this constraint. On the few occasions I only want account_users query for only users who have a password. This seems like the best solution if I can figure out how to write the constraint.
-Have Account_users inherit from Virtual_usersand Virtual_users has an additional column password and unique constraints on email. From here there are two potential options:
---Have a Users table which includes two columns account_user_id and virtual_user_id one of which is NULL and one of which corresponds to the appropriate user. When other tables need to reference a user, they reference this table. Have all my queries server side for users query both tables and combine.
---When other tables need to reference they reference either table. I don't think this is possible. Have all my queries server side for users query both tables and combine.
Any advice would be appreciated.

I assume the scenario is you have a system which some parts require the user to be signed into a registered account, and others do not, but you'd still like to track users.
Postgres has table inheritance. You could use that, but I'd be concerned about the caveats.
You could put them all into one table and use some sort of flag, like Single Table Inheritance, but then you run into constraint issues. You would then enforce constraints in the model. This should be fine if you have a strong model.
You could have separate accounts and users. Rather than one being a special case of the other, they key is thinking of them as two conceptually different things. In OO terms, an account has a user.
-- Visitors to the site who haven't signed up.
create table users (
id serial,
-- The unverified email they might have given you.
email text,
-- Any other common information like a tracking token
token text
);
-- Users who have registered.
create table accounts (
id serial,
user_id int references users(id),
-- Their verified email.
email text not null,
-- Hashed, of course.
password text not null
-- any additional information only for accounts
);
I like this because there are no flags involved, users and accounts can have separate constraints, and the accounts table doesn't get bloated with users that showed up once and never came back.
You'd access complete account information with a simple join.
select accounts.*, users.token
from accounts
join users on accounts.user_id = users.id
where accounts.id = ?
If you want to delete an account you can do so without losing the underlying user information.
delete from accounts where accounts.id = ?
Systems which require an account use accounts. Systems which don't use users. Systems which behave differently for users and accounts can check if a user has an account easily.
select accounts.id
from accounts
where accounts.user_id = ?
User tracking is associated with the users table, so you have a consistent record of a user's behavior before and after they register.

Related

Same form that submits to two different tables

I am quite new to Microsoft Access but i have been working with databases for quite a while now, so I understand "the back end" (so to speak) as to how I will set up this database, but I'm not sure how to tell Access what to do and how to do it.
The aim of my database is to load up a customer's account with their information; eg, phone number address, email, etc. All this information has been migrated to a table within my new database project.
from here I would like to create a separate table which has notes which can be assigned to each user's account number. the aim of this is to be able to read up on recent activity of these customers and be able to search and filter that information on an easy to use front end interface.
So so far I have two tables, one with the customer's information in it and another which is where I would like to save the notes for each customer.
the primary key which I will be using is the customer's account numbers. This, of course, being unique to each customer and would be perfect for the primary key in both of these tables.
I have set up a relationship between both of the tables as they will both contain the user's account numbers.
I'm just not sure how to go about it and would really appreciate the help. I am currently using Microsoft Access 2007.
for your second table you need to use a new primary key like "customerInformationID". Reference the customer table by a foreign key.
Then you can join these tables like this:
SELECT c.Name, c.Number, ci.notes
FROM customer AS c
INNER JOIN customerInformation AS ci ON c.customerID = ci.customerID
Why do you need two tables anyways? If you don't need them for a technical purpose like "every user shall be able to store his own notes to a customer" then you can also store them in your main table as all of these information are only dependent from the primary key.
Best Regards
Bruellhusten

Database design relations in User and Profile

I'm designing a web application for a school. So far, I'm stuck with the database which has these tables:
users
id
username
password
profile
user_id (FK)
name
last_name
sex
group_id (FK)
(other basic information)
... And other tables irrelevant now, like events, comitees, groups and so on.
So, the users table stores basic information about the login, and the profiles table stores all the personal data about the user.
Now, the *group_id* column in the profile table has a foreign key that references the ID column of the group in which the user is currently enrolled, in the groups table. A user can only be enrolled in one group at once, so there's no need for any additional tables.
The thing is that it doesn't make much sense to me declaring a relation like group HAS MANY profiles. Instead, the relation should be group HAS MANY users, but then, I would have to put a *group_id* column on the users table, which doesn't really fit in, since the users table only stores auth information.
On the other side, I would like to list all the users enrolled in a group using an ORM and getting the a users collection and not profiles. The way I see it, is that the users table is like the 'parent' and the profiles table extends the users table.
The same problem would occur when setting attendances for events. Should I reference the profile as a foreign key in the events_attendance table? Or should I reference the user ID?
Of course both solutions could be implemented and work, but which of them is the best choice?
I have dug a little and found that both solutions would comply with 3NF, so in theory, would be correct, but I'm having a hard time designing the right way my database.
This is a question of your own conventions. You need to decide what is the main entity, right after that you can easiy find a proper solution. Both ways are good, but if you think of User as of the main entity while Profile is a property then you should put GroupId into User, otherwise, if you mean User and Profile as a single entity, you can leave GroupId in Profile, and by this you're not saying group HAS MANY profiles but group HAS MANY users.
By setting a proper one-to-one relation (User-Profile) you can force your data integrity good enough.

Link table(s) or redundant columns, SQL optimisation

I have two tables, Users and People, both of which share a common attribute, email address, of which they should be allowed to have many email addresses.
I can see three options myself:
One link table with redundant columns:
Users [id,email_id] and People [id,email_id]
EmailAddress [id,user_id,person_id,email_id]
Emails [id,address,type]
Two link tables without redundancies:
Users [id,email_id] and People [id,email_id]
PersonEmail [id,person_id,email_id]
UserEmail [id,user_id,email_id]
Emails [id,address,type]
No link tables with redundant columns:
Users [id] and People [id]
Emails [id,address,type,user_id,person_id]
Does anyone have any idea what would be the best option, or if there is any other ways? Also, if anyone knows how to implement or feel it is better to have link tables without the generated id column please also specify.
Update: a User has many People, a person belongs to a User
First off, the relationship between user and e-mail is 1:N, not M:N, so in any case you don't need the "link" table EmailAddress.
You need to decide which of these possibilities is true for your application:
User is always person.
Person is always user.
There can be a person that is not user and there can be a user that is not person.
Option 1:
Assuming the option (1) is the correct one, the logical model should look like this:
The symbol between Person and User is "category", which at the level of the physical database can be implemented either:
as a "1 to 0 or 1" relationship between separate tables Person and User,
or a single table containing both person and user fields, where user fields are NULL for persons that are not also users.
If you have...
many user-specific fields,
there are user-specific foreign keys,
new kinds of persons could be added in the future
and you don't need to squeeze-out every last drop of performance,
...choose the implementation strategy with two tables.
If there are:
relatively few user-specific fields,
there are no user-specific relationships,
low "evolvability" is acceptable
and performance is of high importance,
...choose the implementation strategy with the single table.
Similar analysis can be done for each of the remaining possibilities...
Option 2:
Option 3:
If the two entities are conceptually related, then it might make sense to have one table. But if they are two different concepts, then in my experience it is best to have separate tables in order to avoid future confusion. And you're not going to take a big hit anywhere by doing so.
Isn't the User a Person (People)?
That would solve the redundant field issue right away.
----------
| Person |
----------
|
--------
| User |
--------
The User should have the single e-mail field, or mantain the relation with the e-mails table, since Person is an abstract concept not related to any application.
I would say start thinking about (re)modeling your schema, so you won't have problems like this.
Read the Multiple Table Inheritance in Rails guide, that should get you started.

Should I include user_id in multiple tables?

I'm at the planning stages of a multi-user application where each user will only have access their own data. There'll be a few tables that relate to each other, so I could use JOINs to ensure they're accessing only their data, but should I include user_id in each table? Would this be faster? It would certainly make some of the queries easier in the long run.
Specifically, the question is about multiple tables containing the user_id field.
For example, each user can configure categories, items (in those categories), and sub-items against those items. There's a logical path from user, to sub-items through the other tables, but it would require 3 JOINs. Should I just include user_id in all the tables?
Thanks!
This is a design decision in multi-tenant databases. With "root" tables, obviously you have to have the user_id. But in the non-"root" tables, you do have a choice when you are using surrogate PKs.
Say you have users with projects and projects with actions. Projects obviously has to have a user_id, but if actions are tied to one and only one project, then the user_id is redundant, and also violates normal form, since if it was to move to another user's project (probably not likely in your use cases), both the project FK and the user FK would have to be updated. Typically in multi-tenant scenarios, this isn't really a possible scenario, and so the primary key of every table is really a combination of tenant and a unique primary key "within" the tenant (which may also happen to be globally unique).
If you use natural keys extensively in your design, then clearly tenant+natural key is necessary so that each tenant's natural keys can be used. It's only when using surrogates like IDENTITY or GUIDs or sequences, that this becomes an issue, since it is tempting to make the IDENTITY the PK, after all, it is unique by definition.
Having the user_id in all tables does allow you to do certain things in views to enhance security (defense in depth), giving you a little bit of defensive programming (in SQL Server you can restrict all access through inline table valued function - essentially parametrized views - which require the app to specify user_id on every "table" access), and also allows you to easily scale out to multiple databases by forklifting everything on shared keys.
See this article for some interesting insights.
(In a massively multi-parallel paradigm like Teradata, the PRIMARY INDEX determines the amp on which the data lives, so I would think that this is a must to stop redistribution of rows to the other amps.)
In general, I would say you have a tenantid in each table, it should be the first column in the table, in most indexes and should be part of the primary key in most cases, unless otherwise justified. Where possible, it should be a required parameter in most stored procedures.
Generally, you use foreign keys to relate data between tables. In many cases, this foreign key is the user id. For example:
users
id
name
phonenumbers
user_id
phonenumber
So yes, that'd make perfect sense.
If a category can only belong to one user then yes, you need to include the user_id in the category table. If a category can belong to multiple people then you would have a separate table that maps category IDs to user IDs. You can still do this if you have a one to one mapping between the two, but there is no real reason for it.
You don't need to include the user_id in further tables if you can guarantee that those child tables will always be accessed via joining to the category table. If there is a chance that you will access them independantly of the category table then you should also have the user_id on those tables.
The extent to which to normalize can be a difficult decision. One of the best StackOverflow answers on this topic (Database Development Mistakes Made by App Developers) warns against both (1) failing to normalize, and (2) over-normalizing.
You mention that it might be easier "in the long run" to repeat the same data in multiple tables (that is, not to normalize that data). Look at the "Not simplifying complex queries through views" topic in the previous link. If you use views effectively, you will only have to do the 3 join query once when writing the view and then you can use a query with no joins for most purposes.
Most developers tend to under-normalize because it seems simpler. Go ahead and normalize. Use views to simplify your daily queries. When your requiremens get more complex or you decide to add features, you will be glad that you put time into a relational database design.
Alternatively, depending on your toolset, you may want to use a database abstraction layer that does the relational design under the covers while you manipulate higher level data object.
if it is Oracle, then you would probably set up a fine grained security rule to do the joins and prevent certain activities based on the existence of the original user id... (SELECT INSERT UPDATE DELETE etc)
You would need a map between the logged in user and the user_id. You could use uid, but then remember this umber may change if the database is reconstructed after some disaster...

put login and password in one table or in multiple tables for each type of user?

I have different 3 types of users and each type of user can have columns and relationships with tables that another type doesn't, but all of them have login(Unique) and password,
how would you do:
create a table for each type or
create one table for all of them or
create a table for all of them only for login and password and separate for all the other things and bind them with a FK
something else
Number 3 is the best of the options you suggested (updated slightly for clarification):
create a table for all of them for login and password and anything else that is shared and a separate table for all the other things that are not shared and bind them with a FK
Except don't store the password, store a hashed version of a salted password.
An alternative might be to assign groups and/or roles to your users. This might be more flexible than a fixed table structure, allowing you to add new roles dynamically. But it depends on your needs whether this is useful for you or not.
As Aaronaught pointed out, in the main table you need an AccountType to ensure that a user can only have one of the roles. You must remember to check the value of this column when joining the tables to ensure that a user has only one role active.
A unique constraint on the foreign key ensures that a user can only have a role once.