Which database structure to choose? [closed] - sql

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I want to implement a notification system. I have users and each user has notification setting
Structure 1:
Users Notification_settings Notifications
-id (pk) -id -id (pk)
-username -user_id (fk) references Users -user_id (fk)
-password -receive_email (boolean) -message
-is_read
Structure 2:
Users Notifications_settings
-id (pk) -id (pk)
-setting_id (fk) references Notifications_settings -receive_email
-username
-password
Notifications
-id (pk)
-user_id (fk)
-message
-is_read
Which database structure to choose or any other database structure for notification system ?

And now for a Joe Celko quote:
A strong entity is one that exists on its own merit. A weak entity is one that exists because of a strong entity. The classic example is that of a sales order; the order header is strong and the order details are weak. If the order is dropped, then all the order details should disappear.
So, then, can a User exist on it's own without a Notification? Then the Notifications table should have the foreign key to Users. Is the converse true? Then make it the other way. Are neither of these true? Then perhaps your model is incorrect, and there should be a junction table between them (even with unique constraints on it), or maybe they truly do belong in the same table. I don't particularly like this last option of putting them in the same table because you've already naturally come up with two nouns to describe distinct entities.
Can other entities other than Users "have a" Notification? Maybe this kind of thinking could help you model this domain.
Update - Some additional ideas:
If you were to put all of these columns into one table, which of them, if any, now looks like it's going to contain redundant data? Let's imagine there are only a few different messages now. Perhaps you don't need a Notifications table but a Messages table, and a junction table between that and Users which could store Messages sent to Users over time, including if they've been read or not. The receive_email could be a better attribute of Users at this point, though maybe just having a Message mapped to a User is enough to say that this User should receive email. These are just some of the things that I might be thinking and would hope to lead to a better understanding of the app.
Also, beware that the bit/boolean datatype is not ANSI SQL, can often be derived from other data, or can even turn into an int down the road mapping to multiple statuses.

It seems like it should be users < notifications (one to many), but maybe you have a specific reason that it should be 1-1. In that case, I would make the parent table (the one without the FK column) the table with more one-to-many relationships. So, naturally, it would make sense to use store the user ID in the notification table.
It would seem that a notification would always have a user, but NOT vice-versa. Therefore, you should store the foreign key to a UserID in your notification table -- and not the other way around.
edit-- as others have suggested, if you truly do want a 1-1 relationship, you could just add notification fields to the User table. These seems to violate normalization rules at a glance, but if it reeeaaaally is a 1-1 relationship, then by all means, have at it
Edit 2- Since you explicitly stated that a notification does not exist without a user, I will definitively say that you should store the foreign key to a User in the Notifications table, no exceptions (except if you want to store notification information inside the user's table :)
Edit #2937:
You should store the user's notification preferences in the User table - there's no need to split that into a different table unless you have some obfuscated design and have 256 columns for your user already and that is the limit.
You should store the notifications in a separate table, with a One-Many Relationship from Users to Notifications. That is my final answe, Regis :)

If your site is not huge in traffic, u put it all in the same table (user and settings). This is the relevant answer for the OP.
Usually, u separate 1:1 into different tables when several conditions happen (together):
each group of fields is relevant to a different module in your app
each group of fields is access in different rates (username/password each login, billing settings once/twice in a week, for example)
There is huge traffic into your site, where u need to milk any ounch of performance from your system
As u can see from above, most do not need to separate it.

Related

When to use one to many vs many to many in right situation?

i quite confuse when, or not to use one to many vs many to many. ex, user roles. in such situation many to many have advantage in reduce data size cause it just point to integer, maybe it save 1-10bytes each row, ex, senior developer char with id 7, it consume 2 bytes in smallint, instead 16 bytes. but, it makes bloat table. if such situation use many to many. why one to many should exists if many to many have the advantage? is it not always good to many to many?
Users table
id
username
password
Users_Roles table
user_id
role
Versus
Users table
Users_Roles table
user_id
role_id
Roles table
id
role
You're prematurely optimizing. A few integers here and there is unlikely to impact your data size nor your performance. If it does, the schema can be changed later, but usually there is much bigger bloat to be concerned with.
One-to-many vs many-to-many is not an optimizing issue. It's about the relationship between the tables.
If one and only one user can have a role, use one-to-many.
If many users can have the same role, use many-to-many.
For example, if you have an admin role and there can ever only be one admin user, use one-to-many. If there can be many admins, use many-to-many. You have to decide what the relationship is between users and roles.
Note: Use bigints for ids. 4 billion might seem like a lot, but it comes up fast and one of the worst things that can happen is to run out of IDs.
This is a data modeling question, and it's answer comes out of and is dictated by the analysis of the relationships of the entities involved. You have identified 2 entities you want to store data about, users and roles. Now describe their relationship in spoken language terms, looking at the relationship from both directions.
Can a user have more than one role? Can a role be held by more than one user? If the answer to both is yes, than it's a many to many relationship. Take the primary keys of both entities and bring them together as the composite primary key of an associative table. It may not have any attribute unless there is data about the relationship of a user/role itself that needs to be captured.
However, what if you are modeling entities of invoices and line items? Can an invoice have more than one line item? Yes. Can an instance of a line item on an invoice belong to more than one invoice? No (note I'm modeling a line item, not a product or part number as a line item could include special pricing for this invoice, color, logo, etc). So this is clearly a one to many relationship in the direction of one invoice can have many line items.
For more information, do some searching on data modeling, it will be a huge help in your database design efforts and you will end up with a better design for more efficient queries by designing the database correctly.
Looks like Schwern and I were typing at the same time :-)

Table with multiple foreign keys -- only one not null

I'm trying to design a system where an administrator will have to approve changes to the data and other various administrative tasks -- add a user, add an admin etc.
My idea is to have a notification table that contains these notifications, but the problem is that a notification can be any of the previously mentioned types, ie it's data is stored in one of many tables. Here is a picture to describe my current plan -- note I'm sure that it's not a proper ER diagram.
full_screen
Also, the data goes into a pending table, that reflects the table it will eventually wind up in, provided the data is approved -- it's a staging ground of sorts. So, a pending_user is a user that is not in the user table. And as you can see the user table, amongst others, is not shown here, but one can use their imagination.
I'm concerned that the multiple null values in the pending table will have adverse effects that I'm not totally aware of, such as increased space usage and possibly increase query time. Also, I'm not sure how I'll implement the retrieval of these notifications. My naive approach is to select the first X notifications, analyze the rows to find the non-null column, retrieve the appropriate data and then load all the data in a response.
Is there a more straight forward pattern for this type of problem?
Thanks in advance for any help.
I think, the traditional way is to provide various levels of access/read/write rights to users. These access rights define what actions a user can and can't perform. In this traditional approach if a user has access to a certain function, he can do it without further approval.
Also, traditionally there are some kind of audit logs that contain a trace of all important changes to the data. With such logs it would be possible to know who made a change (and when).
If you need to build a two-stage system, where a change has to go through an approval, I'd add a flag column to each important table that would indicate that values in the given row are not final and have to be approved. The table would store all historical changes to the data and with the help of this flag the system would know which variant is the latest approved version and which variant is pending and waiting for approval.
I would not try to make a single universal table that would hold data related to changes in many different tables. Each table is different and approval process for each table is likely to be different. I doubt that you'll have more than a dozen entities that are important enough to go through this approval process.

Database Design - Linking two users

I need some help with some database design. I am a FE developer by trade and have only dealt with very basic DBs. I am just starting to branch out into more "advanced" web apps and would like some pointers in the right direction for the schema.
What I am looking for is an account system that can basically link two accounts. I will give you the scenario I had imagined off the top of my head.
A user signs up in a regular way, just providing name, email, password for simplicity of this question. After they have signed up, the user can then link their account to another user by entering the others email and having it accepted by the other user.
Once this link has been created, the two users can CRUD tasks together.
The bit I am struggling with is how to create the link between the two users. I obviously have my users table.
USERS:
id
name
email
password
Now, I believe I need to create another table that holds the two linked accounts, that has its own unique ID that we can use to CRUD tasks. Something like:
LINKED_USERS:
id
user1id
user2id
verified
TASKS
id
lu_id (FK, Linked_Users id)
// Any other fields for the two combined here.
Is this correct? If so, how would I setup the relationships between the users table and the linked_users table? This is the bit that is confusing me because I need the relationship to reference two users IDs. Say I wanted to display user1id and user2id names, how would the relationship work? Just really need a bit of help wrapping my head around this.
I hope this makes sense, if you need any more information I will just edit the question.
Thanks for any help in advance!
Your question in not entirely clear as to the requirements. My design assumes the following about requirements:
People are linked together in pairs
Each pair owns zero, one, or more task records.
Each person can be assigned to zero, one, or more pairs. If not currently, then perhaps over time (past pairs, current pairs, future pairs).
I think your confusion revolves around the pairing. Instead think of it as teams. The fact that a team can have at most two people is beside the point; 2, 10, 100 does not matter because any number is handled the same way. That way is a Team table that has members assigned. Each person can belong to one or more teams, and each team can have one or more members. That means we have a Many-To-Many relationship between Person and Team. A many-to-many is a problem in relational design that is always solved by adding a third intermediate or "bridge" table. In this case, that bridge table is membership_.
Each team owns zero, one, or more tasks. Each task is owned by one and only one team. This is a simple One-To-Many relationship between Team and Task.
If these assumptions and constraints are correct, then you would have the following table design in a relational database such as Postgres.
I added a start_ and stop_ pair of fields on membership_ to show the idea that people may have past, present, or future assignments to teams.

archiving strategies and limitations of data in a table

Environment: Jboss, Mysql, JPA, Hibernate
Our web application will be catering to a large amount of users (~ 1,000,000) and there are a lots of child table where user specific data are stored (e.g. personal, health, forum contributions ...).
What would be the best practice to archive user & user specific information.
[a] Would it be wise to move the archived user & user specific information to their respective tables within the same database (e.g. user_archive, user_forum_comments_archive ...) OR
[b] Would you just mark the database entries with a flag in the original table(s) and just query only non archived entries.
We have a unique constraint on User.loginid, how do you handle this requirement if the users are archived via 1-[a] (i.e if a user with loginid 'samuel' gets moved into the archive table and if a new user gets added with the same name in the original table, how would you prevent this. What would be the best strategy to address the unique key constraints.
We have a requirement to selectively archive records and bring it back if necessary, will you rely on database tools are would you handle this via your persistence APIs exposed by the JPA entity model.
Personally, I'd go for solution "[a]".
Having things split on two table sets (current and archived) would make things a bit hard to manage in terms of common RDBMS concepts (example: forum comment author would be a foreign key pointing to the user's table... but you can't have a field behave as a foreign key to two different tables).
You could go for a compromise (users table uses solution "a", all the other tables like profile get archived to a twin table like per solution "b") but this would make things unnecessarily complicated for your code (in some cases you have to look at the non-archived, in some to the archived only, in some other cases to the union of both).
Solution A would easily solve #2 and #3 requirements, too. Uniqueness of user name is easy to enforce if everything is in the same table, and resurrecting archived users is just a matter of flipping a bit (Archived=Y/N) on the main user table.
10% is not much, I doubt that the difference in terms of performance would really justify the extra complexity (and risk of bugs).
I would put an archived flag on the table and then create a view to use when you don't want to see archived records. That way people will be more consistent in applying the archive flag I suspect.

Database for microblogging startup

I will do microblogging web service (for school, so don't blast me for lack of new idea) and I worry that DB could be often be overloaded (user could following other users or even tag so I suppouse that SELECT will be heavy - check 20 latest messages which contains all observing tags and user).
My idea is create another table, and store in it only statusID and userID (who should pick up message). Danger of that is, if some tag or user has many followers there will be a lot of record with that status ID. So, is it good idea? Or maybe better is used M2M relation? (one status -> many receivers)
I think most databases can easily handle large record sets. The responsibility to have it preform lies in your design with properly setting up the indexes. If you create the right indexes the select clauses should perform really well.
I'd go with a users table, a table to have the m2m relationship between users and messages table.
You can then do one select to find all of the users a user is following and then a second select in to get all of the messages of interest (sorting and limiting the results as appropriate). Extending this to tagging should be pretty simple.
This design should be fine for large numbers of users and messages as long as you index the right columns. If you got massive then you could also run the users tables and messages tables to different servers or have read only replicates. I wouldn't even worry about that for the moment - you'd need to be huge.
When implementing Collabinate (http://www.collabinate.com), a service-based engine for microblogging and shared activity streams, I used a graph database. The fact that people create posts and follow other people lends itself to a graph structure. With the right relationships and algorithms, this can be a very efficient and performant solution.