sDesigning a database with flexible user profile - sql

I am working on a design where I can have flexible attributes for users and I am confused how to continue the design of the schema.
I made a table where I kept system needed information:
Table name: users
id
username
password
Now, I wish to create a profile table and have one to one relation where all the other attributes in profile table such as email, first name, last name, etc. My question is: is there a way to add a third table in which profiles will be flexible? In other words, if my clients need to create a new attribute he/she won't need any customization to the code.

You're looking for a normalized table. That is a table that has user_id, key, value columns which produce a 1:N relationship between User & this new table. Look into http://en.wikipedia.org/wiki/Database_normalization for a little more information. Performance isn't amazing with normalized tables and it can take some interesting planning for optimization of your code but it's a very standard practice.

Keep the fixed parts of the profile in a standard table to make it easy to query, add constraints, etc.
For the configurable parts it sounds like you are looking for an entity-attribute-value model. The extra configurability comes at a high cost though: everything will have to be stored as strings and you will have to do any data validation in the application, not in the database.

How will these attributes be used? Are they simply a bag of data or would the user expect that the system would do something with these values? Are there ever going to be any reports against them?
If the system must do something with these attributes then you should make them columns since code will have to be written anyway that does something special with the values. However, if the customers just want them to store data then an EAV might be the ticket.
If you are going to implement an EAV, I would suggest adding a DataType column to your attributes table. This enables you to do some rudimentary validation on the entered data and dynamically change the control used for entry.
If you are going to use an EAV, then the one rule you must follow is to never write any code where you specify a particular attribute. If these custom attributes are nothing more than a wad of data, then an EAV for this one portion of your system will work. You could even consider creating an XML column to store these attributes. SQL Server actually has an XML data type but all databases have some form of large text data type that will also work. On reports, the data would only ever be spit out. You would never place specific values in specific places on reports nor would you ever do any kind of numerical operation against the data.
The price of an EAV is vigilence and discipline. You have to have discipline amongst yourself and the other developers and especially report writers to never filter on a specific attribute no matter how much pressure you get from management. The moment a client wants to filter or do operations on a specific attribute, it must become a first class attribute as a column. If you feel that this kind of discipline cannot be maintained, then I would simply create columns for each attribute which would mean an adjustment to code but it will create less of mess down the road.

Related

Database schema sample for storing social media post

I am building a small social networking website, I have a doubt regarding database schema:
How should I store the posts(text) by a user?
I'll have a separate POST table and will link USERS table with it, through USERS_POST table.
But every time to display all the posts on user's profile, system will have to search the entire USERS_POST table for USER id and then display?
What else should I do?
Similarly how should I store the multiple places the user has worked or studied?
I understand it's broad but I am new to Database. :)
First don't worry too much, start by making it work and see where you get performance problems. The database might be a lot quicker then you expect. Also it is often much easier to see what the best solution is when you have an actual query that is too slow.
Regarding your design, if a post is never linked to more then one user then forget the USERS_POST table and put the user id in the POST table. In either case an index on the user id would help (as in not having to read the whole table) when the database grows large.
Multiple places for a single user you would store in an additional table. For instance called USERS_PLACES, give it a column user_id to link it to USERS plus other columns for the data you wish to store per place.
BTW In postgresql you might want to keep all object (tables, columns, ...) names lowercase because unless you take care to always quote them like "USERS" postgresql will make them lowercase which can be confusing.

Storing relational data in MongoDB (NoSQL)

I've been trying to get my head around NoSQL, and I do see the benefits to embedding data in documents.
What I can't understand, and hope someone can clear up, is how to store data if it must be relational.
For example.
I have many users. They are all buying a product. So everytime that they buy a product, we add it under the users document in mongo, so its embedded and its all great.
The problem I have is when something in reference to that product changes.
Lets say user A buys a car called "Porsche". Then, we add a reference to that under the users profile. However, in a strange turn of events Porsche gets purchased by Ferrari.
What do you do now, update each and every record and change to name from Porsche to Ferrari?
Typically in SQL, we would create 3 tables. One for users, one for Cars (description, model etc) & one for mapping users to purchases.
Do you do the same thing for Mongo? It seems like if you go down this route, you are trying to make Mongo do things SQL way, which is not what its intended for.
I can understand how certain data is great for embedding (addresses, contact details, comments, etc) but what happens when you need to reference data that can and needs to change at a regular basis?
I hope this question is clear
DBRefs/Manual References were made specifically to solve this issue. Instead of manually adding the data to each document and then needing to update when something changes, you can store a reference to another collection. Here is the mongoDB documentation for details.
References in Mongo
Then all you would need to do is update the reference collection and the change would be reflected in all downstream locations.
When i used the mongoose library for node js it actually creates 3 tables similar to how you might do it in SQL, you can use object id's as foreign keys and enrich them either on the client side or on the backend, still no joining but you could do an 'in' query for the ID's then enrich the objects that way, mongoose can do this automatically by 'populating'

Confused whether using the correct database design concept will make it difficult to design PHP login system

I would very much appreciate any information regarding this. I have got a database that follows the correct principles, I say this because I used approached it using ERD and Normalisation to data model the database.
I am using this database for a web program that I am developing which has got a Login system. I am aware about the login system that can be implemented using the one table e.g. user table and having an extra field to define the authorisation level of the user within the system which will be so much easier to develop. But on the other hand I am confused as an compsci student to whether doing this will degrade my marks since it isn't the correct principle.
Just to clarify the database I've designed have got 3 different users and have relationship to different entities.
Thank you so much for your time and reading this !!!!
So you have three different types of users, and you want to impress your teacher by not merely using one table.
A good schema would be:
users
for all the things they have in common
common_data , admin_data, and organizer_data
The former for regular login/authentication -
username
hash (password)
access_level (or type)
-- and you might even include:
last_login
Or you know, whatever.
and in the other tables, have the generalized information
(that you would be reading less-often)
email
phone_number
address
etc --
For the organizer_table, you might have groupID, which of course, you could also put in the user table -- admin_table would get something like failed_login_attempts -- or in some of my projects, I have "last_ip_address" for the admin -
But you get the idea --- separate user-entities, that require separate data-sets -- since this project doesn't seem to be very code-oriented, I'm sure you could get away with making up whatever columns that seem remotely logical
And of course, both tables get an id column - which provides their relationship !
Now, insofar as one table making it easier than two - you should look into JOIN's - which make two tables appear as one when you need them to - otherwise, they can be separate entities--

Efficient Database Structure

I'm working on an app where part of it involves people liking and commenting on pictures other people posted. Obviously I want the user to be notified when someone comments/likes their picture but I also want that user to be able to be able to see the pictures that they posted. This brings up a couple structuring questions.
I have a table that stores an image with it's ID, image, other info such as likes/comments, date posted info, and finally the userID of the user that posted the image:
Here's that table structure:
Image Posts Table: |postID|image|misc. image info|userID|
The userID is used to grab information from the users entry in the user table for notifications. Now when that user looks at a page containing his own posts I have two options:
1.) Query the Image Posts Table for any image containing that user's userID.
2.) Create a table for each user and put a postID of each image they posted :
Said User's Table: |postID|
I would assume that the second option would be more efficient because I don't have to query a table with a large amount of entries. Are there any more efficient ways to do this?
Obviously I should read up on good database design so do any of you have any good recommendations?
Multiple tables of identical structure almost never makes sense. Writing queries using your 2nd option would become ugly in short order. Stick with 1 large user's table, databases are designed to handle tables with many rows.
I would recommend against manually storing the userID, as Parse will do it's own internal magic if you just set a property called user to the current user. Internally it stores the ID and marks it as an object reference. They may or may not have extra optimizations in there for query performance.
Given that the system is designed around the concept of references, you should keep to just the two tables/classes you mentioned.
When you query the Image Posts table you can just add a where expression using the current user (again it internally gets the ID and searches on that). It is a fully indexed search so should perform well.
The other advantage is that when you query the Image Posts table you can use the include method to include the User object it is linked to, avoiding a 2nd query. This is only available if you store a reference instead of manually extracting and storing the userID.
Have a look at the AnyPic sample app on the tutorial page as it is very similar to what you mention and will demonstrate the ideas.

I need to store HTML emails in a database. Is that a bad idea?

The templates for these HTML emails are all the same, but there are just different variables for say, first name, last name and such.
Would it just make sense to store the most minimal of data that I need, and load the template in and replace the variables everytime?
Another option would be to actually create the HTML file and store a reference to it, which probably would be the easiest to do except it might be a pain managing the files, and it adds complexity in regards to migrating, file permissions, et cetera.
Looking for opinions from people who've done this before...
GOAL/PURPOSE/USE:
I have a booking engine. When users make a booking, they are sent a confirmation email, generated from the sessionized booking data.
This email provides a "Cannot view this email? See it here" link which provides a web view of the email, in addition to a plaintext view.
I need to display the same email that was sent out, in addition to the plaintext view.
The template is subject to change, but I think because of that very fact I should have a table of templates and map the data to a template.
That's what I would do, because the template layout may change over the time, but the person information should remain the same. So, it makes sense to just store the person information in the database and leave the template out from the database.
In fact, it would be even better if you use template engine such as Velocity (in Java) to construct your HTML emails... very easy, by the way.
On the one hand cpu is more expensive then memory, so mostly it is better to save more data to reduce cpu power used by computation.
But in your case, I would save the minimal data, the emails or what you are tying to save, because it allows you to easily remodel your templates, and to reuse the data at multiple places of your application.
You persist redundant data (especially because of the template) which is in no way normalized. I would not suggest to do that. But mentioned in the comment it is important what you want to do with that data.
If you only save the data you need you could for example exchange that template easy and use another one.
Yea, your right on track. I did a similar thing. All dynamic/runtime variables were starting from ##symbol.
So in database you would have one Template table. One table would be for dynamic/runtime variables. One table for Mapping between Template and dynamic/runtime variables.
tblTemplate - TemplateID, TemplateValue
tblRuntimeVariables - RuntimeVariableID, VariableString, VariableSQL
tblMapping - TemplateID, RuntimeVariableID, RuntimeVariableValue
Advantage of using an extra mapping table is that on adding new dynamic variables to existing change would mean making no change to existing database. Only more rows would be added to tblMapping.
In my case I was also having one extra column for storing SQL Statements in tblRuntimeVariables in case the value for a runtime variable is fetched from database.