Database schema sample for storing social media post - sql

I am building a small social networking website, I have a doubt regarding database schema:
How should I store the posts(text) by a user?
I'll have a separate POST table and will link USERS table with it, through USERS_POST table.
But every time to display all the posts on user's profile, system will have to search the entire USERS_POST table for USER id and then display?
What else should I do?
Similarly how should I store the multiple places the user has worked or studied?
I understand it's broad but I am new to Database. :)

First don't worry too much, start by making it work and see where you get performance problems. The database might be a lot quicker then you expect. Also it is often much easier to see what the best solution is when you have an actual query that is too slow.
Regarding your design, if a post is never linked to more then one user then forget the USERS_POST table and put the user id in the POST table. In either case an index on the user id would help (as in not having to read the whole table) when the database grows large.
Multiple places for a single user you would store in an additional table. For instance called USERS_PLACES, give it a column user_id to link it to USERS plus other columns for the data you wish to store per place.
BTW In postgresql you might want to keep all object (tables, columns, ...) names lowercase because unless you take care to always quote them like "USERS" postgresql will make them lowercase which can be confusing.

Related

DB Architecture in a Node.js environment

Say we have an app that allows people to like or dislike pictures.
This looks like a data-intensive application, as you would expect a huge amount of (dis)like requests, so say we chose Node.js for it.
As we don't want people to vote more than once, we need a way of relating the picId and userId. This could be done:
with a relational database, by using a table where picId and userId are keys,
with NoSQL, by creating a 'file' for each user and storing there all the picIds she voted... or the other way around, creating a file for each picture and storing there all the userIds that have voted the pic.
This part of the DB will be intensively read and written, as for every vote you first need to check if the user has already voted and then write the new vote, plus updating the total vote count of the pic.
Which is the best option (based ONLY on technical reasons)?
Say you use MongoDB (NoSQL document-oriented DB) and you have unique usernames. You can do that :
Create a Model ImageModel (mongoose model) for the images and UserModel for your users
Store the images into the database (you can use references to find them)
In your ImageModel, you have a like and dislike array containing the usernames of the users who will like or dislike your picture
With that, the number of like/dislike will be the length of the array and you will be able to find out easily what your users like or dislike.
Nevertheless, if you want to create a social-network application, document-oriented DB aren't the best choices because they don't implements relations and it will be difficult for you to link informations and users.
SQL DB aren't good either because they don't offer enough performance for big apps, so I suggest you to take a look at NoSQL Graph or Graph-Document DB like OrientDB or Neo4j
I hope it could help you and sry for my english (feel free to correct me) ;)

Confused whether using the correct database design concept will make it difficult to design PHP login system

I would very much appreciate any information regarding this. I have got a database that follows the correct principles, I say this because I used approached it using ERD and Normalisation to data model the database.
I am using this database for a web program that I am developing which has got a Login system. I am aware about the login system that can be implemented using the one table e.g. user table and having an extra field to define the authorisation level of the user within the system which will be so much easier to develop. But on the other hand I am confused as an compsci student to whether doing this will degrade my marks since it isn't the correct principle.
Just to clarify the database I've designed have got 3 different users and have relationship to different entities.
Thank you so much for your time and reading this !!!!
So you have three different types of users, and you want to impress your teacher by not merely using one table.
A good schema would be:
users
for all the things they have in common
common_data , admin_data, and organizer_data
The former for regular login/authentication -
username
hash (password)
access_level (or type)
-- and you might even include:
last_login
Or you know, whatever.
and in the other tables, have the generalized information
(that you would be reading less-often)
email
phone_number
address
etc --
For the organizer_table, you might have groupID, which of course, you could also put in the user table -- admin_table would get something like failed_login_attempts -- or in some of my projects, I have "last_ip_address" for the admin -
But you get the idea --- separate user-entities, that require separate data-sets -- since this project doesn't seem to be very code-oriented, I'm sure you could get away with making up whatever columns that seem remotely logical
And of course, both tables get an id column - which provides their relationship !
Now, insofar as one table making it easier than two - you should look into JOIN's - which make two tables appear as one when you need them to - otherwise, they can be separate entities--

Efficient Database Structure

I'm working on an app where part of it involves people liking and commenting on pictures other people posted. Obviously I want the user to be notified when someone comments/likes their picture but I also want that user to be able to be able to see the pictures that they posted. This brings up a couple structuring questions.
I have a table that stores an image with it's ID, image, other info such as likes/comments, date posted info, and finally the userID of the user that posted the image:
Here's that table structure:
Image Posts Table: |postID|image|misc. image info|userID|
The userID is used to grab information from the users entry in the user table for notifications. Now when that user looks at a page containing his own posts I have two options:
1.) Query the Image Posts Table for any image containing that user's userID.
2.) Create a table for each user and put a postID of each image they posted :
Said User's Table: |postID|
I would assume that the second option would be more efficient because I don't have to query a table with a large amount of entries. Are there any more efficient ways to do this?
Obviously I should read up on good database design so do any of you have any good recommendations?
Multiple tables of identical structure almost never makes sense. Writing queries using your 2nd option would become ugly in short order. Stick with 1 large user's table, databases are designed to handle tables with many rows.
I would recommend against manually storing the userID, as Parse will do it's own internal magic if you just set a property called user to the current user. Internally it stores the ID and marks it as an object reference. They may or may not have extra optimizations in there for query performance.
Given that the system is designed around the concept of references, you should keep to just the two tables/classes you mentioned.
When you query the Image Posts table you can just add a where expression using the current user (again it internally gets the ID and searches on that). It is a fully indexed search so should perform well.
The other advantage is that when you query the Image Posts table you can use the include method to include the User object it is linked to, avoiding a 2nd query. This is only available if you store a reference instead of manually extracting and storing the userID.
Have a look at the AnyPic sample app on the tutorial page as it is very similar to what you mention and will demonstrate the ideas.

How should I deal with copies of data in a database?

What should I do if a user has a few hundred records in the database, and would like to make a draft where they can take all the current data and make some changes and save this as a draft potentially for good, keeping the two copies?
Should I duplicate all the data in the same table and mark it as a draft?
or only duplicate the changes? and then use the "non-draft" data if no changes exist?
The user should be able to make their changes and then still go back to the live and make changes there, not affecting the draft?
Just simply introduce a version field in the tables that would be affected.
Content management systems (CMS) do this already. You can create a blog post for example, and it has version 1. Then a change is made and that gets version 2 and on and on.
You will obviously end up storing quite a bit more data. A nice benefit though is that you can easily write queries to load a version (or a snapshot) of data.
As a convention you could always make the highest version number the "active" version.
You can either use BEGIN TRANS, COMMIT and ROLLBACK statements or you can create a stored procedure / piece of code that means that any amendments the user makes are put into temporary tables until they are ready to be put into production.
If you are making a raft of changes it is best to use temporary tables as using COMMIT etc can result in locks on the live data for other uses.
This article might help if the above means nothing to you: http://www.sqlteam.com/article/temporary-tables
EDIT - You could create new tables (ie NOT temporary, but full fledged sql tables) "on the fly" and name them something meaningful. For instance, the users intials, followed by original table name, followed by a timestamp.
You can then programtically create, amend and delete these tables over long periods of time as well as compare against Live tables. You would need to keep track of how many tables are being created in case your database grows to vast sizes.
The only major headache then is putting the changes back into the live data. For instance, if someone takes a cut of data into a new table and then 3 weeks later decides to send it into live after making changes. In this instance there is a likelihood of the live data having changed anyway and possibly superseding the changes the user will submit.
You can get around this with some creative coding though. There are many ways to tackle this, so if you get stuck at the next step you might want to start a new question. Hopefully this at least gives you some inspiration though.

sDesigning a database with flexible user profile

I am working on a design where I can have flexible attributes for users and I am confused how to continue the design of the schema.
I made a table where I kept system needed information:
Table name: users
id
username
password
Now, I wish to create a profile table and have one to one relation where all the other attributes in profile table such as email, first name, last name, etc. My question is: is there a way to add a third table in which profiles will be flexible? In other words, if my clients need to create a new attribute he/she won't need any customization to the code.
You're looking for a normalized table. That is a table that has user_id, key, value columns which produce a 1:N relationship between User & this new table. Look into http://en.wikipedia.org/wiki/Database_normalization for a little more information. Performance isn't amazing with normalized tables and it can take some interesting planning for optimization of your code but it's a very standard practice.
Keep the fixed parts of the profile in a standard table to make it easy to query, add constraints, etc.
For the configurable parts it sounds like you are looking for an entity-attribute-value model. The extra configurability comes at a high cost though: everything will have to be stored as strings and you will have to do any data validation in the application, not in the database.
How will these attributes be used? Are they simply a bag of data or would the user expect that the system would do something with these values? Are there ever going to be any reports against them?
If the system must do something with these attributes then you should make them columns since code will have to be written anyway that does something special with the values. However, if the customers just want them to store data then an EAV might be the ticket.
If you are going to implement an EAV, I would suggest adding a DataType column to your attributes table. This enables you to do some rudimentary validation on the entered data and dynamically change the control used for entry.
If you are going to use an EAV, then the one rule you must follow is to never write any code where you specify a particular attribute. If these custom attributes are nothing more than a wad of data, then an EAV for this one portion of your system will work. You could even consider creating an XML column to store these attributes. SQL Server actually has an XML data type but all databases have some form of large text data type that will also work. On reports, the data would only ever be spit out. You would never place specific values in specific places on reports nor would you ever do any kind of numerical operation against the data.
The price of an EAV is vigilence and discipline. You have to have discipline amongst yourself and the other developers and especially report writers to never filter on a specific attribute no matter how much pressure you get from management. The moment a client wants to filter or do operations on a specific attribute, it must become a first class attribute as a column. If you feel that this kind of discipline cannot be maintained, then I would simply create columns for each attribute which would mean an adjustment to code but it will create less of mess down the road.