Say we have an app that allows people to like or dislike pictures.
This looks like a data-intensive application, as you would expect a huge amount of (dis)like requests, so say we chose Node.js for it.
As we don't want people to vote more than once, we need a way of relating the picId and userId. This could be done:
with a relational database, by using a table where picId and userId are keys,
with NoSQL, by creating a 'file' for each user and storing there all the picIds she voted... or the other way around, creating a file for each picture and storing there all the userIds that have voted the pic.
This part of the DB will be intensively read and written, as for every vote you first need to check if the user has already voted and then write the new vote, plus updating the total vote count of the pic.
Which is the best option (based ONLY on technical reasons)?
Say you use MongoDB (NoSQL document-oriented DB) and you have unique usernames. You can do that :
Create a Model ImageModel (mongoose model) for the images and UserModel for your users
Store the images into the database (you can use references to find them)
In your ImageModel, you have a like and dislike array containing the usernames of the users who will like or dislike your picture
With that, the number of like/dislike will be the length of the array and you will be able to find out easily what your users like or dislike.
Nevertheless, if you want to create a social-network application, document-oriented DB aren't the best choices because they don't implements relations and it will be difficult for you to link informations and users.
SQL DB aren't good either because they don't offer enough performance for big apps, so I suggest you to take a look at NoSQL Graph or Graph-Document DB like OrientDB or Neo4j
I hope it could help you and sry for my english (feel free to correct me) ;)
Related
I am building a small social networking website, I have a doubt regarding database schema:
How should I store the posts(text) by a user?
I'll have a separate POST table and will link USERS table with it, through USERS_POST table.
But every time to display all the posts on user's profile, system will have to search the entire USERS_POST table for USER id and then display?
What else should I do?
Similarly how should I store the multiple places the user has worked or studied?
I understand it's broad but I am new to Database. :)
First don't worry too much, start by making it work and see where you get performance problems. The database might be a lot quicker then you expect. Also it is often much easier to see what the best solution is when you have an actual query that is too slow.
Regarding your design, if a post is never linked to more then one user then forget the USERS_POST table and put the user id in the POST table. In either case an index on the user id would help (as in not having to read the whole table) when the database grows large.
Multiple places for a single user you would store in an additional table. For instance called USERS_PLACES, give it a column user_id to link it to USERS plus other columns for the data you wish to store per place.
BTW In postgresql you might want to keep all object (tables, columns, ...) names lowercase because unless you take care to always quote them like "USERS" postgresql will make them lowercase which can be confusing.
I've been trying to get my head around NoSQL, and I do see the benefits to embedding data in documents.
What I can't understand, and hope someone can clear up, is how to store data if it must be relational.
For example.
I have many users. They are all buying a product. So everytime that they buy a product, we add it under the users document in mongo, so its embedded and its all great.
The problem I have is when something in reference to that product changes.
Lets say user A buys a car called "Porsche". Then, we add a reference to that under the users profile. However, in a strange turn of events Porsche gets purchased by Ferrari.
What do you do now, update each and every record and change to name from Porsche to Ferrari?
Typically in SQL, we would create 3 tables. One for users, one for Cars (description, model etc) & one for mapping users to purchases.
Do you do the same thing for Mongo? It seems like if you go down this route, you are trying to make Mongo do things SQL way, which is not what its intended for.
I can understand how certain data is great for embedding (addresses, contact details, comments, etc) but what happens when you need to reference data that can and needs to change at a regular basis?
I hope this question is clear
DBRefs/Manual References were made specifically to solve this issue. Instead of manually adding the data to each document and then needing to update when something changes, you can store a reference to another collection. Here is the mongoDB documentation for details.
References in Mongo
Then all you would need to do is update the reference collection and the change would be reflected in all downstream locations.
When i used the mongoose library for node js it actually creates 3 tables similar to how you might do it in SQL, you can use object id's as foreign keys and enrich them either on the client side or on the backend, still no joining but you could do an 'in' query for the ID's then enrich the objects that way, mongoose can do this automatically by 'populating'
I would very much appreciate any information regarding this. I have got a database that follows the correct principles, I say this because I used approached it using ERD and Normalisation to data model the database.
I am using this database for a web program that I am developing which has got a Login system. I am aware about the login system that can be implemented using the one table e.g. user table and having an extra field to define the authorisation level of the user within the system which will be so much easier to develop. But on the other hand I am confused as an compsci student to whether doing this will degrade my marks since it isn't the correct principle.
Just to clarify the database I've designed have got 3 different users and have relationship to different entities.
Thank you so much for your time and reading this !!!!
So you have three different types of users, and you want to impress your teacher by not merely using one table.
A good schema would be:
users
for all the things they have in common
common_data , admin_data, and organizer_data
The former for regular login/authentication -
username
hash (password)
access_level (or type)
-- and you might even include:
last_login
Or you know, whatever.
and in the other tables, have the generalized information
(that you would be reading less-often)
email
phone_number
address
etc --
For the organizer_table, you might have groupID, which of course, you could also put in the user table -- admin_table would get something like failed_login_attempts -- or in some of my projects, I have "last_ip_address" for the admin -
But you get the idea --- separate user-entities, that require separate data-sets -- since this project doesn't seem to be very code-oriented, I'm sure you could get away with making up whatever columns that seem remotely logical
And of course, both tables get an id column - which provides their relationship !
Now, insofar as one table making it easier than two - you should look into JOIN's - which make two tables appear as one when you need them to - otherwise, they can be separate entities--
I'm working on an app where part of it involves people liking and commenting on pictures other people posted. Obviously I want the user to be notified when someone comments/likes their picture but I also want that user to be able to be able to see the pictures that they posted. This brings up a couple structuring questions.
I have a table that stores an image with it's ID, image, other info such as likes/comments, date posted info, and finally the userID of the user that posted the image:
Here's that table structure:
Image Posts Table: |postID|image|misc. image info|userID|
The userID is used to grab information from the users entry in the user table for notifications. Now when that user looks at a page containing his own posts I have two options:
1.) Query the Image Posts Table for any image containing that user's userID.
2.) Create a table for each user and put a postID of each image they posted :
Said User's Table: |postID|
I would assume that the second option would be more efficient because I don't have to query a table with a large amount of entries. Are there any more efficient ways to do this?
Obviously I should read up on good database design so do any of you have any good recommendations?
Multiple tables of identical structure almost never makes sense. Writing queries using your 2nd option would become ugly in short order. Stick with 1 large user's table, databases are designed to handle tables with many rows.
I would recommend against manually storing the userID, as Parse will do it's own internal magic if you just set a property called user to the current user. Internally it stores the ID and marks it as an object reference. They may or may not have extra optimizations in there for query performance.
Given that the system is designed around the concept of references, you should keep to just the two tables/classes you mentioned.
When you query the Image Posts table you can just add a where expression using the current user (again it internally gets the ID and searches on that). It is a fully indexed search so should perform well.
The other advantage is that when you query the Image Posts table you can use the include method to include the User object it is linked to, avoiding a 2nd query. This is only available if you store a reference instead of manually extracting and storing the userID.
Have a look at the AnyPic sample app on the tutorial page as it is very similar to what you mention and will demonstrate the ideas.
One of the basics of this new website I am making it that it allows for a user to select their favourite game. Because i want users to be able to search for people who like the same games that they like, I want the spellings and layouts of each game to be the same so I was thinking of simply having a drop down menu for this. But here's where my data structure issue comes in:
As each user will only have one favourite game, should I just have one table for the users where the 'Favourite Game' field is included? OR, should I have a Users table, and a Games table with a link table in-between which stores the ID of the User and the ID of the game?
If I were to have other options such as favourite genre too, would this be an efficient way of doing it? I just feel that at the end I'll end up with a huge flat file database, but this seems to be the best way and it'd be easier to create forms for it.
Thoughts?
If the relationship is, and will always be, one favorite per user, then just add the column to the Users table. If you suspect that at any point you may want your users to be able to select more than one favorite, then implement the link table now.