Efficient Database Structure

Efficient Database Structure - sql

I'm working on an app where part of it involves people liking and commenting on pictures other people posted. Obviously I want the user to be notified when someone comments/likes their picture but I also want that user to be able to be able to see the pictures that they posted. This brings up a couple structuring questions.
I have a table that stores an image with it's ID, image, other info such as likes/comments, date posted info, and finally the userID of the user that posted the image:
Here's that table structure:
Image Posts Table: |postID|image|misc. image info|userID|
The userID is used to grab information from the users entry in the user table for notifications. Now when that user looks at a page containing his own posts I have two options:
1.) Query the Image Posts Table for any image containing that user's userID.
2.) Create a table for each user and put a postID of each image they posted :
Said User's Table: |postID|
I would assume that the second option would be more efficient because I don't have to query a table with a large amount of entries. Are there any more efficient ways to do this?
Obviously I should read up on good database design so do any of you have any good recommendations?

Multiple tables of identical structure almost never makes sense. Writing queries using your 2nd option would become ugly in short order. Stick with 1 large user's table, databases are designed to handle tables with many rows.

I would recommend against manually storing the userID, as Parse will do it's own internal magic if you just set a property called user to the current user. Internally it stores the ID and marks it as an object reference. They may or may not have extra optimizations in there for query performance.
Given that the system is designed around the concept of references, you should keep to just the two tables/classes you mentioned.
When you query the Image Posts table you can just add a where expression using the current user (again it internally gets the ID and searches on that). It is a fully indexed search so should perform well.
The other advantage is that when you query the Image Posts table you can use the include method to include the User object it is linked to, avoiding a 2nd query. This is only available if you store a reference instead of manually extracting and storing the userID.
Have a look at the AnyPic sample app on the tutorial page as it is very similar to what you mention and will demonstrate the ideas.

Related

Database schema sample for storing social media post

I am building a small social networking website, I have a doubt regarding database schema:
How should I store the posts(text) by a user?
I'll have a separate POST table and will link USERS table with it, through USERS_POST table.
But every time to display all the posts on user's profile, system will have to search the entire USERS_POST table for USER id and then display?
What else should I do?
Similarly how should I store the multiple places the user has worked or studied?
I understand it's broad but I am new to Database. :)

First don't worry too much, start by making it work and see where you get performance problems. The database might be a lot quicker then you expect. Also it is often much easier to see what the best solution is when you have an actual query that is too slow.
Regarding your design, if a post is never linked to more then one user then forget the USERS_POST table and put the user id in the POST table. In either case an index on the user id would help (as in not having to read the whole table) when the database grows large.
Multiple places for a single user you would store in an additional table. For instance called USERS_PLACES, give it a column user_id to link it to USERS plus other columns for the data you wish to store per place.
BTW In postgresql you might want to keep all object (tables, columns, ...) names lowercase because unless you take care to always quote them like "USERS" postgresql will make them lowercase which can be confusing.

DB Architecture in a Node.js environment

Say we have an app that allows people to like or dislike pictures.
This looks like a data-intensive application, as you would expect a huge amount of (dis)like requests, so say we chose Node.js for it.
As we don't want people to vote more than once, we need a way of relating the picId and userId. This could be done:
with a relational database, by using a table where picId and userId are keys,
with NoSQL, by creating a 'file' for each user and storing there all the picIds she voted... or the other way around, creating a file for each picture and storing there all the userIds that have voted the pic.
This part of the DB will be intensively read and written, as for every vote you first need to check if the user has already voted and then write the new vote, plus updating the total vote count of the pic.
Which is the best option (based ONLY on technical reasons)?

Say you use MongoDB (NoSQL document-oriented DB) and you have unique usernames. You can do that :
Create a Model ImageModel (mongoose model) for the images and UserModel for your users
Store the images into the database (you can use references to find them)
In your ImageModel, you have a like and dislike array containing the usernames of the users who will like or dislike your picture
With that, the number of like/dislike will be the length of the array and you will be able to find out easily what your users like or dislike.
Nevertheless, if you want to create a social-network application, document-oriented DB aren't the best choices because they don't implements relations and it will be difficult for you to link informations and users.
SQL DB aren't good either because they don't offer enough performance for big apps, so I suggest you to take a look at NoSQL Graph or Graph-Document DB like OrientDB or Neo4j
I hope it could help you and sry for my english (feel free to correct me) ;)

Storing relational data in MongoDB (NoSQL)

I've been trying to get my head around NoSQL, and I do see the benefits to embedding data in documents.
What I can't understand, and hope someone can clear up, is how to store data if it must be relational.
For example.
I have many users. They are all buying a product. So everytime that they buy a product, we add it under the users document in mongo, so its embedded and its all great.
The problem I have is when something in reference to that product changes.
Lets say user A buys a car called "Porsche". Then, we add a reference to that under the users profile. However, in a strange turn of events Porsche gets purchased by Ferrari.
What do you do now, update each and every record and change to name from Porsche to Ferrari?
Typically in SQL, we would create 3 tables. One for users, one for Cars (description, model etc) & one for mapping users to purchases.
Do you do the same thing for Mongo? It seems like if you go down this route, you are trying to make Mongo do things SQL way, which is not what its intended for.
I can understand how certain data is great for embedding (addresses, contact details, comments, etc) but what happens when you need to reference data that can and needs to change at a regular basis?
I hope this question is clear

DBRefs/Manual References were made specifically to solve this issue. Instead of manually adding the data to each document and then needing to update when something changes, you can store a reference to another collection. Here is the mongoDB documentation for details.
References in Mongo
Then all you would need to do is update the reference collection and the change would be reflected in all downstream locations.

When i used the mongoose library for node js it actually creates 3 tables similar to how you might do it in SQL, you can use object id's as foreign keys and enrich them either on the client side or on the backend, still no joining but you could do an 'in' query for the ID's then enrich the objects that way, mongoose can do this automatically by 'populating'

Access 2010 Inside Out

I recently acquired this book from Microsoft Press. I currently have Office Enterprise 2007 (Access included) and have firmly decided to convert my Informix-SQL app to Access 2010. However, I'm not experienced with VBA, Macros and several other functionality my app needs. This is going to be a new learning process for me, but I must modernize my 20 year old char-based app and take advantage of new features. I have begun defining my tables and columns, but not the relationships. With INFORMIX, I join a serial (autoincrement) column with an INT column in another table. Now when I display a customers master row, I would like to automatically display all of the transactions associated with that customer in a sub-form and have the ability to add, update, query, delete on either tables. Can this be accomplished with A'10?
EDIT: OK, this is what I have done so far, defined tables and relationships:
there are more validation lookup tables to follow, but these are the main tables. So if now I create a form and specify the CLIENTES (customer table), LOTES (lot table), ARTICULOS (item table) and TRANSACCIONES (transaction table) it will create a CLIENT table as the master form and the other child tables as subforms on one screen?
Also, the reason I created a lot table is because when customers pawn or sell items, the pawnshop groups all these items into one lot, calculates the total loan or purchase amount, stores it all under one transaction and prints the ticket with a description of all the items and total amount. So I want the ability to say, if customer defaults on interest payments or does not redeem pawn, then customer forfeits all items and pawnshop may choose to sell some items to gold refinery and/or transfer other non-gold items to inventory to sell to the public, so would the above ER be adequate for this capability?
I also want to ensure that every row in every table has the same store_ID (company ID) while users are working within a specific company, as this system will be multi-company and there will be consolidated reports, etc.

This type of setup can be accomplished in any version of access going back to 1992.
The way you model these classic one to many relationships in access is to base the parent form on the parent table (note I said partent table, not a query – I going to repeat this again: you do not need a query that joins the data for you). You then create what is called a continues form based on the child table. You now have two forms, and you then can simple drag + drop in the form for the child table into the above parent form, and you are done.
In fact, if you design and setup the relationship correctly in the relationships window, then if you use a wizard to create the form, it will actually build and automatically insert a sub form for you.
So, there is some several interesting issues about the above process that you should know As I pointed above, you don't have to build any SQL query at all. You don't have to write sql to join together the data. Access will do all of this for you, and do it without any code.
So, when you navigate records in the main form, the child records will be automatically displayed and filtered for you in the sub form. (and if you add or delete or edit those child records, the correct relationship key is inserted and maintained for you, again done without writing any code at all).
In the following form, we have a classic accounting funds distribution example. In this example we are tracking membership donations. So, the top part of the form is one record based on the master table and is the donation event. I then created an continuous form. When dropped into the main form, it becomes a sub form. That form is the one on the left side, and it simply allows me to enter a list of members who donated for the above event.
The form looks like:
At this point the form will work without any code having been written.
In fact the above form I have the one main record, on the left side I have many members who made a donation. However, I also needed to split out each donation for those memebers on the left side to an to particular account for accounting purposes. (a classic check spliting that you see in just about every accounting package these days)
So the above models a one to many and then the many members also split out into many different accounts for each donation. A really incredible powerful setup, and one that has almost no code.
So, in the above I'm really doing three tables deep as a model. |And, to be fair, the right side (donations split into accounts) did need one line of code to update correctly, as access does not do this for me when you go 3 tables deep.
However for the most part, to model these classic parent to child table relationships in access, you don't need to write any code at all. You use a main form and then for the child table, you insert a sub-form based on the child table.
And as noted if you set the relationships up correctly, access will automatically stitch the two together for you, and maintain the relationship for you. So display of the child records belonging to the one parent record will display for you automatically. And this ability includes you to edit and delete and add those child records. And, thus as you navigate to a new reocrd, all of the child records and information will automatically be refreshed for the next form.
In other words all the above can be done without building any SQL queries, and not one line of code is required.

Unfortunately Stack Overflow works great for asking a question and having an answer. However, a serious of Q + A that builds on previous question we much find that StackOverflow REALLY breaks down here. I will give this a shot, and with all due respect to the great stack overflow, I am temped to suggest you try a forum based system as opposed to so. Perahps Utter access, or even the access dev forum here:
http://social.msdn.microsoft.com/Forums/en-US/accessdev
Anyway, lets see what we can come up with. I should also point out that your table design and how to relate those tables is not really a access question but simply that of database design. How this should be laid out will be the same for MySql, Oracle, FoxPro or in fact just about ANY relational database system that we had in the last 20+ years in our industry. So, this is not really a access question when you ask about how a database is to be setup.
Ok, lets see. In the above you have the LOTES table attached to customers. If you WERE going to attach (relate) the LOTS table to a customer, then you don't need the store_id in the LOTES tables since the table is a child to that customer record. Anytime you can go up to the parent table to get that information, you don't need to repeat it in the child table(s). So, that customer record you are thus relating back to already has the store id and you be able to get at that store id value any time since it is a relation (that is the whole idea of a relational database, you don't have to repeat data over and over again).
And, as you have it now, the same advice applies to the two child table to LOTES. Again they don't need the store id since by you above design they are already attached to the LOTES table which in turn is attached to the customer table where the store id is. So, in your current design ALL of the child tables below customer can and should and would have the store_id removed.
However, the above I suspect is not what you looking to accomplish here. It is quite likely that LOTES records belong to a store. If that is the case, then you want LOTES table to be child of stores.
You thus should have STORES->LOTES
The only part not clear here is if your design is going to allow one customer to be attached to more then one store. If this is a very rare case, then perhaps you adopt a easier design that either forces you to enter the customer again, or even via some copy code. While this breaks normalizing here often there are some issues like if the shipping address is going to change for different stores for the same customer, then often you save a lot of code and design issues by not normalizing in this case.
However, if you need this feature, then the reverse is also true and there is MANY advantages to normalizing this data correctly, but often more desing work up front.
If you really do want customers to have more then one store, then you need a table called tblCustomerStores. You will attach this table to customers as a child table, and this this table will have the store_id. You can then attached all customer transactions etc. to this child store table. And this Customer store table might even include perhaps payment type and deliver preferences (if they vary from store to store for example).
So, you start at
Customers->customer stores->Transactions->
And, it not quite clear what is in table transactions, but it possible that everthing else belongs below the above trans table.
And, it not clear if Articles are attached to a particular transaction or not? If they are, then articles for thus be a child of transactions.
I thinking more of
Customers->customer stores->Articles->lots
And you want Customers stores->Tranascation
And you want Customers stores->Articles
So, I think you are currently going too many tables deep. Just keep in mind that you can relate many table to the parent table just like you have in your diagram, but I think what you have it not what you are looking for

sDesigning a database with flexible user profile

I am working on a design where I can have flexible attributes for users and I am confused how to continue the design of the schema.
I made a table where I kept system needed information:
Table name: users
id
username
password
Now, I wish to create a profile table and have one to one relation where all the other attributes in profile table such as email, first name, last name, etc. My question is: is there a way to add a third table in which profiles will be flexible? In other words, if my clients need to create a new attribute he/she won't need any customization to the code.

You're looking for a normalized table. That is a table that has user_id, key, value columns which produce a 1:N relationship between User & this new table. Look into http://en.wikipedia.org/wiki/Database_normalization for a little more information. Performance isn't amazing with normalized tables and it can take some interesting planning for optimization of your code but it's a very standard practice.

Keep the fixed parts of the profile in a standard table to make it easy to query, add constraints, etc.
For the configurable parts it sounds like you are looking for an entity-attribute-value model. The extra configurability comes at a high cost though: everything will have to be stored as strings and you will have to do any data validation in the application, not in the database.

How will these attributes be used? Are they simply a bag of data or would the user expect that the system would do something with these values? Are there ever going to be any reports against them?
If the system must do something with these attributes then you should make them columns since code will have to be written anyway that does something special with the values. However, if the customers just want them to store data then an EAV might be the ticket.
If you are going to implement an EAV, I would suggest adding a DataType column to your attributes table. This enables you to do some rudimentary validation on the entered data and dynamically change the control used for entry.
If you are going to use an EAV, then the one rule you must follow is to never write any code where you specify a particular attribute. If these custom attributes are nothing more than a wad of data, then an EAV for this one portion of your system will work. You could even consider creating an XML column to store these attributes. SQL Server actually has an XML data type but all databases have some form of large text data type that will also work. On reports, the data would only ever be spit out. You would never place specific values in specific places on reports nor would you ever do any kind of numerical operation against the data.
The price of an EAV is vigilence and discipline. You have to have discipline amongst yourself and the other developers and especially report writers to never filter on a specific attribute no matter how much pressure you get from management. The moment a client wants to filter or do operations on a specific attribute, it must become a first class attribute as a column. If you feel that this kind of discipline cannot be maintained, then I would simply create columns for each attribute which would mean an adjustment to code but it will create less of mess down the road.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas