Database structure for voting system with up- and down votes - sql

I am going to create a voting system for a web application and wonder what the best way would be to store the votes in the (SQL) database.
The voting system is similiar to the one of StackOverflow. I am pondering now if I should store the up and down votes in different tables. That way it is easier to count all up votes resp. down votes. On the other hand I have to query two tables to find all votes for an user or voted item.
An alternative would be one table with a boolean field that specifies if this vote is an up or down vote. But I guess counting up or down votes is quite slow (when you have a lot of votes), and an index on a boolean field (as far as I know) does not make a lot of sense.
How would you create the database structure? One or two tables?

Regarding the comments, we found the solution that best fits to Zardoz
He does not want to always count votes and needs as much details as possible. So the solution is a mix of both.
Adding an integer field in the considered table to store vote counts (make sure there won't be overflows).
Create additional tables to log the votes (user, post, date, up/down, etc.)
I would recommend to use triggers to automatically update the 'vote count field' when inserting/deleting/updating a vote in the log table.

If your votes are just up/down then you could make a votes table linking to the posts and having a value of 1 or -1 (up / down). This way you can sum in a single go.

https://meta.stackexchange.com/questions/1863/so-database-schema
Worth a look or
http://sqlserverpedia.com/wiki/Understanding_the_StackOverflow_Database_Schema

You will need a link table between users and the entities which are being voted on, I would have thought. This will allow you to see which users have already voted and prevent them from submitting further votes. The table can record in a boolean whether it is an up or down vote.
I would advise storing in the voted entity a current vote tally field to ease querying. The saving in size would be negligible if you omitted this.

Related

Proper Design For SQL table connecting to many other tables

I have an app where users can log comments related to any specific entity that the system has and I wanted to know if there is a "best practice" way to handle the design of the db for this kind of feature.
For example: Three current entities (tables) deal are Question, Documentation, ReferenceMaterial (self explanatory what each hold). The user can leave a comment on any one of those particular items and all comments are simply a varchar field, user id, and date of comment. EDIT: Comments can also belong to more than one entity. For example, all Question entities belong to a Quiz or Test entity. Each of those (Quiz and Test), can also have comments associated with themselves. So you could run a report to see all comments left for a test and easily just query the Comment table for every record with that test foreign key, or you could limit your query to just the comments left for questions in that test, or a particular question itself. It offered a lot of flexibility END EDIT
Right now the way that I hvae this is one Comment table with a foreign key relationship with each of the other entity tables (i.e. fkQuestion, fkDocumentation, fkReferenceMaterial, etc). So all comments in the system are stored in this table and based on what page the user is on, I conduct the join to that particular entity's records.
Is there a best practice way of doing this?
Thanks in advance for any help.

Redundant field in SQL for Performance

Let's say I have two Tables, called Person, and Couple, where each Couple record stores a pair of Person id's (also assume that each person is bound to at most another different person).
I am planning to support a lot of queries where I will ask for Person records that are not married yet. Do you guys think it's worthwhile to add a 'partnerId' field to Person? (It would be set to null if that person is not married yet)
I am hesitant to do this because the partnerId field is something that is computable - just go through the Couple table to find out. The performance cost for creating new couple will also increase because I have to do this extra book keeping.
I hope that it doesn't sound like I am asking two different questions here, but I felt that this is relevant. Is it a good/common idea to include extra fields that are redundant (computable/inferable by joining with other tables), but will make your query a lot easier to write and faster?
Thanks!
A better option is to keep the data normalized, and utilize a view (indexed, if supported by your rdbms). This gets you the convenience of dealing with all the relevant fields in one place, without denormalizing your data.
Note: Even if a database doesn't support indexed views, you'll likely still be better off with a view as the indexes on the underlying tables can be utilized.
Is there always a zero to one relationship between Person and Couples? i.e. a person can have zero or one partner? If so then your Couple table is actually redundant, and your new field is a better approach.
The only reason to split Couple off to another table is if one Person can have many partners.
When someone gets a partner you either write one record to the Couple table or update one record in the Person table. I argue that your Couple table is redundant here. You haven't indicated that there is any extra info on the Couple record besides the link, and it appears that there is only ever zero or one Couple record for every Person record.
How about one table?
-- This is psuedo-code, the syntax is not correct, but it should
-- be clear what it's doing
CREATE TABLE Person
(
PersonId int not null
primary key
,PartnerId int null
foreign key references Person (PersonId)
)
With this,
Everyone on the system has a row and a PersonId
If you have a partner, they are listed in the PartnerId column
Unnormalized data is always bad. Denormalized data, now, that can be beneficial under very specific circumstances. The best advice I ever heard on this subject it to first fully normalize your data, assess performance/goals/objectives, and then carefully denormalize only if it's demonstrably worth the extra overhead.
I agree with Nick. Also consider the need for history of the couples. You could use row versioning in the same table, but this doesn't work very well for application databases, works best in a in a DW scenario. A history table in theory would duplicate all the data in the table, not just the relationship. A secondary table would give you this flexibility to add additional information about the relationship including StartDate and EndDate.

questionaire time spend mysql

Say I have a table where I would store questions.
Now I would like to track how much time people on average spend per question and how many came up with the right solution.
Would I store the time spend per question in the table_questions itself or in a different one.
Would I store the answered right in the table_questions or in a seperate one, maybe even with time spend.
The reason why I am hesitating is two fold. First off I rather not want the user to be able to perform update queries on my questions. But seperating the time spend and "answered good" in a different table seems weird to me because they are inherent to the question?
Does anyone with normalization talent (unlike me) know what would be a good approach?
My suggestion:
Don't name tables TABLE_QUESTIONS or TABLE_USERS or anything similar, unless you have a good reason, and I cannot think of one at the moment. Just call them QUESTIONS and USERS.
If you actually have a USERS table, and you care who answers correctly (I cannot tell based on the wording of the question), then I think you should also have a USER_QUESTIONS table. The tables might look like this:
QUESTIONS
---------
Question_Id
Question_Descr
USERS
-----
User_Id
User_Name
USER_QUESTIONS
--------------
Question_Id
User_Id
Answer
Grade
StartTime
EndTime
Then questions (and only questions) go in their own table, and users (and only users) go in their own table. But when a user answers a question, it goes in the mixed table.
You have a many-to-many relationship between users and questions, and creating an intermediate table like this is the normal way of resolving that.

Sql design question - many tables or not?

15 ECTS credits worth of database design down the bin.. I really can't come up with the best design solution for my problem.
Which is this: Basically I'm making a tool that gathers a lot of information concerning the user. At the most the user would fill in 50 fields of data, ranging from simple checkboxes to text input. I'm designing the db right now (with mySql) and can't decide whether or not to use a single User table with all of those fields, or to have a table for each category of input.
One example would be "type of payment". This one has three options and if I went with the "table" way I would add a table paymentType and give it binary fields for each payment type. Then I would need and id table to identify which paymentType the user has chosen whereas if I use a single user table, the data would already be there.
The site will probably see a lot of users (tv, internet and radio marketing) so I'm concerned which alternative would be the best.
I'll be happy to provide more details if you need more to base a decision.
Thanks for reading.
Read this article "Database Normalization Basics", and come back here if you still have questions. It should help a lot.
The most fundamental idea behind these decisions, as you will see in this article, is that each table should represent one and only one "thing", and each field should relate directly and only to that thing.
In your payment types example, it probably makes sense to break it out into a separate table if you anticipate the need to store additional information about each payment type.
Create your "Type of Payment" table; there's no real question there. That's proper normalization and the power behind using relational databases. One of the many reasons to do so is the ability to update a Type of Payment record and not have to touch the related data in your users table. Your join between the two tables will allow your app to see the updated type of payment info by changing it in just the 1 place.
Regarding your other fields, they may not be as clear cut. The question to ask yourself about each field is "does this field relate only to a user or does it have meaning and possible use in its own right?". If you can never imagine a field having meaning outside of the context of a user you're safe leaving it as a field on the user table, otherwise do the primary key-foreign key relationship and put the information in its own table.
If you are building a form with variable inputs, I wouldn't recommend building it as one table. This is inflexible and dirty.
Normalization is the key, though if you end up with a key/value setup, or effectively a scalar type implementation across many tables and can't cache:
a) the form definition from table data and
b) the joined result of storage (either a caching view or otherwise)
c) or don't build in proper sharding
Then you may hit a performance boundary.
In this KVP setup, you might want to look at something like CouchDB or a less table-driven storage format.
You may also want to look at trickier setups such as serialized object storage and cache-tables if your internal data is heavily relative to other data already in the database
50 columns is a lot. Have you considered a table that stores values like a property sheet? This would only be useful if you didn't need to regularly query the values it contains.
INSERT INTO UserProperty(UserID, Name, Value)
VALUES(1, 'PaymentType', 'Visa')
INSERT INTO UserProperty(UserID, Name, Value)
VALUES(1, 'TrafficSource', 'TV')
I think I figured out a great way of solving this. Thanks to a friend of mine for suggesting this!
I have three tables, Field {IdField, FieldName, FieldType}, FieldInput {IdInput, IdField, IdUser} and User { IdUser, UserName... etc }
This way it becomes very easy to see what a user has answered, the solution is somewhat scalable and it provides a good overview. I will constrain the alternatives in another layer, farther away from the db. I believe it's a tradeoff worth doing.
Any suggestions or critics to this solution?

how to make mysql structure of up,down rating

Every programmer here knows about ratings like that:
Rating system http://img69.imageshack.us/img69/4241/98948761.gif
The problem is that I don't know how to make my SQL structure for that.
I think I can add up and down field in article table in MySQL, but this does not allow multi voting.
Could you please tell me how to make it?
Do I have to create a new table?
The easiest way is to simply store the vote counter per article.
When an article gets voted up, increase the counter. Voting down - decrease the counter.
If you need to keep track about which user voted up/down (and avoid multiple votes), you need to define an intersection table between users and articles.
It could look like this:
article_votes
--------------
user_id
article_id
vote
where vote can be either +1 or -1.
If you need the points of an article, you get it by
SELECT SUM( vote )
FROM article_votes
WHERE article_id = <your_article_id>
You may get some ideas out of how stackoverflow does it:
Stack Overflow Creative Commons Data Dump
Understanding the StackOverflow Database Schema
Meta Stackoverflow: Anatomy of a data dump