MySQL query (joining tables) - sql

QuestionsTable
id* (int) | question_text (string) | question_type (int)
AlternativesTable
id* (int) | question_id (int) | alternative_text (string) | is_correct (bool)
AnswersTable
id* (int) | question_id (int) | alternative_id (int) | answer_text (string)
" * " = primary key
Every question can either be of type free text or multiple selector. A multiple selector question has one or more alternatives and only one can be correct.
An answer is defined by a question_id and an alternative_id (multiple selector) or an answer_text (free text). The is_correct bool is so I can mark which answer is the correct one.
How do I make an SQL query that will give me all the alternatives listed for every question with a count on each alternative that shows how many has selected it? Say I can store it in an array and iterate through it with an foreach-loop and show it as the example below.
An example, every question is represented by it's question_text, and every alternative beloning to that question is represented by Alt1 (alternative_text), Alt2, and so on... The numbers after the alternatives is the number of selections it got (answers).
MultiQuestion1
Alt1 | 3
Alt2 | 2
Alt3 | 0
MultiQuestion2
Alt1 | 2
Alt2 | 3
FreeText1
Answer1
Answer2
....
I can make the query that gives me all questions and all the alternatives that belong to it, but I fail when I try to get the count on all answers for every question alternative.
So now I could use some help from a SQL-ninja =)
Thanks in advance Daniel

Off the top of my head, this might work:
SELECT alternatives.id, COUNT(DISTINCT(answers.id))
FROM alternatives
LEFT JOIN answers ON alternatives.id = answers.altid
GROUP BY alternatives.id;
That should give you the total number of times each alternative occurs in the answers table.

Related

SQL multiple table nested group by using Ruby on Rails Active Record

Models:
A survey has survey_answers which each have an answer and a question
A question belongs to a category.
Example:
Survey_answer has survey_id and answer_id and question_id
question has category_id
I am using Ruby on Rails with Active Record
Using SQL, and starting with a survey, how can I group as following:
{ category => questions => answers }
Example output:
{[
category_one_record,
[question_1,[answer_1, answer_2]],
[question_2,[answer_3, answer_4]],
category_two_record,
[question_3,[answer_5, answer_6]],
[question_4,[answer_7, answer_8]],
category_three_record,
...
]}
I understand how to do a group but I don't understand how to do a nested grouping with SQL
I believe that you're thinking about this in a non-SQL manner.
SQL is tabular, and when returning this kind of information it is normal to accept the repetition rather than look to generate a nested data structure (such as can be done with JSON, XML, etc).
You don't appear to want any actual aggregation (what GROUP BY is for). Instead you're referring to one way of keeping related records next to each other.
The would simply be joining the data and then ordering it.
You're original post isn't clear on the table structures, so this is a very rough example of what I mean...
SELECT
survey.name AS survey_name,
category.name AS category_name,
question.text AS question_text,
answer.text AS answer_text
FROM
survey
INNER JOIN
question
ON question.survey_id = survey.id
INNER JOIN
category
ON category.id = question.category_id
INNER JOIN
answer
ON answer.question_id = question.id
ORDER BY
survey.name,
category.name,
question.text,
answer.text
Such a query might then return a record set that looks like the following...
survey_name | category_name | question_text | answer_text
-------------+---------------+---------------+-------------
survey_1 | category_1 | question_1 | answer_1
survey_1 | category_1 | question_1 | answer_2
survey_1 | category_1 | question_2 | answer_3
survey_1 | category_1 | question_2 | answer_4
survey_1 | category_2 | question_3 | answer_5
Tabular. Not nested.
If you have natural joins available you can almost always pair table up with it even if you don't know how it works, like this:
select * from Survey_answer
natural join question
natural join category
where category_id = 1
If you don't have this option available you need to specify the joining points and if it's right, left or inner for example:
select * from Survey_answer
inner join question on Survey_answer.answer_id like question.answer_id
where Survey_answer.category_id = 1
You may learn more about this here: https://www.w3schools.com/sql/sql_join.asp

Clash of multivalued attribute

I am having a database having name and hobbies(as multivalued attribute) and I want to find out what is the count of occurence of more than one same value
For example
If this is a sample database
A reading
A dancing
B reading
B dancing
Then the result should be
List of hobbies | Number of occurrence
-----------------|---------------------
reading, dancing | 2
I think you have a query like this:
SELECT hobbies, Count(*) As hNo
FROM t
GROUP BY hobbies
That have a result set like this:
hobbies | hNo
--------+------
reading | 2
dancing | 2
Now for this data-set you can follow answers of this question [Concatenate many rows into a single text string] to have them in one row.

Is there a way to optimise an array of subquery in a SQL select?

I currently have two tables
question
--------
id
title, character varying
answer
--------
id
question_id
votes, integer
I use the following query to return me a list of questions with its corresponding array of votes:
SELECT question.id,
question.title,
ARRAY(SELECT votes
FROM answer
WHERE answer.question_id = question.id)
FROM question
ORDER BY question.id
The output looks like:
id | title | ?column?
----+----------+-----------------------------------------------------
100 | How to | {5,2,7}
101 | Where is | {0}
102 | What is | {1}
The above query can take close to 50s to run with hundred of thousands of questions where each question can have at least 5 answers. Is there a way to optimise the above?
You should use a join:
SELECT question.id, question.title, answer.votes
FROM question
JOIN answer ON answer.question_id == question.id
ORDER BY question.id
If you want the output column to contain a concatenated list of all "votes" associated with a question, and you are on Postgres, check out this question: How to concatenate strings of a string field in a PostgreSQL 'group by' query?
I recommend creating an index on your answer table, and using your original query.
CREATE INDEX answer_question_id_idx ON answer(question_id);
Without this index, it will have to do a sequential scan of the entire table to find rows with a matching question_id. It will have to do that for every single question.
Alternatively, consider using a join, as arc suggested. I'm not an expert in the matter, but I think Postgres will use a hash join rather than multiple sequential scans, making the query faster. If you want to retain the id/title/array format, use array_agg:
SELECT question.id, question.title, array_agg(answer.votes)
FROM question
LEFT JOIN answer ON answer.question_id = question.id
GROUP BY question.id, question.title
ORDER BY question.id;
However, there's a caveat. If a question has no answers, you'll get a weird-looking result:
id | title | array_agg
----+-------------------+-----------
1 | How do I do this? | {3,5}
2 | How do I do that? | {NULL}
(2 rows)
This is because of the LEFT JOIN, which creates a NULL value when no rows from the joined table are available. With INNER JOIN, the second row won't appear at all.
That's why I recommend using your original query. It produces the expected result:
id | title | ?column?
----+-------------------+----------
1 | How do I do this? | {3,5}
2 | How do I do that? | {}
If you want the query to produce one row per question, with votes gathered into an array, you can use a join, with array_agg:
SELECT question.id,
question.title,
array_agg(answer.votes) as answer_votes
FROM question
JOIN answer ON answer.question_id = question.id
GROUP BY question.id, question.title
ORDER BY question.id

Help with Voting Table Schema Idea

I'm trying to create a voting table and maximize performance. Since a vote can only be UP or DOWN, I'm thinking of using bit where 1 = up and 0 = down is this unintuitive? is there a better way?
UserVotes (3 way primary key between all three tables)
+----------+----------+-------------+
| UserID | IsUp | CommentID |
+----------+----------+-------------+
| 1 | 1 | 99 |
| 2 | 0 | 99 |
etc.
The updates will happen when a user clicks a vote up or a vote down button
If VoteUpButtonClicked Then
VoteService.Add(userID,True, CommentID)
End If
If VoteDownButtonClicked Then
VoteService.Add(userID, False, CommentID)
End If
Then the calls will be "count"
Dim TotalUpVotes = VoteServce.QueryVotes().Where(Function(v) v.IsUp And v.CommentID = CommentID).Count
Dim TotalDownVotes = VoteService.QueryVotes().Where(Function(v) Not v.IsUp And v.CommentID = CommentID).Count
I'm using SQL Server 2008 and Linq to SQL.
And yes, I would like to allow users to delete a vote.
How about:
UserId (int) PK
CommentId (int) PK
Vote (tinyint) (1 or -1)
Voted (Date)
Then you can simply sum the values
I would have the voting table keep a live sum of the votes:
UpVote int
DownVote int
+1 to the applicable column when a user vote.
Store the votes in a log table
UserID
CommentID
IsUp (bit)
If a user deletes his vote, you can interrogate IsUp and -1 on either UpVote or DownVote.
Since you are the one that will be wrapping the table in logic to accomplish the requirements of your project, I would assume it is you that needs to answer the question of whether or not it is intuitive. It needs to be intuitive to you.

Need lowest price in each region in a mysql query

I am trying to write up a query for wordpress which will give me all the post_id's with the lowest fromprice field for each region. Now the trick is these are custom fields in wordpress, and due to such, the information is stored row based, so there is no region and fromprice columns.
So the data I have is (but of course containing a lot more rows):
Post_ID | Meta_Key | Meta_Value
1 | Region | Location1
1 | FromPrice | 150
2 | Region | Location1
2 | FromPrice | 160
3 | Region | Location2
3 | FromPrice | 145
The query I am endeavoring to build should return the post_id of the "lowest priced" matching post grouped by each region with results like:
Post_ID | Region | From Price
1 | Location1 | 150
3 | Location2 | 145
This will allow me to easily iterate the post_id's and print the required information, in fact, I would be just happy with returning post_id's if the rest is harder, I can then fetch the information independently if need be.
Thanks a lot, tearing my hair out over this one; don't often have to think about shifting results on their side from row based to column based that often, but this time I need it!
So you get an idea of the table structure I have, you can use the below as a guide. I thought I had this, but it turned out yes, this query prints out each distinct region WITH the lowest from price found attached to that post in the region, but the post_id is completely incorrect. I don't know why, it seems to be just getting the first result of the post_id and using that.
SELECT pm.post_id,
pm2.meta_value as region,
MIN(pm.meta_value) as price
FROM `wp_postmeta` pm
inner join `wp_postmeta` pm2
on pm2.post_id = pm.post_id
AND pm2.meta_key = 'region'
AND pm.meta_key = 'fromprice'
group by region
I suggest changing MIN(pm.meta_value) in your query to be MIN(CAST(pm.meta_value AS DECIMAL)). Meta_value is a character field, so your existing query will be returning the minimum string value, not the minimum numeric value; for example, "100" will be deemed to be lower than "21".
EDIT - amended CAST syntax.
It's hard to figure out without being able to execute the query, but would it help to just change your group by to:
group by pm.post_id, region