MySQL COUNT can't count - sql

Well, it can, but I can't query ;)
Here's my query:
SELECT code.id AS codeid, code.title AS codetitle, code.summary AS codesummary, code.author AS codeauthor, code.date, code.challengeid, ratingItems.*, FORMAT((ratingItems.totalPoints / ratingItems.totalVotes), 1) AS rating, code_tags.*, tags.*, users.firstname AS authorname, users.id AS authorid, GROUP_CONCAT(tags.tag SEPARATOR ', ') AS taggroup,
COUNT(DISTINCT comments.codeid) AS commentcount
FROM (code)
JOIN code_tags ON code_tags.code_id = code.id
JOIN tags ON tags.id = code_tags.tag_id
JOIN users ON users.id = code.author
LEFT JOIN comments ON comments.codeid = code.id
LEFT JOIN ratingItems ON uniqueName = code.id
WHERE `code`.`approved` = 1
GROUP BY code_id
ORDER BY date desc
LIMIT 15
The important line is the second one - the one I've indented. I'm asking it to COUNT the number of comments on a particular post, but it doesn't return the right number. For example, something with two comments will return "1". Something with 8 comments by two different authors will still return "1"...
Any ideas?
Thanks!
Jack
EDIT: Forgot to mention. When I remove the DISTINCT part, something with 8 comments from two authors returns "28". Sorry, I'm not a MySQL expert and don't really understand why it's returning that :(

You group by code.id and in each group you count (DISTINCT comments.codeid), but comments.codeid = code.id as defined in JOIN, that's why you always get 1.
You need to count by some other field on comments... if there is a primary surrogate key, this is the way to go COUNT(comments.commentid).
Also, if the comments in every group are known to be distinct, a simple COUNT(*) should work.

Related

SQL ORDER BY something else than one of the table's columns

I have a table with posts in them. Website visitors can upvote or downvote such a post. I want to order a certain sql query by the score of the post, but my posts table doesn't have a score column - I keep the upvotes and downvotes in a different votes table, because that tells me who voted on what. I could add a score column to by posts table, and update it every time someone votes on a post, but I'd rather not do this, as the score is something I can work out by subtracting the downvotes from the upvotes anyways.
Do you have any suggestions? Or should I just go ahead and add a score column to my table?
Edit
My posts table has a post_id column (among other irrelevant columns) and my votes table has columns post_id, user_id and positive (the latter is a BOOLEAN, being 1 when the vote is an upvote and 0 when the vote is a downvote).
I can easily determine the score of a post 'by hand', by first querying the number of upvotes of that post, then the number of downvotes, and calculating their difference. However, I would like to query my posts table and order by the score of that post, so I want to know how/if I can query the votes table in the ORDER BY command while querying the posts table.
No, you do not have to create a score column. You can order by the calculated score, as below:
Since you do have the upvotes and downvotes in a different table, you need to join, as Tim Schmelter has explained.
SELECT p.*
FROM Post p
INNER JOIN Votes v
ON p.PostID = v.PostID
ORDER BY (v.upvotes - v.downvotes);
If you want to get the query to perform better, you could add a function-based index for (v.upvotes - v.downvotes).
EDIT:
Based on the updated information about the posts and the votes table, the following query can be used. The score is calculated within an inline view using a CASE statement. Then, this inline view is joined with the posts table, ordering the rows by the score. Note that an INNER JOIN is used, so only posts that have votes would be listed. To list all posts, a LEFT JOIN could be used instead.
SELECT p.*
FROM posts p
INNER JOIN
(
SELECT
post_id,
SUM
(
CASE
WHEN positive = 0 THEN -1
ELSE 1
END
) score
FROM votes v
GROUP BY post_id
) scores
ON p.post_id = scores.post_id
ORDER BY scores.score;
You have to link both tables via JOIN. Presuming that the Score-table has a column PostID:
SELECT p.*, Score = s.Upvotes- s.DownVotes
FROM Post p
INNER JOIN Score s
ON p.PostID = s.PostID
ORDER BY Score
Presumably, your data has a scores table with a column for each vote and an indicator of whether it is an up vote or down vote. If so, you need to aggregate this information and then you can use it for ordering:
select p.*, (NumUpVotes - NumDownVotes) as NetVotes
from posts p left outer join
(select PostId, sum(case when IsUpVote = 'Y' then 1 else 0 end) as NumUpvotes,
sum(case when IsDownVote = 'Y' then 1 else 0 end) as NumDownVotes
from scores s
group by PostId
) s
on p.postId = scores.PostId
order by (NumUpVotes - NumDownVotes);
You don't specify what database you are using so this uses standard SQL that should work in any database. You can adapt the logic for your particular data structure.

Counting from two tables according to single criteria

I am trying to do something like this:
SELECT COUNT(topic.topic_id) + COUNT(post.post_id)
FROM topic, post WHERE author_id = ?
Both tables have column author_id.
I get column reference "author_id" is ambiguous error.
How can I tell it that author_id is present in both tables?
While you could, you most probably do not want to join both tables, since that might result in different counts. Explanation in this related answer:
Two SQL LEFT JOINS produce incorrect result
Two subqueries would be fastest:
SELECT (SELECT COUNT(topic_id) FROM topic WHERE author_id = ?)
+ (SELECT COUNT(post_id) FROM post WHERE author_id = ?) AS result
If topic_id and post_id are defined NOT NULL in their respective tables, you can slightly simplify:
SELECT (SELECT COUNT(*) FROM topic WHERE author_id = ?)
+ (SELECT COUNT(*) FROM post WHERE author_id = ?) AS result
If at least one of both author_id columns is unique, a JOIN would work, too, in this case (but slower, and I wouldn't use it):
SELECT COUNT(t.topic_id) + COUNT(p.post_id) AS result
FROM topic t
LEFT post p USING (author_id)
WHERE t.author_id = ?;
If you want to enter the value only once, use a CTE:
WITH x AS (SELECT ? AS author_id) -- enter value here
SELECT (SELECT COUNT(*) FROM topic JOIN x USING (author_id))
+ (SELECT COUNT(*) FROM post JOIN x USING (author_id)) AS result
But be sure to understand how joins work. Read the chapter about Joined Tables in the manual.

Find total user posting per post SQL QUERY

I have a Database with the following two tables, member, POSTS I am looking for a way to get the count of how many posts a user has.
(Source: http://i.stack.imgur.com/FDv31.png)
I have tried many variations of the following SQL command with out any success. instead of showing the count of posts for a single user it shows a single row with all the posts as the count.
In the end I want something like this
(Source: http://i.stack.imgur.com/EbaEj.png)
Might be that I'm missing something here, but this query would seem to give you the results you want:
SELECT member.ID,
member.Name,
(SELECT COUNT(*) FROM Posts WHERE member.ID = Posts.user_id) AS total
FROM member;
I have left comment out of the query as it is not obvious what comment you want to be returned in that column for the group of comments that is counted.
See a SQL Fiddle demo here.
Edit
Sorry, misinterpreted your question :-) This query will properly return all the comments, along with the person who posted them and the total number of comments that the person made:
SELECT Posts.ID,
member.Name,
(SELECT COUNT(*) FROM Posts WHERE member.ID = Posts.user_id) AS total,
Posts.comment
FROM Posts
INNER JOIN member ON Posts.user_id = member.ID
GROUP BY Posts.ID, member.Name, member.ID, Posts.comment;
See an updated SQL Fiddle demo here.
You could use a subquery to calculate the total posts per member:
select m.ID
, m.Name
, coalesce(grp.total, 0)
, p.comment
from member m
left join
posts p
on p.user_id = m.id
left join
(
select user_id
, count(*) as total
from posts
group by
user_id
) grp
on grp.user_id = m.id
select
a.id
, a.name
, count(1) over (partition by b.user_id) as TotalCountPerUser
, b.comment
from member a join post b
on a.id = b.user_id

mysql left join ascending

I have a query:
SELECT reply.id,
reply.message,
reply.userid,
reply.date,
medal.id,
medal.url,
medal.name,
user.id,
user.name AS username
FROM posts AS reply
LEFT JOIN users AS user ON reply.userid = user.id
LEFT JOIN medals AS medal ON medal.userid = user.id
GROUP BY reply.id
ORDER BY reply.id ASC
everything is OK, except that I get the medal ascending not descending
which mean that it grab the first medal that user got - I need to get the last one.
The fact that you are seeing the first record per group is accidental. Selecting a non-aggregate, non-group-unique column in a GROUP BY query causes an undefined record to be selected.
For a more detailed explanation read: http://dev.mysql.com/tech-resources/articles/debunking-group-by-myths.html.
One correct way to do what you're doing is by using a subquery, in which you select the maximum medal date per desired group.
This approach is outlined here: http://dev.mysql.com/doc/refman/5.6/en/example-maximum-column-group-row.html
You could perhaps make a subquery or temporary table for medals instead of joining directly on medals. The temporary table would be the medals table but sorted DESC. The subquery would be LEFT JOIN (select id, url, name from medals order by desc) but I feel like that would be very slow and not the best option.
The simplest solution is to do this:
ORDER BY reply.id ASC, medal.id DESC
AND remove your GROUP BY statement.
You can filter the posts / users who have multiple medals afterwards (skip rows that contain the medals you don't want).
The next option is to select MAX(medal.id) in a subquery, and then join on that. That gets messy, but is doable. Here is the general idea (you will probably have to rename a few table aliases, since they are used more than once):
SELECT *,
medal.url,
medal.name
FROM
(
SELECT reply.id,
reply.message,
reply.userid,
reply.date,
MAX(medal.id) AS medal_id,
user.id,
user.name AS username
FROM posts AS reply
LEFT JOIN users AS user ON reply.userid = user.id
LEFT JOIN medals AS medal ON medal.userid = user.id
GROUP BY reply.id
ORDER BY reply.id ASC
)
AS t
LEFT JOIN medals AS medal ON medal.id = t.medal_id

SQL: Get all posts with any comments

I need to construct some rather simple SQL, I suppose, but as it's a rare event that I work with DBs these days I can't figure out the details.
I have a table 'posts' with the following columns:
id, caption, text
and a table 'comments' with the following columns:
id, name, text, post_id
What would the (single) SQL statement look like which retrieves the captions of all posts which have one or more comments associated with it through the 'post_id' key? The DBMS is MySQL if it has any relevance for the SQL query.
select p.caption, count(c.id)
from posts p join comments c on p.id = c.post_id
group by p.caption
having count (c.id) > 0
SELECT DISTINCT p.caption, p.id
FROM posts p,
comments c
WHERE c.post_ID = p.ID
I think using a join would be a lot faster than using the IN clause or a subquery.
SELECT DISTINCT caption
FROM posts
INNER JOIN comments ON posts.id = comments.post_id
Forget about counts and subqueries.
The inner join will pick up all the comments that have valid posts and exclude all the posts that have 0 comments. The DISTINCT will coalesce the duplicate caption entries for posts that have more then 1 comment.
I find this syntax to be the most readable in this situation:
SELECT * FROM posts P
WHERE EXISTS (SELECT * FROM Comments WHERE post_id = P.id)
It expresses your intent better than most of the others in this thread - "give me all the posts ..." (select * from posts) "... that have any comments" (where exist (select * from comments ... )). It's essentially the same as the joins above, but because you're not actually doing a join, you don't have to worry about getting duplicates of the records in Posts, so you'll just get one record per post.
SELECT caption FROM posts
INNER JOIN comments ON comments.post_id = posts.id
GROUP BY posts.id;
No need for a having clause or count().
edit: Should be a inner join of course (to avoid nulls if a comment is orphaned), thanks to jishi.
Just going off the top of my head here but maybe something like:
SELECT caption FROM posts WHERE id IN (SELECT post_id FROM comments HAVING count(*) > 0)
You're basically looking at performing a subquery --
SELECT p.caption FROM posts p WHERE (SELECT COUNT(*) FROM comments c WHERE c.post_id=p.id) > 1;
This has the effect of running the SELECT COUNT(*) subquery for each row in the posts table. Depending on the size of your tables, you might consider adding an additional column, comment_count, into your posts table to store the number of corresponding comments, such that you can simply do
SELECT p.caption FROM posts p WHERE comment_count > 1