SQL Join and Count can't GROUP BY correctly? - sql

So let's say I want to select the ID of all my blog posts and then a count of the comments associated with that blog post, how do I use GROUP BY or ORDER BY so that the returned list is in order of number of comments per post?
I have this query which returns the data but not in the order I want? Changing the group by makes no difference:
SELECT p.ID, count(c.comment_ID)
FROM wp_posts p, wp_comments c
WHERE p.ID = c.comment_post_ID
GROUP BY c.comment_post_ID;

I'm not familiar with pre-SQL92 syntax, so I'll express it in a way that I'm familiar with:
SELECT c.comment_post_ID, COUNT(c.comment_ID)
FROM wp_comments c
GROUP BY c.comment_post_ID
ORDER BY COUNT(c.comment_ID) -- ASC or DESC
What database engine are you using? In SQL Server, at least, there's no need for a join unless you're pulling more data from the posts table. With a join:
SELECT p.ID, COUNT(c.comment_ID)
FROM wp_posts p
JOIN wp_comments c ON c.comment_post_ID = p.ID
GROUP BY p.ID
ORDER BY COUNT(c.comment_ID)

SELECT p.ID, count(c.comment_ID) AS [count]
FROM wp_posts p, wp_comments c
WHERE p.ID = c.comment_post_ID
GROUP BY c.comment_post_ID;
ORDER BY [count] DESC

probably there are no related data on the comments table, so please try grouping it by the post ID, and please learn JOIN statements, it is very helpful and produces better results
SELECT p.ID, count(c.comment_ID)
FROM wp_posts p
LEFT JOIN wp_comments c ON (p.ID = c.comment_post_ID)
GROUP BY p.ID
I also encountered that kind of situation in my SQL query journeys :)

Related

SELECT DISTINCT query taking too long SQL

This is my code below, it's taking a very a long time to execute. When I add the SELECT DISTINCT it makes it very long.
What I'm trying to do is get unique companies that satisfy these conditions and also calculate how many teams each company has (this is given by team_id which is given to each user in auth_user u table).
Any help would be amazing, I want to learn how to make better SQL queries. I know that GROUP BY is the better way to do this, but I can't seem to get it.
SELECT DISTINCT u.company_id, c.name, c.company_type, c.office_location, (SELECT (COUNT(DISTINCT u.team_id)) FROM auth_user u WHERE u.company_id = c.id GROUP BY u.company_id) as number_of_teams, s.status, h.auto_renewal
FROM auth_user u, companies_company c, subscriptions_subscription s, hubspot_company h
WHERE u.company_id = c.id
AND s.company_id = c.id
AND h.myagi_id = c.id
ORDER BY u.company_id ASC
First of all refactor your query to use the 1992 JOIN syntax instead of your grandpa's comma-join syntax. (I'm a grandpa and I jumped at using JOIN as soon as it became available.)
SELECT DISTINCT u.company_id, c.name, c.company_type, c.office_location,
count_of_teams_TODO,
s.status, h.auto_renewal
FROM auth_user u
JOIN companies_company c ON u.company_id = c.id
JOIN subscriptions_subscription s ON s.company_id = c.id
JOIN hubspot_company h ON h.myagi_id = c.id
ORDER BY u.company_id ASC;
Then, I believe each user belongs to one team; that is has one value of auth_user.team_id. And you want your result set to show how many teams the company has.
So substitute COUNT(DISTINCT u.team_id) teams for my count_of_teams_TODO placeholder, getting this. There's no need for a subquery. But for the aggregate function COUNT() we need GROUP BY. And we want to group by company, status, and autorenewal.
SELECT c.id, company_id, c.name, c.company_type, c.office_location,
COUNT(DISTINCT u.team_id) teams,
s.status, h.auto_renewal
FROM auth_user u
JOIN companies_company c ON u.company_id = c.id
JOIN subscriptions_subscription s ON s.company_id = c.id
JOIN hubspot_company h ON h.myagi_id = c.id
GROUP BY c.id, s.status, h.auto_renewal
ORDER BY u.company_id ASC;
And that should do it. Study up on GROUP BY and aggregate functions. Every second you spend learning those concepts better will help you.
As far as performance goes, get this working and then ask another question. Tag it with query-optimization and read this before you ask it.

SQL query to find the top 3 in a category

Calling all sql enthusiasts!
Quick info: using PostgreSQL.
I have a query that return the maximum number of likes for a user per category. What I want now, is to show the top 3 users with the most likes per category.
A helpful resource was using this example to solve the problem:
select type, variety, price
from fruits
where (
select count(*) from fruits as f
where f.type = fruits.type and f.price <= fruits.price
) <= 2;
I understand this, but my query is using joins and I am also a beginner, so I was not able to use this information effectively.
Down to business, this is my query for returning the MAX likes for a user per category.
SELECT category, username, MAX(post_likes) FROM (
SELECT c.name category, u.username username, SUM(p.like_count) post_likes, COUNT(*) post_num
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id
GROUP BY c.name, u.username) AS leaders
WHERE post_likes > 0
GROUP BY category, username
HAVING MAX(post_likes) >= (SELECT SUM(p.like_count)
FROM categories c
JOIN topics t ON c.id = t.category_id
JOIN posts p ON t.id = p.topic_id
JOIN users u ON u.id = p.user_id WHERE c.name = leaders.category
GROUP BY u.username order by sum desc limit 1)
ORDER BY MAX(post_likes) DESC;
Any and all help would be greatly appreciated. I am having a difficult time wrapping my head around this problem. Thank!
If you want the most likes per category, use window functions:
SELECT cu.*
FROM (SELECT c.name as category, u.username as username,
SUM(p.like_count) as post_likes, COUNT(*) as post_num,
ROW_NUMBER() OVER (PARTITION BY c.name ORDER BY COUNT(*) DESC) as seqnum
FROM categories c JOIN
topics t
ON c.id = t.category_id JOIN
posts p
ON t.id = p.topic_id JOIN
users u
ON u.id = p.user_id
GROUP BY c.name, u.username
) cu
WHERE seqnum <= 3;
This always returns three rows per category, even if there are ties. If you want to do something else, then consider DENSE_RANK() or RANK() instead of ROW_NUMBER().
Also, use as for column aliases in the FROM clause. Although optional, one day you will leave out a comma and be grateful that you are in the habit of using as.

How can I add a column in sql query from nested join?

I am using Access and having three tables: Category, Topic and Post. What I am trying to achieve is to include in the result the date of the last post creation.
Post table has a CreatedOn column.
Currently my query looks like this:
SELECT
category.id,
category.CategoryName,
category.Description,
count(tp.topic.id) AS NumberOfTopics,
Sum(numofposts) AS NumberOfPosts
FROM category
LEFT JOIN (
SELECT
topic.id,
topic.categoryId,
count(post.id) AS numofposts
FROM
topic
LEFT JOIN post ON topic.id = post.topicId
GROUP BY topic.id, topic.categoryId
) AS TP ON category.id=TP.categoryid
GROUP BY category.id, category.CategoryName, category.Description;
My best attempt (in my opinion) was to extend a query in a following way:
SELECT
category.id,
category.CategoryName,
category.Description,
COUNT(topic.id) AS NumberOfTopics,
sum(numofposts) AS NumberOfPosts,
"DUMMY" AS last
FROM category
LEFT JOIN (
SELECT
topic.id,
COUNT(ps.id) AS numofposts,
topic.categoryId
FROM topic
LEFT JOIN (
SELECT
id,
CreatedOn,
topicId
FROM post
ORDER BY post.CreatedOn DESC
) AS ps ON topic.id = ps.topicId
GROUP BY topic.id, topic.categoryId
) AS TP ON category.id=TP.categoryid
GROUP BY category.id, category.CategoryName, category.Description;
Unfortunately I've tried a many different ways to get it, but I am still unsuccessful.
Thanks in advance.
Using inline SELECT clauses:
SELECT
c.id,
c.CategoryName,
c.Description,
(SELECT count(t.id)
FROM topic t
WHERE t.categoryId = c.id
) AS NumberOfTopics,
(SELECT count(p.id)
FROM post p
JOIN topic t ON p.topicId = t.id
WHERE t.categoryId = c.id
) AS NumberOfPosts,
(SELECT max(p.createdOn) FROM post p
JOIN topic t ON p.topicId = t.id
WHERE t.categoryId = c.id
) AS LastPostDate
FROM category c;
This may not be the most efficient query, but produces the right results.
See http://www.sqlfiddle.com/#!3/165d1/2 for a demo.

Single SQL query on many to many relationship

I have a simple database with few tables (and some sample columns):
Posts (ID, Title, Content)
Categories (ID, Title)
PostCategories (ID, ID_Post, ID_Category)
Is there a way to create single SQL query which will return posts with categories that are assigned to each post?
You can use the GROUP_CONCAT function
select p.*, group_concat(DISTINCT c.title ORDER BY c.title DESC SEPARATOR ', ')
from Posts p
inner join PostCategories pc on p.ID = pc.ID_Post
inner join Categories c on pc.ID_Category = c.ID
group by p.id, p.title, p.content
Simple joins work well.
SELECT posts.id, posts.title, categories.id, categories.title
FROM posts
JOIN posts_categories ON posts.id = posts_categories.post_id
JOIN categories ON posts_categories.category_id = categories.id
select p.*, c.*
from Posts p
inner join PostCategories pc on p.ID = pc.ID_Post
inner join Categories c on pc.ID_Category = c.ID
If you mean with only one record per post, I will need to know what database platform you are using.
Sure. If I understand your question correctly, it should be as simple as
SELECT Posts.title, Categories.title
FROM Posts, Categories, PostCategories
WHERE PostCategories.ID_Post = Posts.ID AND PostCategories.ID_Category = Categories.ID
ORDER BY Posts.title, Categories.title;
Getting one row per Post will be a little more complicated, and will depend on what RDBMS you're using.
We can use this query also.
select e.*,c.* from Posts e, Categories c, PostCategories cp where cp.id in ( select s.id from PostCategories s where s.empid=e.id and s.companyid=c.id );

Simple SQL question about getting rows and associated counts

this oughta be an easy one.
My question is very similar to this one; basically, I've got a table of posts, a table of comments with a foreign key for the post_id, and a table of votes with a foreign key for the post id. I'd like to do a single query and get back a result set containing one row per post, along with the count of associated comments and votes.
From the question I've linked to above, it seems that for getting a table back containing just a row for each post and a comment count, this is the right approach:
SELECT a.ID, a.Title, COUNT(c.ID) AS NumComments
FROM Articles a
LEFT JOIN Comments c ON c.ParentID = a.ID
GROUP BY a.ID, a.Title
I thought adding vote count would be as easy as adding another left join, as in
SELECT a.ID, a.Title, COUNT(c.ID) AS NumComments, COUNT(v.id AS NumVotes)
FROM Articles a
LEFT JOIN Comments c ON c.ParentID = a.ID
LEFT JOIN Votes v ON v.ParentID = a.ID
GROUP BY a.ID, a.Title
but I'm getting bad numbers back. What am I missing?
SELECT
a.ID,
a.Title,
COUNT(DISTINCT c.ID) AS NumComments,
COUNT(DISTINCT v.id) AS NumVotes
FROM
Articles a
LEFT JOIN Comments c ON c.ParentID = a.ID
LEFT JOIN Votes v ON v.ParentID = a.ID
GROUP BY
a.ID,
a.Title
SELECT id, title,
(
SELECT COUNT(*)
FROM comments c
WHERE c.ParentID = a.ID
) AS NumComments,
(
SELECT COUNT(*)
FROM votes v
WHERE v.ParentID = a.ID
) AS NumVotes
FROM articles a
try:
COUNT(DISTINCT c.ID) AS NumComments
You are thinking in trees, not recordsets.
In the recordset the you get each Comment and each Vote returned multiple times combined with each other. Run the query without the group by and the count to see what I mean.
The solution is simple: use COUNT(DISCTINCT c.ID) and COUNT(DISTINCT v.ID)