How to join three tables having relation parent-child-child's child. And I want to access all records related to parent - sql

I have three tables:
articles(id,title,message)
comments(id,article_id,commentedUser_id,comment)
comment_likes(id, likedUser_id, comment_id, action_like, action_dislike)
I want to show comments.id, comments.commentedUser_id, comments.comment, ( Select count(action_like) where action_like="like") as likes and comment_id=comments.id where comments.article_id=article.id
Actually I want to count all action_likes that related to any comment. And show all all comments of articles.
action_likes having only two values null or like
SELECT c.id , c.CommentedUser_id , c.comment , (cl.COUNT(action_like) WHERE action_like='like' AND comment_id='c.id') as likes
FROM comment_likes as cl
LEFT JOIN comments as c ON c.id=cl.comment_id
WHERE c.article_id=article.id
It shows nothing, I know I'm doing wrong way, that was just that I want say

I guess you are looking for something like below. This will return Article/Comment wise LIKE count.
SELECT
a.id article_id,
c.id comment_id,
c.CommentedUser_id ,
c.comment ,
COUNT (CASE WHEN action_like='like' THEN 1 ELSE NULL END) as likes
FROM article a
INNER JOIN comments C ON a.id = c.article_id
LEFT JOIN comment_likes as cl ON c.id=cl.comment_id
GROUP BY a.id,c.id , c.CommentedUser_id , c.comment
IF you need results for specific Article, you can add WHERE clause before the GROUP BY section like - WHERE a.id = N

I would recommend a correlated subquery for this:
SELECT a.id as article_id, c.id as comment_id,
c.CommentedUser_id, c.comment,
(SELECT COUNT(*)
FROM comment_likes cl
WHERE cl.comment_id = c.id AND
cl.action_like = 'like'
) as num_likes
FROM article a INNER JOIN
comments c
ON a.id = c.article_id;
This is a case where a correlated subquery often has noticeably better performance than an outer aggregation, particularly with the right index. The index you want is on comment_likes(comment_id, action_like).
Why is the performance better? Most databases will implement the group by by sorting the data. Sorting is an expensive operation that grows super-linearly -- that is, twice as much data takes more than twice as long to sort.
The correlated subquery breaks the problem down into smaller pieces. In fact, no sorting should be necessary -- just scanning the index and counting the matching rows.

Related

Sum fields of an Inner join

How I can add two fields that belong to an inner join?
I have this code:
select
SUM(ACT.NumberOfPlants ) AS NumberOfPlants,
SUM(ACT.NumOfJornales) AS NumberOfJornals
FROM dbo.AGRMastPlanPerformance MPR (NOLOCK)
INNER JOIN GENRegion GR ON (GR.intGENRegionKey = MPR.intGENRegionLink )
INNER JOIN AGRDetPlanPerformance DPR (NOLOCK) ON
(DPR.intAGRMastPlanPerformanceLink =
MPR.intAGRMastPlanPerformanceKey)
INNER JOIN vwGENPredios P โ€‹โ€‹(NOLOCK) ON ( DPR.intGENPredioLink =
P.intGENPredioKey )
INNER JOIN AGRSubActivity SA (NOLOCK) ON (SA.intAGRSubActivityKey =
DPR.intAGRSubActivityLink)
LEFT JOIN (SELECT RA.intGENPredioLink, AR.intAGRActividadLink,
AR.intAGRSubActividadLink, SUM(AR.decNoPlantas) AS
intPlantasTrabajads, SUM(AR.decNoPersonas) AS NumOfJornales,
SUM(AR.decNoPlants) AS NumberOfPlants
FROM AGRRecordActivity RA WITH (NOLOCK)
INNER JOIN AGRActividadRealizada AR WITH (NOLOCK) ON
(AR.intAGRRegistroActividadLink = RA.intAGRRegistroActividadKey AND
AR.bitActivo = 1)
INNER JOIN AGRSubActividad SA (NOLOCK) ON (SA.intAGRSubActividadKey
= AR.intAGRSubActividadLink AND SA.bitEnabled = 1)
WHERE RA.bitActive = 1 AND
AR.bitActive = 1 AND
RA.intAGRTractorsCrewsLink IN(2)
GROUP BY RA.intGENPredioLink,
AR.decNoPersons,
AR.decNoPlants,
AR.intAGRAActivityLink,
AR.intAGRSubActividadLink) ACT ON (ACT.intGENPredioLink IN(
DPR.intGENPredioLink) AND
ACT.intAGRAActivityLink IN( DPR.intAGRAActivityLink) AND
ACT.intAGRSubActivityLink IN( DPR.intAGRSubActivityLink))
WHERE
MPR.intAGRMastPlanPerformanceKey IN(4) AND
DPR.intAGRSubActivityLink IN( 1153)
GROUP BY
P.vchRegion,
ACT.NumberOfFloors,
ACT.NumOfJournals
ORDER BY ACT.NumberOfFloors DESC
However, it does not perform the complete sum. It only retrieves all the values โ€‹โ€‹of the columns and adds them 1 by 1, instead of doing the complete sum of the whole column.
For example, the query returns these results:
What I expect is the final sums. In NumberOfPlants the result of the sum would be 163,237 and of NumberJornales would be 61.
How can I do this?
First of all the (nolock) hints are probably not accomplishing the benefit you hope for. It's not an automatic "go faster" option, and if such an option existed you can be sure it would be already enabled. It can help in some situations, but the way it works allows the possibility of reading stale data, and the situations where it's likely to make any improvement are the same situations where risk for stale data is the highest.
That out of the way, with that much code in the question we're better served with a general explanation and solution for you to adapt.
The issue here is GROUP BY. When you use a GROUP BY in SQL, you're telling the database you want to see separate results per group for any aggregate functions like SUM() (and COUNT(), AVG(), MAX(), etc).
So if you have this:
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
You will get a separate row per ColumnA group, even though it's not in the SELECT list.
If you don't really care about that, you can do one of two things:
Remove the GROUP BY If there are no grouped columns in the SELECT list, the GROUP BY clause is probably not accomplishing anything important.
Nest the query
If option 1 is somehow not possible (say, the original is actually a view) you could do this:
SELECT SUM(SumB)
FROM (
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
) t
Note in both cases any JOIN is irrelevant to the issue.

SQL issue with RIGHT JOINS

The following SQL query does exactly what it should do, but I don't know how to change it so it does what I need it to do.
SELECT
a.id AS comment_id, a.parent_comment_id, a.user_id, a.pid, a.comment, a.blocked_at, a.blocked_by, a.created_at, a.updated_at,
b.name, b.is_moderator, COUNT(IF(d.type = 1, 1, NULL)) AS a_count, COUNT(IF(d.type = 2, 1, NULL)) AS b_count
FROM
comments AS a
RIGHT JOIN
users AS b ON b.id = a.user_id
RIGHT JOIN
node_user AS c ON c.user = b.id
RIGHT JOIN
nodes AS d ON d.id = c.node
WHERE
a.pid = 999
GROUP BY
comment_id
ORDER BY
a.created_at ASC
It gets all comments belonging to a specific pid, it then RIGHT JOINS additional user data like name and is_moderator, then RIGHT JOINS any (so called) nodes including additional data based on the user id and node id. As seen in the SELECT, I count the nodes based on their type.
This works great for users that have any nodes attached to their accounts. But users who don't have any, so whose id doesn't exist in the node_user and nodes tables, won't be shown in the query results.
So my question:
How can I make it so that even users who don't have any nodes, are still shown in the query results but with an a_count and b_count of 0 or NULL.
I'm pretty sure you want left joins not right joins. You also want to fix your table aliases so they are meaningful:
SELECT . . .
FROM comments c LEFT JOIN
users u
ON u.id = c.user_id LEFT JOIN
node_user nu
ON nu.user = u.id LEFT JOIN
nodes n
ON n.id = nu.node
WHERE c.pid = 999
GROUP BY c.id
ORDER BY c.created_at ASC;
This keeps everything in the first table, regardless of whether or not rows match in the subsequent tables. That appears to be what you want.

how to count two values from three dataset

I have 3 datasets: company, post, postedited,
I want to count the numbers of companies' post and postedited. some companies post but did not edited.
here is my query :
SELECT company.name, company.id, count(*),
( select count(*)
from post, postedited
where post.id=postedited.post_id)
from company, post as p
where company.id=p.company_id
group by company_id
the outcome of post is right, but the column of postedited is the same. what's wrong with my query?
Your subquery is completely unrelated to the main query. It selects post and postedited and counts. You are showing this result for every row of the main query.
You want the subquery relate to the main query's post. So remove the post table from the subquery's from clause:
(select count(*) from postedited where postedited.post_id = p.id)
Now this subquery selects a count for the post_id of the main query's records. At last you must get the sum of the counts:
select
c.name, c.id, count(*) as posts,
sum(select count(*) from postedited pe where pe.post_id = p.id) as edits
from company c
join post p on p.company_id = c.id
group by c.id;
You can achieve the same thus:
select
c.name, c.id, count(distinct p.id) as posts, count(pe.post_id) as edits
from company c
join post p on p.company_id = c.id
left join postedited pe on pe.post_id = p.id
group by c.id;
SELECT c.name AS companyName
, c.id AS companyID
, COUNT(DISTINCT p.id) AS postCount
, COUNT(DISTINCT pe.post_id) AS postEditCount
FROM company c
LEFT OUTER JOIN post p ON p.Company_ID = c.ID
LEFT OUTER JOIN postEdited pe ON pe.Company_ID = c.ID
GROUP BY c.id, c.name
That will give you a list of all companies in your company table with a count of each of their posts and edited posts. If you need to further query against that dataset, you can. Or you can add a WHERE clause to the above query to filter it.
And I agree, please don't use comma syntax. It's very easy to produce unintended results, and it doesn't give a good representation of what you're actually querying against. Plus, it's no longer standard and being deprecated in many flavors of SQL. Good JOIN syntax will make your life much easier.

Help with Complicated SELECT query

I have this SELECT query:
SELECT Auctions.ID, Users.Balance, Users.FreeBids,
COUNT(CASE WHEN Bids.Burned=0 AND Auctions.Closed=0 THEN 1 END) AS 'ActiveBids',
COUNT(CASE WHEN Bids.Burned=1 AND Auctions.Closed=0 THEN 1 END) AS 'BurnedBids'
FROM (Users INNER JOIN Bids ON Users.ID=Bids.BidderID)
INNER JOIN Auctions
ON Bids.AuctionID=Auctions.ID
WHERE Users.ID=#UserID
GROUP BY Users.Balance, Users.FreeBids, Auctions.ID
My problam is that it returns no rows if the UserID cant be found on the Bids table.
I know it's something that has to do with my
(Users INNER JOIN Bids ON Users.ID=Bids.BidderID)
But i dont know how to make it return even if the user is no on the Bids table.
You're doing an INNER JOIN, which only returns rows if there are results on both sides of the join. To get what you want, change your WHERE clause like this:
Users LEFT JOIN Bids ON Users.ID=Bids.BidderID
You may also have to change your SELECT statement to handle Bids.Burned being NULL.
If you want to return rows even if there's no matching Auction, then you'll have to make some deeper changes to your query.
My problam is that it returns no rows if the UserID cant be found on the Bids table.
Then INNER JOIN Bids/Auctions should probably be left outer joins. The way you've written it, you're filtering users so that only those in bids and auctions appear.
Left join is the simple answer, but if you're worried about performance I'd consider re-writing it a little bit. For one thing, the order of the columns in the group by matters to performance (although it often doesn't change the results). Generally, you want to group by a column that's indexed first.
Also, it's possible to re-write this query to only have one group by, which will probably speed things up.
Try this out:
with UserBids as (
select
a.ID
, b.BidderID
, ActiveBids = count(case when b.Burned = 0 then 1 end)
, BurnedBids = count(case when b.Burned = 0 then 1 end)
from Bids b
join Auctions a
on a.ID = b.AuctionID
where a.Closed = 0
group by b.BidderID, a.AuctionID
)
select
b.ID
, u.Balance
, u.FreeBids
, b.ActiveBids
, b.BurnedBids
from Users u
left join UserBids b
on b.BidderID = u.ID
where u.ID = #UserID;
If you're not familiar with the with UserBids as..., it's called a CTE (common table expression), and is basically a way to make a one-time use view, and a nice way to structure your queries.

SQL: Speed Improvement - Left Join on cond1 or cond2

SELECT DISTINCT a.*, b.*
FROM current_tbl a
LEFT JOIN import_tbl b
ON ( a.user_id = b.user_id
OR ( a.f_name||' '||a.l_name = b.f_name||' '||b.l_name)
)
Two tables that are basically the same
I don't have access to the table structure or data input (thus no cleaning up primary keys)
Sometimes the user_id is populated in one and not the other
Sometimes names are equal, sometimes they are not
I've found that I can get the most of the data by matching on user_id or the first/last names. I'm using the ' ' between the names to avoid cases where one user has the same first name as another's last name and both are missing the other field (unlikely, but plausible).
This query runs in 33000ms, whereas individualized they are each about 200ms.
I've been up late and can't think straight right now
I'm thinking that I could do a UNION and only query by name where a user_id does not exist (the default join is the user_id, if a user_id doesn't exist then I want to join by the name)
Here is some free points to anyone that wants to help
Please don't ask for the execution plan.
Looks like you can easily avoid the string concatenation:
OR ( a.f_name||' '||a.l_name = b.f_name||' '||b.l_name)
Change it to:
OR ( a.f_name = b.f_name AND a.l_name = b.l_name)
Rather than concatenating first and last name and comparing them, try comparing them individually instead. Assuming you have them (and you should create them if you don't), this should improve your chances of using indexes on the first name and last name columns.
SELECT DISTINCT a.*, b.*
FROM current_tbl a
LEFT JOIN import_tbl b
ON ( a.user_id = b.user_id
OR (a.f_name = b.f_name and a.l_name = b.l_name)
)
If people's suggestions don't provide a major speed increase, there is a possibility that your real problem is that the best query plan for your two possible join conditions is different. For that situation you would want to do two queries and merge results in some way. This is likely to make your query much, much uglier.
One obscure trick that I have used for that kind of situation is to do a GROUP BY off of a UNION ALL query. The idea looks like this:
SELECT a_field1, a_field2, ...
MAX(b_field1) as b_field1, MAX(b_field2) as b_field2, ...
FROM (
SELECT a.field_1 as a_field1, ..., b.field1 as b_field1, ...
FROM current_tbl a
LEFT JOIN import_tbl b
ON a.user_id = b.user_id
UNION ALL
SELECT a.field_1 as a_field1, ..., b.field1 as b_field1, ...
FROM current_tbl a
LEFT JOIN import_tbl b
ON a.f_name = b.f_name AND a.l_name = b.l_name
)
GROUP BY a_field1, a_field2, ...
And now the database can do each of the two joins using the most efficient plan.
(Warning of a drawback in this approach. If a row in current_tbl joins to multiple rows in import_tbl, then you'll wind up merging data in a very odd way.)
Incidental random performance tip. Unless you have reason to believe that there are potential duplicate rows, avoid DISTINCT. It forces an implicit GROUP BY, which can be expensive.
I don't really understand why you're concatenating those strings. Seems like that's where your slowdown would be. Does this work instead?
SELECT DISTINCT a.*, b.*
FROM current_tbl a
LEFT JOIN import_tbl b
ON ( a.user_id = b.user_id
OR ( a.f_name = b.f_name AND a.l_name = b.l_name)
)
Here is Yet Another Ugly Way To Do It.
SELECT a.*
, CASE WHEN b.user_id IS NULL THEN c.field1 ELSE b.field1 END as b_field1
, CASE WHEN b.user_id IS NULL THEN c.field2 ELSE b.field2 END as b_field2
...
FROM current_tbl a
LEFT JOIN import_tbl b
ON a.user_id = b.user_id
LEFT JOIN import_tbl c
ON a.f_name = c.f_name AND a.l_name = c.l_name;
This avoids any GROUP BY, and also handles conflicting matches in a somewhat reasonable way.
Try using JOIN hints:
http://msdn.microsoft.com/en-us/library/ms173815.aspx
We were encountering the same type of behavior with one of our queries. As a last resort we added the LOOP hint, and the query ran much much faster.
It's important to note that Microsoft says this about JOIN hints:
Because the SQL Server query optimizer typically selects the best execution plan for a query, we recommend that hints, including , be used only as a last resort by experienced developers and database administrators.
my boss at my last job.. I swear.. he thought that using UNIONS was ALWAYS FASTER THAN OR.
For example.. instead of writing
Select * from employees Where Employee_id = 12 or employee_id = 47
he would write (and have me write)
Select * from employees Where employee_id = 12
UNION
Select * from employees Where employee_id = 47
SQL Sever optimizer said that this was the right thing to do in SOME situations.. I have a friend who works on the SQL Server team at Microsoft, I emailed him about this and he told me that my stats were out of date or something along those lines.
I never really got a good answer on WHY the unions are faster, it seems REALLY counter-intuitive.
I'm not recommending you DO this, but in some situations it can help.
Also two more things-- GET RID OF THE DISTINCT CLAUSE unless you absolutely need it.. n
and more importantly, you can easily get rid of the concatenation in your join, like this for example (pardon my lack of mySQL knowledge)
SELECT DISTINCT a., b.
FROM current_tbl a
LEFT JOIN import_tbl b
ON ( a.user_id = b.user_id
OR ( a.f_name = b.f_name and a.l_name = b.l_name)
)
I've had some tests at work in a similiar situation that show 10x performance improvement by getting rid of the simple concatenation in your join