How to count the number of occurrences in a self-referencing column? - sql

I am new to SQL and I've been struggling with this example for quite a while now.
I have this table:
So, what is asked from me is to produce a count of the number of recommendations each member has made. Order by number of recommendations. The final result should look something like this:
I really am confused, since the values of column recommendedby is actually the id of the member. I don't know how to "recognize" them as id and not just some values, count how many recommendations each member has and "connect" them to memid column to get to needed result.
So far I managed to do this:
SELECT COUNT(recommendedby)
FROM members
GROUP BY recommendedby
But I'm stuck now. I get a counted number of recommendations for each id, but it's not connected to actual id. This is my result.

I think this is what you're looking for:
select "id"
, (select count(1)
from "members"
where "recommendedby" = m."id")
from "members" m
Although using subqueries are not very popular and can cause serious performance issues, this is imho the easiest way to learn what you're doing.

You should use a self-join for this:
SELECT m.id,
count(r.id) AS recommendations
FROM members AS m
LEFT JOIN members AS r
ON r.recommendedby = m.id
GROUP BY m.id
ORDER BY recommendations;
The left join will make r.id be NULL for members that made no recommendation, and count won't count such NULL values.

There are two ways you can go with.
Join + Group By
This is pretty simple, all you need to do is put the query you made as SubQuery and Join your table with that.
SELECT members.*, rec.recommendations
FROM members
LEFT JOIN (
SELECT COUNT(recommendedby) recommendations, recommendedby
FROM members
GROUP BY recommendedby
) rec ON rec.recommendedby = members.id
Lateral
SELECT m1.*, rec.recommendations
FROM members m1,
LATERAL (
SELECT COUNT(recommendedby) recommendations
FROM members m2
WHERE m1.id = m2.id
)
Hope this can help.

Related

Sum fields of an Inner join

How I can add two fields that belong to an inner join?
I have this code:
select
SUM(ACT.NumberOfPlants ) AS NumberOfPlants,
SUM(ACT.NumOfJornales) AS NumberOfJornals
FROM dbo.AGRMastPlanPerformance MPR (NOLOCK)
INNER JOIN GENRegion GR ON (GR.intGENRegionKey = MPR.intGENRegionLink )
INNER JOIN AGRDetPlanPerformance DPR (NOLOCK) ON
(DPR.intAGRMastPlanPerformanceLink =
MPR.intAGRMastPlanPerformanceKey)
INNER JOIN vwGENPredios P ​​(NOLOCK) ON ( DPR.intGENPredioLink =
P.intGENPredioKey )
INNER JOIN AGRSubActivity SA (NOLOCK) ON (SA.intAGRSubActivityKey =
DPR.intAGRSubActivityLink)
LEFT JOIN (SELECT RA.intGENPredioLink, AR.intAGRActividadLink,
AR.intAGRSubActividadLink, SUM(AR.decNoPlantas) AS
intPlantasTrabajads, SUM(AR.decNoPersonas) AS NumOfJornales,
SUM(AR.decNoPlants) AS NumberOfPlants
FROM AGRRecordActivity RA WITH (NOLOCK)
INNER JOIN AGRActividadRealizada AR WITH (NOLOCK) ON
(AR.intAGRRegistroActividadLink = RA.intAGRRegistroActividadKey AND
AR.bitActivo = 1)
INNER JOIN AGRSubActividad SA (NOLOCK) ON (SA.intAGRSubActividadKey
= AR.intAGRSubActividadLink AND SA.bitEnabled = 1)
WHERE RA.bitActive = 1 AND
AR.bitActive = 1 AND
RA.intAGRTractorsCrewsLink IN(2)
GROUP BY RA.intGENPredioLink,
AR.decNoPersons,
AR.decNoPlants,
AR.intAGRAActivityLink,
AR.intAGRSubActividadLink) ACT ON (ACT.intGENPredioLink IN(
DPR.intGENPredioLink) AND
ACT.intAGRAActivityLink IN( DPR.intAGRAActivityLink) AND
ACT.intAGRSubActivityLink IN( DPR.intAGRSubActivityLink))
WHERE
MPR.intAGRMastPlanPerformanceKey IN(4) AND
DPR.intAGRSubActivityLink IN( 1153)
GROUP BY
P.vchRegion,
ACT.NumberOfFloors,
ACT.NumOfJournals
ORDER BY ACT.NumberOfFloors DESC
However, it does not perform the complete sum. It only retrieves all the values ​​of the columns and adds them 1 by 1, instead of doing the complete sum of the whole column.
For example, the query returns these results:
What I expect is the final sums. In NumberOfPlants the result of the sum would be 163,237 and of NumberJornales would be 61.
How can I do this?
First of all the (nolock) hints are probably not accomplishing the benefit you hope for. It's not an automatic "go faster" option, and if such an option existed you can be sure it would be already enabled. It can help in some situations, but the way it works allows the possibility of reading stale data, and the situations where it's likely to make any improvement are the same situations where risk for stale data is the highest.
That out of the way, with that much code in the question we're better served with a general explanation and solution for you to adapt.
The issue here is GROUP BY. When you use a GROUP BY in SQL, you're telling the database you want to see separate results per group for any aggregate functions like SUM() (and COUNT(), AVG(), MAX(), etc).
So if you have this:
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
You will get a separate row per ColumnA group, even though it's not in the SELECT list.
If you don't really care about that, you can do one of two things:
Remove the GROUP BY If there are no grouped columns in the SELECT list, the GROUP BY clause is probably not accomplishing anything important.
Nest the query
If option 1 is somehow not possible (say, the original is actually a view) you could do this:
SELECT SUM(SumB)
FROM (
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
) t
Note in both cases any JOIN is irrelevant to the issue.

Subtracting values of columns from two different tables

I would like to take values from one table column and subtract those values from another column from another table.
I was able to achieve this by joining those tables and then subtracting both columns from each other.
Data from first table:
SELECT max_participants FROM courses ORDER BY id;
Data from second table:
SELECT COUNT(id) FROM participations GROUP BY course_id ORDER BY course_id;
Here is some code:
SELECT max_participants - participations AS free_places FROM
(
SELECT max_participants, COUNT(participations.id) AS participations
FROM courses
INNER JOIN participations ON participations.course_id = courses.id
GROUP BY courses.max_participants, participations.course_id
ORDER BY participations.course_id
) AS course_places;
In general, it works, but I was wondering, if there is some way to make it simplier or maybe my approach isn't correct and this code will not work in some conditions? Maybe it needs to be optimized.
I've read some information about not to rely on natural order of result set in databases and that information made my doubts to appear.
If you want the values per course, I would recommend:
SELECT c.id, (c.max_participants - COUNT(p.id)) AS free_places
FROM courses c LEFT JOIN
participations p
ON p.course_id = c.id
GROUP BY c.id, c.max_participants
ORDER BY 1;
Note the LEFT JOIN to be sure all courses are included, even those with no participants.
The overall number is a little tricker. One method is to use the above as a subquery. Alternatively, you can pre-aggregate each table:
select c.max_participants - p.num_participants
from (select sum(max_participants) as max_participants from courses) c cross join
(select count(*) as num_participants from participants from participations) p;

Microsoft Access SQL query count distinct

I am trying to create a queries in MS Access SQL that performs two separate counts function and have drafted the below code:
SELECT DISTINCT A.Name, Count(A.Name) AS X, Count(b.Address) AS Y
FROM PEOPLE AS A INNER JOIN PEOPLE Sub AS b ON A.PID = b.PID
GROUP BY A.Nam
The problem with this query is that both count functions provide a total count of the number of address for each name and I want the first count function to provide a count of names, therefore I would be grateful if someone could advise how I amend this code to change the first count function to a count distinct
Thanks
Nick
Your Query is wrong - GROUP BY must have A.Name - i think it´s an error by copying.
Otherwise change this. What happens if you do it without DISTINCT? Try it with SUM not with COUNT.
Distinct in your query is obsolete because of the GROUP BY clause.
Furthermore it is not clear if your 'People sub' refers to another table or is a self-join. The following code should work:
SELECT P.Name
, COUNT(P.*) AS X
, COUNT(DISTINCT A.Address) AS Y
FROM PEOPLE AS P
INNER JOIN ADDRESS AS A ON A.PID = P.PID
GROUP BY P.Name

Is it possible to use AND with three tables and using left join/inner join

I am in need of a very strange problem. I hope you brilliant guys will enjoy this problem (or may be it's an easy task for you :) ).
Here is the query I am using to return values from three tables
select listing.*
, sum(review.rNumber) as nor
, count(review.rNumber) as total
, users.username from listing
where
left join review on listing.lid=review.lID
inner join users on users.uid=listing.cuid
group by listing.lid
Now in this query I want to use an additional filter. It is returning all the values from listing table but I want to return all values using WHERE cat='Hair' or something
I don't have any idea how to insert where clues in this query.. Please let me know if it doable.
Thanks
If you want the filter to apply before the group by:
select listing.*, sum(review.rNumber) as nor, count(review.rNumber) as total, users.username from listing
left join review on listing.lid=review.lID
inner join users on users.uid=listing.cuid
where cat='Hair'
group by listing.lid
If you want it after, you use "having" instead:
select listing.*, sum(review.rNumber) as nor, count(review.rNumber) as total, users.username from listing
left join review on listing.lid=review.lID
inner join users on users.uid=listing.cuid
group by listing.lid
having cat='Hair'
Your query should be as follow.
select listing.lid
, sum(review.rNumber) as nor
, count(review.rNumber) as total
, users.username from listing
left join review on listing.lid=review.lID
inner join users on users.uid=listing.cuid
where
cat='Hair
group by listing.lid,users.username
First of all, you need to follow
Select
From
Where
Group By
Second, if you use group by, you may not select any column that is not in aggregate function or in group by. Therefore, listing.* can not be done. If you need that values, put them in the both group by and select.

Aggregate query with subquery (SUM)

I have the following query:
SELECT UserId, (
0.099 *
(
CASE WHEN
(SELECT AcceleratedProfitPercentage FROM CustomGroups cg
INNER JOIN UserCustomGroups ucg ON ucg.CustomGroupId = cg.Id
WHERE Packs.UserCustomGroupId = ucg.Id)
IS NOT NULL THEN
((SELECT AcceleratedProfitPercentage FROM CustomGroups cg
INNER JOIN UserCustomGroups ucg ON ucg.CustomGroupId = cg.Id
WHERE Packs.UserCustomGroupId = ucg.Id)*1.0) / (100*1.0)
ELSE 1
END
)
)
As amount
FROM Packs WHERE Id IN (
SELECT ap.Id FROM Packs ap JOIN Users u ON ap.UserId = u.UserId
WHERE ap.MoneyToReturn > ap.MoneyReturned AND
u.Mass LIKE '1%');
which is producing correct output. However I have no idea how to aggregate it properly. I tried to use standard GROUP BY but I get the error (Column 'Packs.UserCustomGroupId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY claus). Any ideas? Here is the output I currently get:
I want to aggregate it by UserId. Thanks in advance.
The option that involves the least query-rewriting is to drop your existing query into a CTE or temp table, like so:
; with CTE as (MyQueryHere)
Select UserID, sum(amount)
from CTE
Group by UserID
Wow that is one crazy query you've got going on there.
Try this:
SELECT UserId,
0.099 * SUM(t.Amount) AS [Amount SUM]
FROM Packs P
JOIN Users U
ON P.UserID = U.UserID
LEFT OUTER JOIN UserCustomGroups UCG
ON P.UserCustomGroupID = UCG.ID
LEFT OUTER JOIN CustomGroups CG
ON UCG.CustomGroupID = CG.ID
CROSS APPLY
(
SELECT CASE WHEN CG.ID IS NULL
THEN 1
ELSE CG.AcceleratedProfitPercentage / 100
END AS [Amount]
) t
WHERE P.MoneyToReturn > P.MoneyReturned
AND U.Mass LIKE '1%'
GROUP BY UserID
First, multiplying any number by 1 is pretty pointless, yet I see it twice in your original post. I'm not sure what led to that, but it's unnecessary.
Also, using CROSS APPLY will eliminate the need for you to repeat your subquery. Granted, it's slower (since it'll run on every row returned), but I think it makes sense in this case...Using left outer joins instead of CASE - SELECT - IS NULL makes your query much more efficient and much more readable.
Next, it appears that you are attempting to SUM percentages. Not sure what kind of data you're looking to return, but perhaps AVG would be more appropriate? I can't think of any practical reason why you would be looking to do that.
Lastly, APH's answer will most certainly work (assuming your original query works), but given the obfuscation and inefficiency of your query, I would definitely rewrite it.
Please let me know if you have any questions.