How do you explicitly show rows which have count(*) equal to 0 - sql

The query I'm running in DB2
select yrb_customer.name,
yrb_customer.city,
CASE count(*) WHEN 0 THEN 0 ELSE count(*) END as #UniClubs
from yrb_member, yrb_customer
where yrb_member.cid = yrb_customer.cid and yrb_member.club like '%Club%'
group by yrb_customer.name, yrb_customer.city order by count(*)
Shows me people which are part of clubs which has the word 'Club' in it, and it shows how many such clubs they are part of (#UniClubs) along with their name and City. However for students who are not part of such a club, I would still like for them to show up but just have 0 instead of them being hidden which is what's happening right now. I cannot get this functionality with count(*). Can somebody shed some light? I can explain further if the above is not clear enough.

I'm not familiar with DB2 so I'm taking a stab in the dark, but try this:
select yrb_customer.name,
yrb_customer.city,
CASE WHEN yrb_member.club like '%Club% THEN count(*) ELSE 0 END as #UniClubs
from yrb_member, yrb_customer
where yrb_member.cid = yrb_customer.cid
group by yrb_customer.name, yrb_customer.city order by count(*)
Basically you don't want to filter for %Club% in your WHERE clause because you want ALL rows to come back.

You're going to want a LEFT JOIN:
SELECT yrb_customer.name, yrb_customer.city,
COUNT(yrb_member.club) as clubCount
FROM yrb_customer
LEFT JOIN yrb_member
ON yrb_member.cid = yrb_customer.cid
AND yrb_member.club LIKE '%Club%
GROUP BY yrb_customer.name, yrb_customer.city
ORDER BY clubCount
Also, if the tuple (yrb_customer.name, yrb_customer.city) is unique (or is supposed to be - are you counting all students with the same name as the same person?), you might get better performance out of the following:
SELECT yrb_customer.name, yrb_customer.city,
COALESCE(club.count, 0)
FROM yrb_customer
LEFT JOIN (SELECT cid, COUNT(*) as count
FROM yrb_member
WHERE club LIKE '%Club%
GROUP BY cid) club
ON club.cid = yrb_customer.cid
ORDER BY club.count
The reason that your original results were being hidden was because in your original query, you have an implicit inner join, which of course requires matching rows. The implicit-join syntax (comma-separated FROM clause) is great for inner (regular) joins, but is terrible for left-joins, which is what you really needed. The use of the implicit-join syntax (and certain types of related filtering in the WHERE clause) is considered deprecated.

Related

COUNT is outputting more than one row

I am having a problem with my SQL query using the count function.
When I don't have an inner join, it counts 55 rows. When I add the inner join into my query, it adds a lot to it. It suddenly became 102 rows.
Here is my SQL Query:
SELECT COUNT([fmsStage].[dbo].[File].[FILENUMBER])
FROM [fmsStage].[dbo].[File]
INNER JOIN [fmsStage].[dbo].[Container]
ON [fmsStage].[dbo].[File].[FILENUMBER] = [fmsStage].[dbo].[Container].[FILENUMBER]
WHERE [fmsStage].[dbo].[File].[RELATIONCODE] = 'SHIP02'
AND [fmsStage].[dbo].[Container].DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08'
GROUP BY [fmsStage].[dbo].[File].[FILENUMBER]
Also, I have to do TOP 1 at the SELECT statement because it returns 51 rows with random numbers inside of them. (They are probably not random, but I can't figure out what they are.)
What do I have to do to make it just count the rows from [fmsStage].[dbo].[file].[FILENUMBER]?
First, your query would be much clearer like this:
SELECT COUNT(f.[FILENUMBER])
FROM [fmsStage].[dbo].[File] f INNER JOIN
[fmsStage].[dbo].[Container] c
ON v.[FILENUMBER] = c.[FILENUMBER]
WHERE f.[RELATIONCODE] = 'SHIP02' AND
c.DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08';
No GROUP BY is necessary. Otherwise you'll just one row per file number, which doesn't seem as useful as the overall count.
Note: You might want COUNT(DISTINCT f.[FILENUMBER]). Your question doesn't provide enough information to make a judgement.
Just remove GROUP BY Clause
SELECT COUNT([fmsStage].[dbo].[File].[FILENUMBER])
FROM [fmsStage].[dbo].[File]
INNER JOIN [fmsStage].[dbo].[Container]
ON [fmsStage].[dbo].[File].[FILENUMBER] = [fmsStage].[dbo].[Container].[FILENUMBER]
WHERE [fmsStage].[dbo].[File].[RELATIONCODE] = 'SHIP02'
AND [fmsStage].[dbo].[Container].DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08'

Count of how many times id occurs in table SQL regexp

Hi I have a redshift table of articles that has a field on it that can contain many accounts. So there is a one to many relationship between articles to accounts.
However I want to create a new view where it lists the partner id's in one column and in another column a count of how many times the partner id appears in the articles table.
I've attempted to do this using regex and created a new redshift view, but am getting weird results where it doesn't always build properly. So one day it will say a partner appears 15 times, then the next 17, then the next 15, when the partner id count hasn't actually changed.
Any help would be greatly appreciated.
SELECT partner_id,
COUNT(DISTINCT id)
FROM (SELECT id,
partner_ids,
SPLIT_PART(partner_ids,',',i) partner_id
FROM positron_articles a
LEFT JOIN util.seq_0_to_500 s
ON s.i < regexp_count (partner_ids,',') + 2
OR s.i = 1
WHERE i > 0
AND regexp_count (partner_ids,',') = 0
ORDER BY id)
GROUP BY 1;
Let's start with some of the more obvious things and see if we can start to glean other information.
Next GROUP BY 1 on your outer query needs to be GROUP BY partner_id.
Next you don't need an order by in your INNER query and the database engine will probably do a better job optimizing performance without it so remove ORDER BY id.
If you want your final results to be ordered then add an ORDER BY partner_id or similar clause after your group by of your OUTER query.
It looks like there are also problems with how you are splitting a partnerid from partnerids but I am not positive about that because I need to understand your view and the data it provides to know how that affects your record count for partnerid.
Next your LEFT JOIN statement on the util.seq_0_to_500 I am pretty sure you can drop off the s.i = 1 as the first condition will satisfy that as well because 2 is greater than 1. However your left join really acts more like an inner join because you then exclude any non matches from positron_articles that don't have a s.i > 0.
Oddly then your entire join and inner query gets kind of discarded because you only want articles that have no commas in their partnerids: regexp_count (partner_ids,',') = 0
I would suggest posting the code for your util.seq_0_to_500 and if you have a partner table let use know about that as well because you can probably get your answer a lot easier with that additional table depending on how regexp_count works. I suspect regex_count(partnerids,partnerid) exampleregex_count('12345,678',1234) will return greater than 0 at which point you have no choice but to split the delimited strings into another table before counting or building a new matching function.
If regex_count only matches exact between commas and you have a partner table your query could be as easy as this:
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
positron_articles a
LEFT JOIN PARTNERTABLE p
ON regexp_count(a.partnerids,p.partnerid) > 0
GROUP BY
p.partner_id
I will actually correct myself as I just thought of a way to join a partner table without regexp_count. So if you have a partner table this might work for you. If not you will need to split strings. It basically tests to see if the partnerid is the entire partnerids, at the beginning, in the middle, or at the end of partnerids. If one of those is met then the records is returned.
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
PARTNERTABLE p
INNER JOIN positron_articles a
ON
(
CASE
WHEN a.partnerids = CAST(p.partnerid AS VARCHAR(100)) THEN 1
WHEN a.partnerids LIKE p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid THEN 1
ELSE 0
END
) = 1
GROUP BY
p.partner_id

SQL - Counting only one table's entries in joint query

This seems like a regular thing to do, but I can't seem to find how to do it.
I have a join query
SELECT a.nom_batim, COUNT(b.maxten) AS NumFaulty
FROM tblTrials AS b, tblRooms AS a
WHERE b.batiment = a.batiment
AND b.maxten > 10
GROUP BY a.nom_batim
ORDER BY a.nom_batim
that should only return a count of the tblTrials entries. However, since I don't know how to code that, it's currently counting all occurances of b.maxten > 10, TIMES all occurances of b.batiment = a.batiment. I have 1 actual occurance of b.maxten > 10 in the table, but 231 occurances of b.batiment = a.batiment (the tables are set up badly, not my choice; these tables are considered read-only to me), so it returns a count of 231.
How do I COUNT(b.maxten) correctly, but still display a.nom_batim as a user-friendly representation of the batiment ID field? (a.nom_batim is the long name for the building #batiment)
UPDATE
This is what I ended up doing so far..
SELECT a.nom_batim, COUNT(b.batiment) AS NumFaulty
FROM (SELECT DISTINCT nom_batim, batiment FROM tblRooms) AS a
INNER JOIN tblTrials AS b ON a.batiment = b.batiment
WHERE b.maxten > 10
GROUP BY a.nom_batim
ORDER BY a.nom_batim
It works but seems like a resource hog when I only need max ~30 values from tblRooms, but have to query all 5000+ rows selecting only distinct batiment values. Is there no way to do this without having a batiment table tblBatiment: batiment, nom_batim I know it's the best way but I don't have the access.
You can perform the count in a sub-query so it only applies to the one table's records:
SELECT ..
FROM (SELECT batiment, COUNT(maxten) FROM tblTrials WHERE maxten > 10) AS b
,tblRooms AS a
...
Otherwise, the count is applying across all records in the final result, because the query engine doesn't differentiate between records coming from one place or another in a COUNT.
Going back to your original query, you can get what you want if you have an identity column on the tblTrials table:
SELECT a.nom_batim, COUNT(distinct b.id) AS NumFaulty
FROM tblTrials b INNER JOIN tblRooms a
ON b.batiment = a.batiment
WHERE b.maxten > 10
GROUP BY a.nom_batim
ORDER BY a.nom_batim
I also replaced your join syntax with the correct join syntax (using the "join" keyword).
Try this:
SELECT a.nom_batim, COUNT(b.maxten) AS NumFaulty
FROM tblTrials AS b, tblRooms AS a
WHERE b.batiment = a.batiment
GROUP BY a.nom_batim
HAVING count(b.maxten) > 10
ORDER BY a.nom_batim
The best I could do so far which works:
SELECT a.nom_batim AS Building, Count(q.batiment) AS Fixes
FROM (SELECT DISTINCT nom_batim, batiment FROM tblRooms) AS a
INNER JOIN tblTrials AS q
ON a.batiment = q.batiment
WHERE q.maxten > 10
GROUP BY a.nom_batim
Seems like the SELECT DISTINCT nom_batim, batiment FROM tblRooms would be slow considering tblTrials may contain 60k entries and tblRooms may contain 10k entries.. but the entries aren't input yet so I can't really test it. Gordon made a good point that if it's returning the same stuff, it's likely ~the same speed. I do have multi-field primary keys so it might help things along as well (not as much as an ID field perhaps but what can you do).
Thanks to others that answered.
Try to use:
HAVING b.maxten>10

SQL SELECT query with JOIN, SUM and GROUP BY

I have 5 tables in a MS Access databse: tblMember, tblPoint, tblRace, tblRaceType and tblResult. (All of which have primary keys.)
tblPoint contains (RaceTypeID, Position, Points) fields.
What I want to do is look at all the races that the members participated in, see what position they came (stored in tblResult) and see if those positions score points (as defined in tblPoint). I then want to add up all the points for each member and show these, along with the member name in my query...
Is this possible? I came up with my best shot at this SQL query below:
SELECT Sum(tblPoint.Points) AS SumOfPoints, Count(tblRace.RaceID) AS CountOfRaceID,
tblMember.MemberName, tblPoint.Points
FROM ((tblRaceType INNER JOIN tblPoint ON tblRaceType.RaceTypeID = tblPoint.RaceTypeID)
INNER JOIN tblRace ON tblRaceType.RaceTypeID = tblRace.RaceTypeID) INNER JOIN
(tblMember INNER JOIN tblResult ON tblMember.MemberID = tblResult.MemberID) ON
tblRace.RaceID = tblResult.RaceID
GROUP BY tblMember.MemberName, tblPoint.Points
ORDER BY tblPoint.Points DESC;
Is anyone able to point me in the right direction at all?
I'd say this
GROUP BY tblMember.MemberName, tblPoint.Points
ORDER BY tblPoint.Points DESC;
should probably be more like this:
GROUP BY tblMember.MemberName
ORDER BY Sum(tblPoint.Points) DESC;
Also, remove tblPoint.Points at the end of your select. This is just a single point value, you want the sum.
Grouping by points means that you'll get one row per member and point value they scored - probably not what you intended.

MySQL to PostgreSQL: GROUP BY issues

So I decided to try out PostgreSQL instead of MySQL but I am having some slight conversion problems. This was a query of mine that samples data from four tables and spit them out all in on result.
I am at a loss of how to convey this in PostgreSQL and specifically in Django but I am leaving that for another quesiton so bonus points if you can Django-fy it but no worries if you just pure SQL it.
SELECT links.id, links.created, links.url, links.title, user.username, category.title, SUM(votes.karma_delta) AS karma, SUM(IF(votes.user_id = 1, votes.karma_delta, 0)) AS user_vote
FROM links
LEFT OUTER JOIN `users` `user` ON (`links`.`user_id`=`user`.`id`)
LEFT OUTER JOIN `categories` `category` ON (`links`.`category_id`=`category`.`id`)
LEFT OUTER JOIN `votes` `votes` ON (`votes`.`link_id`=`links`.`id`)
WHERE (links.id = votes.link_id)
GROUP BY votes.link_id
ORDER BY (SUM(votes.karma_delta) - 1) / POW((TIMESTAMPDIFF(HOUR, links.created, NOW()) + 2), 1.5) DESC
LIMIT 20
The IF in the select was where my first troubles began. Seems it's an IF true/false THEN stuff ELSE other stuff END IF yet I can't get the syntax right. I tried to use Navicat's SQL builder but it constantly wanted me to place everything I had selected into the GROUP BY and that I think it all kinds of wrong.
What I am looking for in summary is to make this MySQL query work in PostreSQL. Thank you.
Current Progress
Just want to thank everybody for their help. This is what I have so far:
SELECT links_link.id, links_link.created, links_link.url, links_link.title, links_category.title, SUM(links_vote.karma_delta) AS karma, SUM(CASE WHEN links_vote.user_id = 1 THEN links_vote.karma_delta ELSE 0 END) AS user_vote
FROM links_link
LEFT OUTER JOIN auth_user ON (links_link.user_id = auth_user.id)
LEFT OUTER JOIN links_category ON (links_link.category_id = links_category.id)
LEFT OUTER JOIN links_vote ON (links_vote.link_id = links_link.id)
WHERE (links_link.id = links_vote.link_id)
GROUP BY links_link.id, links_link.created, links_link.url, links_link.title, links_category.title
ORDER BY links_link.created DESC
LIMIT 20
I had to make some table name changes and I am still working on my ORDER BY so till then we're just gonna cop out. Thanks again!
Have a look at this link GROUP BY
When GROUP BY is present, it is not
valid for the SELECT list expressions
to refer to ungrouped columns except
within aggregate functions, since
there would be more than one possible
value to return for an ungrouped
column.
You need to include all the select columns in the group by that are not part of the aggregate functions.
A few things:
Drop the backticks
Use a CASE statement instead of IF() CASE WHEN votes.use_id = 1 THEN votes.karma_delta ELSE 0 END
Change your timestampdiff to DATE_TRUNC('hour', now()) - DATE_TRUNC('hour', links.created) (you will need to then count the number of hours in the resulting interval. It would be much easier to compare timestamps)
Fix your GROUP BY and ORDER BY
Try to replace the IF with a case;
SUM(CASE WHEN votes.user_id = 1 THEN votes.karma_delta ELSE 0 END)
You also have to explicitly name every column or calculated column you use in the GROUP BY clause.