SQL GROUP BY/COUNT even if no results

SQL GROUP BY/COUNT even if no results - sql

I am attempting to get the information from one table (games) and count the entries in another table (tickets) that correspond to each entry in the first. I want each entry in the first table to be returned even if there aren't any entries in the second. My query is as follows:
SELECT g.*, count(*)
FROM games g, tickets t
WHERE (t.game_number = g.game_number
OR NOT EXISTS (SELECT * FROM tickets t2 WHERE t2.game_number=g.game_number))
GROUP BY t.game_number;
What am I doing wrong?

You need to do a left-join:
SELECT g.Game_Number, g.PutColumnsHere, count(t.Game_Number)
FROM games g
LEFT JOIN tickets t ON g.Game_Number = t.Game_Number
GROUP BY g.Game_Number, g.PutColumnsHere
Alternatively, I think this is a little clearer with a correlated subquery:
SELECT g.Game_Number, G.PutColumnsHere,
(SELECT COUNT(*) FROM Tickets T WHERE t.Game_Number = g.Game_Number) Tickets_Count
FROM Games g
Just make sure you check the query plan to confirm that the optimizer interprets this well.

You need to learn more about how to use joins in SQL:
SELECT g.*, count(*)
FROM games g
LEFT OUTER JOIN tickets t
USING (game_number)
GROUP BY g.game_number;
Note that unlike some database brands, MySQL permits you to list many columns in the select-list even if you only GROUP BY their primary key. As long as the columns in your select-list are functionally dependent on the GROUP BY column, the result is unambiguous.
Other brands of database (Microsoft, Firebird, etc.) give you an error if you list any columns in the select-list without including them in GROUP BY or in an aggregate function.

"FROM games g, tickets t" is the problem line. This performs an inner join. Any where clause can't add on to this. I think you want a LEFT OUTER JOIN.

Related

List a number of copies per book

For a school project, I should do queries on this Library model.
I've done good so far, but now I'm stuck on this question: "List a number of copies per book".
Here's what've done and not working as it should:
SELECT liv_titulo, (
SELECT count(exe_cod)
FROM exemplar
GROUP BY liv_cod
ORDER BY COUNT(exe_cod) desc) Quantidade
FROM livro INNER JOIN exemplar USING (liv_cod);

I think the technique you wanted to use is a correlated subquery:
you need a WHERE clause in the subquery so it only takes into account the liv_code from the current row in the outer query
no GROUP BY clause is needed in the subquery; it should return a scalar value (just one row and one column, that contains the count of matching rows from table examplar)
there is no need for a join in the outer query
Code:
SELECT
liv_titulo,
(
SELECT count(*)
FROM exemplar e
WHERE e.liv_cod = l.liv_cod -- correlation
) Quantidade
FROM livro l;

Avoid repeated information when having multiple joins?

I have the following query that uses joins to join multiple tables
select DISTINCT
tblArticles.Article_Title,
tblArticles.Article_img,
tblArticles.Article_Content,
tblArticles.Article_Date_Created,
tblArticles.Article_Sequence,
tblWriters.Writer_Name,
tblTypes.Article_Type_Name,
tblimages.image_path as "Extra images"
from tblArticles inner join tblWriters
on tblArticles.Writer_ID_Fkey = tblWriters.Writer_ID inner join
tblArticleType on tblArticles.Article_ID = tblArticleType.Article_ID_Fkey inner join
tblTypes on tblArticleType.Article_Type_ID_Fkey = tblTypes.Article_Type_ID left outer join tblExtraImages
on tblArticles.Article_ID = tblExtraImages.Article_ID_Fkey left outer join tblimages
on tblExtraImages.image_id_fkey = tblimages.image_id
order by tblArticles.Article_Sequence, tblArticles.Article_Date_Created;
And I get the following results:
If an article has more than one type_name then I will get repeated columns for the rest of the records. Is there another way of joining these tables that would prevent that from happening?

The simplest method is to just remove column Article_Type_Name from the select clause. This allows SELECT DISTINCT to identify the rows as duplicates, and eliminate them.
Another option is to use an aggregation function on the column. In recent SQL Server versions, STRING_AGG() comes handy (you can also use MIN() or MAX()):
select
tblArticles.Article_Title,
tblArticles.Article_img,
tblArticles.Article_Content,
tblArticles.Article_Date_Created,
tblArticles.Article_Sequence,
tblWriters.Writer_Name,
string_agg(tblTypes.Article_Type_Name, ',')
within group(order by tblTypes.Article_Type_Name) Article_Type_Name_List,
tblimages.image_path as Extra_Images
from ..
group by
tblArticles.Article_Title,
tblArticles.Article_img,
tblArticles.Article_Content,
tblArticles.Article_Date_Created,
tblArticles.Article_Sequence,
tblWriters.Writer_Name,
tblimages.image_path

What you're seeing here is a Cartesian product; you've joined Tables in such a way that multiple rows from one side match with rows from the other
If you don't care about the article_type, then group the other columns and take the max(article_type), or omit it in a subquery that selects distinct records, not including the article type column, from the table that contains article type). If your SQLS is recent enough and you want to know all the article types you could STRING_AGG them into a csv list
Ultimately what you choose to do depends on what you want them for; filter the rows out, or group them down

Semi-join vs Subqueries

What is the difference between semi-joins and a subquery? I am currently taking a course on this on DataCamp and i'm having a hard time making a distinction between the two.
Thanks in advance.

A join or a semi join is required whenever you want to combine two or more entities records based on some common conditional attributes.
Unlike, Subquery is required whenever you want to have a lookup or a reference on same table or other tables
In short, when your requirement is to get additional reference columns added to existing tables attributes then go for join else when you want to have a lookup on records from the same table or other tables but keeping the same existing columns as o/p go for subquery
Also, In case of semi join it can act/used as a subquery because most of the times we dont actually join the right table instead we mantain a check via subquery to limit records in the existing hence semijoin but just that it isnt a subquery by itself

I don't really think of a subquery and a semi-join as anything similar. A subquery is nothing more interesting than a query that is used inside another query:
select * -- this is often called the "outer" query
from (
select columnA -- this is the subquery inside the parentheses
from mytable
where columnB = 'Y'
)
A semi-join is a concept based on join. Of course, joining tables will combine both tables and return the combined rows based on the join criteria. From there you select the columns you want from either table based on further where criteria (and of course whatever else you want to do). The concept of a semi-join is when you want to return rows from the first table only, but you need the 2nd table to decide which rows to return. Example: you want to return the people in a class:
select p.FirstName, p.LastName, p.DOB
from people p
inner join classes c on c.pID = p.pID
where c.ClassName = 'SQL 101'
group by p.pID
This accomplishes the concept of a semi-join. We are only returning columns from the first table (people). The use of the group by is necessary for the concept of a semi-join because a true join can return duplicate rows from the first table (depending on the join criteria). The above example is not often referred to as a semi-join, and is not the most typical way to accomplish it. The following query is a more common method of accomplishing a semi-join:
select FirstName, LastName, DOB
from people
where pID in (select pID
from class
where ClassName = 'SQL 101'
)
There is no formal join here. But we're using the 2nd table to determine which rows from the first table to return. It's a lot like saying if we did join the 2nd table to the first table, what rows from the first table would match?
For performance, exists is typically preferred:
select FirstName, LastName, DOB
from people p
where exists (select pID
from class c
where c.pID = p.pID
and c.ClassName = 'SQL 101'
)
In my opinion, this is the most direct way to understand the semi-join. There is still no formal join, but you can see the idea of a join hinted at by the usage of directly matching the first table's pID column to the 2nd table's pID column.
Final note. The last 2 queries above each use a subquery to accomplish the concept of a semi-join.

How To Get Count of the records in sql

I have 2 tables in my database one tblNews and another tblNewsComments
I want to select 10 records from tblNewsComments than have must Comments of news
I used this query but it give an error
SELECT tblNews.id,
tblNews.newsTitle,
tblNews.createdate,
tblNews.viewcount,
COUNT(tblNewsComments.id) AS comcounts
FROM tblNews
INNER JOIN tblNewsComments ON tblNews.id = tblNewsComments.newsID
GROUP BY tblNews.id

Try to replace
GROUP BY tblNews.id
With
GROUP BY tblNews.id,
tblNews.newsTitle,
tblNews.createdate,
tblNews.viewcount
All the expressions in the SELECT list should be in the GROUP BY or inside an aggregate function.

I've always found this to be an annoyance in SQL. There's nothing logically wrong with your query; you're grouping by news item and selecting various attributes of the news item, and then selecting the count of comments linked to the news item. That makes sense.
The error arises because the SQL engine isn't smart enough to realize that all the columns in tblNews are at the same data context, and that grouping by tblNews.id effectively guarantees that there will only be one newsTitle, createdate, and viewcount for each group. It should be able to realize that, I think, and carry out the query. But it doesn't do that; the only column it considers to be unique in the group data context is the exact column that you grouped by, id.
One solution, as Multisync just posted, is to group by ALL the columns you want to include in the select clause. I don't think this is the best solution, however, as you shouldn't have to specify all those columns in the group by clause, and that would force you to keep adding to that list whenever you want to add a new TblNews column to the select clause.
The solution I've always used is to wrap the column in an ineffectual aggregate function in the select clause; I always use max():
select
tblNews.id,
max(tblNews.newsTitle),
max(tblNews.createdate),
max(tblNews.viewcount),
count(tblNewsComments.id) comcounts
from
tblNews
inner join tblNewsComments on tblNews.id=tblNewsComments.newsID
group by
tblNews.id
;

Or with subquery:
SELECT n.id,
n.newsTitle,
n.createdate,
n.viewcount,
(SELECT COUNT(*) FROM tblNewsComments c ON n.id = c.newsID) AS comcounts
FROM tblNews n

you have to select one column and group by another...other column will not work as they are not in the aggregate function.
SELECT tblNews.id, COUNT(tblNewsComments.newsID) AS comcounts
FROM tblNews
INNER JOIN tblNewsComments ON tblNews.id = tblNewsComments.newsID
GROUP BY tblNews.id
Read Here

SQL count - first time

I am learning SQL (bit by bit!) trying to perform a query on our database and adding in a count function to show the total orders that appear against a customers id by counting in a inner join query.
Somehow it is pooling all the data together onto one customer with the count function though.
Can someone please suggest where I am going wrong?
SELECT tbl_customers.*, tbl_stateprov.stprv_Name, tbl_custstate.CustSt_Destination, COUNT(order_id) as total
FROM tbl_stateprov
INNER JOIN (tbl_customers
INNER JOIN (tbl_custstate
INNER JOIN tbl_orders ON tbl_orders.order_CustomerID = tbl_custstate.CustSt_Cust_ID)
ON tbl_customers.cst_ID = tbl_custstate.CustSt_Cust_ID)
ON tbl_stateprov.stprv_ID = tbl_custstate.CustSt_StPrv_ID
WHERE tbl_custstate.CustSt_Destination='BillTo'
AND cst_LastName LIKE '#URL.Alpha#%'

You need a GROUP BY clause in this statement in order to get what you want. You need to figure out what level you want to group it by in order to select which fields to add to the group by clause. If you just wanted to see it on a per customer basis, and the customers table had an id field, it would look like this (at the very end of your sql):
GROUP BY tbl_customers.id
Now you can certainly group by more fields, it just depends how you want to slice the results.

In your select statement you are using format like tableName.ColumnName but not for COUNT(order_id)
It should be COUNT(tableOrAlias.order_id)
Hope that helps.

As you are new to SQL it might also be worth considering the readability of your joins - the nested / bracketed joins you mentioned above are quite hard to read, and I would also personally alias your tables to make the query that bit more accessible:
SELECT
tbl_customers.customer_id
,tbl_stateprov.stprv_Name
,tbl_custstate.CustSt_Destination
,COUNT(order_id) as total
FROM tbl_stateprov statep
INNER JOIN tbl_custstate state ON statep.stprv_ID = state.CustSt_StPrv_ID
INNER JOIN tbl_customers customer ON customer.cst_ID = state.CustSt_Cust_ID
INNER JOIN tbl_orders orders ON orders.order_CustomerID = state.CustSt_Cust_ID
WHERE tbl_custstate.CustSt_Destination='BillTo'
AND cst_LastName LIKE '#URL.Alpha#%'
GROUP BY
tbl_customers.customer_id
,tbl_stateprov.stprv_Name
,tbl_custstate.CustSt_Destination
--And any other columns you want to include the count for

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL GROUP BY/COUNT even if no results - sql

"FROM games g, tickets t" is the problem line. This performs an inner join. Any where clause can't add on to this. I think you want a LEFT OUTER JOIN.

Related

List a number of copies per book

Avoid repeated information when having multiple joins?

Semi-join vs Subqueries

How To Get Count of the records in sql

SQL count - first time

Categories

Resources