Why does adding GROUP BY cause a seemingly unrelated error? - sql

The following code works fine:
SELECT name, (SELECT count(item_id) FROM bids WHERE item_id = items.id)
FROM items;
However, when I add
SELECT name, (SELECT count(item_id) FROM bids WHERE item_id = items.id)
FROM items
GROUP BY name;
I get ERROR: subquery uses ungrouped column "items.id" from outer query
Can anyone tell me why this is happening? Thanks!

If you GROUP BY name then any other columns you select from items must have an aggregate function applied. That's what GROUP BY means.
In your case, you are using another column from items -- id -- in a correlated scalar subquery. That's not an aggregate function, and id is not in the GROUP BY clause, so you get an error.
You could instead GROUP BY name, id. That should give you the same results as the first query, and is probably pointless.
If you actually have multiple rows in items with the same value for name, and you want to group the results of the scalar subquery for those values, you need to specify how to group them. Perhaps you want the total of the subquery results for each value of name. If so, I think you could do:
SELECT name, SUM(SELECT count(item_id) FROM bids WHERE item_id = items.id))
FROM items
GROUP BY name;
(I'm not positive about the specific syntax as I don't have a Postgres instance to test against.)
A clearer way to express it might be:
SELECT name, SUM(bid_count)
FROM (
SELECT name, (SELECT count(item_id) FROM bids WHERE item_id = items.id) AS bid_count
FROM items
)
GROUP BY name

Join the tables then perform the GROUP BY:
select i.name, count(b.item_id)
from items i
inner join bids b
on b.item_id = i.id
group by i.name
db<>fiddle here

Related

Calculated sum/count by category and sort descending, together with an inner join

I have a simple table. One column with a variable which I want to sum or count and another one with category. I tried this:
SELECT COUNT(*) AS counted, category
FROM mytable
GROUP BY category
ORDER BY counted DESC;
With out the ORDER BY counted DESC it works, however it is not sorted. I would like to see the maximum immediately, so sort descending. However, when running it, a message pops up and asks me to insert a value for counted. Why can't I do this in one step, why is this not working?
Same for sum:
SELECT sum(variable) AS calcsum, category
FROM mytable
GROUP BY category
ORDER BY calcsum DESC;
Furthermore I have the same problem or similiar when trying to do this in one step with a join. I have one table with provided IDs (variable called keys). Another table with IDs, a category, a filter variable and a score. I want the sum of score per category and sort it descending. So far I have:
SELECT SUM(score) AS calcsum, category
FROM (
SELECT keys, category, filter, score INTO newdataset
FROM table1 INNER JOIN table2 ON table1.keys=table2.ID
WHERE table2.filter="Value")
GROUP BY category;
And I thought here again to add: ORDER BY calcsum DESC
However, even without adding the ORDER BY I get the error message "An action query cannot be used as a row source". So what is my mistake here?
Just repeat the COUNT(*) expression:
SELECT COUNT(*) AS counted, category
FROM mytable
GROUP BY category
ORDER BY COUNT(*) DESC;
EDIT:
If you want this with INTO and JOINs:
SELECT SUM(score) AS calcsum, category
INTO newdataset
FROM table1 INNER JOIN
table2
ON table1.keys =table2.ID
WHERE table2.filter = "Value"
GROUP BY category
ORDER BY SUM(score) DESC;
simply you can use order by 2 desc 2 stands for the second column in your select statement

Count() how many times a name shows up in a table with the rest of info

I have read in various websites about the count() function but I still cannot make this work.
I made a small table with (id, name, last name, age) and I need to retrieve all columns plus a new one. In this new column I want to display how many times a name shows up or repeats itself in the table.
I have made test and can retrieve but only COLUMN NAME with the count column, but I haven't been able to retrieve all data from the table.
Currently I have this
select a.n_showsup, p.*
from [test1].[dbo].[person] p,
(select count(*) n_showsup
from [test1].[dbo].[person])a
This gives me all data on output but on the column n_showsup it gives me just the number of rows, now I know this is because I'm missing a GROUP BY but then when I write group by NAME it shows me a lot of records. This is an example of what I need:
You can use window functions, if you RDBMS supports them:
select t.*, count(*) over(partition by name) n_showsup
from mytable t
Alternatively, you can join the table with an aggregation query that counts the number of occurences of each name:
select t.*, x.n_showsup
from mytable t
inner join (select name, count(*) n_showsup from mytable group by name) x
on x.name = t.name
While the window function approach (#GMB's answer) is the right way to go, thinking through this from a subquery approach (like you were headed towards) would look something like:
select p.*, a.n_showsup
from [test1].[dbo].[person] p
INNER JOIN (
select name, count(*) n_showsup
from [test1].[dbo].[person]
GROUP BY name
) a ON p.name = a.name
This is VERY close to what you had, the difference is that we are grouping that subquery by name (so we get a count by name) and we can use that in the join criteria which we do with the ON clause on that INNER JOIN.
You should really never ever use a comma in your FROM clause. Instead use a JOIN.

Get the first instance of a row using MS Access

EDITED:
I have this query wherein I want to SELECT the first instance of a record from the table petTable.
SELECT id,
pet_ID,
FIRST(petName),
First(Description)
FROM petTable
GROUP BY pet_ID;
The problem is I have huge number of records and this query is too slow. I discovered that GROUP BY slows down the query. Do you have any idea that could make this query faster? or better, a query wherein I don't need to use GROUP BY?
"The problem is I have huge number of records and this query is too slow. I discovered that GROUP BY slows down the query. Do you have any idea that could make this query faster?"
And an index on pet_ID, then create and test this query:
SELECT pet_ID, Min(id) AS MinOfid
FROM petTable
GROUP BY pet_ID;
Once you have that query working, you can join it back to the original table --- then it will select only the original rows which match based on id and you can retrieve the other fields you want from those matching rows.
SELECT pt.id, pt.pet_ID, pt.petName, pt.Description
FROM
petTable AS pt
INNER JOIN
(
SELECT pet_ID, Min(id) AS MinOfid
FROM petTable
GROUP BY pet_ID
) AS sub
ON pt.id = sub.MinOfid;
Your Query could change as,
SELECT ID, pet_ID, petName, Description
FROM petTable
WHERE ID IN
(SELECT Min(ID) As MinID FROM petTable GROUP BY pet_ID);
Or use the TOP clause,
SELECT petTable.petID, petTable.petName, petTable.[description]
FROM petTable
WHERE petTable.ID IN
(SELECT TOP 1 ID
FROM petTable AS tmpTbl
WHERE tmpTbl.petID = petTable.petID
ORDER BY tmpTbl.petID DESC)
ORDER BY petTable.petID, petTable.petName, petTable.[description];

How can I use the GROUP BY SQL clause with no aggregate function?

When I try to use the following SELECT statement:
SELECT [lots of columns]
FROM Client, Customer, Document, Group
WHERE [some conditions]
GROUP BY Group.id
SQL Server complains that the columns I selected are not part of the GROUP BY statement nor an aggregate function. Am I using GROUP BY wrong? What should I be using instead?
To return all single occurences of a group by field, together with associated field values, write a query like:
select group_field,
max(other_field1),
max(other_field2),
...
from mytable1
join mytable2 on ...
group by group_field
having count(*) = 1;
Yes, you are using GROUP BY incorrectly. The point of using GROUP BY is to use aggregate functions. If you have no aggregrate functions you probably want SELECT DISTINCT instead.
SELECT DISTINCT
col1,
col2,
-- etc
coln
FROM Client
JOIN Customer ON ...
JOIN Document ON ...
JOIN [Group] ON ...
WHERE ...
My first guess would be that the problem is that you have table called Group, which I believe is a reserved word in SQL. Try wrapping the Group name with ' '
You want to group by all columns you are selecting that is not in an aggregate funcion.
SELECT ProductName, ProductCategory, SUM(ProductAmount)
FROM Products
GROUP BY ProductName, ProductCategory
This will give you a disticnt result of Product names and categories with the sum total of product amount in all aggregate child records for that group.

How do I select those records where the group by clause returns 2 or more?

I'd like to return a list of items of only those that have two or more in the group:
select count(item_id) from items group by type_id;
Specifically, I'd like to know the values of item_id when the count(item_id) == 2.
You're asking for something that's not particularly possible without a subquery.
Basically, you want to list all values in a column while aggregating on that same column. You can't do this. Aggregating on a column makes it impossible to list of all the individual values from that column.
What you can do is find all type_id values which have an item_id count equal to 2, then select all item_ids from records matching those type_id values:
SELECT item_id
FROM items
WHERE type_id IN (
SELECT type_id
FROM items
GROUP BY type_id
HAVING COUNT(item_id) = 2
)
This is best expressed using a join rather than a WHERE IN clause, but the idea is the same no matter how you approach it. You may also want to select distinct item_ids in which case you'll need the DISTINCT keyword before item_id in the outer query.
If your SQL dialect includes GROUP_CONCAT(), that could be used to generate a list of items without the inner query. However, the results differ; the inner query returns one item id per row, where GROUP_CONCAT() returns multiple ids as a string.
SELECT type_id, GROUP_CONCAT(item_id), COUNT(item_id) as number
FROM items
GROUP BY type_id
HAVING number = 2
Try this sql query:
select count(item_id) from items group by type_id having count(item_id)=2;