How to use count and group by at the same select statement - sql

I have an SQL SELECT query that also uses a GROUP BY,
I want to count all the records after the GROUP BY clause filtered the resultset.
Is there any way to do this directly with SQL? For example, if I have the table users and want to select the different towns and the total number of users:
SELECT `town`, COUNT(*)
FROM `user`
GROUP BY `town`;
I want to have a column with all the towns and another with the number of users in all rows.
An example of the result for having 3 towns and 58 users in total is:
Town
Count
Copenhagen
58
New York
58
Athens
58

This will do what you want (list of towns, with the number of users in each):
SELECT `town`, COUNT(`town`)
FROM `user`
GROUP BY `town`;
You can use most aggregate functions when using a GROUP BY statement
(COUNT, MAX, COUNT DISTINCT etc.)
Update:
You can declare a variable for the number of users and save the result there, and then SELECT the value of the variable:
DECLARE #numOfUsers INT
SET #numOfUsers = SELECT COUNT(*) FROM `user`;
SELECT DISTINCT `town`, #numOfUsers FROM `user`;

You can use COUNT(DISTINCT ...) :
SELECT COUNT(DISTINCT town)
FROM user

The other way is:
/* Number of rows in a derived table called d1. */
select count(*) from
(
/* Number of times each town appears in user. */
select town, count(*)
from user
group by town
) d1

Ten non-deleted answers; most do not do what the user asked for. Most Answers mis-read the question as thinking that there are 58 users in each town instead of 58 in total. Even the few that are correct are not optimal.
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
SELECT province, total_cities
FROM ( SELECT DISTINCT province FROM canada ) AS provinces
CROSS JOIN ( SELECT COUNT(*) total_cities FROM canada ) AS tot;
+---------------------------+--------------+
| province | total_cities |
+---------------------------+--------------+
| Alberta | 5484 |
| British Columbia | 5484 |
| Manitoba | 5484 |
| New Brunswick | 5484 |
| Newfoundland and Labrador | 5484 |
| Northwest Territories | 5484 |
| Nova Scotia | 5484 |
| Nunavut | 5484 |
| Ontario | 5484 |
| Prince Edward Island | 5484 |
| Quebec | 5484 |
| Saskatchewan | 5484 |
| Yukon | 5484 |
+---------------------------+--------------+
13 rows in set (0.01 sec)
SHOW session status LIKE 'Handler%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 4 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 3 |
| Handler_read_key | 16 |
| Handler_read_last | 1 |
| Handler_read_next | 5484 | -- One table scan to get COUNT(*)
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 15 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 14 | -- leapfrog through index to find provinces
+----------------------------+-------+
In the OP's context:
SELECT town, total_users
FROM ( SELECT DISTINCT town FROM canada ) AS towns
CROSS JOIN ( SELECT COUNT(*) total_users FROM canada ) AS tot;
Since there is only one row from tot, the CROSS JOIN is not as voluminous as it might otherwise be.
The usual pattern is COUNT(*) instead of COUNT(town). The latter implies checking town for being not null, which is unnecessary in this context.

With Oracle you could use analytic functions:
select town, count(town), sum(count(town)) over () total_count from user
group by town
Your other options is to use a subquery:
select town, count(town), (select count(town) from user) as total_count from user
group by town

If you want to order by count (sound simple but i can`t found an answer on stack of how to do that) you can do:
SELECT town, count(town) as total FROM user
GROUP BY town ORDER BY total DESC

You can use DISTINCT inside the COUNT like what milkovsky said
in my case:
select COUNT(distinct user_id) from answers_votes where answer_id in (694,695);
This will pull the count of answer votes considered the same user_id as one count

I know this is an old post, in SQL Server:
select isnull(town,'TOTAL') Town, count(*) cnt
from user
group by town WITH ROLLUP
Town cnt
Copenhagen 58
NewYork 58
Athens 58
TOTAL 174

If you want to select town and total user count, you can use this query below:
SELECT Town, (SELECT Count(*) FROM User) `Count` FROM user GROUP BY Town;

if You Want to use Select All Query With Count Option, try this...
select a.*, (Select count(b.name) from table_name as b where Condition) as totCount from table_name as a where where Condition

Try the following code:
select ccode, count(empno)
from company_details
group by ccode;

Related

SQL: Get top records per category, per day, per country?

A little trickier than just getting the top # per category. I want the top 2 videos per artist, per day per country.
My code, which didn't give me the right results is:
Select *
From
(
Select t2.*, dense_rank() over(partition by artist order by views desc)
From
(select country, day, artist, song, sum(view) as views
From t1
Group by 1,2,3,4
) t2
)
Where rn >=5
Sample data results
| Country | Date | artist | video | views | rn |
|---------|------|----------|-------|-------|----|
| US | Jan1 | Beyonce | ab | 100 | 1 |
| US | Jan1 | Beyonce | ac | 99 | 2 |
| US | Jan2 | C. Brown | ad | 89 | 1 |
| US | Jan2 | C. Brown | ai | 103 | 2 |
| AU | Jan1 | Beyonce | bf | 99 | 1 |
| AU | Jan1 | Beyonce | bb | 89 | 2 |
I want all artists per day, per country but only 10 videos per artist..
I am kind confused as to how to achieve this..
I generally struggle when it comes to window functions, so I would appreciate any help.
I am using Amazon Redshift
Thanks
You need to partition by all of the columns you mentioned since you are ranking views within each combination of these elements.
Because you've renamed the aggregate column as "views", you need to call it by that name.
Finally, if you want the top 2 videos/songs, use this condition: where rn <= 2
Select *
From
(
Select t2.*, dense_rank() over(partition by country, day, artist order by views desc)
From
(select country, day, artist, song, sum(views) as views
From t1
Group by 1,2,3,4
) t2
)
Where rn <= 2
This will rank per artist per day the views and show the two two for each artist per day
Select *
From
(
Select t2.*, ROW_NUMBER() over(partition by artist, day, country order by views desc) as rn
From t1 t2
)
Where rn <= 2

SQL - SELECT duplicates between IDs, but not show records if duplicates occur for same ID

I have the following table (simplified from the real table) at the moment:
+----+-------+-------+
| ID | Name | Phone |
+----+-------+-------+
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 2 | Mark | 321 |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+-------+-------+
My desired output in the SELECT statement is:
+----+------+-------+
| ID | Name | Phone |
+----+------+-------+
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+------+-------+
I want to select duplicates only when they occur between two different IDs (like Mark and Kate sharing the same phone number), but not to show any records for IDs that share the same phone number with themselves only (like Tom).
Could someone advise how this can be achieved?
You can use an EXISTS condition with a correlated subquery to ensure that another record exists that has the same phone and a different id. We also need DISTINCT to remove the duplicates in the resultset.
SELECT DISTINCT id, name, phone
FROM mytable t
WHERE EXISTS (
SELECT 1
FROM mytable t1
WHERE t1.phone = t.phone AND t1.id <> t.id
)
Demo on DB Fiddle:
| id | name | phone |
| --- | ---- | ----- |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
You can use window functions for this:
select t.*
from (select t.*,
row_number() over (partition by phone, name order by id) as seqnum,
min(id) over (partition by phone) as min_id,
max(id) over (partition by phone) as max_id
from t
) t
where seqnum = 1 and min_id <> max_id;
Another method uses aggregation and a window function:
select phone, name, id
from (select phone, name, id,
count(*) over (partition by phone) as num_ids
from t
group by phone, name, id
) pn
where num_ids > 1;
Both of these have the advantage over the exists solution (GMB's) that they refer to the "table" only once. That can be a big advantage if the table is a complex view or query. If performance is an issue, I would encourage you to test several variants to see which works best.
Can use somewhat a corelated query with group by and having as below
Select ID, NAME, max(PHONE) From
(Select * From Table) t group by id,
name having
1= max(
case
When phone in (select phone from
table where t.id<>Id) then 1 else 0)
end)

Aggregate functions in where clause MS Access

I have multiples tables with Names and Ages like this:
| Name | Age |
------------------
| Carlos | 25 |
| Mauricio | 28 |
| Cesar | 19 |
| Hernan | 7 |
And I need to retrieve all the names that are above the average Age.
I tried
select Name from Table1 where Age > avg(Age)
but I found that the where clause does not work with aggregate functions, so I tried
select Name from Table 1 having Age > avg(Age)
But it does not work either.
You can do it with following query:
select Name from Table1 where Age > (select avg(Age) from Table1)

SQL requires aggregate function over singular value

I have table language that contains the columns country, name and percentage.
A sample set might look like this:
+---------+---------+------------+
| country | name | percentage |
+---------+---------+------------+
| usa | english | 85 |
| usa | spanish | 10 |
| usa | german | 5 |
| germany | german | 100 |
+---------+---------+------------+
I want to get
+---------+---------+------------+
| country | name | percentage |
+---------+---------+------------+
| usa | english | 85 |
| germany | german | 100 |
+---------+---------+------------+
select country, name, max(percentage) from language group by country
Tells me that I need to put all but one columns into either a aggregation function or the group by.
If you put the name into the group by, you get the original table, since all pairs of country and name are unique.
Name should be a single specific value since there can only be one pair of country and maximum percentage, so there's nothing to compare it too and it's a string anyway.
I'm sure there's a simple way to resolve this, without doing any second select statements and joining tables and the like.
If you want the maximum percentage, use row_number():
select l.*
from (select l.*,
row_number() over (partition by country order by percentage desc) as seqnum
from language l
) l
where seqnum = 1;

How can I use a single query to list aggregate enumerated values of a single field?

The title is a bit convoluted. Here's a concrete example. I have two tables:
+-------------+--------------------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------------------------+------+-----+---------+-------+
| event | varchar(100) | NO | MUL | NULL | |
| sport | varchar(100) | NO | | NULL | |
| athleteCode | char(10) | NO | MUL | NULL | |
| medal | enum('GOLD','SILVER','BRONZE') | NO | | NULL | |
+-------------+--------------------------------+------+-----+---------+-------+
+---------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+-------+
| name | varchar(100) | NO | | NULL | |
| code | char(10) | NO | PRI | NULL | |
| country | varchar(100) | NO | MUL | NULL | |
+---------+--------------+------+-----+---------+-------+
The first table is an medals table. The second table is an athlete table. The two tables are related via medals.athleteCode and athlete.code. I want to be able to list out a query that displays the following information:
COUNTRY | GOLD | SILVER | BRONZE | TOTAL
The only way that I've been able to do it so far is to use this query:
SELECT country, medal, COUNT(medal) as count
FROM athletes, medals
WHERE athletes.code=medals.athleteCode
GROUP BY country, medal
ORDER BY country, medal;
But after I do this query I still have to process the query (via PHP) because this only gets me every country per medal type (i.e. All of China's golds, all of China's silvers, all of China's bronzes, etc). Is there a way to create a query where each record (i.e. row) of the query is: COUNTRY | GOLDS | SILVERS | BRONZES | TOTAL? I looked at COUNT() but I'm not really sure how to use it.
2012.08.17
#"habib zare"'s solution is really close. Here's my tweak of it:
SELECT country,
(SELECT count(*) FROM medals WHERE a.code=m.athleteCode AND medal='Gold') AS Gold,
(SELECT count(*) FROM medals WHERE a.code=m.athleteCode AND medal='Silver') AS Silver,
(SELECT count(*) FROM medals WHERE a.code=m.athleteCode AND medal='Bronze') AS Bronze,
(SELECT count(*) FROM medals WHERE a.code=m.athleteCode) AS Total
FROM medals m JOIN athletes a
ON m.athleteCode=a.code
GROUP BY country
ORDER BY country, Gold DESC, Silver DESC, Bronze DESC
The problem is that the secondary SELECT statements need to select based on the country; that is I need something like:
SELECT country AS Country,
(SELECT count(*) FROM medals WHERE a.code=m.athleteCode AND medal='Gold' AND a.country=Country) AS Gold,
In MS SQL Server SUM(CASE WHEN medal = 'GOLD' THEN 1 ELSE 0 END) AS GOLDS can be used with GROUP BY country.
You can also use OUTER APPLY if you are using MSSQL:
Sample:
SELECT Country
,Gold = ISNULL(GoldMedals.MedalCount,0)
,Total = ISNULL(TotalMedals.MedalCount,0)
FROM athletes
OUTER APPLY (
SELECT MedalCount = COUNT(*)
FROM medals
WHERE medals.athletescode = atheletes.medalsathleteCode
AND medals.Type = 'Gold'
) GoldMedals
OUTER APPLY (
SELECT MedalCount = COUNT(*)
FROM medals
WHERE medals.athletescode = atheletes.medalsathleteCode
) TotalMedals
GROUP BY country
You'll probably need to add SUM instead of ISNULL but I didn't really test this code, so use it as is...
i dont understand what you want but i think you want this :
medals :
medal event sport athleteCode
Gold jh swim 1
Bronze dfg ert 2
Gold fg err 1
Silver as erf 4
Bronze as erf 5
Gold df dfg 6
Gold sdds tekvando 3
Bronze df jh 1
Bronze yy jh 1
Silver ik as 1
Silver shj jsg 3
Silver shj jsg 5
Silver sjdk hgj 5
Silver wuytu wopow 5
Silver wuytu wopow 6
Silver wuytu wopow 6
and
athlete :
name code country
habib 1 iran
ahmad 2 azerbaijan
mehmad 3 turkey
so 4 iran
ghg 5 azerbaijan
yewtuuy 6 azerbaijan
my query:
SELECT country,name,
(select count(*) from medals where athleteCode=m.athleteCode and medal='Gold') as Gold,
(select count(*) from medals where athleteCode=m.athleteCode and medal='Silver') as Silver,
(select count(*) from medals where athleteCode=m.athleteCode and medal='Bronze') as Bronze,
(select count(*) from medals where athleteCode=m.athleteCode) as Total
FROM medals m join athletes a
on m.athleteCode=a.code
group by country,code
order by country,Gold desc,Silver desc,Bronze desc
the result :
country name Gold Silver Bronze Total
azerbaijan yewtuuy 1 2 0 3
azerbaijan ghg 0 3 1 4
azerbaijan ahmad 0 0 1 1
iran habib 2 1 2 5
iran so 0 1 0 1
turkey mehmad 1 1 0 2