how to use count and distinct together - sql

I have a table with names of countries. The country names are duplicated in the table. Eg say there are 8 rows in the table, 5 rows with country Germany and 3 with country UK. I want to get count the countries in the table (eg. I should get the number 2). But I am unable to come up with the query
I tried SELECT Country FROM Customers; but that will give me 8 rows. I tried SELECT DISTINCT Country FROM Customers; but that gives me 2 rows. I tried using count as SELECT DISTINCT Count(Country) FROM Customers; but I get 8 (probably because DISTINCT is applied on result set of SELECT Count(Country) FROM Customers; How could I get 2?

You can use distinct inside count:
select count(distinct country)
from customers;
Which is equivalent to:
select count(*)
from (
select distinct country
from customers
where country is not null
) t;

use inside distinct
SELECT count( distinct Country) FROM Customers

You can use distinct country within count as below:
SELECT count(DISTINCT country)
FROM customers;
You can use distinct country within count and group by country for getting country name as well:
SELECT count(1), country
FROM customers
GROUP BY country;

Here is one way to do this using analytic functions:
SELECT ROW_NUMBER() OVER (ORDER BY COUNT(*)) cnt
FROM customers
GROUP BY country
ORDER BY cnt DESC
LIMIT 1;

Related

output a number — the number of students whose name (first_name) matches the name of at least one other student

Well, that is, there are two Johns and one Quill, you need to output the number of those people who have the same name. In one column there should be the total number of students with the same names
SELECT COUNT(id) as count
FROM student
GROUP BY LOWER(first_name) HAVING COUNT(LOWER(first_name)) > 1;
it will output for each name the count, how to make the total?
In order to get the total, select from your query result and add the counts up.
SELECT SUM(cnt)
FROM
(
SELECT COUNT(*) AS cnt
FROM student
GROUP BY LOWER(first_name)
HAVING COUNT(*) > 1
) counts;
Please, try the following:
SELECT SUM(COUNT(id)) as Total
FROM student
GROUP BY LOWER(first_name) HAVING COUNT(LOWER(first_name)) > 1;

Oracle SQL distinct Count(*) with 3 columns

I'm trying to display 3 columns in a table.
Something like:
ZIPCODE
SUBSCRIBERS
MEMBERS
12345
5
10
12346
3
8
In which each zipcode is a distinct zipcode that has a number of "subscribers" within it. The subscribers would be the original employee, that can just be defined as DEPNO=0 (they are the original employee and not a dependent), the members would just be everyone in the zipcode which I am able to get with a statement that looks like the SQL below. I am pulling from a table called EMPDEP
SELECT DISTINCT ZIPCODE, COUNT(*) OVER (PARTITION BY ZIPCODE) as Subscribers FROM EMPDEP where depno=0
This statement will get me a Subscriber count but I want the total member count in there as well which would just be
SELECT DISTINCT ZIPCODE, COUNT(*) OVER (PARTITION BY ZIPCODE) as Members FROM EMPDEP
but getting all 3 of these in 1 query is killing me as I can't get the nesting down correctly, at least I'm assuming I will need that?
Any tips on how to do this?
Huh? Why are you using window functions? Just use aggregation:
SELECT ZIPCODE, COUNT(*) as Members,
SUM(CASE WHEN depno = 0 THEN 1 ELSE 0 END) as Subscribers
FROM EMPDEP
GROUP BY ZIPCODE;

SQL Select column which is not used in select section of subquery which find duplicates

I am trying to find in my database records which has duplicated fields like name, surname and type.
Example:
SELECT name, surname, type, COUNT(*)
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1
Query results:
Robb|Stark|1|2
Tyrion|Lannister|1|3
So we have duplicated customer with name and surname "Robb Stark" 2 times and "Tyrion Lannister" 3 times
Now, I want to know the id of these records.
I found similar problem described here:
Finding duplicate values in a SQL table
there is answer but no example.
Use COUNT as an analytic function:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, surname) cnt
FROM customers
)
SELECT * -- return all columns
FROM cte
WHERE cnt > 1
ORDER BY name, surname;
The simplest way will be to use the EXISTS as follows:
SELECT t.*
FROM customers t
where exists
(select 1 from customers tt
where tt.name = t.name
and tt.surname = t.surname
and tt.id <> t.id)
Or use your original query in IN clause as follows:
select * from customers where (name, surname) in
(SELECT name, surname
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1)
If you want one row per group of duplicate, with the list of id in a comma separated string, you can just use string aggration with your existing query:
SELECT name, surname, COUNT(*) as cnt,
STRING_AGG(id, ',') WITHIN GROUP (ORDER BY id) as all_ids
FROM customers
GROUP BY name, surname
HAVING COUNT(*) > 1

Column must appear in group by or aggregate function in nested query

I have the following table.
Fights (fight_year, fight_round, winner, fid, city, league)
I am trying to query the following:
For each year that appears in the Fights table, find the city that held the most fights. For example, if in year 1992, Jersey held more fights than any other city did, you should print out (1992, Jersey)
Here's what I have so far but I keep getting the following error. I am not sure how I should construct my group by functions.
ERROR: column, 'ans.fight_round' must appear in the GROUP BY clause or be used in an aggregate function. Line 3 from (select *
select fight_year, city, max(*)
from (select *
from (select *
from fights as ans
group by (fight_year)) as l2
group by (ans.city)) as l1;
In Postgres, I would recommend aggregation and distinct on:
select distinct on (flight_year) flight_year, city, count(*) cnt
from flights
group by flight_year, city
order by flight_year, count(*) desc
This counts how many fights each city had each year, and retains the city with most fight per year.
If you want to allow ties, then use window functions:
select flight_year, city, cnt
from (
select flight_year, city, count(*) cnt,
rank() over(partition by flight_year order by count(*) desc) rn
from flights
group by flight_year, city
) f
where rn = 1
Although row_number is the easiest way as done by #GMB. Can try this alternative as well
select city, fight_year
from fights
group by city, fightyear
having count(*) = sum(case when fid is not null then 1 end)

how to find difference between no_of_value and no_of_distinct columns values?

Let be the number of CITY entries in STATION, and let be the number of distinct CITY names in STATION; query the value of from STATION. In other words, find the difference between the total number of CITY entries in the table and the number of distinct CITY entries in the table.
Input Format
The STATION table is described as follows:
enter image description here
where LAT_N is the northern latitude and LONG_W is the western longitude.
Use distinct in count function.
select count(city) - count(distinct city)
from station
SELECT count(city) - count(DISTINCT city) FROM station;
Do not forget to add semicolon ';' after the query
You could use having for filtering the resul on aggregated function
select city, count(*), count(distinct city)
from station
group by city
having count(*) <> count(distinct city)
If I understand correctly:
select count(city) - count(distinct city)
from station;
You would do this to get the number of duplicated values in the table. I might be more interested in the list of cities and the number of duplicates:
select city, count(*) - 1 as numdups
from station
group by city
having count(*) > 1;