COUNT of DISTINCT items in a column - sql

Here's the SQL that works (strangely) but still just returns the COUNT of all items, not the COUNT of DISTINCT items in the column.
SELECT DISTINCT(COUNT(columnName)) FROM tableName;

SELECT COUNT(*) FROM tableName
counts all rows in the table,
SELECT COUNT(columnName) FROM tableName
counts all the rows in the table where columnName is not null, and
SELECT (DISTINCT COUNT(columnName)) FROM tableName
counts all the rows in the table where columnName is both not null and distinct (i.e. no two the same)
SELECT DISTINCT(COUNT(columnName)) FROM tableName
Is the second query (returning, say, 42), and the distinct gets applied after the rows are counted.

You need
SELECT COUNT(DISTINCT columnName) AS Cnt
FROM tableName;
The query in your question gets the COUNT (i.e. a result set with one row) then applies Distinct to that single row result which obviously has no effect.

SELECT COUNT(*) FROM (SELECT DISTINCT columnName FROM tableName);

Related

PostgreSQL create count, count distinct columns

fairly new to PostgreSQL and trying out a few count queries. I'm looking to count and count distinct all values in a table. Pretty straightforward -
CountD Count
351 400
With a query like this:
SELECT COUNT(*)
COUNT(id) AS count_id,
COUNT DISTINCT(id) AS count_d_id
FROM table
I see that I can create a single column this way:
SELECT COUNT(*) FROM (SELECT DISTINCT id FROM table) AS count_d_id
But the title (count_d_id) doesn't come through properly and unsure how can I add an additional column. Guidance appreciated
This is the correct syntax:
SELECT COUNT(id) AS count_id,
COUNT(DISTINCT id) AS count_d_id
FROM table
Your original query aliases the subquery rather than the column. You seem to want:
SELECT COUNT(*) AS count_d_id FROM (SELECT DISTINCT id FROM table) t
-- column alias --^ -- subquery alias --^

SQL/HIVE - Distinct count query - How does SELECT COUNT (DISTINCT columns,..) differ from SELECT COUNT(*) with subquery of DISTINCT records

In HIVE, I tried getting the count of distinct rows in 2 methods,
SELECT COUNT (*) FROM (SELECT DISTINCT columns FROM table);
SELECT COUNT (DISTINCT columns) FROM table;
Both are yielding DIFFERENT RESULTS.
The count for the first query is greater than the second query.
How are they working differently?
Thanks in advance.
Do a slight change to your query, ie name your sub query for eg:
SELECT COUNT (*) FROM (SELECT DISTINCT columns FROM table) myquery;
Try with this in hive:
SELECT COUNT (DISTINCT nvl(columns,'NA')) FROM table;
or:
SELECT COUNT (DISTINCT coalesce(columns,'NA')) FROM table;
Above query output will be same as below:
SELECT COUNT (*) FROM (SELECT DISTINCT columns FROM table);

COUNT() doesn't work with GROUP BY?

SELECT COUNT(*) FROM table GROUP BY column
I get the total number of rows from table, not the number of rows after GROUP BY. Why?
Because that is how group by works. It returns one row for each identified group of rows in the source data. In this case, it will give the count for each of those groups.
To get what you want:
select count(distinct column)
from table;
EDIT:
As a slight note, if column can be NULL, then the real equivalent is:
select (count(distinct column) +
max(case when column is null then 1 else 0 end)
)
from table;
Try this:
SELECT COUNT(*), column
FROM table
GROUP BY column

SELECT *, COUNT(*) in SQLite

If i perform a standard query in SQLite:
SELECT * FROM my_table
I get all records in my table as expected. If i perform following query:
SELECT *, 1 FROM my_table
I get all records as expected with rightmost column holding '1' in all records. But if i perform the query:
SELECT *, COUNT(*) FROM my_table
I get only ONE row (with rightmost column is a correct count).
Why is such results? I'm not very good in SQL, maybe such behavior is expected? It seems very strange and unlogical to me :(.
SELECT *, COUNT(*) FROM my_table is not what you want, and it's not really valid SQL, you have to group by all the columns that's not an aggregate.
You'd want something like
SELECT somecolumn,someothercolumn, COUNT(*)
FROM my_table
GROUP BY somecolumn,someothercolumn
If you want to count the number of records in your table, simply run:
SELECT COUNT(*) FROM your_table;
count(*) is an aggregate function. Aggregate functions need to be grouped for a meaningful results. You can read: count columns group by
If what you want is the total number of records in the table appended to each row you can do something like
SELECT *
FROM my_table
CROSS JOIN (SELECT COUNT(*) AS COUNT_OF_RECS_IN_MY_TABLE
FROM MY_TABLE)

Selecting COUNT(*) with DISTINCT

In SQL Server 2005 I have a table cm_production that lists all the code that's been put into production. The table has a ticket_number, program_type, program_name and push_number along with some other columns.
GOAL: Count all the DISTINCT program names by program type and push number.
What I have so far is:
DECLARE #push_number INT;
SET #push_number = [HERE_ADD_NUMBER];
SELECT DISTINCT COUNT(*) AS Count, program_type AS [Type]
FROM cm_production
WHERE push_number=#push_number
GROUP BY program_type
This gets me partway there, but it's counting all the program names, not the distinct ones (which I don't expect it to do in that query). I guess I just can't wrap my head around how to tell it to count only the distinct program names without selecting them. Or something.
Count all the DISTINCT program names by program type and push number
SELECT COUNT(DISTINCT program_name) AS Count,
program_type AS [Type]
FROM cm_production
WHERE push_number=#push_number
GROUP BY program_type
DISTINCT COUNT(*) will return a row for each unique count. What you want is COUNT(DISTINCT <expression>): evaluates expression for each row in a group and returns the number of unique, non-null values.
I needed to get the number of occurrences of each distinct value. The column contained Region info.
The simple SQL query I ended up with was:
SELECT Region, count(*)
FROM item
WHERE Region is not null
GROUP BY Region
Which would give me a list like, say:
Region, count
Denmark, 4
Sweden, 1
USA, 10
You have to create a derived table for the distinct columns and then query the count from that table:
SELECT COUNT(*)
FROM (SELECT DISTINCT column1,column2
FROM tablename
WHERE condition ) as dt
Here dt is a derived table.
SELECT COUNT(DISTINCT program_name) AS Count, program_type AS [Type]
FROM cm_production
WHERE push_number=#push_number
GROUP BY program_type
try this:
SELECT
COUNT(program_name) AS [Count],program_type AS [Type]
FROM (SELECT DISTINCT program_name,program_type
FROM cm_production
WHERE push_number=#push_number
) dt
GROUP BY program_type
You can try the following query.
SELECT column1,COUNT(*) AS Count
FROM tablename where createddate >= '2022-07-01'::date group by column1
This is a good example where you want to get count of Pincode which stored in the last of address field
SELECT DISTINCT
RIGHT (address, 6),
count(*) AS count
FROM
datafile
WHERE
address IS NOT NULL
GROUP BY
RIGHT (address, 6)