SQL - Distribution Count - sql

Hi I have the following table:
crm_id | customer_id
jon 12345
jon 12346
ben 12347
sam 12348
I would like to show the following:
Crm_ID count | Number of customer_ids
1 2
2 1
Basically I want to count the number crm_ids that have 1,2,3,4,5+ customer_ids.
Thanks

One approach is to aggregate twice. First, aggregate over crm_id and generate counts. Then, aggregate over those counts themselves and generate a count of counts.
SELECT
cnt AS crm_id_cnt,
COUNT(*) AS num_customer_ids
FROM
(
SELECT crm_id, COUNT(DISTINCT customer_id) AS cnt
FROM yourTable
GROUP BY crm_id
) t
GROUP BY cnt;
Have a look at a demo below, given in MySQL as you did not specify a particular database (though my answer should run on most databases I think).
Demo

Related

count different column values after grouping by

Consider this table:
id name department email
1 Alex IT blah#gmail.com
1 Alex IT blah#gmail.com
2 Jay HR jay#gmail.com
2 Jay Marketing zou#gmail.com
If I group byid,name and count I get:
id name count(*)
1 Alex 2
2 Jay 2
With this query:
select id,name,count(*) from tb group by id,name;
However I would like to count only records that diverge from department,email, so as to have:
id name count(*)
1 Alex 0
2 Jay 1
This time the count for the first group 1,Alex is 0 because department,email have the same values (duplicated) , on the other hand 2,Jay is one because department,email has one different value.
If you meant "two different values" for "Jay", you can use distinct:
select id,name,count(*) from (SELECT distinct * FROM tb) group by id,name;
You can use count(*) - 1 to get similar results in your question.

group by count sql

I have a script
SELECT customer_id, COUNT(*)
FROM devices
GROUP BY customer_id
ORDER BY 2 desc;
and the result is
customer_id count
1234 8
4567 7
8910 7
1112 7
1314 5
1516 5
but what I expect is
devices customer
8 1
7 3
5 2
Is there any way without using a new table?
Use your current query as a source for another:
with temp as
-- this is your current query
(select customer_id,
count(*) cnt
from devices
group by customer_id
)
select cnt as devices,
count(*) as customer
from temp
group by cnt;
The usual use of count(*) is to find out many rows have been aggregated by group function. Want you ant to do is aggregate your result in a second select, and ask for count(*) a second time to get result you've described.

Is there an analytic function for count in oracle sql

select manager, count(*) over (partition by manager) cnt
from dbtable
group by manager
This will provide me the count of manager but if I need a count of senior_manager how will I get it?
|--------------------|------------------|
| Manager |Senior_Manager |
|--------------------|------------------|
| John |Arpit |
| John |govind |
| John |olive |
| Domnic |kelvin |
| Domnic |paul |
|--------------------|------------------|
Result
John 3
Domnic 2
Your code returns "1" for all managers -- because it counts the number of rows after the group by.
If you want to count the number of rows in the table for a given manager, then you want aggregation, not analytic functions:
Select manager, count(*) as cnt
from dbtable
group by manager;
I'm not sure if this answers your question, but it at least addresses the issue that the your query does not do much that is useful.
EDIT:
For the revised question, it simply seems:
Select senior_manager, count(*) as cnt
from dbtable
group by senior_manager;
The result you wanted can be retrieved by
select manager, count(*) over (partition by manager) cnt
from dbtable
This means each manager will be associated with the count of rows in the partition where {manager} value equals that exact manager. According to the table above this is what you expect to get.
Your example:
select manager, count(*) over (partition by manager) cnt
from dbtable
group by manager
Yields the following results:
MANAGER CNT
Domnic 1
John 1
If you drop the group by, you get:
MANAGER CNT
Domnic 2
Domnic 2
John 3
John 3
John 3
Are those the counts you're looking for? If so, then you can eliminate the duplicate rows with distinct:
select distinct manager, count(*) over (partition by manager) cnt
from dbtable
Which gives:
MANAGER CNT
John 3
Domnic 2

How do you determine the average total of a column in Postgresql?

Consider the following Postgresql database table:
id | book_id | author_id
---------------------------
1 | 1 | 1
2 | 2 | 1
3 | 3 | 2
4 | 4 | 2
5 | 5 | 2
6 | 6 | 3
7 | 7 | 2
In this example, Author 1 has written 2 books, Author 2 has written 4 books, and Author 3 has written 1 book. How would I determine the average number of books written by an author using SQL? In other words, I'm trying to get, "An author has written an average of 2.3 books".
Thus far, attempts with AVG and COUNT have failed me. Any thoughts?
select avg(totalbooks) from
(select count(1) totalbooks from books group by author_id) bookcount
I think your example data actually only has 3 books for author id 2, so this would not return 2.3
http://sqlfiddle.com/#!15/3e36e/1
With the 4th book:
http://sqlfiddle.com/#!15/67eac/1
You'll need a subquery. The inner query will count the books with GROUP BY author; the outer query will scan the results of the inner query and avg them.
You can use a subquery in the FROM clause for this, or you can use a CTE (WITH expression).
For an average number of books per author you can do simply:
SELECT 1.0*COUNT(DISTINCT book_id)/count(DISTINCT author_id) FROM tbl;
For number of books per author:
SELECT 1.0*COUNT(DISTINCT book_id)/count(DISTINCT author_id)
FROM tbl GROUP BY author_id;
We need 1.0 factor to make the result not integer.
You can remove DISTINCT depending of result you want (it matters only if one book have many authors).
As Craig Ringer rightly pointed out 2 distincts may be expensive. For test performance I have generated 50 000 rows and I got followng results:
My query with 2 DISTINCTS: ~70ms
My query with 1 DISTINCT: ~40ms
Martin Booth's approach: ~30ms
Then added 1 milion rows and tested again:
My query with 2 DISTINCTS: ~1520ms
My query with 1 DISTINCT: ~820ms
Martin Booth's approach: ~1060ms
Then added another 9 milion rows and tested again:
My query with 2 DISTINCTS: ~17s
My query with 1 DISTINCT: ~11s
Martin Booth's approach: ~19s
So there is no universal solution.
This should work:
SELECT AVG(cnt) FROM (
SELECT COUNT(*) cnt FROM t
GROUP BY author_id
) s

How to produce detail, not summary, report sorted by count(*)?

Oracle 11g:
I want results to list by highest count, then ch_id. When I use group by to get the count then I loose the granularity of the detail. Is there an analytic function I could use?
SALES
ch_id desc customer
=========================
ANAR Anari BOB
SWIS Swiss JOE
SWIS Swiss AMY
BRUN Brunost SAM
BRUN Brunost ANN
BRUN Brunost ROB
Desired Results
count ch_id customer
===========================================
3 BRUN ANN
3 BRUN ROB
3 BRUN SAM
2 SWIS AMY
2 SWIS JOE
1 ANAR BOB
Use the analytic count(*):
select * from
(
select count(*) over (partition by ch_id) cnt,
ch_id, customer
from sales
)
order by cnt desc
select total, ch_id, customer
from sales s
inner join (select count(*) total, ch_id from sales group by ch_id) b
on b.ch_id = s.chi_id
order by total, ch_id
ok - the other post that happened at the same time, using partition, is the better solution for Oracle. But this one works regardless of DB.