MySQL query to return only duplicate entries with counts - sql

I have a legacy MySQL table called lnk_lists_addresses with columns list_id and address_id. I'd like to write a query that reports all the cases where the same list_id-address_id combination appears more than once in the table with a count.
I tried this...
SELECT count(*), list_id, address_id
FROM lnk_lists_addresses
GROUP BY list_id, address_id
ORDER BY count(*) DESC
LIMIT 20
It works, sort of, because there are fewer than 20 duplicates. But how would I return only the counts greater than 1?
I tried adding "WHERE count(*) > 1" before and after GROUP BY but got errors saying the statement was invalid.

SELECT count(*), list_id, address_id
FROM lnk_lists_addresses
GROUP BY list_id, address_id
HAVING count(*)>1
ORDER BY count(*) DESC
To combine mine and Todd.Run's answers for a more "complete" answer. You want to use the HAVING clause:
http://dev.mysql.com/doc/refman/5.1/en/select.html

You want to use a "HAVING" clause. Its use is explained in the MySQL manual.
http://dev.mysql.com/doc/refman/5.1/en/select.html

SELECT count(*) AS total, list_id, address_id
FROM lnk_lists_addresses
WHERE total > 1
GROUP BY list_id, address_id
ORDER BY total DESC
LIMIT 20
If you name the COUNT() field, you can use it later in the statement.
EDIT: forgot about HAVING (>_<)

Related

How to query a column created by aggregate function in hive?

In hive, I want to select the records with users>=40. My table column consist of field userid. So i used
select title,sum(rating),count(userid) from table_name where count(userid)>=40
group by title order by rating desc
But it showed error like you can't use count in where clause. Also i have tried using alias like
select title,sum(rating) as ratings,count(userid) as users where users>=40 group by title order by ratings desc
Here also i struck up with error showing users is not a column name in table.
I need to get title with maximum ratings having minimum 40 users
You want the having clause:
select title, sum(rating), count(userid)
rom table_name
group by title
having count(userid) >= 40
order by sum(rating) desc;
In Hive, you may need to use a column alias, though:
select title, sum(rating) as rating, count(userid) as cnt
rom table_name
group by title
having cnt >= 40
order by rating desc;

Selecting top 10 counts in SQLite

I have a table which records questions, their answers, and their authors. The column names are as follows:
id, question, answers, author
I would like to get a list of top 10 authors who have written the most questions. So it would need to first count the number of questions each author has written then sort them by the count then return the top 10.
This is in SQLite and I'm not exactly sure how to get the list of counts. The second part should be fairly simple as it's just an ORDER BY and a LIMIT 10. How can I get the counts into a list which I can select from?
SELECT BY COUNT(author)
,author
FROM table_name
GROUP BY author
ORDER BY COUNT(author) DESC LIMIT 10;
You can apply an order by clause to an aggregate query:
SELECT author, COUNT(*)
FROM mytable
GROUP BY author
ORDER BY 2 DESC
LIMIT 10
You could wrap your query as a subquery and then use LIMIT like this:
SELECT *
FROM (
SELECT author
,COUNT(*) AS cnt
FROM mytable
GROUP BY author
) t
ORDER BY t.cnt DESC
LIMIT 10;

Using sql, how can I simultaneously count the filtered and unfiltered occurrences of an attribute?

So this is probably a simple question, but here goes. I have some filtered data that get via a query like this:
SELECT DISTINCT account_id, count(*) as filtered_count
FROM my_table
WHERE attribute LIKE '%filter%'
GROUP BY account_id
ORDER BY account_id
This gives me an output table with two columns.
I'd like to add a third column,
count(*) as total_count
that counts the total number of occurrences of each account_id in the entire table (ignoring the filter).
How can I write the query for this three column table?
You can put a case expression inside the count function, then remove your where clause:
SELECT account_id,
count(case when attribute LIKE '%filter%' then 1 end) as filtered_count,
count(*) as total_count
FROM my_table
GROUP BY account_id
ORDER BY account_id;
Using DISTINCT although not actually harmful to your query, was redundant due to the grouping, so I have removed it.
You'll have to use a case statement for counting with your filter:
SELECT DISTINCT account_id,
count(case when attribute LIKE '%filter%' then 1 else null end) as filtered_count,
count (*)
FROM my_table
GROUP BY account_id
ORDER BY account_id

SQL To get Distinct Name and Number from table

Looking for sql to get distinct names and count of those names from a sql table:
Structure:
id
name
other details
Do I use distinct to get each group and then count through those to get:
name1 count(name1)
name2 count(name2)
etc
Thanks
Rob.
When you want a COUNT() or a SUM(), you're using an AGGREGATE FUNCTION based on a GROUP BY clause.
As GROUP BY brings together all records with the same values specified in the GROUP BY columns, you're already getting the same effect as DISTINCT.
Except that DISTINCT doesn't allow aggregates, and GROUP BY does.
SELECT
name,
COUNT(*) AS count_of_name
FROM
yourTable
GROUP BY
name
Try :
SELECT *, COUNT(*) FROM my_table GROUP BY name
Something like this?
select name,COUNT(name) FROM Persons GROUP BY name
In the end I used:
SELECT DISTINCT `school`,COUNT(`school`) AS cat_num FROM table GROUP BY school order by cat_num DESC

How can I count the non-unique combinations of values in MySQL?

I have a table with some legacy data that I suspect may be a little messed up. It is a many-to-many join table.
LIST_MEMBERSHIPS
----------------
list_id
address_id
I'd like to run a query that will count the occurrences of each list_id-address_id pair and show the occurrence count for each from highest to lowest number of occurrences.
I know it's got to involve COUNT() and GROUP BY, right?
select list_id, address_id, count(*) as count
from LIST_MEMBERSHIPS
group by 1, 2
order by 3 desc
You may find it useful to add
having count > 1
select count(*), list_id, address_id
from list_membership
group by list_id, address_id
order by count(*) desc