select and count where regexp group by count - sql

you can see what im trying to do here:
select *, count(*) as count
from `songs`
where `band` REGEXP '^[^[:alpha:]]'
group by `band`
order by `band` asc
bands can be:
avenged sevenfold
3 days grace
led zeppelin
98 mute
back street boys
beastie boys
i need this to select the bands whose first-character is not an alpha, and count how many rows exist for each band.
unfortunately my current query just seems to group all of them together that match the REGEXP.

You can't select columns that are not on the group by clause neither are a group function (count, max...)
The where it's ok because you don't need to group unneed rows and the condition is not over the group value (the result of a group function).
ASC is the default sort sense, so you don't need to specify it.
select band, count(*) as count
from songs
where band REGEXP '^[^[:alpha:]]'
group by band
order by band

Does doing the selection after the group help?
select `band`, count(*) as count
from `songs`
group by `band`
having `band` REGEXP '^[^[:alpha:]]'
order by `band` asc
Also you appear to be selecting columns that aren't in the group clause. Try:
select `band`, count(*) as count
from `songs`
where `band` REGEXP '^[^[:alpha:]]'
group by `band`
order by `band` asc

Related

How to query a column created by aggregate function in hive?

In hive, I want to select the records with users>=40. My table column consist of field userid. So i used
select title,sum(rating),count(userid) from table_name where count(userid)>=40
group by title order by rating desc
But it showed error like you can't use count in where clause. Also i have tried using alias like
select title,sum(rating) as ratings,count(userid) as users where users>=40 group by title order by ratings desc
Here also i struck up with error showing users is not a column name in table.
I need to get title with maximum ratings having minimum 40 users
You want the having clause:
select title, sum(rating), count(userid)
rom table_name
group by title
having count(userid) >= 40
order by sum(rating) desc;
In Hive, you may need to use a column alias, though:
select title, sum(rating) as rating, count(userid) as cnt
rom table_name
group by title
having cnt >= 40
order by rating desc;

QL: Find Top 2 and reverse order

I am having the IMDB database; I am looking for the top two years in which most movies were produced, and I have to sort them chronologically after the years and print only the years.
I am trying this to compute the list and sort it 'the other way around' afterwards but I cannot order by anthing in the last 'order by' statement because in the FROM-statement I dont refer to any tables and instead open the next statement. It says "unknown column topTwo" as well so that I cannot order my results accordingly.
What am I doing wrong?
SELECT *
FROM
(SELECT m.year, COUNT(*)
FROM movies as m
GROUP BY m.year
ORDER BY m.year DESC) AS topTwo
ORDER BY **topTwo** ASC
LIMIT 2;
I think you are looking for this:
SELECT topTwo.year
FROM (SELECT m.year, COUNT(*) as cnt
FROM movies m
GROUP BY m.year
ORDER BY COUNT(*) DESC
LIMIT 2
) topTwo
ORDER BY year ASC;
Notes:
The LIMIT goes in the subquery.
The COUNT(*) is given an alias.
The ORDER BY in the subquery is based on the count.
The ORDER BY in the outer query is based on the year.
You only seem to want the year, so the outer query only select that column.

Hive Script, DISTINCT with SUM

I am trying to distinct and then find the count of the teams a player played for in any single season and number of teams he played for. This is tripping me up and ofcourse i have a sample down below(2nd) one. The first ones is my failed attempt
SELECT o.id,o.year,COUNT(DISTINCT(o.team)) b JOIN
(SELECT id, year, team FROM batting
GROUP BY id,year,team
ORDER BY id DESC
LIMIT 25) o
0.id =b.id;
SELECT id, year, team FROM batting
GROUP BY id,year,team
ORDER BY id DESC
LIMIT 25;
produces
IGNORE the ^A, i think they represent either space or comma, just column seperatpr
Get the count of teams for each player for each year and order by the count desc,get the 1 row
SELECT id, year, COUNT(DISTINCT(team)) FROM batting
GROUP BY id,year
ORDER BY COUNT(DISTINCT(team)) DESC
LIMIT 1;

How can I count the non-unique combinations of values in MySQL?

I have a table with some legacy data that I suspect may be a little messed up. It is a many-to-many join table.
LIST_MEMBERSHIPS
----------------
list_id
address_id
I'd like to run a query that will count the occurrences of each list_id-address_id pair and show the occurrence count for each from highest to lowest number of occurrences.
I know it's got to involve COUNT() and GROUP BY, right?
select list_id, address_id, count(*) as count
from LIST_MEMBERSHIPS
group by 1, 2
order by 3 desc
You may find it useful to add
having count > 1
select count(*), list_id, address_id
from list_membership
group by list_id, address_id
order by count(*) desc

MySQL query to return only duplicate entries with counts

I have a legacy MySQL table called lnk_lists_addresses with columns list_id and address_id. I'd like to write a query that reports all the cases where the same list_id-address_id combination appears more than once in the table with a count.
I tried this...
SELECT count(*), list_id, address_id
FROM lnk_lists_addresses
GROUP BY list_id, address_id
ORDER BY count(*) DESC
LIMIT 20
It works, sort of, because there are fewer than 20 duplicates. But how would I return only the counts greater than 1?
I tried adding "WHERE count(*) > 1" before and after GROUP BY but got errors saying the statement was invalid.
SELECT count(*), list_id, address_id
FROM lnk_lists_addresses
GROUP BY list_id, address_id
HAVING count(*)>1
ORDER BY count(*) DESC
To combine mine and Todd.Run's answers for a more "complete" answer. You want to use the HAVING clause:
http://dev.mysql.com/doc/refman/5.1/en/select.html
You want to use a "HAVING" clause. Its use is explained in the MySQL manual.
http://dev.mysql.com/doc/refman/5.1/en/select.html
SELECT count(*) AS total, list_id, address_id
FROM lnk_lists_addresses
WHERE total > 1
GROUP BY list_id, address_id
ORDER BY total DESC
LIMIT 20
If you name the COUNT() field, you can use it later in the statement.
EDIT: forgot about HAVING (>_<)