Select distinct column and get count of another column - sql

I have the following sqlfiddle:
CREATE TABLE tester(
name TEXT,
address TEXT
)
Each person in the table can have multiple addresses. I'd like to select all names and the number of addresses they have that have > 1 address. I have tried:
SELECT d.name, count(address) c FROM (SELECT DISTINCT ON(name) FROM tester) d
LEFT JOIN tester ON d.name = links.name
WHERE count(address) > 1
I get:
ERROR: syntax error at or near "FROM" Position: 64
I've also tried a DISTINCT ON query:
SELECT DISTINCT ON(name) name, count(address) FROM tester HAVING count(address) > 1
I get:
ERROR: column "tester.name" must appear in the GROUP BY clause or be used in an aggregate function Position: 26
I feel like I'm making this too difficult.

Simply use GROUP BY:
SELECT name, count(address)
FROM tester
GROUP BY name
HAVING count(address) > 1
GROUP BY in SQL (as well as in other languages) will always produce distinct groups, so there is no need for DISTINCT in this case.

You just need to use group by correctly. Like this:
SELECT name, count(*)
FROM tester
group by name

Related

Perform Simple Group By in Google Big Query

i have the simplest query on google big query that keeps returning an error
Grouping by expressions of type STRUCT is not allowed
i am simply trying to select a list of emails from two locations, union them in one cte, and count frequency in the cte to identify duplicates.
this should be very easy - what am i missing??
with a as (select properties.email as email, 'loc1' as tag from `loc1.contacts`),
b as (select properties.email as email, 'loc2' as tag from `loc2.contacts`),
c as (
select * from a
union all
select * from b
)
select email, count(email) from c group by 1
sample data:
email/tag
bob#email.com/loc1
bob#email.com/loc2
expected results:
email/count
bob#email.com/2
looks like i needed to add .value to actually get the value of the email field, following query worked as desired
with a as (select properties.email.value as email, 'loc1' as tag from `loc1.contacts`),
b as (select properties.email.value as email, 'loc2' as tag from `loc2.contacts`),
c as (
select * from a
union all
select * from b
)
select email, count(email) from c group by 1

Filter by number of occurrences in a SQL Table

Given the following table where the Name value might be repeated in multiple rows:
How can we determine how many times a Name value exists in the table and can we filter on names that have a specific number of occurrances.
For instance, how can I filter this table to show only names that appear twice?
You can use group by and having to exhibit names that appear twice in the table:
select name, count(*) cnt
from mytable
group by name
having count(*) = 2
Then if you want the overall count of names that appear twice, you can add another level of aggregation:
select count(*) cnt
from (
select name
from mytable
group by name
having count(*) = 2
) t
It sounds like you're looking for a histogram of the frequency of name counts. Something like this
with counts_cte(name, cnt) as (
select name, count(*)
from mytable
group by name)
select cnt, count(*) num_names
from counts_cte
group by cnt
order by 2 desc;
You need to use a GROUP BY clause to find counts of name repeated as
select name, count(*) AS Repeated
from Your_Table_Name
group by name;
If You want to show only those Which are repeated more than one times. Then use the below query which will show those occurrences which are there more than one times.
select name, count(*) AS Repeated
from Your_Table_Name
group by name having count(*) > 1;

DISTINCT AND COUNT(*)=1 not working on SQL

I need to show the ID (which is unique in every case) and the name, which is sometimes different. In my code I only want to show the names IF they are unique.
I tried with both distinct and count(*)=1, nothing solves my problem.
SELECT DISTINCT id, name
FROM person
GROUP BY id, name
HAVING count(name) = 1;
The result is still showing the names multiple times
By "unique", I assume you mean names that only appear once. That is not what "distinct" means in SQL; the use of distinct is to remove duplicates (either for counting or in a result set).
If so:
SELECT MAX(id), name
FROM person
GROUP BY name
HAVING COUNT(*) = 1;
If your DBMS supports it, you can use a window function:
SELECT id, name
FROM (
SELECT id, name, COUNT(*) OVER(PARTITION BY name) AS NameCount -- get count of each name
FROM person
) src
WHERE NameCount = 1
If not, you can do:
SELECT id, name
FROM person
WHERE name IN (
SELECT name
FROM person
GROUP BY name
HAVING COUNT(*) = 1 -- Only get names that occur once
)

incorrect syntax near ';'

In SQL Server 2016 I miss this error in this query:
select count(*)
from (select count(*), clave
from products
where state = 1
group by key
having count(*) > 1 );
I have tried to copy and paste the query in a note pad in case some invalid character or space has been inserted.
You need alias :
select count(*)
from (select count(*)
from products
where [state] = 1
group by [key]
having count(*) > 1
) t; -- t alias
Considering to use only words or identifiers which have not reserved by SQL Server, such as key (especially in your case) & many more.
Second thing when you include group by clause with your query you should be care about select statement with expressions/columns (which are available in group by clause or the expression/column which are not in group by clause you should include aggregate function to that column/expression.)
'Key' is a reserve word so you can not use it so you should rename it and your result is a table so you should give it an alias.
Try something like
`
select count(*)
from (select count(*), clave
from products
where state = 1
group by key1
having count(*) > 1 ) AS product_alias;
'

Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I'm trying to select the latest date and group by name and keep other columns.
For example:
name status date
-----------------------
a l 13/19/04
a n 13/09/05
a dd 13/18/03
b l 13/01/01
b dd 13/01/02
b n 13/01/03
and I want the result like:
name status date
-----------------
a n 13/09/05
b n 13/01/03
Here's my code
SELECT
Name,
MAX(DATE) as Date,
Status
FROM
[ST].[dbo].[PS_RC_STATUS_TBL]
GROUP BY
Name
I know that I should put max(status) because There are a lot of possibilities in each case, and nothing in the query makes it clear which value to choose for status in each group. Is there anyway to use inner join ?
It's not clear to me you want the max or min status. Rather it seems to me you want the name and status as of a date certain. That is, you want the rows with the lastest date for each name. So ask for that:
select * from PS_RC_STATUS_TBL as T
where exists (
select 1 from PS_RC_STATUS_TBL
where name = T.name
group by name
having max(date) = T.date
)
Another way to think about it is
select T.*
from PS_RC_STATUS_TBL as T
join (
select name, max(date) as date
from PS_RC_STATUS_TBL
group by name
) as D
on T.name = D.name
and T.date = D.date
SQL Server needs to know what to do with the rows that you are not grouping on (it has multiple rows to show on 1 line - so how?). If you have aggregated on them (MIN, MAX, AVG, etc) then you are telling it what to do with these rows. If not it will not know what to do - and will give you an error like the one you are getting.
From what you are saying though - it sounds like you do not want to group by the status. It sounds like you are not interested in that column at all. Let me know If that assumption is wrong.
SELECT
Name,
MAX(Date) AS 'Date',
FROM
PS_RC_STATUS_TBL
GROUP BY
Name
If you really do want the status, but don't want to group on it - try this:
SELECT
MyTable1.Name,
MyTable2.Status,
MyTable1.Date
FROM
(SELECT Name, MAX(Date) AS 'Date' FROM PS_RC_STATUS_TBL GROUP BY Name) MyTable1
INNER JOIN
(SELECT Name, Date, Status FROM PS_RC_STATUS_TBL) MyTable2
ON MyTable1.Name = MyTable2.Name
AND MyTable1.Date = MyTable2.Date
That gives the exact results you've asked for - so does the method below using a CTE.
OR
WITH cte AS (
SELECT Name, MAX(Date) AS Date
FROM PS_RC_STATUS_TBL
GROUP BY Name)
SELECT cte.Name,
tbl.Status,
cte.Date
FROM cte INNER JOIN
PS_RC_STATUS_TBL tbl ON cte.Name = tbl.Name
AND cte.Date = tbl.Date
SQLFiddle example.
It just means that you need to put all non-aggregated columns in the GROUP BY clause, so in the case you need to put the other one
Select Name ,
MAX(DATE) as Date ,
Status
FROM [ST].[dbo].[PS_RC_STATUS_TBL] PS
Group by Name, Status
This is a common problem with text fields in SQL aggregation scenarios. Using either MAX(Status) or MIN(Status) in your field list is a solution, usually MAX(Status) because of the lexical ordering:
"" < " " < "a"
In cases where you really need a more detailed ordering:
Join to a StatusOrder relation (*Status, OrderSequence) in your main query;
select Max(OrderSequence) in your aggregated query; and
Join back to your StatusOrder relation on OrderSequence to select the correct Status value for display.
Whatever fields you're selecting other than aggregation function, need to mention in group by clause.
SELECT
gf.app_id,
ma.name as name,
count(ma.name) as count
FROM [dbo].[geo_fen_notification_table] as gf
inner join dbo.mobile_applications as ma on gf.app_id = ma.id
GROUP BY app_id,name
Here im accessing app_id and name in select, so i need to mention that after group by clause. otherwise it will throw error.