SQL select rows, where column value is unique (only appears once)

SQL select rows, where column value is unique (only appears once) - sql

Given the table
| id | Name |
| 01 | Bob |
| 02 | Chad |
| 03 | Bob |
| 04 | Tim |
| 05 | Bob |
I want to select the name and ID, from rows where the name is unique (only appears once)
This is essentially the same as How to select unique values of a column from table?, but notice that the author doesn't need the id, so that problem can be solved by a GROUP BY name HAVING COUNT(name) = 1
However, I need to extract the entire row (could be tens or hundreds of columns) including the id, where COUNT(name) = 1, but I cannot GROUP BY id, name as every combination of those are unique.
EDIT:
Am using Google BigQuery.
Expected results:
| id | Name |
| 02 | Chad |
| 04 | Tim |

Simply do a GROUP BY. Use HAVING to make sure a name is only there once. Use MIN() to pick the only id for the name.
select min(id), name
from tablename
group by name
having count(*) = 1
Reading the table only once will increase performance! (And don't forget to create an index on (name, id).)

Use correlated subquery
DEMO
select * from tablename a
where not exists (select 1 from tablename b where a.name=b.name having count(*)>1)
OUTPUT:
id name
2 Chad
4 Tim

You can use NOT EXISTS :
SELECT t.*
FROM table t
WHERE NOT EXISTS (SELECT 1 FROM table t1 WHERE t1.name = t.Name AND t1.id <> t.id);
This would need index on table(id, name) to produce faster result set.

How about a simple aggregation?
select any_value(id), name
from t
group by name
having count(*) = 1;
BigQuery works quite well with aggregations so this might be quite efficient as well.

use exists and check uqique name
select id,name
from table t1
where exists ( select 1 from table t2 where t1.name=t2.name
having count(*)=1
)

Please try this.
SELECT
DISTINCT id,NAME
FROM
tableName

You can use multiple subqueries to extract what you need.
SELECT * FROM tableName
WHERE name IN (SELECT name FROM (SELECT name, COUNT(name) FROM tableName
GROUP BY name
HAVING COUNT(name) = 1) AS subQuery)

Below is for BigQuery Standard SQL and works for any number of columns w/o explicitly calling them out and does not require any join'ing or sub-selects
#standardSQL
SELECT t.*
FROM (
SELECT ANY_VALUE(t) t
FROM `project.dataset.table` t
GROUP BY name
HAVING COUNT(1) = 1
)

Related

SQL - Select first group in group by

I have this table in DB2:
+----+-----+----------+
| id | name| key |
+----+-----+----------+
| 1 | foo |111000 |
| 2 | bar |111000 |
| 3 | foo |000111 |
+----+-----+----------+
When I group by name by I can extract the table grouped by the name, but how can I automatically only extract the first group, to get this result:
+----+-----+----------+
| id | name| key |
+----+-----+----------+
| 1 | foo |111000 |
| 3 | foo |000111 |
+----+-----+----------+
How can I solve this?

The MIN function will identify which row is the first one by id, then you can use that to filter the result to show only that row.
SELECT id,name,key
FROM Table1
WHERE id IN (SELECT MIN(ID) FROM Table1 GROUP BY name,key)

You could use a inner join on subselect aggregated by min id
select * from mytable
inner join (
select min(id) my_id
from mytable
group by name, key
) t on t.my_id = mytable.id

It looks like you want to get all names that have the same as the min(id). If this us correct then this should work:
Otherwise, please explain what you mean by "first group" and how that is defined.
select * from table
inner join (
select name, min(id)
from table
group by name
) t on t.name = table.name

In theory, given the way the question is asked you could also just do a simple select on the name you want.
SELECT id,name,key
From Table1
Where name = 'foo'
It really depends what you mean by 'first group'. If you grouped by name and ordered ascending by name then 'bar' would actually be the 'first group', not 'foo'. Maybe if you clarify that we can give you better answers?

postgresql find duplicates in column with ID

For instance, I have a table say "Name" with duplicate records in it:
Id | Firstname
--------------------
1 | John
2 | John
3 | Marc
4 | Jammie
5 | John
6 | Marc
How can I fetch duplicate records and display them with their receptive primary key ID?

Use Count()Over() window aggregate function
Select * from
(
select Id, Firstname, count(1)over(partition by Firstname) as Cnt
from yourtable
)a
Where Cnt > 1

SELECT t.*
FROM t
INNER JOIN
(SELECT firstname
FROM t
GROUP BY firstname
HAVING COUNT(*) > 1) sub
ON t.firstname = sub.firstname
A sub-query would do the trick. Select the first names that are found more than once your table, t. Then join these names back to the main table to pull in the primary key.

Selecting compared pairs from table

I don't really know how to describe it. I have a table:
ID | Name | Date
-------------------------
1 | Mike | 01.01.2016
1 | Michael | 02.03.2016
2 | Samuel | 23.12.2015
2 | Sam | 05.03.2015
3 | Tony | 02.04.2012
I want to select pairs of IDs and Names with latest dates in each pair. The result here should be:
ID | Name | Date
-------------------------
1 | Michael | 02.03.2016
2 | Samuel | 23.12.2015
3 | Tony | 02.04.2012
How do I achieve this?
Oracle Database 11g

You can do it using the ROW_NUMBER() analytic function:
SELECT id, name, "date"
FROM (
SELECT t.*,
ROW_NUMBER() OVER ( PARTITION BY id ORDER BY "date" DESC ) rn
FROM table_name t
)
WHERE rn = 1
This requires only a single table scan (it does not have a self-join or correlated sub-query - i.e. IN (...) or EXISTS(...)).

Have a sub-select that returns each id and it's max date:
select * from table
where (id, date) in (select id, max(date) from table group by id)

You can use NOT EXISTS() :
SELECT * FROM YourTable t
WHERE NOT EXISTS(SELECT 1 FROM YourTable s
WHERE t.id = s.id and s.date > t.date)

Possibly the most efficient method is:
select t.*
from table t
where t.date = (select max(date) from table t2 where t2.id = t.id);
along with an index on table(id, date).
This version should scan the table and look up the correct value in the index.
Or, if there are only three columns, you can use keep:
select id, max(date) as date,
max(name) keep (dense_rank first order by date desc) as name
from table
group by id;
I have found that this version works very well in Oracle.

How to select all attributes (*) with distinct values in a particular column(s)?

Here is link to the w3school database for learners:
W3School Database
If we execute the following query:
SELECT DISTINCT city FROM Customers
it returns us a list of different City attributes from the table.
What to do if we want to get all the rows like that we get from SELECT * FROM Customers query, with unique value for City attribute in each row.

DISTINCT when used with multiple columns, is applied for all the columns together. So, the set of values of all columns is considered and not just one column.
If you want to have distinct values, then concatenate all the columns, which will make it distinct.
Or, you could group the rows using GROUP BY.

You need to select all values from customers table, where city is unique. So, logically, I came with such query:
SELECT * FROM `customers` WHERE `city` in (SELECT DISTINCT `city` FROM `customers`)

I think you want something like this:
(change PK field to your Customers Table primary key or index like Id)
In SQL Server (and standard SQL)
SELECT
*
FROM (
SELECT
*, ROW_NUMBER() OVER (PARTITION BY City ORDER BY PK) rn
FROM
Customers ) Dt
WHERE
(rn = 1)
In MySQL
SELECT
*
FORM (
SELECT
a.City, a.PK, count(*) as rn
FROM
Customers a
JOIN
Customers b ON a.City = b.City AND a.PK >= b.PK
GROUP BY a.City, a.PK ) As DT
WHERE (rn = 1)
This query -I hope - will return your Cities distinctly and also shows other columns.

You can use GROUP BY clause for getting distinct values in a particular column. Consider the following table - 'contact':
+---------+------+---------+
| id | name | city |
+---------+------+---------+
| 1 | ABC | Chennai |
+---------+------+---------+
| 2 | PQR | Chennai |
+---------+------+---------+
| 3 | XYZ | Mumbai |
+---------+------+---------+
To select all columns with distinct values in City attribute, use the following query:
SELECT *
FROM contact
GROUP BY city;
This will give you the output as follows:
+---------+------+---------+
| id | name | city |
+---------+------+---------+
| 1 | ABC | Chennai |
+---------+------+---------+
| 3 | XYZ | Mumbai |
+---------+------+---------+

SQL remove certain duplicate values

I have a result table like this (after a query has been run):
id | time | region
12x-4nm-334 | 16:00 | Utah
12x-4nm-334 | 17:00 | California
12x-4nm-334 | 19:00 | Missouri
12x-4nm-334 | 22:00 | California
983-n2n-aq2 | 8:00 | New York
983-n2n-aq2 | 9:00 | New York
There are a few other columns in this table, but the important thing is that I want to remove the ids that are only registered to one region from the result. So ids like "983-n2n-aq2" which only show up in a single region (regardless of time) should not be in the resulting table.
Hope this question is clear enough.

If you use MySql
DELETE FROM table
WHERE id IN ( SELECT x.id
FROM ( select *
FROM table t
GROUP BY id
HAVING COUNT(DISTINCT region) = 1
) as x
)
I don't know for Vertica. Hope it help

Try this:
SELECT id, count(DISTINCT region) as RegionCount
FROM table
GROUP BY
id
HAVING count(DISTINCT region) > 1
If your DBMS doesn't support count(distinct ) then this should do instead:
SELECT id, count(DISTINCT region) as RegionCount
FROM (
SELECT id, region FROM table GROUP BY id, region
) as table
GROUP BY
id
HAVING count(DISTINCT region) > 1

Try this (borrowing from Mr Geerkens):
SELECT a.*
FROM table a INNER JOIN
(
SELECT id
FROM table
GROUP BY id
HAVING count(DISTINCT region) > 1
) b ON a.id = b.id

First get the ids that have more than one distinct region (as shown in the subquery), and then use that in a WHERE clause to filter.
SELECT id, time, region
FROM mytable
WHERE id IN
(
SELECT id
FROM mytable
GROUP BY id
HAVING count(DISTINCT region) > 1
)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL select rows, where column value is unique (only appears once) - sql

Simply do a GROUP BY. Use HAVING to make sure a name is only there once. Use MIN() to pick the only id for the name. select min(id), name from tablename group by name having count(*) = 1 Reading the table only once will increase performance! (And don't forget to create an index on (name, id).)

Use correlated subquery DEMO select * from tablename a where not exists (select 1 from tablename b where a.name=b.name having count(*)>1) OUTPUT: id name 2 Chad 4 Tim

You can use NOT EXISTS : SELECT t.* FROM table t WHERE NOT EXISTS (SELECT 1 FROM table t1 WHERE t1.name = t.Name AND t1.id <> t.id); This would need index on table(id, name) to produce faster result set.

How about a simple aggregation? select any_value(id), name from t group by name having count(*) = 1; BigQuery works quite well with aggregations so this might be quite efficient as well.

use exists and check uqique name select id,name from table t1 where exists ( select 1 from table t2 where t1.name=t2.name having count(*)=1 )

Please try this. SELECT DISTINCT id,NAME FROM tableName

You can use multiple subqueries to extract what you need. SELECT * FROM tableName WHERE name IN (SELECT name FROM (SELECT name, COUNT(name) FROM tableName GROUP BY name HAVING COUNT(name) = 1) AS subQuery)

Below is for BigQuery Standard SQL and works for any number of columns w/o explicitly calling them out and does not require any join'ing or sub-selects #standardSQL SELECT t.* FROM ( SELECT ANY_VALUE(t) t FROM `project.dataset.table` t GROUP BY name HAVING COUNT(1) = 1 )

Related

SQL - Select first group in group by

postgresql find duplicates in column with ID

Selecting compared pairs from table

How to select all attributes (*) with distinct values in a particular column(s)?

SQL remove certain duplicate values

Categories

Resources