Alternate way of this Query? - sql

Let's say we have a Table named Employee with column 'Name':
+---------+
| Name |
+---------+
| Jack |
+---------+
| Paul |
+---------+
| Jack |
+---------+
To have the distinct name we can run this query:
Select DISTINCT Name
from Employee
Is there any other way to retrieve the distinct value?

GROUP BY is silly, but UNION is even worse:
select name from Employee
union
select name from Employee
You can also do INTERSECT...

Select Name
from Employee
Group by Name
This also gives same result

I don't know why you don't want to use distinct but you can GROUP BY
Select Name
from Employee
group by Name;
If you just want to have some fun:
select top 1 with ties Name
from Employee
order by ROW_NUMBER() over(partition by Name order by Name)

;with #temp(Select Name,row_number()over(partition by name order by name desc) as seq)
Select Name from #temp where seq=1

Related

Why no similar ids in the results set when query with a correlated query inside where clause

I have a table with columns id, forename, surname, created (date).
I have a table such as the following:
ID | Forename | Surname | Created
---------------------------------
1 | Tom | Smith | 2008-01-01
1 | Tom | Windsor | 2008-02-01
2 | Anne | Thorn | 2008-01-05
2 | Anne | Baker | 2008-03-01
3 | Bill | Sykes | 2008-01-20
Basically, I want this to return the most recent name for each ID, so it would return:
ID | Forename | Surname | Created
---------------------------------
1 | Tom | Windsor | 2008-02-01
2 | Anne | Baker | 2008-03-01
3 | Bill | Sykes | 2008-01-20
I get the desired result with this query.
SELECT id, forename, surname, created
FROM name n
WHERE created = (SELECT MAX(created)
FROM name
GROUP BY id
HAVING id = n.id);
I am getting the result I want but I fail to understand WHY THE IDS ARE NOT BEING REPEATED in the result set. What I understand about correlated subquery is it takes one row from the outer query table and run the inner subquery. Shouldn't it repeat "id" when ids repeat in the outer query? Can someone explain to me what exactly is happening behind the scenes?
First, your subquery does not need a GROUP BY. It is more commonly written as:
SELECT n.id, n.forename, n.surname, n.created
FROM name n
WHERE n.created = (SELECT MAX(n2.created)
FROM name n2
WHERE n2.id = n.id
);
You should get in the habit of qualifying all column references, especially when your query has multiple table references.
I think you are asking why this works. Well, each row in the outer query is tested for the condition. The condition is: "is my created the same as the maximum created for all rows in the name table with the same id". In your data, only one row per id matches that condition, so ids are not repeated.
You can also consider joining the tables by created vs max(created) column values :
SELECT n.id, n.forename, n.surname, n.created
FROM name n
RIGHT JOIN ( SELECT id, MAX(created) as created FROM name GROUP BY id ) t
ON n.created = t.created;
or using IN operator :
SELECT id, forename, surname, created
FROM name n
WHERE ( id, created ) IN (SELECT id, MAX(created)
FROM name
GROUP BY id );
or using EXISTS with HAVING clause in the subquery :
SELECT id, forename, surname, created
FROM name n
WHERE EXISTS (SELECT id
FROM name
GROUP BY id
HAVING MAX(created) = n.created
);
Demo

SQL select rows, where column value is unique (only appears once)

Given the table
| id | Name |
| 01 | Bob |
| 02 | Chad |
| 03 | Bob |
| 04 | Tim |
| 05 | Bob |
I want to select the name and ID, from rows where the name is unique (only appears once)
This is essentially the same as How to select unique values of a column from table?, but notice that the author doesn't need the id, so that problem can be solved by a GROUP BY name HAVING COUNT(name) = 1
However, I need to extract the entire row (could be tens or hundreds of columns) including the id, where COUNT(name) = 1, but I cannot GROUP BY id, name as every combination of those are unique.
EDIT:
Am using Google BigQuery.
Expected results:
| id | Name |
| 02 | Chad |
| 04 | Tim |
Simply do a GROUP BY. Use HAVING to make sure a name is only there once. Use MIN() to pick the only id for the name.
select min(id), name
from tablename
group by name
having count(*) = 1
Reading the table only once will increase performance! (And don't forget to create an index on (name, id).)
Use correlated subquery
DEMO
select * from tablename a
where not exists (select 1 from tablename b where a.name=b.name having count(*)>1)
OUTPUT:
id name
2 Chad
4 Tim
You can use NOT EXISTS :
SELECT t.*
FROM table t
WHERE NOT EXISTS (SELECT 1 FROM table t1 WHERE t1.name = t.Name AND t1.id <> t.id);
This would need index on table(id, name) to produce faster result set.
How about a simple aggregation?
select any_value(id), name
from t
group by name
having count(*) = 1;
BigQuery works quite well with aggregations so this might be quite efficient as well.
use exists and check uqique name
select id,name
from table t1
where exists ( select 1 from table t2 where t1.name=t2.name
having count(*)=1
)
Please try this.
SELECT
DISTINCT id,NAME
FROM
tableName
You can use multiple subqueries to extract what you need.
SELECT * FROM tableName
WHERE name IN (SELECT name FROM (SELECT name, COUNT(name) FROM tableName
GROUP BY name
HAVING COUNT(name) = 1) AS subQuery)
Below is for BigQuery Standard SQL and works for any number of columns w/o explicitly calling them out and does not require any join'ing or sub-selects
#standardSQL
SELECT t.*
FROM (
SELECT ANY_VALUE(t) t
FROM `project.dataset.table` t
GROUP BY name
HAVING COUNT(1) = 1
)

SQL - Select first group in group by

I have this table in DB2:
+----+-----+----------+
| id | name| key |
+----+-----+----------+
| 1 | foo |111000 |
| 2 | bar |111000 |
| 3 | foo |000111 |
+----+-----+----------+
When I group by name by I can extract the table grouped by the name, but how can I automatically only extract the first group, to get this result:
+----+-----+----------+
| id | name| key |
+----+-----+----------+
| 1 | foo |111000 |
| 3 | foo |000111 |
+----+-----+----------+
How can I solve this?
The MIN function will identify which row is the first one by id, then you can use that to filter the result to show only that row.
SELECT id,name,key
FROM Table1
WHERE id IN (SELECT MIN(ID) FROM Table1 GROUP BY name,key)
You could use a inner join on subselect aggregated by min id
select * from mytable
inner join (
select min(id) my_id
from mytable
group by name, key
) t on t.my_id = mytable.id
It looks like you want to get all names that have the same as the min(id). If this us correct then this should work:
Otherwise, please explain what you mean by "first group" and how that is defined.
select * from table
inner join (
select name, min(id)
from table
group by name
) t on t.name = table.name
In theory, given the way the question is asked you could also just do a simple select on the name you want.
SELECT id,name,key
From Table1
Where name = 'foo'
It really depends what you mean by 'first group'. If you grouped by name and ordered ascending by name then 'bar' would actually be the 'first group', not 'foo'. Maybe if you clarify that we can give you better answers?

How to select all attributes (*) with distinct values in a particular column(s)?

Here is link to the w3school database for learners:
W3School Database
If we execute the following query:
SELECT DISTINCT city FROM Customers
it returns us a list of different City attributes from the table.
What to do if we want to get all the rows like that we get from SELECT * FROM Customers query, with unique value for City attribute in each row.
DISTINCT when used with multiple columns, is applied for all the columns together. So, the set of values of all columns is considered and not just one column.
If you want to have distinct values, then concatenate all the columns, which will make it distinct.
Or, you could group the rows using GROUP BY.
You need to select all values from customers table, where city is unique. So, logically, I came with such query:
SELECT * FROM `customers` WHERE `city` in (SELECT DISTINCT `city` FROM `customers`)
I think you want something like this:
(change PK field to your Customers Table primary key or index like Id)
In SQL Server (and standard SQL)
SELECT
*
FROM (
SELECT
*, ROW_NUMBER() OVER (PARTITION BY City ORDER BY PK) rn
FROM
Customers ) Dt
WHERE
(rn = 1)
In MySQL
SELECT
*
FORM (
SELECT
a.City, a.PK, count(*) as rn
FROM
Customers a
JOIN
Customers b ON a.City = b.City AND a.PK >= b.PK
GROUP BY a.City, a.PK ) As DT
WHERE (rn = 1)
This query -I hope - will return your Cities distinctly and also shows other columns.
You can use GROUP BY clause for getting distinct values in a particular column. Consider the following table - 'contact':
+---------+------+---------+
| id | name | city |
+---------+------+---------+
| 1 | ABC | Chennai |
+---------+------+---------+
| 2 | PQR | Chennai |
+---------+------+---------+
| 3 | XYZ | Mumbai |
+---------+------+---------+
To select all columns with distinct values in City attribute, use the following query:
SELECT *
FROM contact
GROUP BY city;
This will give you the output as follows:
+---------+------+---------+
| id | name | city |
+---------+------+---------+
| 1 | ABC | Chennai |
+---------+------+---------+
| 3 | XYZ | Mumbai |
+---------+------+---------+

Get list of unique records

I have the following table which lists the employees and their corresponding managers:
id | employeeid | managerid
1 | 34256 | 12789
2 | 21222 | 34256
3 | 12435 | 34256
.....
.....
What is the recommended way to list out all distinct employees(id) in a single list.
Note that all managers may not be listed under the employeeid column (as he may not have a manager in turn).
If I understand this correctly:
This will unite all distict Employee IDs avoiding duplicates from between the two column (UNION)
SELECT employeeid AS Employee
FROM tableA
UNION
SELECT managerid AS Employee
FROM tableA
This should d it :
SELECT DISTINCT employeeid FROM yourtablename
But seriously, by googling the keyword "distinct" you could have found out very easily yourself ! Or did I miss something out ?
SELECT id, employeeid, managerid
FROM
(SELECT yourtablename.*,
ROW_NUMBER() OVER (PARTITION BY managerid ORDER BY employeeid DESC) AS RN
FROM yourtablename) AS t
WHERE RN = 1
ORDER BY ID