SQL select statement avoiding duplicated rows based on primary key

SQL select statement avoiding duplicated rows based on primary key - sql

I have a table employee with two columns-empid(primary key), name. Suppose it has below three rows.
EmpID Name
---------------
11 Name1
12 Name2
11 Name3
How would I write a select statement to select records avoiding the two rows which have duplicating empid. I used query like:
select empid, name
from(select empid, name, row_number() over(partition by empid order by empid desc) rnk
from t)a
where a.rnk=1
But this query will give
EmpID Name
---------------
11 Name1
12 Name2
As the result. But all I need is
EmpID Name
---------------
12 Name2

try this query, this will work and give you the row 12 Name2
select empid, name from employee a
join (
select empid , count(empid) as count1 from employee
group by empid
having count(empid)=1 ) b on a.empid=b.empid

select empid, name
from(select empid, name,count(*) over(partition by empid) cnt from t) t
where cnt=1

An anti join using NOT EXISTS might be the fastest approach:
SELECT empID, Name
FROM T
WHERE NOT EXISTS (SELECT 1 FROM T AS T2 WHERE T2.EmpID = T.EmpID AND T2.Name <> T.Name);
I have done no testing, so it is possible that the optimiser might be able to generate a anti semi-join using a count = 1 operation, but this gives it the best possible chance of getting to that plan.

Would not SELECT max(empid) as empid, name from employee group by name having count(distinct empid) < 2 work?

Related

SQL Server : finding duplicates based on first few characters on column

I want to find duplicates based on the first three characters of the surname, is there a way a to do that on SQL? I can compare the whole name, but how to do we compare the first few characters?
Below are my tables
custid forename surname dateofbirth
----------------------------------------
1 David John 16-09-1985
2 David Jon 16-09-1985
3 Sarah Smith 10-08-2015
4 Peter Proca 11-06-2011
5 Peter Proka 11-06-2011
This is my query that I am currently running to compare
SELECT
y.id, y.forename, y.surname
FROM
customers y
INNER JOIN
(SELECT
forename, surname, COUNT(*) AS CountOf
FROM customers
GROUP BY forename, surname
HAVING COUNT(*) > 1) dt ON y.forename = dt.forename

You can use left():
select c.*
from (select c.*, count(*) over (partition by left(surname, 3)) as cnt
from customers c
) c
order by surname;
You can include the forename as well in the partition by if you mean forename and first three letters of surname.

You can use exists as follows:
select t.* from t
Where exists
(select 1 from t tt
Where left(t.surname, 3) = left(tt.surname, 3) and t.custid <> tt.custid
)
order by t.surname;

SQL Select column which is not used in select section of subquery which find duplicates

I am trying to find in my database records which has duplicated fields like name, surname and type.
Example:
SELECT name, surname, type, COUNT(*)
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1
Query results:
Robb|Stark|1|2
Tyrion|Lannister|1|3
So we have duplicated customer with name and surname "Robb Stark" 2 times and "Tyrion Lannister" 3 times
Now, I want to know the id of these records.
I found similar problem described here:
Finding duplicate values in a SQL table
there is answer but no example.

Use COUNT as an analytic function:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, surname) cnt
FROM customers
)
SELECT * -- return all columns
FROM cte
WHERE cnt > 1
ORDER BY name, surname;

The simplest way will be to use the EXISTS as follows:
SELECT t.*
FROM customers t
where exists
(select 1 from customers tt
where tt.name = t.name
and tt.surname = t.surname
and tt.id <> t.id)
Or use your original query in IN clause as follows:
select * from customers where (name, surname) in
(SELECT name, surname
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1)

If you want one row per group of duplicate, with the list of id in a comma separated string, you can just use string aggration with your existing query:
SELECT name, surname, COUNT(*) as cnt,
STRING_AGG(id, ',') WITHIN GROUP (ORDER BY id) as all_ids
FROM customers
GROUP BY name, surname
HAVING COUNT(*) > 1

How to sort the row of record and update dynamically in SQL?

Clarification on SQL query syntax : we have an Employee table which has two columns Emp Id and EmpName and its values looks like this:
100, a
200, b
300, c
I have to update the EmpName for 100 with "Joe", 200 with "John", 300 with "Sam". These same 3 names then need to repeat in order for the rest of the table.
How to pick EmpId in ascending order sequence and update the EmpName accordingly?

WITH cteRowNums AS (
SELECT EmpId,
EmpName,
ROW_NUMBER() OVER(ORDER BY EmpId) AS RowNum
FROM Employee
)
UPDATE cteRowNums
SET EmpName = CASE WHEN RowNum % 3 = 1 THEN 'Joe'
WHEN RowNum % 3 = 2 THEN 'John'
WHEN RowNum % 3 = 0 THEN 'Sam'
END;

Hi please add a row_num column
after that update using ROW_NUMBER()
something like
Alter table emp1 add row_num varchar(10)
with cte as (
select id,name,row_num,ROW_NUMBER() over (order by [id]) as rn
from emp1
)
update cte set row_num = 'Test'+ convert(varchar,rn)
row_number this will create a ordered column regarding [Emp Id].
and run update statement to get desired result.

Query returning records with duplicate data because of the wrong data in one of the columns

I have a record of an employee but my query is returning 2 records of this employee because the address column is different between the 2. How can solve this problem? Is it something that can be done? EMP_ID, CUS_LAST_NAME, CUS_FIRST_NAME, and GUARDIAN_ADDRESS are from 3 separate tables.
Example:
ID EMP_ID CUS_LAST_NAME CUS_FIRST_NAME GUARDIAN_ADDRESS
00000000 11111111 Jackson Michael 1111 Street Apt 1
ID EMP_ID CUS_LAST_NAME CUS_FIRST_NAME GUARDIAN_ADDRESS
00000000 11111111 Jackson Michael 1111 Street

if you can't the delete one of the two
if you don't matter which address the select return you can use an aggregation function for get one row only
select ID , EMP_ID , EMP_LAST_NAME, EMP_FIRST_NAME, min(ADDRESS)
from my_table
group by ID , EMP_ID , EMP_LAST_NAME, EMP_FIRST_NAME

If you want detect what employee have duplicates entries.
SELECT *
FROM employees
WHERE EMP_ID IN (
SELECT EMP_ID
FROM employees
GROUP BY EMP_ID
HAVING COUNT(*) > 1
)

--start with unique list of clients
SELECT DISTINCT a.ID, a.EMP_ID, e.EMP_LAST_NAME, e.EMP_FIRST_NAME, e.ADDRESS
FROM TABLE1 a
--attach on employee data on id
OUTER APPLY (SELECT TOP 1 b.EMP_LAST_NAME, b.EMP_FIRST_NAME, b.ADDRESS
FROM TABLE2 b
WHERE a.id = b.id
--use order by clause to change order and choose what top employee record u want to choose
ORDER BY b.address
) e

The quick and dirty way with max():
select id, emp_id, emp_last_name, emp_first_name, max(address) as address
from employees
group by id, emp_id, emp_last_name, emp_first_name
Alternative using: top with ties
select top 1 with ties
id, emp_id, emp_last_name, emp_first_name, address
from employees
order by row_number() over (partition by emp_id order by address desc)
rextester demo for both: http://rextester.com/EGGA75008

Query to find nᵗʰ max value of a column

I want to find 2nd, 3rd, ... nth maximum value of a column.

Consider the following Employee table with a single column for salary.
+------+
| Sal |
+------+
| 3500 |
| 2500 |
| 2500 |
| 5500 |
| 7500 |
+------+
The following query will return the Nth Maximum element.
select SAL from EMPLOYEE E1 where
(N - 1) = (select count(distinct(SAL))
from EMPLOYEE E2
where E2.SAL > E1.SAL )
For eg. when the second maximum value is required,
select SAL from EMPLOYEE E1 where
(2 - 1) = (select count(distinct(SAL))
from EMPLOYEE E2
where E2.SAL > E1.SAL )
+------+
| Sal |
+------+
| 5500 |
+------+

You didn't specify which database, on MySQL you can do
SELECT column FROM table ORDER BY column DESC LIMIT 7,10;
Would skip the first 7, and then get you the next ten highest.

You could sort the column into descending format and then just obtain the value from the nth row.
EDIT::
Updated as per comment request. WARNING completely untested!
SELECT DOB FROM (SELECT DOB FROM USERS ORDER BY DOB DESC) WHERE ROWID = 6
Something like the above should work for Oracle ... you might have to get the syntax right first!

Again you may need to fix for your database, but if you want the top 2nd value in a dataset that potentially has the value duplicated, you'll want to do a group as well:
SELECT column
FROM table
WHERE column IS NOT NULL
GROUP BY column
ORDER BY column DESC
LIMIT 5 OFFSET 2;
Would skip the first two, and then will get you the next five highest.

Pure SQL (note: I would recommend using SQL features specific to your DBMS since it will be likely more efficient). This will get you the n+1th largest value (to get smallest, flip the <). If you have duplicates, make it COUNT( DISTINCT VALUE )..
select id from table order by id desc limit 4 ;
+------+
| id |
+------+
| 2211 |
| 2210 |
| 2209 |
| 2208 |
+------+
SELECT yourvalue
FROM yourtable t1
WHERE EXISTS( SELECT COUNT(*)
FROM yourtable t2
WHERE t1.id <> t2.id
AND t1.yourvalue < t2.yourvalue
HAVING COUNT(*) = 3 )
+------+
| id |
+------+
| 2208 |
+------+

(Table Name=Student, Column Name= mark)
select * from(select row_number() over (order by mark desc) as t,mark from student group by mark) as td where t=4

You can find the nth largest value of column by using the following query:
SELECT * FROM TableName a WHERE
n = (SELECT count(DISTINCT(b.ColumnName))
FROM TableName b WHERE a.ColumnName <=b.ColumnName);

select column_name from table_name
order by column_name desc limit n-1,1;
where n = 1, 2, 3,....nth max value.

Here's a method for Oracle. This example gets the 9th highest value. Simply replace the 9 with a bind variable containing the position you are looking for.
select created from (
select created from (
select created from user_objects
order by created desc
)
where rownum <= 9
order by created asc
)
where rownum = 1
If you wanted the nth unique value, you would add DISTINCT on the innermost query block.

Just dug out this question when looking for the answer myself, and this seems to work for SQL Server 2005 (derived from Blorgbeard's solution):
SELECT MIN(q.col1) FROM (
SELECT
DISTINCT TOP n col1
FROM myTable
ORDER BY col1 DESC
) q;
Effectively, that is a SELECT MIN(q.someCol) FROM someTable q, with the top n of the table retrieved by the SELECT DISTINCT... query.

Select max(sal)
from table t1
where N (select max(sal)
from table t2
where t2.sal > t1.sal)
To find the Nth max sal.

SELECT * FROM tablename
WHERE columnname<(select max(columnname) from tablename)
order by columnname desc limit 1

This is query for getting nth Highest from colomn put n=0 for second highest and n= 1 for 3rd highest and so on...
SELECT * FROM TableName
WHERE ColomnName<(select max(ColomnName) from TableName)-n order by ColomnName desc limit 1;

Simple SQL Query to get the employee detail who has Nth MAX Salary in the table Employee.
sql> select * from Employee order by salary desc LIMIT 1 OFFSET <N - 1>;
Consider table structure as:
Employee (
id [int primary key auto_increment],
name [varchar(30)],
salary [int] );
Example:
If you need 3rd MAX salary in the above table then, query will be:
sql> select * from Employee order by salary desc LIMIT 1 OFFSET 2;
Similarly:
If you need 8th MAX salary in the above table then, query will be:
sql> select * from Employee order by salary desc LIMIT 1 OFFSET 7;
NOTE:
When you have to get the Nth MAX value you should give the OFFSET as (N - 1).
Like this you can do same kind of operation in case of salary in ascending order.

mysql query:
suppose i want to find out nth max salary form employee table
select salary
form employee
order by salary desc
limit n-1,1 ;

In SQL Server, just do:
select distinct top n+1 column from table order by column desc
And then throw away the first value, if you don't need it.

for SQL 2005:
SELECT col1 from
(select col1, dense_rank(col1) over (order by col1 desc) ranking
from t1) subq where ranking between 2 and #n

MySQL:
select distinct(salary) from employee order by salary desc limit (n-1), 1;

Answer :
top second:
select * from (select * from deletetable where rownum <=2 order by rownum desc) where rownum <=1

select sal,ename from emp e where
2=(select count(distinct sal) from emp where e.sal<=emp.sal) or
3=(select count(distinct sal) from emp where e.sal<=emp.sal) or
4=(select count(distinct sal) from emp where e.sal<=emp.sal) order by sal desc;

I think that the query below will work just perfect on oracle sql...I have tested it myself..
Info related to this query : this query is using two tables named employee and department with columns in employee named: name (employee name), dept_id (common to employee and department), salary
And columns in department table: dept_id (common for employee table as well), dept_name
SELECT
tab.dept_name,MIN(tab.salary) AS Second_Max_Sal FROM (
SELECT e.name, e.salary, d.dept_name, dense_rank() over (partition BY d.dept_name ORDER BY e.salary) AS rank FROM department d JOIN employee e USING (dept_id) ) tab
WHERE
rank BETWEEN 1 AND 2
GROUP BY
tab.dept_name
thanks

Another one for Oracle using analytic functions:
select distinct col1 --distinct is required to remove matching value of column
from
( select col1, dense_rank() over (order by col1 desc) rnk
from tbl
)
where rnk = :b1

Select min(fee)
from fl_FLFee
where fee in (Select top 4 Fee from fl_FLFee order by 1 desc)
Change Number four with N.

You can simplify like this
SELECT MIN(Sal) FROM TableName
WHERE Sal IN
(SELECT TOP 4 Sal FROM TableName ORDER BY Sal DESC)
If the Sal contains duplicate values then use this
SELECT MIN(Sal) FROM TableName
WHERE Sal IN
(SELECT distinct TOP 4 Sal FROM TableName ORDER BY Sal DESC)
the 4 will be nth value it may any highest value such as 5 or 6 etc.

(TableName=Student, ColumnName=Mark) :
select *
from student
where mark=(select mark
from(select row_number() over (order by mark desc) as t,
mark
from student group by mark) as td
where t=2)

In PostgreSQL, to find N-th largest salary from Employee table.
SELECT * FROM Employee WHERE salary in
(SELECT salary FROM Employee ORDER BY salary DESC LIMIT N)
ORDER BY salary ASC LIMIT 1;

Solution to find Nth Maximum value of a particular column in SQL Server:
Employee table:
Sales table:
Employee table data:
==========
Id name
=========
6 ARSHAD M
7 Manu
8 Shaji
Sales table data:
=================
id emp_id amount
=================
1 6 500
2 7 100
3 8 100
4 6 150
5 7 130
6 7 130
7 7 330
Query to Find out details of an employee who have highest sale/ Nth highest salesperson
select * from (select E.Id,E.name,SUM(S.amount) AS 'total_amount' from employee E INNER JOIN Sale S on E.Id=S.emp_id group by S.emp_id,E.Id,E.name ) AS T1 WHERE(0)=( select COUNT(DISTINCT(total_amount)) from(select E.Id,E.name,SUM(S.amount) AS 'total_amount' from employee E INNER JOIN Sale S on E.Id=S.emp_id group by S.emp_id,E.Id,E.name )AS T2 WHERE(T1.total_amount<T2.total_amount) );
In the WHERE(0) replace 0 by n-1
Result:
========================
id name total_amount
========================
7 Manu 690

Table employee
salary
1256
1256
2563
8546
5645
You find the second max value by this query
select salary
from employee
where salary=(select max(salary)
from employee
where salary <(select max(salary) from employee));
You find the third max value by this query
select salary
from employee
where salary=(select max(salary)
from employee
where salary <(select max(salary)
from employee
where salary <(select max(salary)from employee)));

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL select statement avoiding duplicated rows based on primary key - sql

try this query, this will work and give you the row 12 Name2 select empid, name from employee a join ( select empid , count(empid) as count1 from employee group by empid having count(empid)=1 ) b on a.empid=b.empid

select empid, name from(select empid, name,count(*) over(partition by empid) cnt from t) t where cnt=1

Would not SELECT max(empid) as empid, name from employee group by name having count(distinct empid) < 2 work?

Related

SQL Server : finding duplicates based on first few characters on column

SQL Select column which is not used in select section of subquery which find duplicates

How to sort the row of record and update dynamically in SQL?

Query returning records with duplicate data because of the wrong data in one of the columns

Query to find nᵗʰ max value of a column

Categories

Resources