Retrieving data with single occurrence of repeated data

Retrieving data with single occurrence of repeated data - sql

SELECT *
FROM employee
GROUP BY first_name
HAVING count(first_name) >= 1;
How can i retrieve all rows and columns with single occurrence of duplicates? i want to retrieve all the table contents including repeated data that must occur only at once. In a table first_name,last_name are repeated twice but with different in other info.
Please Help.

try this Sql Query
SELECT * FROM EMPLOYEE WHERE FIRST_NAME NOT IN
(
SELECT FIRST_NAME FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY FIRST_NAME ORDER BY FIRST_NAME) RNK,FIRST_NAME FROM EMPLOYEE
)A WHERE A.RNK=2
)

Related

SQL Select column which is not used in select section of subquery which find duplicates

I am trying to find in my database records which has duplicated fields like name, surname and type.
Example:
SELECT name, surname, type, COUNT(*)
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1
Query results:
Robb|Stark|1|2
Tyrion|Lannister|1|3
So we have duplicated customer with name and surname "Robb Stark" 2 times and "Tyrion Lannister" 3 times
Now, I want to know the id of these records.
I found similar problem described here:
Finding duplicate values in a SQL table
there is answer but no example.

Use COUNT as an analytic function:
WITH cte AS (
SELECT *, COUNT(*) OVER (PARTITION BY name, surname) cnt
FROM customers
)
SELECT * -- return all columns
FROM cte
WHERE cnt > 1
ORDER BY name, surname;

The simplest way will be to use the EXISTS as follows:
SELECT t.*
FROM customers t
where exists
(select 1 from customers tt
where tt.name = t.name
and tt.surname = t.surname
and tt.id <> t.id)
Or use your original query in IN clause as follows:
select * from customers where (name, surname) in
(SELECT name, surname
FROM customers
GROUP BY name, surname
HAVING COUNT(*)>1)

If you want one row per group of duplicate, with the list of id in a comma separated string, you can just use string aggration with your existing query:
SELECT name, surname, COUNT(*) as cnt,
STRING_AGG(id, ',') WITHIN GROUP (ORDER BY id) as all_ids
FROM customers
GROUP BY name, surname
HAVING COUNT(*) > 1

Remove only duplicates from specific Postgresql table

I have a table which contains duplicates and I would like to keep only one row for each duplicates.
I can select duplicates with my SQL command :
SELECT DISTINCT ON (email, first_name, last_name) * from customer;
But I would like to use DELETE with my previous command.
This command should work right ?
DELETE FROM customer WHERE customer.id NOT IN
(SELECT id FROM
(SELECT DISTINCT ON (email, first_name, last_name) * from customer));
Is it true ?

I guess you have a id field.
delete from customer
where id not in (
select min(id)
from customer
group by email, first_name, last_name
)
The subquery finds the id of the rows you want to keep.
Then you delete the other rows

I can't find your ID in (SELECT DISTINCT ON (email, first_name, last_name) * from customer));
The distinct on only return the first row of the duplication data that is unpredictable

How to get all the columns, but have to group by only 2 columns in sql query

I have a table Employees, which has Fields as below:
Employee_name,Employee_id,Employee_status,Employee_loc,last_update_time.
This table does not have any constraint.
I have tried the below query.
select Employee_name, count(1)
from Employees
where Employee_status = 'ACTIVE'
Group by Employee_name,Employee_loc
having count(Employee_name) > 1
order by count(Employee_name) desc
In the select, I need to get Employee_id too.. Can any one help on how to get that?

You can just add Employee_id to the query, and also add it to the group by clause. (Adding it to the grouping won't make any difference in the query results, assuming each employee name each employee id is unique).
If the grouping does make a difference, that implies that some combinations of employee name and location have more than one ID associated with them. Your query would therefore need to decide which ID to return, possibly by using an aggregate function.

SELECT EMPLOYEE_NAME, EMPLOYEE_ID, COUNT(1)
FROM
EMPLOYEES
WHERE
EMPLOYEE_NAME IN
(
SELECT EMPLOYEE_NAME
FROM EMPLOYEES
WHERE Employee_status = 'ACTIVE'
GROUP BY Employee_name,Employee_loc
HAVING COUNT(*) > 1
)
GROUP BY EMPLOYEE_NAME, EMPLOYEE_ID

You can also use partition by clause and select whichever columns you want to see irrespective of the columns you are using for aggregation.
A very short and simple explanation here - Oracle "Partition By" Keyword

How do I pull just the records needed for display from a query?

I am using the datatables jquery plugin.
I would like to pull just the records needed for display with each query to the database, because there will possibly be 100s of thousands of records in the table. So instead of doing something like this and calling every record in the table and only showing a certain number of them at a time, due to pagination...
<cfquery name="get_users" datasource="dsn">
select user_id, first_name, last_name
from users
</cfquery>
<cfloop query="get_users" startrow="#startrow#" endrow="#endrow#">
...
</cfloop>
Is there a way to put the startrow and endrow within the cfquery tag or within sql somehow, to only get a certain number of records each time?

You'll want to send your starting point and number of records per "page" to the database, and have it return just those records. I don't know what database you're using, but here's an example of a query for MS SQL Server (2005+):
SELECT user_id, first_name, last_name
FROM (
SELECT ROW_NUMBER() OVER(
ORDER BY last_name, first_name
) AS rownum, user_id, first_name, last_name
FROM users
) AS users_page
WHERE rownum >= 1000 AND rownum <= 1010
ORDER BY last_name, first_name
This will give you the page of records from 1000 to 1010.
Here's the mysql version:
SELECT user_id, first_name, last_name
FROM users
ORDER BY last_name, first_name
LIMIT 999, 10 /* offset is zero-indexed in mysql */

MySQL, return only rows where there are duplicates among two columns

I have a table in MySQL of contact information ;
first name, last name, address, etc.
I would like to run a query on this table that will return only rows with first and last name combinations which appear in the table more than once.
I do not want to group the "duplicates" (which may only be duplicates of the first and last name, but not other information like address or birthdate) -
I want to return all the "duplicate" rows so I can look over the results and determine if they are dupes or not. This seemed like it would be a simple thing to do, but it has not been.
Every solution I can find either groups the dupes and gives me a count only (which is not useful for what I need to do with the results) or doesn't work at all.
Is this kind of logic even possible in a query ? Should I try and do this in Python or something?

You should be able doing this with the GROUP BY approach in a sub-query.
SELECT t.first_name, t.last_name, t.address
FROM your_table t
JOIN ( SELECT first_name, last_name
FROM your_table
GROUP BY first_name, last_name
HAVING COUNT(*) > 1
) t2
ON ( t.first_name = t2.first_name, t.last_name = t2.last_name )
The sub-query returns all names (first_name and last_name) that exist more than once, and the JOIN returns all records that match these names.

You could do it with a GROUP BY / HAVING and A SUB SELECT. Something like
SELECT t.*
FROM Table t INNER JOIN
(
SELECT FirstName, LastName
FROM Table
GROUP BY FirstName, LastName
HAVING COUNT(*) > 1
) Dups ON t.FirstName = Dups.FirstName
AND t.LastName = Dups.LastName

select * from people
join (select firstName, lastName
from people
group by firstName, lastName
having count(*) > 1
) dupe
using (firstName, lastName)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Retrieving data with single occurrence of repeated data - sql

try this Sql Query SELECT * FROM EMPLOYEE WHERE FIRST_NAME NOT IN ( SELECT FIRST_NAME FROM ( SELECT ROW_NUMBER() OVER(PARTITION BY FIRST_NAME ORDER BY FIRST_NAME) RNK,FIRST_NAME FROM EMPLOYEE )A WHERE A.RNK=2 )

Related

SQL Select column which is not used in select section of subquery which find duplicates

Remove only duplicates from specific Postgresql table

How to get all the columns, but have to group by only 2 columns in sql query

How do I pull just the records needed for display from a query?

MySQL, return only rows where there are duplicates among two columns

Categories

Resources