How to remove duplicate entries in SQL using selected columns only?

How to remove duplicate entries in SQL using selected columns only? - sql

My data is as shown below:
id | name | date | country | vendor
1717 | CUST A | 8-Aug-1978 | INDIA | VENDOR 1
1972 | CUST B | 1-Jan-1965 | INDIA | VENDOR 2
2083 | CUST C | 1-Jan-1936 | AUSTRALIA | VENDOR 1
2189 | CUST D | 27-May-2000 | USA | VENDOR 4
2189 | CUST D | 27-May-2000 | USA | VENDOR 5
2189 | CUST D | 27-May-2000 | USA | VENDOR 6
Question:
I want to remove the duplicate rows based on Columns
id, name, date, gender and country only (hence excluding Vendor)
In the above example, the 5th and 6th entries are duplicate except for their vendors.
Using Select Query how can I get rid of the 5th and 6th entry and keep on 4th entry?
By Keeping the 4th Entry, I mean the first Entry that comes up by select in the sequence of rows.

One method is group by:
select id, name, date, gender, country, min(vendor) as vendor
from t
group by id, name, date, gender, country;
This returns an "arbitrary" value of vendor. Tables in SQL represent unordered sets. There is no concept of 4th or 5th or 6th row. So, if you want one of the particular vendor values, you need to specify how that value is determined.

SELECT count(vendor) as count, id, name, date, gender, country
FROM TABLENAME
GROUP BY id, name, date, gender, country
WHERE Count > 1
sqlcsa

You can use Row_Number()
select * from (
select *, RowN= Row_Number() over(partition by id, name, date, gender, country order by id, name, date, gender, country)
from YourTable ) a where a.RowN = 1

If you are not interested in preserving the vendor information, you can use the distinct keyword
select distinct id, name, date, gender, country
from yourTable
This way the rows that are different for the undesired column only, will result as identical and the distinct will have the query return only one of them
Edit
If you want to preserve only the rows that are not duplicate, you can first select the combinations of id, name, date, gender and country that are available only once
select id, name, date, gender, country, count(*)
from yourTable
group by id, name, date, gender, country
having count(*) = 1
Then you use this table to filter the original one, by joining them together
select t1.*
from yourTable t1
join (
select id, name, date, gender, country, count(*)
from yourTable
group by id, name, date, gender, country
having count(*) = 1
) t2
on t1.id = t2.id and
t1.name = t2.name and
t1.date = t2.date and
t1.gender = t2.gender and
t1.country = t2.country

Related

How to check how many times some values are duplicated?

I have table like below:
city | segment
------------------
London | A
London | B
New York | A
Berlin | B
Barcelona | C
Barcelona | H
Barcelona | E
Each city should have only one segment, but as you can see there are two cities (London and Barcelona) that have more than one segment.
It is essential that in result table I need only these cities which have > 1 segmnet
As a result I need somethig like below:
city - city based on table above
no_segments - number of segments which have defined city based on table above
segments - segments of defined city based on table above
city
no_segments
segments
London
2
A
B
Barcelona
3
C
H
E
How can I do that in Oracle?

You can use COUNT(*) OVER ()(in order to get number of segments) and ROW_NUMBER()(in order to prepare the results those will be conditionally displayed) analytic functions such as
WITH t1 AS
(
SELECT city,
segment,
COUNT(*) OVER (PARTITION BY city) AS no_segments,
ROW_NUMBER() OVER (PARTITION BY city ORDER BY segment) rn
FROM t
)
SELECT DECODE(rn,1,city) AS city,
DECODE(rn,1,no_segments) AS no_segments,
segment
FROM t1
WHERE no_segments > 1
ORDER BY t1.city, segment
Demo

Another way to do this is:
SELECT NULLIF(CITY, PREV_CITY) AS CITY,
SEGMENT
FROM (SELECT CITY,
LAG(CITY) OVER (ORDER BY CITY DESC) AS PREV_CITY,
SEGMENT,
COUNT(SEGMENT) OVER (PARTITION BY CITY) AS CITY_SEGMENT_COUNT
FROM CITY_SEGMENTS)
WHERE CITY_SEGMENT_COUNT > 1
Using LAG() to determine the "previous" CITY allows us to directly compare the CITY values, which in my mind is clearer that using ROW_NUMBER = 1.
db<>fiddle here

;with cte as (
Select city, count(seg) as cntseg
From table1
Group by city having count(seg) > 1
)
Select a.city, b.cntseg, a.seg
From table1 as a join cte as b
On a.city = b.city

Reconciliation Automation Query

I have one database and time to time i change some part of query as per requirement.
i want to keep record of results of both before and after result of these queries in one table and want to show queries which generate difference.
For Example,
Consider following table
emp_id country salary
---------------------
1 usa 1000
2 uk 2500
3 uk 1200
4 usa 3500
5 usa 4000
6 uk 1100
Now, my before query is :
Before Query:
select count(emp_id) as count,country from table where salary>2000 group by country;
Before Result:
count country
2 usa
1 uk
After Query:
select count(emp_id) as count,country from table where salary<2000 group by country;
After Query Result:
count country
2 uk
1 usa
My Final Result or Table I want is:
column 1 | column 2 | column 3 | column 4 |
2 usa 2 uk
1 uk 1 usa
...... but if query results are same than it shouldn't show in this table.
Thanks in advance.

I believe that you can use the same approach as here.
select t1.*, t2.* -- if you need specific columns without rn than you have to list them here
from
(
select t.*, row_number() over (order by count) rn
from
(
-- query #1
select count(emp_id) as count,country from table where salary>2000 group by country;
) t
) t1
full join
(
select t.*, row_number() over (order by count) rn
from
(
-- query #2
select count(emp_id) as count,country from table where salary<2000 group by country;
) t
) t2 on t1.rn = t2.rn

Oracle Sql : distinct value in a specific field [duplicate]

This question already has answers here:
How to select records with maximum values in two columns?
(2 answers)
Closed 7 years ago.
I have the following table :
**Country Name Number**
us John 45
us Jeff 35
fr Jean 31
it Luigi 25
fr Maxime 23
ca Justin 23
This table is order by Number. I want to have a query that for each country give me the name with highest number :
**Country Name Number**
us John 45
fr Jean 31
it Luigi 25
ca Justin 23
I try to use distinct but I can't only make it on country if I want to print the all thing...
Have an idea ?'
EDIT :
The table is obtain by a subquery

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE Countries AS
SELECT 'us' AS Country, 'John' AS Name, 45 AS "Number" FROM DUAL
UNION ALL SELECT 'us' AS Country, 'Jeff' AS Name, 35 AS "Number" FROM DUAL
UNION ALL SELECT 'fr' AS Country, 'Jean' AS Name, 31 AS "Number" FROM DUAL
UNION ALL SELECT 'it' AS Country, 'Luigi' AS Name, 25 AS "Number" FROM DUAL
UNION ALL SELECT 'fr' AS Country, 'Maxime' AS Name, 23 AS "Number" FROM DUAL
UNION ALL SELECT 'ca' AS Country, 'Justin' AS Name, 23 AS "Number" FROM DUAL;
Query 1:
SELECT Country,
MAX( Name ) KEEP ( DENSE_RANK FIRST ORDER BY "Number" DESC ) AS "Name",
MAX( "Number" ) AS "Number"
FROM Countries
GROUP BY Country
Results:
| COUNTRY | Name | Number |
|---------|--------|--------|
| ca | Justin | 23 |
| fr | Jean | 31 |
| it | Luigi | 25 |
| us | John | 45 |

I do not have an Oracle db handy but I got this working in my SQL Server db and am pretty sure it will work in Oracle (meaning I think I am using ANSI sql which should work in most db's):
SELECT m.Country,m.Name,m.number
FROM mytable m
INNER JOIN (
select country, MAX(number) as number
FROM mytable GROUP BY Country
) AS tmp ON m.Country = tmp.Country and m.Number = tmp.number
ORDER BY m.Number DESC
This has the added benefit that it should give you records when you have two people in a given country that have the same number.
You didn't give us a table name so I just called it mytable.

Try below query:
SELECT Country, MAX(numbeer) FROM Table_Name GROUP BY Country
PFB updated query to include Name:
SELECT t1.* FROM table1 t1 INNER JOIN
(SELECT country, max(numbeer) as numbeer FROM table1 GROUP BY country) t2
ON t1.country=t2.country AND t1.numbeer=t2.numbeer;

Use row_number():
select t.Country, t.Name, t.Number
from (select t.*,
row_number() over (partition by country order by number desc) as seqnum
from table t
) t
where seqnum = 1;

SELECT Top 1 ID, DISTINCT Field

I have a table sample table as follows:
ID | City
--------------
1 | New York
2 | San Francisco
3 | New York
4 | Los Angeles
5 | Atlanta
I would like to select the distinct City AND the TOP ID for each. E.g., conceptually I would like to do the following
SELECT TOP 1 ID, DISTINCT City
FROM Cities
Should give me:
ID | City
--------------
1 | New York
2 | San Francisco
4 | Los Angeles
5 | Atlanta
Because New York appears twice, it's taken the first ID 1 in this instance.
But I get the error:
Column 'Cities.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Try this way:
SELECT min(ID), City
FROM Cities
Group by City
MIN function is used for choose one of the ID from two New York cities.

You need to have your city in a GROUP BY
SELECT MIN(ID), City
FROM Cities
GROUP BY City

More general solution is to use row_number in order to get other details of table:
select * from
(select *, row_number() over(partition by City order by ID) as rn from Cities)
where rn = 1
But for this particular table just grouping will do the work:
select City, Min(ID) as ID
from Cities
group by City

If you have a complex scenario where Group By cannot use, You could use Row_Number() function with Common Table Expression.
;WITH CTE AS
(
SELECT ID, City, ROW_NUMBER() OVER (PARTITION BY City ORDER BY Id) rn
FROM YourTable
)
SELECT Id, City
FROM CTE
WHERE rn = 1

select n rows in sql

I have a table
Country | Capital
----------------------
France | Paris
Germany | Berlin
USA | Washington
Russia | Moscow.
I need to select all rows except the first one.The table is having no primary key.
How should i do this?

SELECT *
FROM (
SELECT country, capitol, rownum as rn
FROM your_table
ORDER BY country
)
WHERE rn > 1
If the "first one" is not defined through sorting by country, then you need to apply a different ORDER BY in the inner query.
Edit
For completeness, the ANSI SQL solution to this would be:
SELECT *
FROM (
SELECT country,
capitol,
row_number() over (order by country) as rn
FROM your_table
)
WHERE rn > 1
That is a portable solution that works on almost all major DBMS

The way to do it with Oracle is the following:
SELECT country, capital FROM
( SELECT rownum rn, country, capital
FROM table
)
WHERE rn > 1
You cannot put a direct >N condition on rownum, because ROWNUMs are assigned when rows are fetched and your condition will never evaluate to TRUE.
Alternative is:
SELECT country, capital FROM table
MINUS
SELECT country, capital FROM table WHERE rownum <= 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to remove duplicate entries in SQL using selected columns only? - sql

SELECT count(vendor) as count, id, name, date, gender, country FROM TABLENAME GROUP BY id, name, date, gender, country WHERE Count > 1 sqlcsa

You can use Row_Number() select * from ( select *, RowN= Row_Number() over(partition by id, name, date, gender, country order by id, name, date, gender, country) from YourTable ) a where a.RowN = 1

Related

How to check how many times some values are duplicated?

Reconciliation Automation Query

Oracle Sql : distinct value in a specific field [duplicate]

SELECT Top 1 ID, DISTINCT Field

select n rows in sql

Categories

Resources