Find duplicate entries in a column [duplicate]

Find duplicate entries in a column [duplicate] - sql

This question already has answers here:
How do I find duplicate values in a table in Oracle?
(13 answers)
Closed 7 years ago.
I am writing this query to find duplicate CTN Records in table1. So my thinking is if the CTN_NO appears more than twice or higher , I want it shown in my SELECT * statement output on top.
I tried the following sub-query logic but I need pulls
SELECT *
table1
WHERE S_IND='Y'
and CTN_NO = (select CTN_NO
from table1
where S_IND='Y'
and count(CTN_NO) < 2);
order by 2

Using:
SELECT t.ctn_no
FROM YOUR_TABLE t
GROUP BY t.ctn_no
HAVING COUNT(t.ctn_no) > 1
...will show you the ctn_no value(s) that have duplicates in your table. Adding criteria to the WHERE will allow you to further tune what duplicates there are:
SELECT t.ctn_no
FROM YOUR_TABLE t
WHERE t.s_ind = 'Y'
GROUP BY t.ctn_no
HAVING COUNT(t.ctn_no) > 1
If you want to see the other column values associated with the duplicate, you'll want to use a self join:
SELECT x.*
FROM YOUR_TABLE x
JOIN (SELECT t.ctn_no
FROM YOUR_TABLE t
GROUP BY t.ctn_no
HAVING COUNT(t.ctn_no) > 1) y ON y.ctn_no = x.ctn_no

Try this query.. It uses the Analytic function SUM:
SELECT * FROM
(
SELECT SUM(1) OVER(PARTITION BY ctn_no) cnt, A.*
FROM table1 a
WHERE s_ind ='Y'
)
WHERE cnt > 2
Am not sure why you are identifying a record as a duplicate if the ctn_no repeats more than 2 times. FOr me it repeats more than once it is a duplicate. In this case change the las part of the query to WHERE cnt > 1

Related

How can I combine two code at the same time? [duplicate]

This question already has answers here:
Select top and bottom rows
(9 answers)
Closed 7 months ago.
Question: The information of the youngest and oldest employee at the same time.
I can find information of the youngest or oldest employee, but I can't find information of the youngest and oldest employee at the same time.
My sql query:
-- The information of the oldest employee
Select top 1*
From DIP_Employees1
where DogumTarihi=
(Select Min(DogumTarihi)
From DIP_Employees1
)
-- The information of the youngest employee
Select top 1*
From DIP_Employees1
where DogumTarihi=
(Select Max(DogumTarihi)
From DIP_Employees1
)
How can I combine this two code for at the same time?

You can combine 2 queries that return same columns using UNION statement.
Also it is unnecessary to use subqueries. Instead you can just use order by
SELECT *
FROM (
Select top 1 *
From DIP_Employees1
ORDER BY DogumTarihi ASC
) t1
UNION ALL
SELECT *
FROM (
Select top 1 *
From DIP_Employees1
ORDER BY DogumTarihi DESC
) t2

Another option:
with cte as (
select min(DogumTarihi) as mndg, max(DogumTarihi) as mxdg
from dbo.DIP_Employees1
)
select ...
from dbo.DIP_Employees1 as emp
inner join cte on emp.DogumTarihi = cte.mndg
or emp.DogumTarihi = cte.maxdg
order by ...
;
Notice that some suggestions assume birthdate is unique. However unlikely it might be that multiple persons will have the same birthdate, why assume at all?

Distinct one field but wanted to display all the columns [duplicate]

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 3 years ago.
Anyone can help me the easiest way to distinct one particular field/column but displaying all fields/columns? Please see attached image the data-source, I've tried to query on my own but it displays all the 16 records I'm looking for 6 records only.
USE DBASE;
WITH t1 as (SELECT DISTINCT STATE
FROM DSOURCE),
t2 as (SELECT *
FROM DSOURCE)
SELECT
*
FROM
t1
LEFT JOIN t2 ON t1.STATE=t2.STATE

You want row_number() :
select d.*
from (select d.*, row_number() over (partition by d.state order by d.f) as seq
from dsource d
) d
where d.seq = 1;

row_number() is your saviour here:
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY STATE ORDER BY B,C,D,E,F) Corr
FROM dsource
)
SELECT *
FROM CTE
WHERE Corr = 1

You clearly want the first row from each `state's data. However, your dataset doesn't have a clear indicator of what is "first". So, you need to take one of two approaches.
If your data actually has an IDENTITY column, you can approach it with a query like this:
SELECT *
FROM DSOURCE d
WHERE ID In (
SELECT MIN(ID)
FROM DSOURCE ds
GROUP BY State
)
If not, you will need to use the row_number() functionality as shown above. #yogesh-sharma has the best example of using the this.

how to select only max(id) if you need less columns in group by [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Select First Row of Every Group in sql [duplicate]
(2 answers)
Return row with the max value of one column per group [duplicate]
(3 answers)
Get value based on max of a different column grouped by another column [duplicate]
(1 answer)
Closed 3 years ago.
The thing is that i could have values like
ID STREET_ID HOUSENUMBER POSTCODE
10000000 20512120 22 04114
11000000 20512120 22 04074
problem is that POSTCODE have to be in select, but i need distinct STREET_ID + HOUSENUMBER with MAX id,
by example i just want to show 11000000,20512120,22,04074 out of 2 records because of MAX(h.ID).
this is my code
SELECT DISTINCT
MAX(h.ID),
h.street_id,
h.houseNumber,
h.postindex AS postCode
FROM house h
WHERE
h.postindex IS NOT NULL AND
h.STREET_ID IS NOT NULL
GROUP BY
h.street_id,
h.houseNumber
ORDER BY
STREET_ID,
CAST(REGEXP_REPLACE(REGEXP_REPLACE(h.houseNumber, '(\-|\/)(.*)'), '\D+') AS NUMBER),
h.houseNumber
i have an error " ORA-00979: not a GROUP BY expression " and i understand it, because POSTCODE is not in GROUP BY, how to deal with that?

Your requirement is a good candidate for ROW_NUMBER:
WITH cte AS (
SELECT h.*,
ROW_NUMBER() OVER (PARTITION BY h.STREET_ID, h.HOUSENUMBER ORDER BY h.ID DESC) rn
FROM house h
)
SELECT ID, STREET_ID, HOUSENUMBER, POSTCODE
FROM cte
WHERE rn = 1;
An index on (STREET_ID, HOUSENUMBER, ID) might speed up the above query, because it would let Oracle quickly find the max ID record for each street/house number.

You can do it with subquery and exists:
SELECT *
FROM house h
WHERE NOT EXISTS (SELECT 1 FROM house h2
WHERE h2.street_id = h.street_id
AND h2.houseNumber = h.houseNumber
AND h2.id > h.id)

Don't aggregate. Instead, you can filter with a correlated subquery:
select h.*
from house h
where id = (
select max(h1.id)
from house h1
where h1.street_number = h.street_number and h1.house_number = h.house_number
)
I would expect this solution to be as efficient as it gets, especially with an index on (street_number, house_number, id).

select distinct id,street_id,house_no,postal_code from house where id in (
SELECT MAX(id) from house group by street_id,house_no)

top 10 rows in oracle [duplicate]

This question already has answers here:
Oracle SQL - How to Retrieve highest 5 values of a column [duplicate]
(5 answers)
Oracle SELECT TOP 10 records [duplicate]
(6 answers)
Closed 6 years ago.
i have 2 tables .
abc(CID(pk), cname,)
order(order_id(pk), CID(fk), number_of_rentals)
i want to fetch top 10 customers based on number of rentals.
SELECT cid, sum(no_rentals) as sum
FROM orders
group by cid, no_rentals
order by no_rentals desc;
how can i use rownum function in above query to fetch the desired output

Just wrap your query in:
SELECT * FROM ( your_query ) WHERE ROWNUM <= 10;
However, your query does not look like it is going to do what you intend as the GROUP BY no_renalts will mean that each distinct no_rentals value will be in its own group and you will not sum the values for each customer so you probably don't want to include it in the GROUP BY. Also, if you want to order by the total number of rentals then you want to ORDER BY SUM( no_rentals ) (or by its alias) like this:
SELECT cid,
SUM(no_rentals) as total_no_rentals
FROM orders
GROUP BY cid
ORDER BY total_no_rentals DESC;
Then you can apply the row limit like this:
SELECT *
FROM (
SELECT cid,
SUM(no_rentals) as total_no_rentals
FROM orders
GROUP BY cid
ORDER BY total_no_rentals DESC
)
WHERE ROWNUM <= 10;

How to get total number of rows in a executed select statement? [duplicate]

This question already has answers here:
How to include the total number of returned rows in the resultset from SELECT T-SQL command?
(7 answers)
Closed 9 years ago.
How can I out how many rows I obtained after execution?
My query is:
SELECT a.Emp,b.orders
from table as a inner join table1 b
on a.ID = B.ID
How do I find the number of rows returned in the above join?

You either have to use SELECT COUNT(*) ... with the same condition or add a column with the row-count via ROW_NUMBER function:
SELECT a.Emp,b.orders, RN = ROW_NUMBER () OVER (ORDER BY a.Emp,b.orders)
FROM table as a inner join table1 b on a.ID=B.ID
...or use ##ROWCOUNT after the select.
Instead of ROW_NUMBER it's easier to use COUNT(*) OVER ( Order By ...) where each row contains the same total-count whereas ROW_NUMBER would return a sequential number where only the last record(acc. to the ORDER BY) would have the total-count.
So what Aaron has already meantioned in his answer.

-- statement here
SELECT ##ROWCOUNT;
You can also get it on every row of the statement, but of course this is a little more expensive, e.g.
SELECT x, y, z, COUNT(*) OVER() FROM ...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find duplicate entries in a column [duplicate] - sql

Related

How can I combine two code at the same time? [duplicate]

Distinct one field but wanted to display all the columns [duplicate]

how to select only max(id) if you need less columns in group by [duplicate]

top 10 rows in oracle [duplicate]

How to get total number of rows in a executed select statement? [duplicate]

Categories

Resources