SQL : Getting duplicate rows along with other variables

SQL : Getting duplicate rows along with other variables - sql

I am working on Terradata SQL. I would like to get the duplicate fields with their count and other variables as well. I can only find ways to get the count, but not exactly the variables as well.
Available input
+---------+----------+----------------------+
| id | name | Date |
+---------+----------+----------------------+
| 1 | abc | 21.03.2015 |
| 1 | def | 22.04.2015 |
| 2 | ajk | 22.03.2015 |
| 3 | ghi | 23.03.2015 |
| 3 | ghi | 23.03.2015 |
Expected output :
+---------+----------+----------------------+
| id | name | count | // Other fields
+---------+----------+----------------------+
| 1 | abc | 2 |
| 1 | def | 2 |
| 2 | ajk | 1 |
| 3 | ghi | 2 |
| 3 | ghi | 2 |
What am I looking for :
I am looking for all duplicate rows, where duplication is decided by ID and to retrieve the duplicate rows as well.
All I have till now is :
SELECT
id, name, other-variables, COUNT(*)
FROM
Table_NAME
GROUP BY
id, name
HAVING
COUNT(*) > 1
This is not showing correct data. Thank you.

You could use a window aggregate function, like this:
SELECT *
FROM (
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
) AS sub
WHERE duplicates > 1
Using a teradata extension to ISO SQL syntax, you can simplify the above to:
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
QUALIFY duplicates > 1

As an alternative to the accepted and perfectly correct answer, you can use:
SELECT {all your required 'variables' (they are not variables, but attributes)}
, cnt.Count_Dups
FROM Table_NAME TN
INNER JOIN (
SELECT id
, COUNT(1) Count_Dups
GROUP BY id
HAVING COUNT(1) > 1 -- If you want only duplicates
) cnt
ON cnt.id = TN.id
edit: According to your edit, duplicates are on id only. Edited my query accordingly.

try this,
SELECT
id, COUNT(id)
FROM
Table_NAME
GROUP BY
id
HAVING
COUNT(id) > 1

Related

Oracle SQL: Counting how often an attribute occurs for a given entry and choosing the attribute with the maximum number of occurs

I have a table that has a number column and an attribute column like this:
1.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 1 | b |
| 1 | a |
| 2 | a |
| 2 | b |
| 2 | b |
+------------
I want to make the number unique, and the attribute to be whichever attribute occured most often for that number, like this (This is the end-product im interrested in) :
2.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 2 | b |
+------------
I have been working on this for a while and managed to write myself a query that looks up how many times an attribute occurs for a given number like this:
3.
+-----+-----+-----+
| num | att |count|
------------------+
| 1 | a | 1 |
| 1 | b | 2 |
| 2 | a | 1 |
| 2 | b | 2 |
+-----------------+
But I can't think of a way to only select those rows from the above table where the count is the highest (for each number of course).
So basically what I am asking is given table 3, how do I select only the rows with the highest count for each number (Of course an answer describing providing a way to get from table 1 to table 2 directly also works as an answer :) )

You can use aggregation and window functions:
select num, att
from (
select num, att, row_number() over(partition by num order by count(*) desc, att) rn
from mytable
group by num, att
) t
where rn = 1
For each num, this brings the most frequent att; if there are ties, the smaller att is retained.

Oracle has an aggregation function that does this, stats_mode().:
select num, stats_mode(att)
from t
group by num;
In statistics, the most common value is called the mode -- hence the name of the function.
Here is a db<>fiddle.

You can use group by and count as below
select id, col, count(col) as count
from
df_b_sql
group by id, col

Identify the counts of each distinct value in one column in sqlite3

I am trying to identify the count of each distinct value in one column (name) in a table called brgy.
---------------------
| ID | name |
---------------------
| 1 | Alfonso |
| 2 | Arakan |
| 3 | Poblacion |
| 4 | Ilaya |
| 5 | Poblacion |
----------------------
I tried using this code but it keeps giving the COUNT as 1 despite Poblacion appearing twice in the name column:
SELECT name,COUNT(name) AS distinct_name
FROM (
SELECT DISTINCT name
FROM brgy
GROUP BY name
)
GROUP BY name;
The intended output should eliminate duplicate names but sum up the number of times the distinct name appears in the name column:
Expected Output is as below,
-----------------------------
| name | distinct_name |
-----------------------------
| Alfonso | 1 |
| Arakan | 1 |
| Ilaya | 1 |
| Poblacion | 2 |
-----------------------------

A simple GROUP BY name:
SELECT name, COUNT(name) AS distinct_name
FROM brgy
GROUP BY name;
you don't need the subquery:
SELECT DISTINCT name FROM brgy GROUP BY name
because GROUP BY name takes care of it.

Just removed the subquery because it will give you unique entry :
select name, count(*) as distinct_name
from brgy
group by name
order by distinct_name, name;

sql query to find unique records

I am new to sql and need your help to achieve the below , I have tried using group and count functions but I am getting all the rows in the unique group which are duplicated.
Below is my source data.
CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan
543,xxx-23,12,12,500
543,xxx-23,12,12,501
543,xxx-23,12,12,510
643,xxx-33,11,17,700
343,xxx-33,11,17,700
766,xxx-74,32,1,300
766,xxx-74,32,1,300
877,xxx-32,12,2,300
877,xxx-32,12,2,300
877,xxx-32,12,2,301
Please note :-the source has multiple combinations of unique records, so when I do the count the unique set is not appearing as count =1
example :- the below data in source have 60 records for each combination
877,xxx-32,12,2,300 -- 60 records
877,xxx-32,12,2,301 -- 60 records
I am trying to get the unique unique records, but the duplicate records are also getting in
Below are the rows which should come up in the unique group. i.e. there will be multiple call_Plans for the same combinations of CDR_ID,TelephoneNo,Call_ID,call_Duration. I want to read records for which there is only one call plan for each unique combination of CDR_ID,TelephoneNo,Call_ID,call_Duration,
CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan
643,xxx-33,11,17,700
343,xxx-33,11,17,700
766,xxx-74,32,1,300
Please advice on this.
Thanks and Regards

To do more complex groupings you could also use a Common Table Expression/Derived Table along with windowed functions:
declare #t table(CDR_ID int,TelephoneNo nvarchar(20),Call_ID int,call_Duration int,Call_Plan int);
insert into #t values (543,'xxx-23',12,12,500),(543,'xxx-23',12,12,501),(543,'xxx-23',12,12,510),(643,'xxx-33',11,17,700),(343,'xxx-33',11,17,700),(766,'xxx-74',32,1,300),(766,'xxx-74',32,1,300),(877,'xxx-32',12,2,300),(877,'xxx-32',12,2,300),(877,'xxx-32',12,2,301);
with cte as
(
select CDR_ID
,TelephoneNo
,Call_ID
,call_Duration
,Call_Plan
,count(*) over (partition by CDR_ID,TelephoneNo,Call_ID,call_Duration) as c
from (select distinct * from #t) a
)
select *
from cte
where c = 1;
Output:
+--------+-------------+---------+---------------+-----------+---+
| CDR_ID | TelephoneNo | Call_ID | call_Duration | Call_Plan | c |
+--------+-------------+---------+---------------+-----------+---+
| 343 | xxx-33 | 11 | 17 | 700 | 1 |
| 643 | xxx-33 | 11 | 17 | 700 | 1 |
| 766 | xxx-74 | 32 | 1 | 300 | 1 |
+--------+-------------+---------+---------------+-----------+---+

using not exists()
select distinct *
from t
where not exists (
select 1
from t as i
where i.cdr_id = t.cdr_id
and i.telephoneno = t.telephoneno
and i.call_id = t.call_id
and i.call_duration = t.call_duration
and i.call_plan <> t.call_plan
)
rextester demo: http://rextester.com/RRNNE20636
returns:
+--------+-------------+---------+---------------+-----------+-----+
| cdr_id | TelephoneNo | Call_id | call_Duration | Call_Plan | cnt |
+--------+-------------+---------+---------------+-----------+-----+
| 343 | xxx-33 | 11 | 17 | 700 | 1 |
| 643 | xxx-33 | 11 | 17 | 700 | 1 |
| 766 | xxx-74 | 32 | 1 | 300 | 1 |
+--------+-------------+---------+---------------+-----------+-----+

Basically you should try this:
SELECT A.CDR_ID, A.TelephoneNo, A.Call_ID, A.call_Duration, A.Call_Plan
FROM YOUR_TABLE A
INNER JOIN (SELECT CDR_ID,TelephoneNo,Call_ID,call_Duration
FROM YOUR_TABLE
GROUP BY CDR_ID,TelephoneNo,Call_ID,call_Duration
HAVING COUNT(*)=1
) B ON A.CDR_ID= B.CDR_ID AND A.TelephoneNo=B.TelephoneNo AND A.Call_ID=B.Call_ID AND A.call_Duration=B.call_Duration
You can do a shorter query using Windows Function COUNT(*) OVER ...

Below query will provide you the result
SELECT CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan, COUNT(*)
FROM TABLE_NAME GROUP BY CDR_ID,TelephoneNo,Call_ID,call_Duration,Call_Plan
HAVING COUNT(*) < 2;
It gives you with the count as well. If not required you can remove it.

Select *, count(CDR_ID)
from table
group by CDR_ID, TelephoneNo, Call_ID, call_Duration, Call_Plan
having count(CDR_ID) = 1

Selecting unique records from database

Running this query,
select * from table;
Returns the following
|branch | number |
-------------------
| 1 | 123 |
| 1 | 001 |
| 2 | 123 |
| 3 | 123 |
| 4 | 123 |
| 1 | 123 |
| 1 | 789 |
| 2 | 123 |
| 3 | 123 |
| 4 | 009 |
I want to find values that are unique to ONLY branch 1
| 1 | 001 |
| 1 | 789 |
Can this be done without the data being stored in separate tables? I've tried a few "select distinct" queries & don't seem to get the results I'm expecting.

SELECT branch, number
FROM table
WHERE branch = 1
GROUP BY branch, number

If you do not need any aggregates, you can use distinct instead of group by:
select distinct branch
, number
from YourTable
where branch = 1

I guess what I'm trying to say is that I want to find all numbers that are unique to ONLY branch 1. If they are found in any other branch, I don't want to see them.
I guess this is what you want.
SELECT distinct number
FROM MyTable
WHERE branch=1 and number not in
( SELECT distinct number
FROM MyTable
WHERE branch != 1 )

Try this:
SELECT branch, number
FROM table
GROUP BY branch, number
Here is a SQLFiddle for you to have a look at
If you want to limit it to only branch 1, then just add a where clause.
SELECT branch, number
FROM table
WHERE branch = 1
GROUP BY branch, number

To select all values that are unique in column number and have a branch value of 1 you can use the following code:
SELECT branch, number
FROM table1
WHERE number IN (
SELECT number
FROM table1
GROUP BY number
HAVING (COUNT(number ) = 1)
)
AND branch = 1
For a demo see http://sqlfiddle.com/#!2/97145/62

... where count(col) > 1

I have a table like this:
+-----+-----+-------+
| id | fk | value |
+-----+-----+-------+
| 0 | 1 | peter |
| 1 | 1 | josh |
| 3 | 2 | marc |
| ... | ... | ... |
I'd like now to get all entries which have more than one value.
The expected result would be:
+-----+-------+
| fk | count |
+-----+-------+
| 1 | 2 |
| ... | ... |
I tried to achieve that like this:
select fk, count(value) from table where count(value) > 1;
But Oracle didn't like it.
So I tried this...
select * from (
select fk, count(value) as cnt from table
) where cnt > 1;
...with no success.
Any ideas?

Use the having clause for comparing aggregates.
Also, you need to group by what you're aggregating against for the query to work correctly. The following is a start, but since you're missing a group by clause still it won't quite work. What exactly are you trying to count?
select fk, count(value)
from table
group by fk
having count(value) > 1;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL : Getting duplicate rows along with other variables - sql

try this, SELECT id, COUNT(id) FROM Table_NAME GROUP BY id HAVING COUNT(id) > 1

Related

Oracle SQL: Counting how often an attribute occurs for a given entry and choosing the attribute with the maximum number of occurs

Identify the counts of each distinct value in one column in sqlite3

sql query to find unique records

Selecting unique records from database

... where count(col) > 1

Categories

Resources