Count of duplicate values by two columns in SQL Server

Count of duplicate values by two columns in SQL Server - sql

From this table:
Number Value
1 a
2 b
3 a
2 c
2 b
3 a
2 b
I need to get count of all duplicate rows by Number and Value, i.e. 5.
Thanks.

I think this query is what you want:
SELECT SUM(t.cnt)
FROM
(
SELECT COUNT(*) cnt
FROM table_name
GROUP BY number, value
HAVING COUNT(*) > 1
)t;

Maybe something like this?
select value,number,max(cnt) as Count_distinct from (
select *,row_number () over (partition by value,number order by number) as cnt
from #sample
)t
group by value,number
Output
+---------------------------------+
| Value | Number | Count_Distinct |
| a | 1 | 1 |
| b | 2 | 3 |
| c | 2 | 1 |
| a | 3 | 2 |
+---------------------------------+

Select
count(distinct Number) as Distinct_Numbers,
count(distinct Value) as Distinct_Values
from
Table
This shows how many distinct values are in each column. Does this help?

Give a row number partition by both the columns and order by both the columns. Then count the number of rows where row number greater than 1.
Query
;with cte as(
select [rn] = row_number() over(
partition by [Number], [Value]
order by [Number], [Value]
), *
from [your_table_name]
)
select count(*) from cte
where [rn] > 1;

I think you mean number of unique number - value pairs, you can use:
SELECT count(*)
FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY number, value ORDER BY (select 1)) from mytable rnk) i
where i.rnk = 1

May be this query may help you
select * from [dbo].[Sample_table1]
;WITH
DupContactRecords(number,value,DupsCount)
AS
(
SELECT number,value, COUNT() AS TotalCount FROM [Sample_table1] GROUP BY number,value HAVING COUNT() > 1
)
--to get the duplicats
/*select * from DupContactRecords*/
SELECT sum(DupsCount) FROM DupContactRecords

Related

Return the highest SUM value of all donors by designations

I have the following script:
SELECT DISTINCT GIFT_ID, GIFT_DESG, SUM(GIFT_AMT)
FROM GIFT_TABLE
GROUP BY GIFT_ID, GIFT_DESG
It will return something like this:
GIFT_ID GIFT_DESG SUM(GIFT_AMT)
1 A 25
1 B 500
1 C 75
2 A 100
2 B 200
2 C 300
...
My desired outcome is:
GIFT_ID GIFT_DESG SUM(GIFT_AMT)
1 B 500
2 C 300
How would I do that?
Possibly row_number() right? I think it's something with the summing of gift amounts by designation that is throwing me off.
Thank you.

if your DBMS support ROW_NUMBER window function you can try to make row number by GIFT_ID order by SUM(GIFT_AMT) then get rn = 1 row.
SELECT t1.GIFT_ID,t1.GIFT_DESG,t1.GIFT_AMT
FROM (
SELECT t1.*,ROW_NUMBER() OVER(PARTITION BY GIFT_ID ORDER BY GIFT_AMT DESC) rn
FROM (
SELECT GIFT_ID, GIFT_DESG, SUM(GIFT_AMT) GIFT_AMT
FROM GIFT_TABLE
GROUP BY GIFT_ID, GIFT_DESG
) t1
) t1
where rn =1
Note
You already use GROUP BY the DISTINCT keyword is no sense, you can remove it from your query.
Here is a sample
CREATE TABLE T(
GIFT_ID int,
GIFT_DESG varchar(5),
GIFT_AMT int
);
insert into t values (1,'A' ,25);
insert into t values (1,'B' ,500);
insert into t values (1,'C' ,75);
insert into t values (2,'A' ,100);
insert into t values (2,'B' ,200);
insert into t values (2,'C' ,300);
Query 1:
SELECT t1.GIFT_ID,t1.GIFT_DESG,t1.GIFT_AMT
FROM (
SELECT t1.*,ROW_NUMBER() OVER(PARTITION BY GIFT_ID ORDER BY GIFT_AMT DESC) rn
FROM T t1
) t1
where rn =1
Results:
| GIFT_ID | GIFT_DESG | GIFT_AMT |
|---------|-----------|----------|
| 1 | B | 500 |
| 2 | C | 300 |

You can do this with no subquery:
SELECT TOP (1) WITH TIES GIFT_ID, GIFT_DESG, SUM(GIFT_AMT)
FROM GIFT_TABLE
GROUP BY GIFT_ID, GIFT_DESG
ORDER BY ROW_NUMBER() OVER (PARTITION BY GIFT_ID ORDER BY SUM(GIFT_AMT) DESC);

You can do it also like this
WITH t as
SELECT GIFT_ID, GIFT_DESG, SUM(GIFT_AMT) AS GIFT_AMT
FROM GIFT_TABLE
GROUP BY GIFT_ID, GIFT_DESG)
SELECT GIFT_ID,
max(GIFT_DESG) KEEP (DENSE_RANK LAST ORDER BY GIFT_AMT),
max(GIFT_AMT) GIFT_AMT
FROM T
GROUP BY GIFT_ID;

Limit MAX() result to one row based on highest value in a particular field

Of course my data set is more complex, but this is essentially what I have:
+--------+--------+-------+
| SEQ_NO | FILTER | VALUE |
+--------+--------+-------+
| 1 | 'A' | 5 |
| 2 | 'A' | 10 |
| 3 | 'A' | 15 |
+--------+--------+-------+
Here is my query:
SELECT MAX(SEQ_NO)
, FILTER
, VALUE
FROM TABLE
GROUP BY FILTER
, VALUE
This returns my entire data set. How can I alter my query so that it only returns the record with the highest SEQ_NO ?

SELECT t1.*
FROM Table AS t1
INNER JOIN
(
SELECT MAX(SEQ_NO) MAXSeq
, FILTER
, VALUE
FROM TABLE
GROUP BY FILTER
, VALUE
) t2 ON t1.SEQ_NO = t2.MAXSeq
AND t1.FILTER = t2.FILTER
AND t1.VALUE = t2.VALUE
Or using row_number:
SELECT *
FROM
(
SELECT *,
row_number() over(partition by FILTER, VALUE
order by SEQ_NO desc) as rn
FROM table
) t
WHERE rn = 1

From Oracle 12C:
SELECT SEQ_NO
, FILTER
, VALUE
FROM TABLE
ORDER BY SEQ_NO DESC
FETCH FIRST 1 ROWS ONLY;

You can use ROWNUM in oracle:
select *
from
( select *
from yourTable
order by SEQ_NO desc ) as t
where ROWNUM = 1;

This should work
SELECT TOP 1 *
FROM TABLE
ORDER BY SEQ_NO DESC

If I understand correctly, you want the top SEQ_NO per filter?
i've created this in SQL Server and converted to Oracle
SELECT a.SEQ_NO,
a.FILTER,
a.VALUE
FROM (
SELECT SEQ_NO,
FILTER,
VALUE,
MAX(SEQ_NO) OVER (PARTITION BY FILTER) m
FROM TABLE
) a
WHERE SEQ_NO = m

Using mysql
SELECT SEQ_NO
, VALUE
, FILTER
FROM TABLE
Order by SEQ_NO DESC LIMIT 1

Select data from Sybase database but only select the row with the highest sequence

I'm trying to select data from my database from the highest sequence number, I have been struggling with this for a while and cant get it to work.
The database has a lot of Columns with data. I only want data from the row with the highest sequence number to search in, because the data from lower sequences is not of any value for me. Unfortunately the rows from the lower sequences can not be deleted.
Database looks like this:
-----------------------------
| ID | SEQ | rest of the data
-----------------------------
| 1 | 1 | ..
| 1 | 2 | ....
| 2 | 1 | ..
| 1 | 3 | ....
| 3 | 1 | ..
| 1 | 2 | ....
| 4 | 1 | ........
My question is, how can i select only the ID's with the highest sequence number and search in those rows with the WHERE clause?

On oracle11g you can use:
SELECT *
FROM (
SELECT YOUR_TABLE.*, RANK() OVER (PARTITION BY ID oRDER BY SEQ DESC) RN
FROM YOUR_TABLE) A
WHERE RN=1;

SELECT *
FROM (
SELECT t.*,
ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY SEQ DESC ) AS rn
FROM your_table t
)
WHERE rn = 1
or
SELECT ID,
MAX( seq ) AS seq,
MAX( other_column_1 ) KEEP ( DENSE_RANK LAST ORDER BY seq ) AS other_column_1,
MAX( other_column_2 ) KEEP ( DENSE_RANK LAST ORDER BY seq ) AS other_column_2
-- ...
FROM your_table
GROUP BY id
or
SELECT *
FROM your_table t
WHERE seq IN ( SELECT MAX( seq )
FROM your_table x
WHERE x.id = t.id )
or
SELECT t.*
FROM your_table t
INNER JOIN ( SELECT id, MAX( seq ) AS seq
FROM your_table
GROUP BY id ) x
ON ( x.id = t.id AND x.seq = t.seq )

How to find max value from each group and display their information when using "group by"

For example, i create a table about people contribue to 2 campaigns
+-------------------------------------+
| ID Name Campaign Amount (USD) |
+-------------------------------------+
| 1 A 1 10 |
| 2 B 1 5 |
| 3 C 2 7 |
| 4 D 2 9 |
+-------------------------------------+
Task: For each campaign, find the person (Name, ID) who contribute the most to
Expected result is
+-----------------------------------------+
| Campaign Name ID |
+-----------------------------------------+
| 1 A 1 |
| 2 D 4 |
+-----------------------------------------+
I used "group by Campaign" but the result have 2 columns "Campagin" and "max value" when I need "Name" and "ID"
Thanks for your help.
Edited: I fix some values, really sorry

You can use analytic functions for this:
select name, id, amount
from (select t.*, max(amount) over (partition by campaign) as max_amount
from t
) t
where amount = max_amount;

You can also do it by giving a rank/row_number partiton by campaign and order by descending order of amount.
Query
;with cte as(
select [num] = dense_rank() over(
partition by [Campaign]
order by [Amount] desc
), *
from [your_table_name]
)
select [Campaign], [Name], [ID]
from cte
where [num] = 1;

Try the next query:-
SELECT Campaign , Name , ID
FROM (
SELECT Campaign , Name , ID , MAX (Amount)
FROM MyTable
GROUP BY Campaign , Name , ID
) temp;

Simply use Where Clause with the max of amount group by Campaign:-
As following generic code:-
select a, b , c
from tablename
where d in
(
select max(d)
from tablename
group by a
)
Demo:-
Create table #MyTable (ID int , Name char(1), Campaign int , Amount int)
go
insert into #MyTable values (1,'A',1,10)
insert into #MyTable values (2,'B',1,5)
insert into #MyTable values (3,'C',2,7)
insert into #MyTable values (4,'D',2,9)
go
select Campaign, Name , ID
from #MyTable
where Amount in
(
select max(Amount)
from #MyTable
group by Campaign
)
drop table #MyTable
Result:-

Please find the below code for the same
SELECT *
FROM #MyTable T
OUTER APPLY (
SELECT COUNT(1) record
FROM #MyTable T1
where t.Campaign = t1.Campaign
and t.amount < t1.amount
)E
where E.record = 0

Second maximum and minimum values

Given a table with multiple rows of an int field and the same identifier, is it possible to return the 2nd maximum and 2nd minimum value from the table.
A table consists of
ID | number
------------------------
1 | 10
1 | 11
1 | 13
1 | 14
1 | 15
1 | 16
Final Result would be
ID | nMin | nMax
--------------------------------
1 | 11 | 15

You can use row_number to assign a ranking per ID. Then you can group by id and pick the rows with the ranking you're after. The following example picks the second lowest and third highest :
select id
, max(case when rnAsc = 2 then number end) as SecondLowest
, max(case when rnDesc = 3 then number end) as ThirdHighest
from (
select ID
, row_number() over (partition by ID order by number) as rnAsc
, row_number() over (partition by ID order by number desc) as rnDesc
) as SubQueryAlias
group by
id
The max is just to pick out the one non-null value; you can replace it with min or even avg and it would not affect the outcome.

This will work, but see caveats:
SELECT Id, number
INTO #T
FROM (
SELECT 1 ID, 10 number
UNION
SELECT 1 ID, 10 number
UNION
SELECT 1 ID, 11 number
UNION
SELECT 1 ID, 13 number
UNION
SELECT 1 ID, 14 number
UNION
SELECT 1 ID, 15 number
UNION
SELECT 1 ID, 16 number
) U;
WITH EX AS (
SELECT Id, MIN(number) MinNumber, MAX(number) MaxNumber
FROM #T
GROUP BY Id
)
SELECT #T.Id, MIN(number) nMin, MAX(number) nMax
FROM #T INNER JOIN
EX ON #T.Id = EX.Id
WHERE #T.number <> MinNumber AND #T.number <> MaxNumber
GROUP BY #T.Id
DROP TABLE #T;
If you have two MAX values that are the same value, this will not pick them up. So depending on how your data is presented you could be losing the proper result.

You could select the next minimum value by using the following method:
SELECT MAX(Number)
FROM
(
SELECT top 2 (Number)
FROM table1 t1
WHERE ID = {MyNumber}
order by Number
)a
It only works if you can restrict the inner query with a where clause

This would be a better way. I quickly put this together, but if you can combine the two queries, you will get exactly what you were looking for.
select *
from
(
select
myID,
myNumber,
row_number() over (order by myID) as myRowNumber
from MyTable
) x
where x.myRowNumber = 2
select *
from
(
select
myID,
myNumber,
row_number() over (order by myID desc) as myRowNumber
from MyTable
) y
where x.myRowNumber = 2

let the table name be tblName.
select max(number) from tblName where number not in (select max(number) from tblName);
same for min, just replace max with min.

As I myself learned just today the solution is to use LIMIT. You order the results so that the highest values are on top and limit the result to 2. Then you select that subselect and order it the other way round and only take the first one.
SELECT somefield FROM (
SELECT somefield from table
ORDER BY somefield DESC LIMIT 2)
ORDER BY somefield ASC LIMIT 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Count of duplicate values by two columns in SQL Server - sql

From this table: Number Value 1 a 2 b 3 a 2 c 2 b 3 a 2 b I need to get count of all duplicate rows by Number and Value, i.e. 5. Thanks.

I think this query is what you want: SELECT SUM(t.cnt) FROM ( SELECT COUNT() cnt FROM table_name GROUP BY number, value HAVING COUNT() > 1 )t;

Select count(distinct Number) as Distinct_Numbers, count(distinct Value) as Distinct_Values from Table This shows how many distinct values are in each column. Does this help?

I think you mean number of unique number - value pairs, you can use: SELECT count(*) FROM (SELECT ROW_NUMBER() OVER (PARTITION BY number, value ORDER BY (select 1)) from mytable rnk) i where i.rnk = 1

Related

Return the highest SUM value of all donors by designations

Limit MAX() result to one row based on highest value in a particular field

Select data from Sybase database but only select the row with the highest sequence

How to find max value from each group and display their information when using "group by"

Second maximum and minimum values

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Count of duplicate values by two columns in SQL Server - sql

From this table: Number Value 1 a 2 b 3 a 2 c 2 b 3 a 2 b I need to get count of all duplicate rows by Number and Value, i.e. 5. Thanks.

I think this query is what you want: SELECT SUM(t.cnt) FROM ( SELECT COUNT(*) cnt FROM table_name GROUP BY number, value HAVING COUNT(*) > 1 )t;

Select count(distinct Number) as Distinct_Numbers, count(distinct Value) as Distinct_Values from Table This shows how many distinct values are in each column. Does this help?

I think you mean number of unique number - value pairs, you can use: SELECT count(*) FROM (SELECT ROW_NUMBER() OVER (PARTITION BY number, value ORDER BY (select 1)) from mytable rnk) i where i.rnk = 1

Related

Return the highest SUM value of all donors by designations

Limit MAX() result to one row based on highest value in a particular field

Select data from Sybase database but only select the row with the highest sequence

How to find max value from each group and display their information when using "group by"

Second maximum and minimum values

Categories

Resources

I think this query is what you want: SELECT SUM(t.cnt) FROM ( SELECT COUNT() cnt FROM table_name GROUP BY number, value HAVING COUNT() > 1 )t;