Find duplicate values only if separate column id differs - sql

I have the following table:
id item
1 A
2 A
3 B
4 C
3 H
1 E
I'm looking to obtain duplicate values from the id column only when the item column differs in value. The end result should be:
1 A
1 E
3 B
3 H
I've attempted:
select id, items, count(*)
from table
group by id, items
HAVING count(*) > 1
But this is giving only duplicate values from the id column and not taking into account the items column.
Any suggestions will be greatly appreciated.

You can use a window function for this, this is generally far more efficient than using a self-join
SELECT
t.id,
t.items,
t.count
from (
SELECT *,
COUNT(*) OVER (PARTITION BY t.id) AS count
FROM YourTable t
) t
WHERE t.count > 1;
db<>fiddle

Related

How to print two columns in ascending order from 2 tables without common column

this question was asked to me in interview i was not able give a answer i tried every solution internet did not gave desired output in oracle sql
input table table A has ID column and table B has Value columns
``table A`
ID(table A) Value(table B)
1 E
2 C
3 B
4 A
5 D
output table wants
ID Value
1 A
2 B
3 C
4 D
5 E
You can both order your numbers table and the letters table and then join the numbers table with the letters table on the row number of the letter table:
SELECT numbers.id, letters.value FROM
(SELECT id
FROM tableA) numbers
JOIN
(SELECT ROW_NUMBER() OVER(ORDER BY value) id, value
FROM tableB ) letters
ON numbers.id = letters.id
ORDER BY numbers.id, letters.id
This simply seems a ROW_NUMBER window function -
SELECT ROW_NUMBER() OVER(ORDER BY value) id, value
FROM your_table
ORDER BY value;
You can match the tables by using ASCII() function as ordering by one of the id or value columns while using a CROSS JOIN such as
SELECT id, value
FROM tableA
CROSS JOIN tableB
WHERE ASCII(value)-64 = id
ORDER BY id
Demo

Sum distinct by separate ID column

I have some data of the form:
ID Value
A 2
B 2
C 3
A 2
A 2
C 3
B 2
I want to sum value by distinct IDs.
select sum(distinct value) from table would give the sum of 2 and 3 = 5. I don't want that, I want the sum of value for each ID, i.e. A=2, B=2, C=3, there's 3 distinct IDs so sum(2,2,3) = 7.
In 'sql-ish' I want something like select sum(distinct value by ID) from table. Is this possible?
Get the distinct combinations of ID and Value in a subquery and then the sum of Values:
SELECT SUM(Value) sum_value
FROM (SELECT DISTINCT ID, Value FROM tablename) t
Another way to do it is with SUM() window function:
SELECT DISTINCT SUM(MAX(Value)) OVER() sum_value
FROM tablename
GROUP BY ID
See the demo.

SQL Query to get all rows with duplicate values but are not part of the same group

The database schema is organized as follows:
ID | GroupID | VALUE
--------------------
1 | 1 | A
2 | 1 | A
3 | 2 | B
4 | 3 | B
In this example, I want to GET all Rows with duplicate VALUE, but are not part of the same group. So the desired result set should be IDs (3, 4), because they are not in the same group (2, 3) but still have the same VALUE (B).
I'm having trouble writing a SQL Query and would appreciate any guidance. Thanks.
So far, I'm using SQL Count, but can't figure out what to do with the GroupId.
SELECT *
FROM TABLE T
HAVING COUNT(T.VALUE) > 1
GROUP BY ID, GroupId, VALUE
The simplest method for this is using EXISTS:
SELECT
ID
FROM
MyTable T1
WHERE
EXISTS (SELECT 1
FROM MyTable
WHERE Value = t1.Value
AND GroupID <> t1.GroupID)
Here is one method. First you have to identify the values that appear in more than one group and then use that information to find the right rows in the original table:
select *
from t
where value in (SELECT value
FROM TABLE T
GROUP BY VALUE
HAVING COUNT(distinct groupid) > 1
)
order by value
Actually, I prefer a slight variant in this case, by changing the HAVING clause:
HAVING min(groupid) <> max(groupid)
This works when you are looking for more than one group and should be faster than the COUNT DISTINCT version.
SELECT ALL_.*
FROM (SELECT *
FROM TABLE_
GROUP BY ID, GROUPID, VALUE
ORDER BY ID) GROUPED,
TABLE_ ALL_
WHERE GROUPED.VALUE = ALL_.VALUE
AND GROUPED.GROUPID <> ALL_.GROUPID

How can I find the record with the max value for a group?

I am trying to write a query for a large dataset with many joins and having trouble accomplishing a particular piece without some sort of subquery, which I am trying to avoid.
For an example table with columns ID, Size, Item there may be multiple records with the same ID. I want to return the record per ID which has the largest Size.
ID Size Item
1 5 a
1 10 b
2 3 c
2 6 d
2 11 e
3 2 f
Expected result
ID Size Item
1 10 b
2 11 e
3 2 f
I've tried various group and having approaches without success.
Using a subquery I can do it like this but for a large dataset I'd prefer not to do it this way
select id, size, item
from test
where size = (select max(size) from test t2 where id = test.id)
Any suggestions?
This should satisfy your requirements: For each id, return only the row with the largest size
SELECT test.id, test.size, test.item
FROM test
INNER JOIN (
SELECT id, MAX(size) AS size
FROM test
GROUP BY id
) max_size ON max_size.id = test.id AND max_size.size = test.size
WITH T AS ( SELECT * ,
ROW_NUMBER() OVER ( PARTITION BY ID
ORDER BY Size DESC ) AS RN
FROM YourTable
)
SELECT ID ,
Size ,
Item
FROM T
WHERE RN = 1
SELECT id, item, MAX(size)
FROM Test
GROUP BY id, item
Assuming item is the same for every occurrence of that id.
select id, max(size), item
from test
group by id, item
Edit: Ah, the data you just added changes this and my above query no longer applies.
You can use this query(I mean your query) but it's necessary to create composite index (id, size)

Fetch the row which has the Max value for a column in SQL Server

I found a question that was very similar to this one, but using features that seem exclusive to Oracle. I'm looking to do this in SQL Server.
I have a table like this:
MyTable
--------------------
MyTableID INT PK
UserID INT
Counter INT
Each user can have multiple rows, with different values for Counter in each row. I need to find the rows with the highest Counter value for each user.
How can I do this in SQL Server 2005?
The best I can come up with is a query the returns the MAX(Counter) for each UserID, but I need the entire row because of other data in this table not shown in my table definition for simplicity's sake.
EDIT: It has come to my attention from some of the answers in this post, that I forgot an important detail. It is possible to have 2+ rows where a UserID can have the same MAX counter value. Example below updated for what the expected data/output should be.
With this data:
MyTableID UserID Counter
--------- ------- --------
1 1 4
2 1 7
3 4 3
4 11 9
5 11 3
6 4 6
...
9 11 9
I want these results for the duplicate MAX values, select the first occurance in whatever order SQL server selects them. Which rows are returned isn't important in this case as long as the UserID/Counter pairs are distinct:
MyTableID UserID Counter
--------- ------- --------
2 1 7
4 11 9
6 4 6
I like to use a Common Table Expression for that case, with a suitable ROW_NUMBER() function in it:
WITH MaxPerUser AS
(
SELECT
MyTableID, UserID, Counter,
ROW_NUMBER() OVER(PARTITION BY userid ORDER BY Counter DESC) AS 'RowNumber'
FROM dbo.MyTable
)
SELECT MyTableID, UserID, Counter
FROM MaxPerUser
WHERE RowNumber = 1
THat partitions the data over the UserID, orders it by Counter (descending) for each user, and then labels each of the rows starting with 1 for each user. Select only those rows with a 1 for rownumber and you have your max. values per user.
It's that easy :-) And I get results something like this:
MyTableID UserID Counter
2 1 7
6 4 6
4 11 9
Only one entry per user, no matter how many rows per user happen to have the same max value.
I think this will help you.
SELECT distinct(a.userid), MAX(a.counterid) as counterid
FROM mytable a INNER JOIN mytable b ON a.mytableid = b.mytableid
GROUP BY a.userid
There are several ways to do this, take a look at this Including an Aggregated Column's Related Values Several methods are shown including the performance differences
Here is one example
select t1.*
from(
select UserID, max(counter) as MaxCount
from MyTable
group by UserID) t2
join MyTable t1 on t2.UserID =t1.UserID
and t1.counter = t2.counter
Try this... I'm pretty sure this is the only way to truly make sure you get one row per User.
SELECT MT.*
FROM MyTable MT
INNER JOIN (
SELECT MAX(MID.MyTableId) AS MaxMyTableId,
MID.UserId
FROM MyTable MID
INNER JOIN (
SELECT MAX(Counter) AS MaxCounter, UserId
FROM MyTable
GROUP BY UserId
) AS MC
ON (MID.UserId = MC.UserId
AND MID.Counter = MC.MaxCounter)
GROUP BY MID.UserId
) AS MID
ON (MID.UserId = MC.UserId
AND MID.MyTableId = MC.MaxMyTableId)
select m.*
from MyTable m
inner join (
select UserID, max(Counter) as MaxCounter
from MyTable
group by UserID
) mm on m.UserID = mm.UserID and m.Counter = mm.MaxCounter