SQL: Find duplicated values associated with one ID - sql

Thanks to NoDisplayName (SQL: Query to set value based on group parameter) for getting my table to have Primary tags. But now I need help with a query to find errors in my table.
Sample Input:
Column1 | Column2 |
ID1 Primary
ID1 Primary
ID1
ID2 Primary
ID2
ID3 Primary
ID3
Specifically what would a query to find if there is more than 1 Primary in Column2 associated with the same value in Column1?
I just need the output to be something actionable so I can then remove the duplicated Primary tags.
Thank you!

select
column1,
column2,
count(column1)
from tablename
groupby column1, column2
having count(column1) > 1

Or you could do this:
;WITH CTE AS
(SELECT Column1, Column 2, ROW_NUMBER() OVER (PARTITION BY Column1, Column2 ORDER BY Column1) AS 'DupsWillHave2'
FROM Foo)
SELECT *
FROM CTE
WHERE DupsWillHave2 > 1

I think the following is the query you want:
select column1, count(column1)
from tablename
where column2 = 'Primary'
group by column1
having count(*) > 1;
This will pick up only duplicate values of 'Primary'. It will not pick up duplicate blank values.

Related

How to combine multiple columns into one column?

I'm writing a query and want the results in one column
My current results return like this
Column1 Column2 column3
1 A CAT
I want the results to return like this
Column1
1
A
CAT
SELECT Column1 FROM TableName
UNION ALL
SELECT Column2 FROM TableName
UNION ALL
SELECT Column3 FROM TableName
If you don't want duplicate values, use UNION instead of UNION ALL.
You can also do this using UNPIVOT operator
SELECT Column123
FROM
(
SELECT Column1, Column2, Column3
FROM TableName
) AS tmp
UNPIVOT
(
Column123 FOR ColumnAll IN (Column1, Column2, Column3)
) AS unpvt;
https://www.w3schools.com/sql/sql_union.asp
https://www.mssqltips.com/sqlservertip/3000/use-sql-servers-unpivot-operator-to-help-normalize-output/
The answer is.. it depends..
If the number of columns are unknown.. then use unpivot as UZI has suggested
if you know all columns and is a small finite set..
you can simply go
Select
column1
from table
union all
select column2
from table
union all
select column3
from table
The Cartesian product of the T table with derived table of 3 rows.(each row of #t is presented 3 times, for а=1 and а=2 and а=3). For the first case we take value from Column1,
and for the second - from Column2 and for the Third - from Column3.
Here, certainly, there is both union and join but, in my opinion, the title's question means single scanning the table.
CREATE TABLE #t (Column1 NVARCHAR(25),Column2 NVARCHAR(25), column3 NVARCHAR(25))
INSERT INTO #t
SELECT '1','A','CAT'
SELECT
CASE a WHEN 1 THEN Column1 WHEN 2 THEN Column2 ELSE column3 END col
FROM #t, (SELECT 1 a UNION ALL SELECT 2 UNION ALL SELECT 3) B
DROP TABLE #t

SQL Server : how to find ids where columns have different values

I have a table like this:
Column1 Column2
---------------
1 1
1 2
1 3
1 4
2 1
2 1
2 1
2 1
In column1 one there are 2 different ids, in column2 there are different values for each id from column1.
How can I get the id from column1 where not all ids from column2 are the same? So in this instance the output should be 1 - because they have all different values in column2, where id from column1 has all 1's in column2
Just use group by and having:
select column1
from table t
group by column1
having min(column2) <> max(column2);
Note: you could also use count(distinct), but that has more overhead than min() and max().
Similar logic can be used if the second column could be NULL. That doesn't appear in the sample data so it doesn't seem worth including it in the logic unless the OP specifically says this is a possibility.
Try like this:
select Column1
from yourTable
group by Column1
having count(DISTINCT column2) > 1;
I would think something like this should do the job:
SELECT t.column1 FROM table t
GROUP BY t.column1
HAVING COUNT(DISTINCT t.column2) > 1
This approach will handle the case where a null is an acceptable value in column2.
select column1
from
(
select distinct column1, column2
from yourTable
) t
group by column1
having count(*) > 1

SQL QUERY - Omit ALL duplicate results

I need to return values in a column where only the unique values are returned. I know that DISTINCT will return only unique values, however i need to completely omit any that are duplicated.
i.e.
Column 1 Column 2
----------------------
123456789 27/02/2014
123456789 25/02/2014
654789897 27/02/2014
To return only "654789897 27/02/2014" and omit the other results.
You want to use group by and having:
select column1, column2
from table t
group by column1, column2
having count(*) = 1;
EDIT: (based on comment by knkarthick24)
Depending on what the OP intends, this might also be correct:
select column1, max(column2)
from table t
group by column1
having count(*) = 1;
select column1,column2
from tbl
where column1 in(
select column1
from table
group by column1 having count(column1)=1)
Its good to have Having and GroupBy
Let me know if that works:)

Finding rows that have many similar values and one different one

I'm trying to isolate a problem with a violation of a unique key index. I'm pretty certain that the cause is resulting from columns that have the same value in 3 columns not having the same value in the 4th (when they should). As an example...
Key Column1 Column2 Column3 Column4
1 A B C D
2 A B C D
3 A B C D
4 A B C Z
I basically want to select column 4, or some way to let me identify column 4. I know it's a matter of using aggregrate functions but I'm not very familiar with them. Can anyone assist on a way to select Key, Column4 for rows that have a different column 4 value and the same column 1-3 values?
This is what you want:
select column1, column2, column3
from t
group by column1, column2, column3
having min(column4) <> max(column4)
Once you get the right values for the first three columns, you can join back in to get the specific rows.
Or, you can use window functions like this:
select t.*
from (select t.*, min(column4) over (partition by column1, column2 column3) as min4,
max(column4) over (partition by column1, column2 column3) as max4
from t
) t
where min4 <> max4;
If NULL is a valid "other" value that you want to count, you will need additional logic for that.
If you want to get all columns, then (it could be simpler if windowed count supported distinct but it's not):
with cte1 as (
select distinct * from Table1
), cte2 as (
select
*,
count(column4) over(partition by column1, column2, column3) as cnt
from cte1
)
select * from cte2 where cnt > 1;
if you want just to select key:
select
column1, column2, column3
from Table1
group by column1, column2, column3
having count(distinct column4) > 1
sql fiddle demo

Sampling unique set of records in Oracle table

I have an Oracle table that from which I need to select a given percentage of records for each type of a given set of unique column combination.
For example,
SELECT distinct column1, column2, Column3 from TableX;
provides me all the combination of unique records from that table. I need a % of each rows from each such combination. Currently I am using the following query to accomplish this, which is lengthy and slow.
SELECT *
FROM tableX Sample ( 3 )
WHERE Column1 = ‘value1’ and
Column2 = ‘value2’ and
Column3 = ‘value3
UNION
SELECT *
FROM tableX Sample ( 3 )
WHERE Column1 = ‘value1’ and
Column2 = ‘value2’ and
Column3 = ‘value4
UNION
…
…
SELECT *
FROM tableX Sample ( 3 )
WHERE Column1 = ‘valueP’ and
Column2 = ‘valueQ’ and
Column3 = ‘valueR’
Where the combination of suffix in the “Value” is unique for that table (obtained from the first query)
How can I improve the length of the query and speed?
Here is one approach:
select t.*
from (select t.*,
row_number() over (partition by column1, column2, column3 order by dbms_random()
) as seqnum,
count(*) over (partition by column1, column2, column3) as totcnt
from tablex t
) t
where seqnum / totcnt <= 0.10 -- or whatever your threshold is
It uses row_number() to assign a sequential number to rows in each group, in a random order. The where clause chooses the proportion that you want.