SQL DISTINCT for 2 Columns WHERE 3rd column = value - sql

I'm trying to Select the Count of Distinct Columns Col 1 & Col 2 Where Col 3 = "Complete".
Firstly I don't know how to make Distinct apply to Col 1 & Col 2 together as opposed to Distinct about their own columns.
Secondly how to exclude Col 3 from the Distinct..
SELECT COUNT(*) AS Count From
(Select Distinct DP, RN From ECount
Where ET = 'Complete') as rows
Any thoughts?
example
col1 col2 col3
DP01 RN01 Complete yes
DP01 RN02 Incomplete
DP02 RN03 Complete
DP02 RN03 Incomplete
DP01 RN04 Complete yes
DP02 RN05 Complete yes
DP03 RN06 Incomplete
Result = 3

I don't think you need to include Column 3 (aka, col ET) in the SELECT part, you can just use it in the WHERE statement directly.
So in your example:
SELECT COUNT(*) AS Count FROM
(SELECT DISTINCT DP, RN FROM ECount
WHERE ET = 'Complete'
) AS rows

just don't select ET in subquery
SELECT COUNT(*) AS Count
From (
Select Distinct DP, RN
From ECount
Where ET = 'Complete'
) as rows

SELECT
Count(*) AS Count
FROM
(
SELECT
*
FROM
(
SELECT
*
FROM
`ECount`
ORDER BY
col3 DESC
) AS StrongIncomplete
GROUP BY
col1,
col2
) AS CompleteCut
WHERE
CompleteCut.col3 = 'Complete'
There are 3 SELECT statements.
The first one rearranges the table that 'Incomplete' is prior than 'Complete' in Col3.
The second one removes rows duplicated in Col1, Col2.
The third one removes rows where Col3 = 'Incomplete'

Related

SQL with having statement now want complete rows

Here is a mock table
MYTABLE ROWS
PKEY 1,2,3,4,5,6
COL1 a,b,b,c,d,d
COL2 55,44,33,88,22,33
I want to know which rows have duplicated COL1 values:
select col1, count(*)
from MYTABLE
group by col1
having count(*) > 1
This returns :
b,2
d,2
I now want all the rows that contain b and d. Normally, I would use where in stmt, but with the count column, not certain what type of statement I should use?
maybe you need
select * from MYTABLE
where col1 in
(
select col1
from MYTABLE
group by col1
having count(*) > 1
)
Use a CTE and a windowed aggregate:
WITH CTE AS(
SELECT Pkey,
Col1,
Col2,
COUNT(1) OVER (PARTITION BY Col1) AS C
FROM dbo.YourTable)
SELECT PKey,
Col1,
Col2
FROM CTE
WHERE C > 1;
Lots of ways to solve this here's another
select * from MYTABLE
join
(
select col1 ,count(*)
from MYTABLE
group by col1
having count(*) > 1
) s on s.col1 = mytable.col1;

SQL DISTINCT based on a single column, but keep all columns as output

--mytable
col1 col2 col3
1 A red
2 A green
3 B purple
4 C blue
Let's call the table above mytable. I want to select only distinct values from col2:
SELECT DISTINCT
col2
FROM
mytable
When I do this the output looks like this, which is expected:
col2
A
B
C
but how do I perform the same type of query, yet keep all columns? The output would look like below. In essence I'm going through mytable looking at col2, and when there's multiple occurrences of col2 I'm only keeping the first row.
col1 col2 col3
1 A red
3 B purple
4 C blue
Do SQL functions (eg DISTINCT) have arguments I could set? I could imagine it to be something like KeepAllColumns = TRUE for this DISTINCT function? Or do I need to perform JOINs to get what I want?
You can use window functions, particularly row_number():
select t.*
from (select t.*, row_number() over (partition by col2 order by col2) as seqnum
from mytable t
) t
where seqnum = 1;
row_number() enumerates the rows, starting with "1". You can control whether you get the oldest, earliest, biggest, smallest . . .
You can use the QUALIFY clause in Teradata:
SELECT col1, col2, col3
FROM mytable
QUALIFY ROW_NUMBER() OVER(PARTITION BY col2 ORDER BY col2) = 1 -- Get 1st row per group
If you want to change the ordering for how to determine which col2 row to get, just change the expression in the ORDER BY.
With NOT EXISTS:
select m.* from mytable m
where not exists (
select 1 from mytable
where col2 = m.col2 and col1 < m.col1
)
This code will return the rows for which there is not another row with the same col2 and a smaller value in col1.

SQL/Oracle return only field with identical value in 2nd column

Need to return column 1 only if identical values are found in 2nd column of a repeating log. If any other value is seen exclude from result.
A 2
A 2
A 2
A 2
A 2
Exlude
B 2
B 1
B 2
B 3
B 2
select b. column1
from
( select *
from table
where column2 != 1
) b
where b.column2 = 2
Results:
A
You could use aggregation and HAVING:
SELECT col1
FROM tab
GROUP BY col1
HAVING COUNT(DISTINCT col2) = 1;
or if you need original rows:
SELECT s.*
FROM (SELECT t.*, COUNT(DISTINCT col2) OVER(PARTITION BY col1) AS cnt
FROM tab t) s
WHERE s.cnt = 1;
If you need the original rows, I would recommend not exists:
select t.*
from t
where not exists (select 1 from t t2 where t2.col1 = t.col1 and t2.col2 <> t.col2);
If you just want the col1 values (which makes sense to me), then I would phrase the aggregation as:
select col1
from t
group by col1
having min(col2) = max(col2);
If you want to include "all-null" as a valid option, then:
having min(col2) = max(col2) or min(col2) is null
Try this query
select column1 from (select column1,column2 from Test group by column1,column2) a group by column1 having count(column1)=1;

DISTINCT for only one Column and other column random?

I have one Table name Demodata which have two column col1 and col2. data of table is
col1 col2
1 5
1 6
2 7
3 8
3 9
4 10
and after SELECT command we need this data
col1 Col2
1 5
6
2 7
3 8
9
4 10
is this possible then what is query please guide me
Try this
SELECT CASE WHEN RN > 1 THEN NULL ELSE Col1 END,Col2
FROM
(
SELECT *,Row_Number() Over(Partition by col1 order by col1) AS RN
From yourTable
) AS T
No it is not possible.
SQL Server result sets are row based not tree based. You must have a value for each column (alternatively a NULL value).
What you can do is grouping by col1 and run an aggregate function on the values of col2 (possibly the STUFF function).
You can do this in SQL, using row_number():
select (case when row_number() over (partition by col1 order by col2) = 1
then col1
end), col2
from table t
order by col1, col2;
Notice that the ordering is important. The way you have written the result set, the data is ordered by col1 and then col2. Result sets do not have an inherent ordering, unless you include an order by clause.
Also, I have used NULL for the missing values.
And, finally, although this can be done in SQL, it is often preferable to do these types of manipulations on the client side.
What do you want to select on the duplicates, an empty string, NULL, 0, ... ?
I presume NULL, you can use a CTE with ROW_NUMBER and CASE on col1:
WITH CTE AS(
SELECT RN = ROW_NUMBER() OVER (PARTITION BY col1
ORDER BY (SELECT 1))
, col1, col2
FROM Demodata
)
SELECT col1 = CASE WHEN RN = 1 THEN col1 ELSE NULL END, col2
FROM CTE
Demo

How to select all columns for rows where I check if just 1 or 2 columns contain duplicate values

I'm having difficulty with what I figure should be an easy problem. I want to select all the columns in a table for which one particular column has duplicate values.
I've been trying to use aggregate functions, but that's constraining me as I want to just match on one column and display all values. Using aggregates seems to require that I 'group by' all columns I'm going to want to display.
If I understood you correctly, this should do:
SELECT *
FROM YourTable A
WHERE EXISTS(SELECT 1
FROM YourTable
WHERE Col1 = A.Col1
GROUP BY Col1
HAVING COUNT(*) > 1)
You can join on a derived table where you aggregate and determine "col" values which are duplicated:
SELECT a.*
FROM Table1 a
INNER JOIN
(
SELECT col
FROM Table1
GROUP BY col
HAVING COUNT(1) > 1
) b ON a.col = b.col
This query gives you a chance to ORDER BY cola in ascending or descending order and change Cola output.
Here's a Demo on SqlFiddle.
with cl
as
(
select *, ROW_NUMBER() OVER(partition by colb order by cola ) as rn
from tbl)
select *
from cl
where rn > 1