SQL with having statement now want complete rows - sql

Here is a mock table
MYTABLE ROWS
PKEY 1,2,3,4,5,6
COL1 a,b,b,c,d,d
COL2 55,44,33,88,22,33
I want to know which rows have duplicated COL1 values:
select col1, count(*)
from MYTABLE
group by col1
having count(*) > 1
This returns :
b,2
d,2
I now want all the rows that contain b and d. Normally, I would use where in stmt, but with the count column, not certain what type of statement I should use?

maybe you need
select * from MYTABLE
where col1 in
(
select col1
from MYTABLE
group by col1
having count(*) > 1
)

Use a CTE and a windowed aggregate:
WITH CTE AS(
SELECT Pkey,
Col1,
Col2,
COUNT(1) OVER (PARTITION BY Col1) AS C
FROM dbo.YourTable)
SELECT PKey,
Col1,
Col2
FROM CTE
WHERE C > 1;

Lots of ways to solve this here's another
select * from MYTABLE
join
(
select col1 ,count(*)
from MYTABLE
group by col1
having count(*) > 1
) s on s.col1 = mytable.col1;

Related

How can I use a COUNT(DISTINCT var) to return the count of unique values per group?

I need to return a count of unique values, but unique per group of the result set, not unique to the entire result set. For example I would like the following code:
SELECT col1 AS letters, count(DISTINCT col2) AS numbers
GROUP BY col1;
applied to this data:
col1 col2
a 5
a 5
a 6
b 1
b 2
b 6
To return this:
col1 col2
a 2
b 3
If the above code will not produce this, how can I accomplish this is T-SQL?
I hope this works for your solution, you need to use group by on col2 with count distinct of col2
SELECT
col1,
COUNT(DISTINCT col2)
FROM
count_unique_values_per_group
GROUP BY
col1
Try this:
SELECT DISTINCT col1
,dense_rank() over (partition by col1 order by col2 asc) + dense_rank() over (partition by col1 order by col2 desc) - 1
FROM my_table
Apply concat function to get the unique count. Hope this helps..
SELECT col1, count(distinct col1 + col2) FROM table_name group by col1;
or
SELECT col1, count(distinct concat(col1,col2)) FROM table_name group by col1;

SQL - avoid select if there is pair

Is it possible to write MS SQL query for this case? If there is pair with 1 and -1 , I don't want select those entries at all.
COL1
COL2
NOTE
A
1
I don't want select this entry becase is in pair with A -1
A
-1
I don't want select this entry becase is in pair with A 1
A
1
OK to select - no pair (no -1 for this A )
B
1
OK to select - no pair
C
1
OK to select - no pair
D
1
I don't want select this entry because is in pair with D -1
D
-1
I don't want select this entry because is in pair with D 1
I understand there is 1s and -1s and these are the only possible values for col2. If this is the case and there is at most one row difference, then you can just add the values up:
select col1, sum(col2)
from mytable
group by col1
having sum(col2) <> 0;
If there can be more rows different or there exist other values beside 1 and -1, then we must generate row numbers.
select col1, max(col2)
from
(
select
col1,
col2,
row_number() over (partition by col1, col2 order by col2) as rn
from mytable
) numbered
group by col1, rn
having count(*) = 1;
One method is aggregation. Assuming there are only -1 and 1 and no duplicates with the same sign:
select col1, max(col2), col3
from t
group by col1, col3
having count(*) = 1;
Alternatively, you could use `not exists:
select t.*
from t
where not exists (select 1
from t t2
where t2.col3 = c.col3 and t2.col1 = t.col1 and
t2.col2 = - t.col1
);
If for any value of Col1 sum of 1 and -1 is not 0, it means that it has unpaired value.
try this:
select *
from t
where col1 in
(select col1 from t group by col1 having sum(col2) <> 0);

How do I SELECT two distinct columns?

I want to be able to select two distinct from col1 and col2 ordered by id.
I'm struggling to do this because when I write the following SQL query...
SELECT DISTINCT col1, col2
FROM table
ORDER BY id
I can't ORDER BY id because it's not in the SELECT statement but if I put id in the SELECT statement it will take the DISTINCT id, col1 and col2. Which is basically the whole table as it is since the id column is unique.
How do I do this?
You can use aggregation, and put an aggregate function in the order by clause:
select col1, col2 from mytable group by col1, col2 order by min(id) limit 10
This is one way to do it:
select A.col1, A.col2
from
(select id, col1, col2
from Tablet
order by id) A
left join
(select min(id) id2, col1, col2
from Tablet
GROUP BY COL1, COL2) B
on A.COL1 = B.COL1 AND A.COL2=b.COL2
where A.id = B.id2
LIMIT 4;
Here is the DEMO

Count(*) in SQL spanning multiple columns

I have a table similar to this
Can I get help writing up a query which will join col1, col2 & col3 and give me a result as below
I've spent an hour trying to figure it out with my mediocre skills and have got to some point.
select col1, count(*)
from tableName
group by col1
But I can't figure out how to join all three cols.
try this one
select
col,
count(*)
from
(select
id,
col1 as col
from
<table_name>
union all
select
id,
col2
from
<table_name>
union all
select
id,
col3
from
<table_name>)
group by
col
You need to group by col of the union of the 3 columns:
select t.col, count(*)
from (
select col1 col from tablename
union all
select col2 from tablename
union all
select col3 from tablename
) t
group by t.col
You should use UNION to group values from all columns to one column. After that, you can count values
SELECT
col,
count(*) as cnt
FROM
(SELECT col1 as col FROM table1
UNION ALL
SELECT col2 as col FROM table1
UNION ALL
SELECT col2 as col FROM table1) as t
GROUP BY col

How to select non-distinct rows with a distinct on multiple columns

I have found many answers on selecting non-distinct rows where they group by a singular column, for example, e-mail. However, there seems to have been issue in our system where we are getting some duplicate data whereby everything is the same except the identity column.
SELECT DISTINCT
COLUMN1,
COLUMN2,
COLUMN3,
...
COLUMN14
FROM TABLE1
How can I get the non-distinct rows from the query above? Ideally it would include the identity column as currently that is obviously missing from the distinct query.
select COLUMN1,COLUMN2,COLUMN3
from TABLE_NAME
group by COLUMN1,COLUMN2,COLUMN3
having COUNT(*) > 1
With _cte (col1, col2, col3, id) As
(
Select cOl1, col2, col3, Count(*)
From mySchema.myTable
Group By Col1, Col2, Col3
Having Count(*) > 1
)
Select t.*
From _Cte As c
Join mySchema.myTable As t
On c.col1 = t.col1
And c.col2 = t.col2
And c.col3 = t.col3
SELECT * FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY COL 1, COL 2, .... COL N ORDER BY COL M
) RN
FROM TABLE_NAME
)T
WHERE T.RN>1