Getting only unique values in postresql - sql

I need something like:
SELECT * FROM TABLE WHERE <value in column1 is always unique
(if ever any value will be noticed more than once, then skip this row)>
in postgresql.
So if I have these rows in table:
1;"something";"xoxox"
2;"other";"xoxox"
3;"something";"blablabla"
And then go with the query, then that should be result:
2;"other";"xoxox"
Any ideas?

Use count(*) as a window function:
select t.*
from (select t.*, count(*) over (partition by col1) as cnt
from t
) t
where cnt = 1;
Alternatively, you can use not exists and the id column:
select t.*
from t
where not exists (select 1 from t t2 where t2.col1 = t.col1 and t2.id <> t.id);

You can filter over count without the need of a subquery:
SELECT t.col1
FROM t
GROUP BY col1
HAVING COUNT(*) = 1
Other columns can be added by using an aggregation function like max, since there will only be 1 row per value:
SELECT t.col1, max(t.col2), max(t.col3)
FROM t
GROUP BY col1
HAVING COUNT(*) = 1

Related

SQL with having statement now want complete rows

Here is a mock table
MYTABLE ROWS
PKEY 1,2,3,4,5,6
COL1 a,b,b,c,d,d
COL2 55,44,33,88,22,33
I want to know which rows have duplicated COL1 values:
select col1, count(*)
from MYTABLE
group by col1
having count(*) > 1
This returns :
b,2
d,2
I now want all the rows that contain b and d. Normally, I would use where in stmt, but with the count column, not certain what type of statement I should use?
maybe you need
select * from MYTABLE
where col1 in
(
select col1
from MYTABLE
group by col1
having count(*) > 1
)
Use a CTE and a windowed aggregate:
WITH CTE AS(
SELECT Pkey,
Col1,
Col2,
COUNT(1) OVER (PARTITION BY Col1) AS C
FROM dbo.YourTable)
SELECT PKey,
Col1,
Col2
FROM CTE
WHERE C > 1;
Lots of ways to solve this here's another
select * from MYTABLE
join
(
select col1 ,count(*)
from MYTABLE
group by col1
having count(*) > 1
) s on s.col1 = mytable.col1;

Select duplicated data from table

Query
select * from table1
where having count(reference)>1
I want to select * the data which have duplicate data,any idea why my query is not working?
Below are my expect result..
You can make use of window function count to find number of rows per id and reference and then filter to get those which have count more than 1.
;with cte as (
select t.*, count(*) over (partition by id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
Demo
In the above solution, I have made an assumption that name and id has one to one correspondence (which is true as per your given data). If that's not the case, add name too in the partition by clause:
;with cte as (
select t.*, count(*) over (partition by name, id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
I might actually approach this by using a subquery with GROUP BY:
SELECT t1.*
FROM table1 t1
INNER JOIN
(
SELECT Name, ID, reference
FROM table1
GROUP BY Name, ID, reference
HAVING COUNT(*) > 1
) t2
ON t1.Name = t2.Name AND
t1.ID = t2.ID AND
t1.reference = t2.reference
Demo here:
Rextester
Try this ), first i get count by partition, after that i get row with count > 1
select No, Name, ID, Reference
from (select count(*) over (partition by name, ID, reference) cnt, table1.* from table1)
where cnt>1
The easy way (although maybe not the best for performance) would be:
select * from table1 where reference in (
select reference from table1 group by reference having count(*)>1
)
In a subselect you have the duplicated data, and in the outter select you have all the data for these references.

How to get duplicate text values from SQL query

I have to get table only with duplicate text values using SQL query. I have used Having count(columnname) > 1 but I'm not getting result, only with duplicate values instead getting all values.
Can anyone suggest whether I have to add anything to my query?
Thanks.
Use the below query. mention the column which is getting duplicated in the patition by clause..
with CTE_1
AS
(SELECT *,COUNT(1) OVER(PARTITION BY LTRIM(RTRIM(REPLACE(yourDuplicateColumn,' ',''))) Order by -anycolunm- ) cnt
FROM YourTable
)
SELECT *
FROM CTE_1
WHERE cnt>1
Assuming id is a primary key
select *
from myTable t1
where exists (select 1
from myTable t2
where t2.text = t1.text and t2.id != t1.id)
You can use similar to following query:
SELECT
column1, COUNT(*)
FROM table
GROUP BY column1
HAVING COUNT(*) > 1

SQL how to select a group of records based on some statistics of this group?

Example, I have a record set with three columns:
id,week,count
1,1,10;
1,2,20;
1,3,30;
2,1,3;
2,2,2;
2,3,15;
What I want is just the data of IDs whose average count is > 10. Then, in this example data, the data of id=1 will be selected.
Thanks.
SELECT id FROM YourTable GROUP BY id HAVING AVG(count) > 10
SELECT *
FROM YourTable
WHERE id IN (SELECT id FROM YourTable GROUP BY id HAVING AVG(count) > 10)
Or if you are using an access database (where IN happens to have horrendous performance for whatever reason) you can use:
SELECT t2.*
FROM (SELECT id FROM YourTable GROUP BY id HAVING AVG(count) > 10) AS t1
INNER JOIN YourTable AS t2 ON t1.id = t2.id
In most databases, you can also do this with window functions:
select t.*
from (select t.*, avg(count) over (partition by id) as avgcount
from t
) t
where avgcount > 10

Multiple rows match, but I only want one?

Sometimes I wish to perform a join whereby I take the largest value of one column. Doing this I have to perform a max() and a groupby- which prevents me from retrieving the other columns from the row which was the max (beause they were not contained in a GROUP BY or aggregate function).
To fix this, I join the max value back on the original data source, to get the other columns. However, my problem is that this sometimes returns more than one row.
So, so far I have something like:
SELECT * FROM
(SELECT Col1, Max(Col2) FROM Table GROUP BY Col1) tab1
JOIN
(SELECT Col1, Col2 FROM Table) tab2
ON tab1.Col2 = tab2.Col2
If the above query now returns three rows (which match the largest value for column2) I have a bit of a headache.
If there was an extra column- col3 and for the rows returned by the above query, I only wanted to return the one which was, say the minimum Col3 value- how would I do this?
If you are using SQL Server 2005+. Then you can do it like this:
CTE way
;WITH CTE
AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY Col1 ORDER BY Col2 DESC) AS RowNbr,
table.*
FROM
table
)
SELECT
*
FROM
CTE
WHERE
CTE.RowNbr=1
Subquery way
SELECT
*
FROM
(
SELECT
ROW_NUMBER() OVER(PARTITION BY Col1 ORDER BY Col2 DESC) AS RowNbr,
table.*
FROM
table
) AS T
WHERE
T.RowNbr=1
As I got it can be something like this
SELECT * FROM
(SELECT Col1, Max(Col2) FROM Table GROUP BY Col1) tab1
JOIN
(SELECT Col1, Col2 FROM Table) tab2
ON tab1.Col2 = tab2.Col2 and Col3 = (select min(Col3) from table )
Assuming you are using SQL-Server 2005 or later You can make use of Window functions here. I have chosen ROW_NUMBER() but it is not hte only option.
;WITH T AS
( SELECT *,
ROW_NUMBER() OVER(PARTITION BY Col1 ORDER BY Col2 DESC) [RowNumber]
FROM Table
)
SELECT *
FROM T
WHERE RowNumber = 1
The PARTITION BY within the OVER clause is equivalent to your group by in your subquery, then your ORDER BY determines the order in which to start numbering the rows. In this case Col2 DESC to start with the highest value of col2 (Equivalent to your MAX statement).