How to get duplicate text values from SQL query - sql

I have to get table only with duplicate text values using SQL query. I have used Having count(columnname) > 1 but I'm not getting result, only with duplicate values instead getting all values.
Can anyone suggest whether I have to add anything to my query?
Thanks.

Use the below query. mention the column which is getting duplicated in the patition by clause..
with CTE_1
AS
(SELECT *,COUNT(1) OVER(PARTITION BY LTRIM(RTRIM(REPLACE(yourDuplicateColumn,' ',''))) Order by -anycolunm- ) cnt
FROM YourTable
)
SELECT *
FROM CTE_1
WHERE cnt>1

Assuming id is a primary key
select *
from myTable t1
where exists (select 1
from myTable t2
where t2.text = t1.text and t2.id != t1.id)

You can use similar to following query:
SELECT
column1, COUNT(*)
FROM table
GROUP BY column1
HAVING COUNT(*) > 1

Related

SQL - Removing Row Groups

I have a table with the following information:
Is there a way to remove all groups which have multiple IDs? For example group 3 would be removed because it consists of ID 1 and 2.
Thank you!
A simple, portable and efficient approach is not exists:
select t.*
from mytable t
where not exists (
select 1
from mytable t1
where t1.group = t.group and t1.id <> t.id
)
For performance, consider an index on (group, id).
Side note: group is a SQL keyword (as in group by), hence not a good choice for a column name.
You can use below query to remove all groups having multiple IDs
Delete from <your_table_name> where Group in (select Group from <your_table_name> group by Group,ID having count(*) > 1)
inner query will return Group having multiple IDs.
select * from temp where group in (
select groups from temp group by id,group having count(1)<3)
delete from temp where group in (
select groups from temp group by id,group having count(1)<3)
Try to execute below query:
select id,group from table where group in
(
select group from(
select group,count(distinct id) as cn from table group by 1 having cn=1) a
)

Select duplicated data from table

Query
select * from table1
where having count(reference)>1
I want to select * the data which have duplicate data,any idea why my query is not working?
Below are my expect result..
You can make use of window function count to find number of rows per id and reference and then filter to get those which have count more than 1.
;with cte as (
select t.*, count(*) over (partition by id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
Demo
In the above solution, I have made an assumption that name and id has one to one correspondence (which is true as per your given data). If that's not the case, add name too in the partition by clause:
;with cte as (
select t.*, count(*) over (partition by name, id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
I might actually approach this by using a subquery with GROUP BY:
SELECT t1.*
FROM table1 t1
INNER JOIN
(
SELECT Name, ID, reference
FROM table1
GROUP BY Name, ID, reference
HAVING COUNT(*) > 1
) t2
ON t1.Name = t2.Name AND
t1.ID = t2.ID AND
t1.reference = t2.reference
Demo here:
Rextester
Try this ), first i get count by partition, after that i get row with count > 1
select No, Name, ID, Reference
from (select count(*) over (partition by name, ID, reference) cnt, table1.* from table1)
where cnt>1
The easy way (although maybe not the best for performance) would be:
select * from table1 where reference in (
select reference from table1 group by reference having count(*)>1
)
In a subselect you have the duplicated data, and in the outter select you have all the data for these references.

PostgreSQL how to delete duplicated values

I have a table in my Postgres database where I forgot to insert a unique index. because of that index that i have now duplicated values. How to remove the duplicated values? I want to add a unique index on the fields translationset_Id and key.
I think you are asking for this:
DELETE FROM tablename
WHERE id IN (SELECT id
FROM (SELECT id,
ROW_NUMBER() OVER (partition BY column1, column2, column3 ORDER BY id) AS rnum
FROM tablename) t
WHERE t.rnum > 1);
It appears that you only want to delete records which are duplicate with regard to the translationset_id column. In this case, we can use Postgres' row number functionality to discern between duplicate rows, and then to delete those duplicates.
WITH cte AS
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY translationset_id, key) AS rnum
FROM yourTable t
)
DELETE FROM yourTable
WHERE translationset_id IN (SELECT translationset_id FROM cte WHERE rnum > 1)
I think the most efficient way to do this is below.
DELETE FROM
table_name a
USING table_name b
WHERE
a.id < b.id and
a.same_column = b.same_column;
delete from mytable
where exists (select 1
from mytable t2
where t2.name = mytable.name and
t2.address = mytable.address and
t2.zip = mytable.zip and
t2.ctid > mytable.ctid
);

MS SQL Server sum of sum fields

I have a sql statement that will give me two columns from two tables using sub query.
select
sum(field1) as f1_sum,
(select sum(field2) from table2) as f2_sum
from
table1
group by
table1.field_x
I want to get the total of f1_sum + f2_sum as the third column output from this query. It seems simple but I can't find a way around this.Question is how to get the sum of sum fields.
I am ok to write SP or a view to do this etc..
Can someone assist please ?
you can use subquery like:
SELECT t1.f1_sum+t1.f2_sum AS total_sum FROM
(select sum(field1) as f1_sum , (select sum(field2) from table2) as f2_sum
from table1
group by table1.field_x) AS t1
I would suggest doing it like this:
select t1.f1_sum, t2.f2_sum, coalesce(t1.f1_sum, 0) + coalesce(t2.f2_sum, 0)
from (select sum(field1) as f1_sum
from table1 t1
group by t1.field_x
) t1 cross join
(select sum(field2) as f2_sum from table2) t2;
When possible, I prefer to keep table references in the from clause. I added the coalesce() just in case any of the values could be NULL.
You could also try this :
SELECT SUM(a.field1) f1_sum,
SUM(b.field2) f2_sum,
(SUM(a.field1) + SUM(b.field2)) f3_sum
from table1 a, table2 b
Simply you can write,
select sum(field1) as f1_sum
, (select sum(field2) from table2) as f2_sum
, (ISNULL(sum(field1),0) + ISNULL((select sum(field2) from table2),0)) AS Total_Sum
from table1
group by table1.field_x

reuse table alias in another select

I have a sql statement:
select id from table1 t1, table t2
where.....
order by ( select count(owner_id) from t2) ASC;
What I want to do here is to select the id of the item whose owner has least number of items.
Is this possible? If not, what I can do to achieve to goal?
Thanks in advance!
You don't mention what SQL you're using but you can do this, or something similar, in PL ( and My I believe ); I'm assuming you're linking table 1 and 2 on id; I haven't ordered by the count(owner_id) alone as this will always be the same value. Obviously partition by whatever you want to get the correct count you're after.
select id
from ( select t1.id, t2.ct
from table1 t1
, ( select id, count(owner_id) over ( partition by id ) as ct
from table2 ) t2
where t1.id = t2.id
order by t2.ct ASC )
;