T-SQL Select full rows with same column value

T-SQL Select full rows with same column value - sql

I want to show all rows that has a serial that exist in another row.
If I do like this it works
SELECT
[Serial]
FROM [x].[dbo].[Devices]
GROUP BY Serial
HAVING COUNT(*) > 1
But when I add more select columns
SELECT [ID]
,[UUID]
,[Serial]
FROM [x].[dbo].[Devices]
GROUP BY Serial
HAVING COUNT(*) > 1
I get
'ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Why can't I select more columns?
How am I suppose to show the full rows?

You can do this with a window function:
select *
from (
select *,
count(*) over (partition by [Serial]) as serial_count
from [x].[dbo].[Devices]
) t
where serial_count > 1;
This is typically faster then joining to a sub-select with an aggregate.

Hope you need this query, it will show all the rows that has a serial that exist in another row.
SELECT D1.[ID]
,D1.[UUID]
,D1.[Serial]
FROM [x].[dbo].[Devices] D1
JOIN ( SELECT [Serial]
FROM [x].[dbo].[Devices]
GROUP BY Serial
HAVING COUNT(*) > 1 ) D2 ON D1.[Serial] = D2.[Serial]

As long as ID and UUID are unique to the serial, try grouping by all columns.
SELECT [ID]
,[UUID]
,[Serial]
FROM [x].[dbo].[Devices]
GROUP BY Serial
,[ID]
,[UUID]
HAVING COUNT(*) > 1

Related

How to to find all matching rows in 2 columns in SQL?

My table has 2 columns containing code pairs (Parentcodes and Childcodes). They are unique parings but each code can and often are repeated in each column. I'm trying to pull a list of each instance of each code and all of the associated values from the other column.
So basically
Select ParentCode, Childcode
from TABLE
where count(ParentCode)>1
(and vice versa)
It seems like I have to include both columns in the group by if I want them both in the select. I've tried subqueries but with no luck. I know I can set up a script in VBA to loop through each code and return the results (running a basic select where count > 1), but that seems like the least efficient approach.
Sample data:

To get as parentcode or childcode also repeated more than 1 time you can use IN:
select Parentcode, Childcode
from Table
where Parentcode in (
select Parentcode
from Table
group by Parentcode
having count(Parentcode) > 1
)
or Childcode in (
select Childcode
from Table
group by Childcode
having count(Childcode) > 1
)

You should be just about there with that.
select Perentcode, count(ParentCode) count
from TABLE
group by ParentCode
having count(Parentcode)>1

You can use EXISTS:
select t.* from tablename t
where
exists (select 1 from tablename where parentcode <> t.parentcode and childcode = t.childcode)
or
exists (select 1 from tablename where parentcode = t.parentcode and childcode <> t.childcode)

How to delete the duplicate data in table (Postgres)

I want to delete the duplicated data in a table , I know there is a way use
SELECT
fruit,
COUNT( fruit )
FROM
basket
GROUP BY
fruit
HAVING
COUNT( fruit )> 1
ORDER BY
fruit;
to find them , buy I need to determine every column's value is equal , which means tableA.* = tableA.* (except id , id is the auto-increment primary key )
and I tried this:
SELECT
*,
COUNT( * )
FROM
myTable
GROUP BY
*
HAVING
COUNT( * )> 1
ORDER BY
id;
but it says I can't use GROUP BY * , so how can I find & delete the duplicated data(need every column's value is equal except id)?

using
SELECT * DISTINCT
DISTINCT remove duplicated result

You need to try something similar to be below query. You apply PARTITION BY for the columns other than Id (as it is incrementing unique value). PARTITION BY should be applied for columns, for which you want to check duplicates.
Also refer to Row_Number in Postgres & Common Table expression in Postgres
WITH DuplicateTableRows AS
(
SELECT Id, Row_Number() OVER (PARTITION BY col1, col2... ORDER BY Id)
FROM
Table1
)
DELETE FROM Table1
WHERE Id IN (SELECT Id FROM Table1 WHERE row_number > 1)

You can do this using JSON:
select (to_jsonb(b) - 'id')
from basket b
group by 1
having count(*) > 1;
The result is as JSON. Unfortunately, to extract the values back into a record, you need to list the columns individually.

DB2 - how to find count multiple occurrences of column value

Im new to DB2 , and tried based on some similar posts, I have a table where I need to find the count of IDs based on where status=P and
the count of(primary=1) more than once.
so my result should be 2 here - (9876,3456)
Tried:
SELECT id, COUNT(isprimary) Counts
FROM table
GROUP BY id
HAVING COUNT(isprimary)=1;

Try the query below:
select ID as IDs,Count(isPrimary) as isPrimary
From Table
where Status = 'p'
Group by ID
Having Count(isPrimary) >1

You are close, I think all you need to do is to add a where clause like:
SELECT id, COUNT(*) as Counted
FROM table
WHERE PrimaryFlag = 1
AND[status] = 'P'
GROUP BY id
EDIT: if you need to count only the distinct IDs, then try:
SELECT COUNT(t.ID) FROM
(
SELECT id, COUNT(*) as Counted
FROM table
WHERE PrimaryFlag = 1
AND[status] = 'P'
GROUP BY id
) as t

Select a Column in SQL not in Group By

I have been trying to find some info on how to select a non-aggregate column that is not contained in the Group By statement in SQL, but nothing I've found so far seems to answer my question. I have a table with three columns that I want from it. One is a create date, one is a ID that groups the records by a particular Claim ID, and the final is the PK. I want to find the record that has the max creation date in each group of claim IDs. I am selecting the MAX(creation date), and Claim ID (cpe.fmgcms_cpeclaimid), and grouping by the Claim ID. But I need the PK from these records (cpe.fmgcms_claimid), and if I try to add it to my select clause, I get an error. And I can't add it to my group by clause because then it will throw off my intended grouping. Does anyone know any workarounds for this? Here is a sample of my code:
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid
This is the result I'd like to get:
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, cpe.fmgcms_claimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid

The columns in the result set of a select query with group by clause must be:
an expression used as one of the group by criteria , or ...
an aggregate function , or ...
a literal value
So, you can't do what you want to do in a single, simple query. The first thing to do is state your problem statement in a clear way, something like:
I want to find the individual claim row bearing the most recent
creation date within each group in my claims table
Given
create table dbo.some_claims_table
(
claim_id int not null ,
group_id int not null ,
date_created datetime not null ,
constraint some_table_PK primary key ( claim_id ) ,
constraint some_table_AK01 unique ( group_id , claim_id ) ,
constraint some_Table_AK02 unique ( group_id , date_created ) ,
)
The first thing to do is identify the most recent creation date for each group:
select group_id ,
date_created = max( date_created )
from dbo.claims_table
group by group_id
That gives you the selection criteria you need (1 row per group, with 2 columns: group_id and the highwater created date) to fullfill the 1st part of the requirement (selecting the individual row from each group. That needs to be a virtual table in your final select query:
select *
from dbo.claims_table t
join ( select group_id ,
date_created = max( date_created )
from dbo.claims_table
group by group_id
) x on x.group_id = t.group_id
and x.date_created = t.date_created
If the table is not unique by date_created within group_id (AK02), you you can get duplicate rows for a given group.

You can do this with PARTITION and RANK:
select * from
(
select MyPK, fmgcms_cpeclaimid, createdon,
Rank() over (Partition BY fmgcms_cpeclaimid order by createdon DESC) as Rank
from Filteredfmgcms_claimpaymentestimate
where createdon < 'reportstartdate'
) tmp
where Rank = 1

The direct answer is that you can't. You must select either an aggregate or something that you are grouping by.
So, you need an alternative approach.
1). Take you current query and join the base data back on it
SELECT
cpe.*
FROM
Filteredfmgcms_claimpaymentestimate cpe
INNER JOIN
(yourQuery) AS lookup
ON lookup.MaxData = cpe.createdOn
AND lookup.fmgcms_cpeclaimid = cpe.fmgcms_cpeclaimid
2). Use a CTE to do it all in one go...
WITH
sequenced_data AS
(
SELECT
*,
ROW_NUMBER() OVER (PARITION BY fmgcms_cpeclaimid ORDER BY CreatedOn DESC) AS sequence_id
FROM
Filteredfmgcms_claimpaymentestimate
WHERE
createdon < 'reportstartdate'
)
SELECT
*
FROM
sequenced_data
WHERE
sequence_id = 1
NOTE: Using ROW_NUMBER() will ensure just one record per fmgcms_cpeclaimid. Even if multiple records are tied with the exact same createdon value. If you can have ties, and want all records with the same createdon value, use RANK() instead.

You can join the table on itself to get the PK:
Select cpe1.PK, cpe2.MaxDate, cpe1.fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate cpe1
INNER JOIN
(
select MAX(createdon) As MaxDate, fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate
group by fmgcms_cpeclaimid
) cpe2
on cpe1.fmgcms_cpeclaimid = cpe2.fmgcms_cpeclaimid
and cpe1.createdon = cpe2.MaxDate
where cpe1.createdon < 'reportstartdate'

Thing I like to do is to wrap addition columns in aggregate function, like max().
It works very good when you don't expect duplicate values.
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, MAX(cpe.fmgcms_claimid) As fmgcms_claimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid

What you are asking, Sir, is as the answer of RedFilter.
This answer as well helps in understanding why group by is somehow a simpler version or partition over:
SQL Server: Difference between PARTITION BY and GROUP BY
since it changes the way the returned value is calculated and therefore you could (somehow) return columns group by can not return.

You can use as below,
Select X.a, X.b, Y.c from (
Select X.a as a, sum (b) as sum_b from name_table X
group by X.a)X
left join from name_table Y on Y.a = X.a
Example;
CREATE TABLE #products (
product_name VARCHAR(MAX),
code varchar(3),
list_price [numeric](8, 2) NOT NULL
);
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('Dinding', 'ADE', 2000)
INSERT INTO #products VALUES ('Kaca', 'AKB', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
--SELECT * FROM #products
SELECT distinct x.code, x.SUM_PRICE, product_name FROM (SELECT code, SUM(list_price) as SUM_PRICE From #products
group by code)x
left join #products y on y.code=x.code
DROP TABLE #products

How do I find duplicate values in a table in Oracle?

What's the simplest SQL statement that will return the duplicate values for a given column and the count of their occurrences in an Oracle database table?
For example: I have a JOBS table with the column JOB_NUMBER. How can I find out if I have any duplicate JOB_NUMBERs, and how many times they're duplicated?

Aggregate the column by COUNT, then use a HAVING clause to find values that appear more than once.
SELECT column_name, COUNT(column_name)
FROM table_name
GROUP BY column_name
HAVING COUNT(column_name) > 1;

Another way:
SELECT *
FROM TABLE A
WHERE EXISTS (
SELECT 1 FROM TABLE
WHERE COLUMN_NAME = A.COLUMN_NAME
AND ROWID < A.ROWID
)
Works fine (quick enough) when there is index on column_name. And it's better way to delete or update duplicate rows.

Simplest I can think of:
select job_number, count(*)
from jobs
group by job_number
having count(*) > 1;

You don't need to even have the count in the returned columns if you don't need to know the actual number of duplicates. e.g.
SELECT column_name
FROM table
GROUP BY column_name
HAVING COUNT(*) > 1

How about:
SELECT <column>, count(*)
FROM <table>
GROUP BY <column> HAVING COUNT(*) > 1;
To answer the example above, it would look like:
SELECT job_number, count(*)
FROM jobs
GROUP BY job_number HAVING COUNT(*) > 1;

In case where multiple columns identify unique row (e.g relations table ) there you can use following
Use row id
e.g. emp_dept(empid, deptid, startdate, enddate)
suppose empid and deptid are unique and identify row in that case
select oed.empid, count(oed.empid)
from emp_dept oed
where exists ( select *
from emp_dept ied
where oed.rowid <> ied.rowid and
ied.empid = oed.empid and
ied.deptid = oed.deptid )
group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);
and if such table has primary key then use primary key instead of rowid, e.g id is pk then
select oed.empid, count(oed.empid)
from emp_dept oed
where exists ( select *
from emp_dept ied
where oed.id <> ied.id and
ied.empid = oed.empid and
ied.deptid = oed.deptid )
group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);

Doing
select count(j1.job_number), j1.job_number, j1.id, j2.id
from jobs j1 join jobs j2 on (j1.job_numer = j2.job_number)
where j1.id != j2.id
group by j1.job_number
will give you the duplicated rows' ids.

SELECT SocialSecurity_Number, Count(*) no_of_rows
FROM SocialSecurity
GROUP BY SocialSecurity_Number
HAVING Count(*) > 1
Order by Count(*) desc

I usually use Oracle Analytic function ROW_NUMBER().
Say you want to check the duplicates you have regarding a unique index or primary key built on columns (c1, c2, c3).
Then you will go this way, bringing up ROWID s of rows where the number of lines brought by ROW_NUMBER() is >1:
Select *
From Table_With_Duplicates
Where Rowid In (Select Rowid
From (Select ROW_NUMBER() Over (
Partition By c1, c2, c3
Order By c1, c2, c3
) nbLines
From Table_With_Duplicates) t2
Where nbLines > 1)

I know its an old thread but this may help some one.
If you need to print other columns of the table while checking for duplicate use below:
select * from table where column_name in
(select ing.column_name from table ing group by ing.column_name having count(*) > 1)
order by column_name desc;
also can add some additional filters in the where clause if needed.

Here is an SQL request to do that:
select column_name, count(1)
from table
group by column_name
having count (column_name) > 1;

1. solution
select * from emp
where rowid not in
(select max(rowid) from emp group by empno);

Also u can try something like this to list all duplicate values in a table say reqitem
SELECT count(poid)
FROM poitem
WHERE poid = 50
AND rownum < any (SELECT count(*) FROM poitem WHERE poid = 50)
GROUP BY poid
MINUS
SELECT count(poid)
FROM poitem
WHERE poid in (50)
GROUP BY poid
HAVING count(poid) > 1;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

T-SQL Select full rows with same column value - sql

You can do this with a window function: select * from ( select , count() over (partition by [Serial]) as serial_count from [x].[dbo].[Devices] ) t where serial_count > 1; This is typically faster then joining to a sub-select with an aggregate.

Hope you need this query, it will show all the rows that has a serial that exist in another row. SELECT D1.[ID] ,D1.[UUID] ,D1.[Serial] FROM [x].[dbo].[Devices] D1 JOIN ( SELECT [Serial] FROM [x].[dbo].[Devices] GROUP BY Serial HAVING COUNT(*) > 1 ) D2 ON D1.[Serial] = D2.[Serial]

As long as ID and UUID are unique to the serial, try grouping by all columns. SELECT [ID] ,[UUID] ,[Serial] FROM [x].[dbo].[Devices] GROUP BY Serial ,[ID] ,[UUID] HAVING COUNT(*) > 1

Related

How to to find all matching rows in 2 columns in SQL?

How to delete the duplicate data in table (Postgres)

DB2 - how to find count multiple occurrences of column value

Select a Column in SQL not in Group By

How do I find duplicate values in a table in Oracle?

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

T-SQL Select full rows with same column value - sql

You can do this with a window function: select * from ( select *, count(*) over (partition by [Serial]) as serial_count from [x].[dbo].[Devices] ) t where serial_count > 1; This is typically faster then joining to a sub-select with an aggregate.

Hope you need this query, it will show all the rows that has a serial that exist in another row. SELECT D1.[ID] ,D1.[UUID] ,D1.[Serial] FROM [x].[dbo].[Devices] D1 JOIN ( SELECT [Serial] FROM [x].[dbo].[Devices] GROUP BY Serial HAVING COUNT(*) > 1 ) D2 ON D1.[Serial] = D2.[Serial]

As long as ID and UUID are unique to the serial, try grouping by all columns. SELECT [ID] ,[UUID] ,[Serial] FROM [x].[dbo].[Devices] GROUP BY Serial ,[ID] ,[UUID] HAVING COUNT(*) > 1

Related

How to to find all matching rows in 2 columns in SQL?

How to delete the duplicate data in table (Postgres)

DB2 - how to find count multiple occurrences of column value

Select a Column in SQL not in Group By

How do I find duplicate values in a table in Oracle?

Categories

Resources

You can do this with a window function: select * from ( select , count() over (partition by [Serial]) as serial_count from [x].[dbo].[Devices] ) t where serial_count > 1; This is typically faster then joining to a sub-select with an aggregate.