Need to select ALL columns while using COUNT/Group By - sql

Ok so I have a table in which ONE of the columns have a FEW REPEATING records.
My task is to select the REPEATING records with all attributes.
CustID FN LN DOB City State
the DOB has some repeating values which I need to select from the whole table and list all columns of all records that are same within the DOB field..
My try...
Select DOB, COUNT(DOB) As 'SameDOB' from Table1
group by DOB
HAVING (COUNT(DOB) > 1)
This only returns two columns and one row 1st column is the DOB column that occurs more than once and the 2nd column gives count on how many.
I need to figure out a way to list all attributes not just these two...
Please guide me in the right direction.

I think a more general solution is to use windows functions:
select *
from (select *, count(*) over (partition by dob) as NumDOB
from table
) t
where numDOB > 1
The reason this is more general is because it is easy to change to duplicates across two or more columns.

Select *
FROM Table1 T
WHERE T.DOB IN( Select I.DOB
FROM Table1 I
GROUP BY I.DOB
HAVING COUNT(I.DOB) > 1)

Try joining with a subquery, which will also allow you to see the count
select t.*, a.SameDOB from Table1 t
join (
Select DOB, COUNT(DOB) As 'SameDOB' from Table1
group by DOB
HAVING (COUNT(DOB) > 1)
) a on a.dob = t.dob

select *
from table1, (select count(*) from table1) as cnt

Related

How to check if a person has duplicate date records?

I am looking to query my Access database from Excel (DAO) to determine if any name in the table has more than one record per date. E.g. If Bob has two records on 05/05/17 then I want to return both records as part of a recordset.
Seems like you are looking for something like:
SELECT *
FROM yourtable
INNER JOIN
(
SELECT count(*), name, date
FROM yourtable
GROUP BY name, date
HAVING COUNT(*) > 1
) multi
ON multi.name = yourtable.name
AND multi.date = yourtable.date
The inner select returns rows with more than 1 entry for the same name and date.
In Access you can do
select name, date
from your_table
group by name, date
having count(*) > 1

using group by operators in sql

i have two columns - email id and customer id, where an email id can be associated with multiple customer ids. Now, I need to list only those email ids (along with their corresponding customer ids) which are having a count of more than 1 customer id. I tried using grouping sets, rollup and cube operators, however, am not getting the desired result.
Any help or pointers would be appreciated.
SELECT emailid
FROM
( SELECT emailid, count(custid)
FROM table
Group by emailid
Having count(custid) > 1
)
I think this will get you what you want, if I am understanding you question correctly
select emailid, customerid from tablename where emailid in
(
select emailid from tablename group by emailid having count(emailid) > 1
)
Sounds like you would need to use HAVING
e.g
SELECT email_id, COUNT(customer_id)
From sometable
GROUP BY email_id
HAVING COUNT(customer_id) > 1
HAVING allows you to filter following the grouping of a particular column.
WITH email_ids AS (
SELECT email_id, COUNT(customer_id) customer_count
FROM Table
GROUP BY email_id
HAVING count(customer_id) > 1
)
SELECT t.email_id, t.customer_id
FROM Table t
INNER JOIN email_ids ei
ON ei.email_id = t.email_id
If you need a comma separated list of all of their customer id's returned with the single email id, you could use GROUP_CONCAT for that.
This would find all email_id's with at least 1 customer_id, and give you a comma separated list of all customer_ids for that email_id:
SELECT email_id, GROUP_CONCAT(customer_id)
FROM your_table
GROUP BY email_id
HAVING count(customer_id) > 1;
Assuming email_id #1 was assigned to customer_ids 1, 2, & 3, your output would look like:
email_id | customer_id
1 | 1,2,3
I didn't realize you were using MS SQL, there's a thread here about simulating GROUP_CONCAT in MS SQL: Simulating group_concat MySQL function in Microsoft SQL Server 2005?
SELECT t1.email, t1.customer
FROM table t1
INNER JOIN (
SELECT email, COUNT(customer)
FROM table
GROUP BY email
HAVING COUNT(customer)>1
) t2 on t1.email = t2.email
This should get you what your looking for.
Basically, as other ppl have stated, you can filter group by results with HAVING. But since you want the customerids afterwards, join the entire select back to your original table to get your results. Could probably be done prettier but this is easy to understand.
SELECT
email_id,
STUFF((SELECT ',' + CONVERT(VARCHAR,customer_id) FROM cust_email_table T1 WHERE T1.email_id = T2.email_id
FOR
XML PATH('')
),1,1,'') AS customer_ids
FROM
cust_email_table T2
GROUP BY email_id
HAVING COUNT(*) > 1
this would give you a single row per email id and comma seperated list of customer id's.

Can we use join with in same table while using group by function?

For instance, I have a table with columns below:
pk_id,address,first_name,last_name
and I have a query like this to display the first name ans last name that are repetitive(duplicates)
select first_name,last_name
from table
group by first_name,last_name
having count(*)>1;
but the above query just returns first and last names but I want to display pk_id and address too that are tied to these duplicate first and last names
Can we use joins to do this on the same table.Please help!!
A simple way of doing is to build a view with the pk_id and the count of duplicates. Once you have it, it is only a matter of using a JOIN on the base table, and a filter to only keep rows having a duplicate:
SELECT T.*
FROM T
JOIN (SELECT "pk_id",
COUNT(*) OVER(PARTITION BY "first_name", "last_name") cnt
FROM T) V
ON T."pk_id" = V."pk_id"
WHERE cnt > 1
See http://sqlfiddle.com/#!4/3ecd0/9
You have to call it from an outer query, like this:
select * from table
where first_name||last_name in
(select first_name||last_name from
(select first_name, last_name, count( * )
from table
group by first_name,last_name
having count( * ) > 1
)
)
note: you may not need to concatenate the 2 fields, but I haven't tested thaT.
with
my_duplicates as
(
select
first_name,
last_name
from
my_table
group by
first_name,
last_name
having
count(*) > 1
)
select
bb.pk_id,
bb.address,
bb.first_name,
bb.last_name
from
my_duplicates aa
join my_table bb on
(
aa.first_name = bb.first_name
and
aa.last_name = bb.last_name
)
order by
bb.last_name,
bb.first_name,
bb.pk_id

Find duplicated rows that are not exactly same

Can i select all rows that have same column value (for example SSN field) but display them all separably. ?
I've searched for this answer but they all have "count(*) and group by" section that demands the rows to be exactly same.
Try This:
SELECT A, B FROM MyTable
WHERE A IN
(
SELECT A FROM MyTable GROUP BY A HAVING COUNT(*)>1
)
I have done with SQL server. But hope this is what you need
Here is another approach, which only references the table once, using an analytic function instead of a subquery to get the duplicate counts It might be faster; it also might not, depending on the particular data.
SELECT * FROM (
SELECT col1, col2, col3, ssn, COUNT(*) OVER (PARTITION BY ssn) ssn_dup_count
)
WHERE ssn_dup_count > 1
ORDER BY ssn_dup_count DESC
SELECT
*
FROM
MyTable
WHERE
EXISTS
(
SELECT
NULL
FROM
MyTable MT
WHERE
MyTable.SameColumnName = MT.SameColumnName
AND MyTable.DifferentColumnName <> MT.DifferentColumnName)
This will fetch the required data and show them in order so that we can see the grouped data together.
SELECT * FROM TABLENAME
WHERE SSN IN
(
SELECT SSN FROM TABLENAMEGROUP BY SSN HAVING COUNT(SSN)>1
)
ORDER BY SSN
Here SSN is the column names fro which similar value check is done.

Get list of records with multiple entries on the same date

I need to return a list of record id's from a table that may/may not have multiple entries with that record id on the same date. The same date criteria is key - if a record has three entries on 09/10/2008, then I need all three returned. If the record only has one entry on 09/12/2008, then I don't need it.
SELECT id, datefield, count(*) FROM tablename GROUP BY datefield
HAVING count(*) > 1
Since you mentioned needing all three records, I am assuming you want the data as well. If you just need the id's, you can just use the group by query. To return the data, just join to that as a subquery
select * from table
inner join (
select id, date
from table
group by id, date
having count(*) > 1) grouped
on table.id = grouped.id and table.date = grouped.date
The top post (Leigh Caldwell) will not return duplicate records and needs to be down modded. It will identify the duplicate keys. Furthermore, it will not work if your database doesn't allows the group by to not include all select fields (many do not).
If your date field includes a time stamp then you'll need to truncate that out using one of the methods documented above ( I prefer: dateadd(dd,0, datediff(dd,0,#DateTime)) ).
I think Scott Nichols gave the correct answer and here's a script to prove it:
declare #duplicates table (
id int,
datestamp datetime,
ipsum varchar(200))
insert into #duplicates (id,datestamp,ipsum) values (1,'9/12/2008','ipsum primis in faucibus')
insert into #duplicates (id,datestamp,ipsum) values (1,'9/12/2008','Vivamus consectetuer. ')
insert into #duplicates (id,datestamp,ipsum) values (2,'9/12/2008','condimentum posuere, quam.')
insert into #duplicates (id,datestamp,ipsum) values (2,'9/13/2008','Donec eu sapien vel dui')
insert into #duplicates (id,datestamp,ipsum) values (3,'9/12/2008','In velit nulla, faucibus sed')
select a.* from #duplicates a
inner join (select id,datestamp, count(1) as number
from #duplicates
group by id,datestamp
having count(1) > 1) b
on (a.id = b.id and a.datestamp = b.datestamp)
SELECT RecordID
FROM aTable
WHERE SameDate IN
(SELECT SameDate
FROM aTable
GROUP BY SameDate
HAVING COUNT(SameDate) > 1)
If I understand your question correctly you could do something similar to:
select
recordID
from
tablewithrecords as a
left join (
select
count(recordID) as recordcount
from
tblwithrecords
where
recorddate='9/10/08'
) as b on a.recordID=b.recordID
where
b.recordcount>1
http://www.sql-server-performance.com/articles/dba/delete_duplicates_p1.aspx will get you going. Also, http://en.allexperts.com/q/MS-SQL-1450/2008/8/SQL-query-fetch-duplicate.htm
I found these by searching Google for 'sql duplicate data'. You'll see this isn't an unusual problem.
SELECT * FROM the_table WHERE ROW(record_id,date) IN
( SELECT record_id, date FROM the_table
GROUP BY record_id, date WHERE COUNT(*) > 1 )
For matching on just the date part of a Datetime:
select * from Table
where id in (
select alias1.id from Table alias1, Table alias2
where alias1.id != alias2.id
and datediff(day, alias1.date, alias2.date) = 0
)
I think. This is based on my assumption that you need them on the same day month and year, but not the same time of day, so I did not use a Group by clause. From the other posts it looks like I could have more cleverly used a Having clause. Can you use a having or group by on a datediff expression?
I'm not sure I understood your question, but maybe you want something like this:
SELECT id, COUNT(*) AS same_date FROM foo GROUP BY id, date HAVING same_date = 3;
This is just written from my mind and not tested in any way. Read the GROUP BY and HAVING section here. If this is not what you meant, please ignore this answer.
Note that there's some extra processing necessary if you're using a SQL DateTime field. If you've got that extra time data in there, then you can't just use that column as-is. You've got to normalize the DateTime to a single value for all records contained within the day.
In SQL Server here's a little trick to do that:
SELECT CAST(FLOOR(CAST(CURRENT_TIMESTAMP AS float)) AS DATETIME)
You cast the DateTime into a float, which represents the Date as the integer portion and the Time as the fraction of a day that's passed. Chop off that decimal portion, then cast that back to a DateTime, and you've got midnight at the beginning of that day.
Without knowing the exact structure of your tables or what type of database you're using it's hard to answer. However if you're using MS SQL and if you have a true date/time field that has different times that the records were entered on the same date then something like this should work:
select record_id,
convert(varchar, date_created, 101) as log date,
count(distinct date_created) as num_of_entries
from record_log_table
group by convert(varchar, date_created, 101), record_id
having count(distinct date_created) > 1
Hope this helps.
SELECT id, count(*)
INTO #tmp
FROM tablename
WHERE date = #date
GROUP BY id
HAVING count(*) > 1
SELECT *
FROM tablename t
WHERE EXISTS (SELECT 1 FROM #tmp WHERE id = t.id)
DROP TABLE tablename
TrickyNixon writes;
The top post (Leigh Caldwell) will not return duplicate records and needs to be down modded.
Yet the question doesn't ask about duplicate records. It asks about duplicate record-ids on the same date...
GROUP-BY,HAVING seems good to me. I've used it in production before.
.
Something to watch out for:
SELECT ... FROM ... GROUP BY ... HAVING count(*)>1
Will, on most database systems, run in O(NlogN) time. It's a good solution. (Select is O(N), sort is O(NlogN), group by is O(N), having is O(N) -- Worse case. Best case, date is indexed and the sort operation is more efficient.)
.
Select ... from ..., .... where a.data = b.date
Granted only idiots do a Cartesian join. But you're looking at O(N^2) time. For some databases, this also creates a "temporary" table. It's all insignificant when your table has only 10 rows. But it's gonna hurt when that table grows!
Ob link: http://en.wikipedia.org/wiki/Join_(SQL)
select id from tbl where date in
(select date from tbl group by date having count(*)>1)
GROUP BY with HAVING is your friend:
select id, count(*) from records group by date having count(*) > 1