Join Talbe to itself having count > 1 - sql-server-2005

I have a Personal table with a LastName column and a MaybeUniqueID.
I want to put in output a table with a LastName column where the counter set on to the column MaybeUniqueID gives more than 1 row.
I would like to do everything in one unique run, avoiding mid-step outputs.
I prefere not using temporary table or table variables, otherwise I would like to use at most one table variable (not temporary tables), but I think this should not be necessary.
I am using Microsoft SQL Server 2005.
I tried different scenarios with different SQL statements like HAVING or GROUP BY, but I failed to get the outcome I am looking for.
Please have a look at the following not-working summary test:
SELECT LastName
FROM Personal
JOIN
(SELECT MaybeUniqueID AS ID2,
COUNT(*) AS CNT
FROM Personal
--WHERE CNT > 1
GROUP BY MaybeUniqueID
HAVING cnt > 1
) AS MultiMaybeUniqueID
ON Personal.MaybeUniqueID = MultiMaybeUniqueID.ID2

HAVING cnt > 1 should be HAVING COUNT(*) > 1.
Column aliases can only be referenced in the ORDER BY clause not the HAVING clause.
Though you could also use
;WITH T AS
(
SELECT LastName,
COUNT(*) OVER (PARTITION BY MaybeUniqueID) AS Cnt
FROM Personal
)
SELECT LastName
FROM T
WHERE Cnt > 1

Related

Getting result basis on count of another SQL query

I have a table with the following columns:
bkng_date
bkng_id (varchar)
villa_id (varchar)
This query
select bkng_date,count(*) as cnt
from tab_bkng_det
group by bkng_date;
returns the no.of records for each date as count.
Now I need to find dates in the resultset of this query where cnt = 2.
I tried a couple of subqueries but I'm not getting the desired results.
The simplest, correct and safe solution is adding having count(*) = 2 clause as Gordon said.
For completeness, if you were curious how to solve it using subqueries (you didn't provide your db vendor though it's very likely your vendor supports having clause), it would be:
select x.bkng_date, x.cnt from (
select bkng_date,count(*) as cnt
from tab_bkng_det
group by bkng_date
) x
where x.cnt = 2
or
with x as (
select bkng_date,count(*) as cnt
from tab_bkng_det
group by bkng_date
)
select * from x where cnt = 2
Best Option is to use the Having Clause as follows,
select bkng_date,count(*) as cnt
from tab_bkng_det
group by bkng_date
having count(*) = 2

SQL Oracle Find Max of count

I have this table called item:
| PERSON_id | ITEM_id |
|------------------|----------------|
|------CP2---------|-----A03--------|
|------CP2---------|-----A02--------|
|------HB3---------|-----A02--------|
|------BW4---------|-----A01--------|
I need an SQL statement that would output the person with the most Items. Not really sure where to start either.
I advice you to use inner query for this purpose. the inner query is going to include group by and order by statement. and outer query will select the first statement which has the most items.
SELECT * FROM
(
SELECT PERSON_ID, COUNT(*) FROM TABLE1
GROUP BY PERSON_ID
ORDER BY 2 DESC
)
WHERE ROWNUM = 1
here is the fiddler link : http://sqlfiddle.com/#!4/4c4228/5
Locating the maximum of an aggregated column requires more than a single calculation, so here you can use a "common table expression" (cte) to hold the result and then re-use that result in a where clause:
with cte as (
select
person_id
, count(item_id) count_items
from mytable
group by
person_id
)
select
*
from cte
where count_items = (select max(count_items) from cte)
Note, if more than one person shares the same maximum count; more than one row will be returned bu this query.

SQL SELECT Full Row with Duplicated Data in One Column

I am using Microsoft SQL Server 2014.
I am able to list emails which are duplicated.
But I am unable to list the entire row, which contain other fields such as EmployeeId, Username, FirstName, LastName, etc.
SELECT Email,
COUNT(Email) AS NumOccurrences
FROM EmployeeProfile
GROUP BY Email
HAVING ( COUNT(Email) > 1 )
May I know how can I list all field in the rows that contains Email appearing more than once in the table?
Thank you.
Try this:
WITH DataSource AS
(
SELECT *
,COUNT(*) OVER (PARTITION BY email) count_calc
FROM EmployeeProfile
)
SELECT *
FROM DataSource
WHERE count_calc > 1
select distinct * from EmployeeProfile where email in (SELECT
Email
FROM EmployeeProfile
GROUP BY Email
HAVING COUNT(*) > 1 )
SQL Fiddle
with cte as (
select *
, count(1) over (partition by email) noDuplicates
from Demo
)
select *
from cte
where noDuplicates > 1
order by Email, EmployeeId
Explanation:
I've used a common table expression (cte) here; but you could equally use a subquery; it makes no difference.
This cte/subquery fetches every row, and includes a new field called noDuplicates which says how many records have that same email address (including the record itself; so noDuplicates=1 actually means there are no duplicates; whilst noDuplicates=2 means the record itself and 1 duplicate, or 2 records with this email address). This field is calculated using an aggregate function over a window. You can read up on window functions here: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-2017
In out outer query we're then selecting only those records with noDuplicates greater than 1; i.e. where there are multiple records with the same mail address.
Finally I've sorted by Email and EmployeeId; so that duplicates are listed alongside one another, and are presented in the sequence in which they were (presumably) created; just to make whoever's then dealing with these results life easy.
If EmployeeId is unique, then you can EXISTS :
SELECT ep.*
FROM EmployeeProfile ep
WHERE EXISTS (SELECT 1
FROM EmployeeProfile ep1
WHERE ep1.Email = ep.Email AND ep1.EmployeeId <> ep.EmployeeId
);

sql query - filtering duplicate values to create report

I am trying to list all the duplicate records in a table. This table does not have a Primary Key and has been specifically created only for creating a report to list out duplicates. It comprises of both unique and duplicate values.
The query I have so far is:
SELECT [OfficeCD]
,[NewID]
,[Year]
,[Type]
FROM [Test].[dbo].[Duplicates]
GROUP BY [OfficeCD]
,[NewID]
,[Year]
,[Type]
HAVING COUNT(*) > 1
This works right and gives me all the duplicates - that is the number of times it occurs.
But I want to display all the values in my report of all the columns. How can I do that without querying for each record separately?
For example:
Each table has 10 fields and [NewID] is the field which is occuring multiple times.I need to create a report with all the data in all the fields where newID has been duplicated.
Please help.
Thank you.
You need a subquery:
SELECT * FROM yourtable
WHERE NewID IN (
SELECT NewID FROM yourtable
GROUP BY OfficeCD,NewID,Year,Type
HAVING Count(*)>1
)
Additionally you might want to check your tags: You tagged mysql, but the Syntax lets me think you mean sql-server
Try this:
SELECT * FROM [Duplicates] WHERE NewID IN
(
SELECT [NewID] FROM [Duplicates] GROUP BY [NewID] HAVING COUNT(*) > 1
)
select d.*
from Duplicates d
inner join (
select NewID
from Duplicates
group by NewID
having COUNT(*) > 1
) dd on d.NewID = dd.NewID

sql query to get redundant record

I want to get redundant records from the database. Is my query correct for this?
select (fields)
from DB
group by name, city
having count(*) > 1
If wrong please let me know how can I correct this.
Also if I want to delete duplicate record will it work?
delete from tbl_name
where row_id in
(select row_id from tbl_name group by name, city having count(*) > 1)
so i can make the above query like this
DELETE FROM tb_name where row_id not in(select min(row_id) from tb_name groupBy(name, city) having count(*)>1)
Your DELETE syntax is definitely totally wrong - that won't work ever. What it'll do is delete all rows that have more than one occurence - not leaving any data around...
What you can do in SQL Server 2005 and up is use a CTE (Common Table Expression) and the
ROW_NUMBER() ranking function:
;WITH Duplicates AS
(
SELECT
Name, City,
ROW_NUMBER() OVER (PARTITION BY Name, City ORDER BY City) AS 'RowNum'
)
DELETE FROM dbo.YourTable
WHERE RowNum > 1
You basically create "partitions" of your data by the (name, city) combo - each of those pairs will get sequential numbers from 1 on up.
Those that have more than one occurence will also have entries in that CTE with a RowNum > 1 - just delete all of those and your duplicates are done!
Read about Using Common Table Expressions in SQL Server 2005 and about Ranking Functions and Performance in SQL Server 2005 (or consult the MSDN docs on those topics)
You have the syntax wrong:
select name, city, count(*) from table group by name, city having count(*) > 1
If you are not interested in the actual count, remove ", count(*)" from the query