SQL: Removing ALL rows that have repeated values - sql

I've got a table with columns:
Acct_no, PSTL_CODE, NAME, phone
I'm trying to get rid of all rows that share the same PSTL_CODE and phone (i.e. dump the ones where there's his & hers accounts, and similar scenarios)
I've pulled together the following which I think should only return rows with a unique PSTL_CODE:
SELECT * FROM Sheet1
WHERE PSTL_CODE IN
(SELECT PSTL_CODE FROM SHEET1
GROUP BY PSTL_CODE HAVING COUNT(PSTL_CODE) =1)
ORDER BY phone
and it's close-ish, but it's still returning one row where there are two accounts at the same PSTL_CODE.
and I'm stuck with Access 2007, so I can't do:
SELECT * FROM Sheet1
EXCEPT
(SELECT PSTL_CODE FROM SHEET1
GROUP BY PSTL CODE HAVING COUNT(*) >1)
ORDER BY phone
in order to just scythe away the multiples.
Help!

Try using Exists
Considering that your unique combination is PSTL_CODE and phone
SELECT *
FROM sheet1 a
WHERE EXISTS (SELECT 1
FROM sheet1 b
WHERE a.pstl_code = b.pstl_code
AND a.phone = b.phone
HAVING Count(*) = 1)
ORDER BY phone
It is better to define a Unique Constraint on PSTL_CODE and phone columns which helps you to avoid duplicate records in table

If I understand the business logic correctly you are attempting to eliminate couples living together from the query. you are grouping by phone also which seems to be causing the error. Is there a logical business reason for two people to live in the same place but have different phone numbers. eg mobile phones?

One approach is to generate a set of unique phone, postal codes that are duplicates and then LEFT join your base set to the generated set where the postal_Code is null.
Select *
FROM SHEET1 A
LEFT JOIN (
Select count(*) as cnt pstl_Code, phone
From sheet1
group by pstl_Code, phone
having count(*) > 1) B
on A.Pstl_Code = B.Pstl_Code
and A.Phone = B.Phone
where B.pstl_Code is null and b.phone is null

Related

SQL Oracle - query to return rows based on data matchng rules

I have the below data
NUMBER SEQUENCE_NUMBER
CA00000045 AAD508
CA00000045 AAD508
CA00000046 AAD509
CA00000047 AAD510
CA00000047 AAD510
CA00000047 AAD511
CA00000048 AAD511
and I would like to find out which rows do not match the following rule:
NUMBER will always be the same when the SEQUENCE_NUMBER is the same.
So in the above data 'AAD508' will mean the NUMBER value will be the same on each row where the same value appears in the SEQUENCE_NUMBER.
I want to right a query that will bring me back rows where this rule is
broken. So for example:
CA00000047 AAD511
CA00000048 AAD511
I don't know where to start with this one, so have no initial SQL i'm afraid.
Thanks
You want to self join on the data to compare each row to all others sharing the same sequence number, and then filter using a with statement to only get rows with non-matching numbers. You did not give a name for the table so I added it as "table_name" below
SELECT
a.NUMBER,
a.SEQUENCE_NUMBER
FROM table_name a
INNER JOIN table_name b
ON a.SEQUENCE_NUMBER = b.SEQUENCE_NUMBER
WHERE a.NUMBER <> b.NUMBER
GROUP BY 1,2
Threw in the group by to act as a distinct
I would simply use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.sequence_number = t.sequence_number and
t2.number <> t.number
);
If sequence_numbers() only had up to two rows, you could get each rule-breaker on one row:
select sequence_number, min(number), max(number)
from t
group by sequence_number
having min(number) <> max(number);
Or, you could generalize this to get the list of numbers on a single row:
select sequence_number, listagg(number, ',') within group (order by number) as numbers
from t
group by sequence_number
having min(number) <> max(number);

SQL: How to return records with more than 1 row of data

I'm stuck on a problem where I am creating a report and need to show records which have two or more bank accounts (some of our employees are international and get paid in more than one currency).
The report I created brings back all employees and their bank account information. However, I want this report only to bring back employees with 2 or more bank accounts.
Here is some test data below:
As you can see, Gareth has more than one bank account - what filter can I write to just bring back his record?
Assuming it is mysql, sqlite and even postgresql (I think). it should be
SELECT * FROM table_name WHERE "First Name" = "Gareth";
However if you are using the MySQL binding for excel be sure that "Gareth" is listed for each account he owns. It may look cleaner but SQL will interpret the row beneath "Gareth" as "". Doing this will also mean you will need a new index, "First Name" can't be set as "UNIQUE". Also having spaces in columns is pretty ugly SQL syntax best to use camelCase or under_scores.
Assuming Person_Number as the primary key of the table accounts
SELECT * FROM accounts WHERE accounts.First_Name in
(SELECT a.First_Name
FROM accounts a
INNER JOIN accounts b
ON a.Person_Number = b.Person_Number
WHERE a.First_Name = b.First_Name
AND a.Bank_Account_Name <> b.Bank_Account_Name
);
Try this if you are using database like oracle, sqlserver....
select b1.* from account_info_table b1 where exists (select 1 from account_info_table b2 where b1.first_name=b2.first_name group by b2.first_name having count(*)>2);
or
select * from account_info_table where first_name in (select first_name from account_info_table group by first_name having count(*)>2);
Use This Code :
;WITH _CTE AS
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY FirstName Order By FirstName)ROWNO FROM #Temp
)SELECT FirstName,BankAccountName,BankBranchName,BankAccountNo,SortCode FROM _CTE WHERE ROWNO > 1
select empleado from empleados
group by empleado
having count(*) > 1

How to search for matching staff number in sql

I am new to sql and trying to come up with a sql query which will list me the duplicate staff which were created in our system.
We have one staff which is created with id as 1234 and the same user has another account starting with staff id 01234. Is there anyway i can get the matching staff
Once i come up with correct duplicates i will than want to delete the accounts which don't have "0" at the start e.g deleted 1234 and only keep 01234
below is the sql
SELECT tps_user.tps_title AS [Name] , tps_user_type.tps_title AS [User Type]
FROM tps_user INNER JOIN
tps_user_type ON tps_user.tps_user_type_guid = tps_user_type.tps_guid
WHERE (tps_user.tps_title IN
(SELECT tps_title AS users
FROM tps_user AS t1
WHERE (tps_deleted = 0)
GROUP BY tps_title
HAVING (COUNT(tps_title) > 1))) AND (tps_user.tps_deleted = 0)
When you do you select try this:
SELECT DISTINCT CONVERT(INT,ID)
FROM your_table
WHERE ...
OR
SELECT ID
FROM your_table
WHERE ...
GROUP BY ID
This will convert all the id's to an int temporarily so when the distinct evaluates duplicates everything will be uniform to give you an accurate representation of the duplicates.
IF you don't want to convert them maybe convert them and insert them into a temporary table and add a flag to which ones have a leading zero. Or convert them then append a zero after you delete the duplicates since you want that anyway. It is easy to append a 0.
the below query will give you the list of duplicates with same Name and title. -
SELECT tps_user.tps_title AS [Name] ,
tps_user_type.tps_title AS [UserType],
COUNT(*) Duplicate_Count
FROM tps_user
INNER JOIN tps_user_type
ON tps_user.tps_user_type_guid = tps_user_type.tps_guid
group by tps_user.tps_title, tps_user_type.tps_title
having COUNT(*) > 1
order by Duplicate_Count desc
Select t1.stringId
from mytable t1
inner join mytable t2 on Convert(INT, t1.intId) = CONVERT(INT, t2.intId)
where t1.stringId not like '0%'
This should list all the persons that have duplicates but do not start with 0.

Find incorrect records by Id

I am trying to find records where the personID is associated to the incorrect SoundFile(String). I am trying to search for incorrect records among all personID's, not just one specific one. Here are my example tables:
TASKS-
PersonID SoundFile(String)
123 D10285.18001231234.mp3
123 D10236.18001231234.mp3
123 D10237.18001231234.mp3
123 D10212.18001231234.mp3
123 D12415.18001231234.mp3
**126 D19542.18001231234.mp3
126 D10235.18001234567.mp3
126 D19955.18001234567.mp3
RECORDINGS-
PhoneNumber(Distinct Records)
18001231234
18001234567
So in this example, I am trying to find all records like the one that I indented. The majority of the soundfiles like '%18001231234%' are associated to PersonID 123, but this one record is PersonID 126. I need to find all records where for all distinct numbers from the Recordings table, the PersonID(s) is not the majority.
Let me know if you need more information!
Thanks in advance!!
; WITH distinctRecordings AS (
SELECT DISTINCT PhoneNumber
FROM Recordings
),
PersonCounts as (
SELECT t.PersonID, dr.PhoneNumber, COUNT(*) AS num
FROM
Tasks t
JOIN distinctRecordings dr
ON t.SoundFile LIKE '%' + dr.PhoneNumber + '%'
GROUP BY t.PersonID, dr.PhoneNumber
)
SELECT t.PersonID, t.SoundFile
FROM PersonCounts pc1
JOIN PersonCounts pc2
ON pc2.PhoneNumber = pc1.PhoneNumber
AND pc2.PersonID <> pc1.PersonID
AND pc2.Num < pc1.Num
JOIN Tasks t
ON t.PersonID = pc2.PersonID
AND t.SoundFile LIKE '%' + pc2.PhoneNumber + '%'
SQL Fiddle Here
To summarize what this does... the first CTE, distinctRecordings, is just a distinct list of the Phone Numbers in Recordings.
Next, PersonCounts is a count of phone numbers associated with the records in Tasks for each PersonID.
This is then joined to itself to find any duplicates, and selects whichever duplicate has the smaller count... this is then joined back to Tasks to get the offending soundFile for that person / phone number.
(If your schema had some minor improvements made to it, this query would have been much simpler...)
here you go, receiving all pairs (PersonID, PhoneNumber) where the person has less entries with the given phone number than the person with the maximum entries. note that the query doesn't cater for multiple persons on par within a group.
select agg.pid
, agg.PhoneNumber
from (
select MAX(c) KEEP ( DENSE_RANK FIRST ORDER BY c DESC ) OVER ( PARTITION BY rt.PhoneNumber ) cmax
, rt.PhoneNumber
, rt.PersonID pid
, rt.c
from (
select r.PhoneNumber
, t.PersonID
, count(*) c
from recordings r
inner join tasks t on ( r.PhoneNumber = regexp_replace(t.SoundFile, '^[^.]+\.([^.]+)\.[^.]+$', '\1' ) )
group by r.PhoneNumber
, t.PersonID
) rt
) agg
where agg.c < agg.cmax
;
caveat: the solution is in oracle syntax though the operations should be in the current sql standard (possibly apart from regexp_replace, which might not matter too much since your sound file data seems to follow a fixed-position structure ).

SQL Count the numbers in a returned query?

I have a query to retrieve the email addresses and the names (sometimes more than 1) associated with the email address, where the account status is not closed, in descending order:
SELECT sbg.contact_email, COUNT(DISTINCT sbg.contact_name) Num_Contact_Names
FROM SummaryBillGroup sbg
INNER JOIN Account a
ON sbg.Customer_number = a.Customer_number
WHERE a.account_status_code <> 'c'
GROUP BY sbg.contact_email
ORDER BY Num_Contact_Names DESC
This returns a list of the email addresses and the number of names associated with each email address. What I would like to do now is use that query to count up all of the returned numbers, so that I have a list of the 3's, the 2's, the 1's, etc.
You could use that same query as a derived table for another one. Like this:
select num_contact_names, count(*)
from (SELECT sbg.contact_email, COUNT(DISTINCT sbg.contact_name) Num_Contact_Names
FROM SummaryBillGroup sbg
INNER JOIN Account a
ON sbg.Customer_number = a.Customer_number
WHERE a.account_status_code 'c'
GROUP BY sbg.contact_email) as t
group by t.num_contact_names
order by 2
First row would give you 1's, second row the 2's and so on. Cheers.