Search for duplicate accounts - sql

I've been beating my head for two days and I finally concede that I need help doing this seemingly simple task.
I'm trying to find a way to produce a list of possible duplicate accounts in a single table. Fields are similar to AccountNumber, FirstName, LastName, DOB, SSN, Address, City, State, Zip.
I need to find a way to query the DB and find accounts that have different AccountNumbers but similar names/DOB/etc that are likely the same person.
Any help would be greatly appreciated.
Thanks!

select distinct t1.AccountNumber
from table t1
join table t2 on t2.Name = t1.Name and t2.DoB = t1.DoB
and t2.AccountNumber <> t1.AccountNumber

Look at using recursive selects. Here's an article on it. http://technet.microsoft.com/en-us/library/ms186243(v=sql.105).aspx
Basically, it creates a temporary table that allows you to perform actions on without requerying the database directly. This allows you to perform multiple sub selects more efficiently.
Your query would end up looking something like the following.
With accounts as (select a, b, c from table)
Select a, b, c
From table tbl
Where exists (select 1 from accounts act where act.a like '%' + tbl.a + '%') ...
Etc for more information on how to check if a column is like another check this out Compare Columns Where One is Similar to Part of Another

I got a query from our software vendor that did exactly what I needed it to do. Their solution is kind of a combination of the two suggestions above. It creates a temp table to put accounts that have more than one firstname lastname matching, then rates the likelihood that they are duplicates by checking for matching address, state, city. It also notes the last time each account had been used.
I don't fully understand the syntax, but that's ok, it gets the job done.
Thanks for the help!

Related

New to SQL and need some advice with queries

I have recently started learning SQL but can't seem to get my head around creating SQL statements that form relevant results from multiple tables/relations.
Given the following schema:
Account(accNumber, balance, type)
Branch(BSB, phone, streetAddress, town)
registered(accNumber*, BSB*)
I am trying to formulate some outputs:
List all the accNumber registered with a specific BSB (123) and show its listed town (Sydney).
I have tried the following statement for the first query:
SELECT accNumber, BSB, town
FROM ACCOUNT, BRANCH
WHERE BSB = 123;
However, I get every account listed even if they don't belong to the BSB, so I tried:
SELECT accNumber, BSB, town
FROM ACCOUNT, BRANCH
WHERE BSB = 123
AND Town = 'Sydney'
AND account.accNumber=registered.accNumber
AND branch.bsb=registered.bsb;
This time I get column ambiguously defined because they have the same name in the "registered" table.
I've tried making alias in the select statment i.e. accNumber AS ACCOUNT_NUMBER etc, but still getting ambiguously defined errors.
I tried just listing what was in the registered table but then I do not get the town name, just the accNumber and the BSB passed in as a foreign key.
I can't seem to understand how to pull data from other tables and display them correctly and would greatly appreciate any advice!
This might help you start.
SELECT a.ccNumber, b.BSB, c.town
FROM ACCOUNT as a
inner join registered as b on b.accNumber=a.accNumber
inner join BRANCH as c on c.bsb = b.bsb
WHERE b.BSB = 123
AND c.Town = 'Sydney'
So this sounds like a generic SQL question. For your query here is what you're looking at:
select a.account_number
from account a, brance b, registered r
where a.account_number = r.account_number and
a.bsb = b.bsb and
b.bsb = 123;
This will get you all account numbers from the account table that are in BSB 123. When you have multiple tables that have the same column, you need to tell Oracle (and any database for that matter) which "account_number" column you're referring to (as otherwise it's ambiguous as there are multiple tables that contain the column account_number).
SQL is about tables and joins. You sometimes have to join several tables to get what you need, as above. If you don't join tables, as you did originally, you'll get a "cross product", which is not what you want.
I know this is very "light", but hopefully from the above answer to your question, you'll get some idea of how to do this.
I'd be happy to help you more if you have questions. Everyone is new to SQL at some point. Don't feel bad about that. It takes practice and then it becomes much easier.
-Jim

Is there a way to select automatically the row pointed by an FK on a given table?

Today while writing one of the many queries that every developer in my company write every day I stumbled upon a question.
The DBMS we are using is Sql Server 2008
Say for example I write a query like this in the usual PERSON - DEPARTMENT db example
select * from person where id = '01'
And this query returns one row:
id name fk_department
01 Joe dp_01
The question is: is there a way (maybe using an addon) to make sql server write and execute a select like this
select * from department where id = 'dp_01'
only by for example clicking with the mouse on the cell containing the fk value (dp_01 in the example query)? Or by right click and selecting something like ("Go to pointed value")?
I hope I didn't wrote something stupid or impossible by definition
Not really, but that seems like a silly thing to do. Why would you want to confuse an id with a department name?
Instead, you could arrange things so you could do:
select p.*
from person p
where department = 'dp_01';
You would do this by adding a computed column department that references a scalar function that looks up the value in the department table. You can read about computed columns here.
However, a computed column would have bad performance characteristics. In particular, it would basically require a full table scan on the person table, even if that is not appropriate.
Another solution is to create a view, v_person that has the additional columns you want. Then you would do:
select p.*
from v_person p
where department = 'dp_01';
Why can't you write yourself by saying
select * from department where id =
(select fk_department from person where id = '01')

(SQL) Using SELECT statements to display data with odd requirements

So I'm taking a course on learning basic SQL (using Oracle), and I felt like I had become fairly fluent with using SELECT statements (grouping, joining, having, etc), but now I'm at a loss on how to deal with this latest problem.
I need to write a statement that would only display rows with more than one piece of data. So, say I had
COMPANY PRODUCT
One Car
One Book
Two Game
it should only list company 'One'. But I can't find anything online to help me.
Select Company
From YourTableName
Group By Company
Having Count(*) > 1
better way to know count of each company is :
Select Company,Count(*)
From Table
Group By Company
Having Count(*) > 1

SQL WHERE <from another table>

Say you have these tables:
PHARMACY(**___id_pharmacy___**, name, addr, tel)
PHARMACIST(**___Insurance_number___**, name, surname, qualification, **id_pharmacy**)
SELLS(**___id_pharmacy___**, **___name___**, price)
DRUG(**___Name___**, chem_formula, **id_druggistshop**)
DRUGGISTSHOP(**___id_druggistshop___**, name, address)
I think this will be more specific.
So, I'm trying to construct an SQL statement, in which I will fetch the data from id_pharmacy and name FROM PHARMACY, the insurance_number, name, and surname columns from PHARMACIST, for all the pharmacies that sell the drug called Kronol.
And that's basically it. I know I'm missing the relationships in the code I wrote previously.
Note: Those column names which have underscores left and right to them are underlined(Primary keys).
The query you've written won't work in any DBMS that I know of.
You'll most likely want to use some combination of JOINs.
Since the exact schema isn't provided, consider this pseudo code, but hopefully it will get you on the right track.
SELECT PH.Ph_Number, PH.Name, PHCL.Ins_Number, PHCL.Name, PHCL.Surname
FROM PH
INNER JOIN PHCL ON PHCL.PH_Number = PH.Ph_Number
INNER JOIN MLIST ON MLIST.PH_Number = PH.PH_Number
WHERE MLIST.Name = "Andy"
I've obviously assumed some relationships between tables that may or may not exist, but hopefully this will be pretty close. The UNION operator won't work because you're selecting different columns and a different number of columns from the various tables. This is the wrong approach all together for what you're trying to do. It's also worth mentioning that a LEFT JOIN may or may not be a better option for you, depending on the exact requirements you're trying to meet.
Ok, try this query:
SELECT A.id_pharmacy, A.name AS PharmacyName, B.Insurance_number,
B.name AS PharmacistName, B.surname AS PharmacistSurname
FROM PHARMACY A
LEFT JOIN PHARMACIST B
ON A.id_pharmacy = B.id_pharmacy
WHERE A.id_pharmacy IN (SELECT id_pharmacy FROM SELLS WHERE name = 'Kronol')

SQL query to get data from one table based upon a column from another table?

In my tables I have for example
CountyID,County and CityID in the county table and in the city table I have table I have for example
City ID and City
How do I create a report that pulls the County from the county table and pulls city based upon the cityid in the county table.
Thanks
Since this is quite a basic question, I'll give you a basic answer instead of the code to do it for you.
Where tables have columns that "match" each other, you can join them together on what they have in common, and query the result almost as if it was one table.
There are also different types of join based on what you want - for example it might be that some rows in one of the tables you're joining together don't have a corresponding match.
If you're sure that a city will definitely have a corresponding county, try inner joining the two tables on their matching column CityID and querying the result.
The obvious common link between both tables is CityID, so you'd be joining on that. I think you have the data organized wrong though, I'd put CountryID in the City table rather than CityID in the country table. Then, based on the CountryID selected, you can limit your query of the City table based on that.
To follow in context of Bridge's answer, you are obviously new to SQL and there are many places to dig up how to write them. However, the most fundamental basics you should train yourself with is always apply the table name or alias to prevent ambiguity and try to avoid using column names that might be considered reserved words to the language... they always appear to bite people.
That said, the most basic of queries is
select
T1.field1,
T1.field2,
etc with more fields you want
from
FirstTable as T1
where
(some conditional criteria)
order by
(some column or columns)
Now, when dealing with multiple tables, you need the JOINs... typically INNER or LEFT are most common. Inner means MUST match in both tables. LEFT means must match the table on the left side regardless of a match to the right... ex:
select
T1.Field1,
T2.SomeField,
T3.MaybeExistsField
from
SomeTable T1
Join SecondTable T2
on T1.SomeKey = T2.MatchingColumnInSecondTable
LEFT JOIN ThirdTable T3
on T1.AnotherKey = T3.ColumnThatMayHaveTheMatchingKey
order by
T2.SomeField DESC,
T1.Field1
From these examples, you should easily be able to incorporate your tables and their relationships to each other into your results...