SQL Query to Obtain the Oldest People - sql

I am trying to find the oldest customers in my database. I want just their full names and their ages, but my current results are outputting all customers and their ages (not just the oldest). What am I doing wrong here?
SELECT
LTRIM(CONCAT(' ' + Prefix, ' ' + FirstName,
' ' + MiddleName, ' ' + LastName, ', ' + Suffix)),
MAX(DATEDIFF(year, BirthDate, GETDATE()))
FROM
Customers
WHERE
BirthDate is not null
GROUP BY
Prefix, FirstName, MiddleName, LastName, Suffix
ORDER BY
MAX(DATEDIFF(year, e.BirthDate, GETDATE())) desc
Note that there seems to be multiple customers with the same oldest age.

You have not defined what you mean with "oldest customers".
So I will give a few options you could try
to see a list of customers with the oldest on top, use a simple querie like this
SELECT FirstName, LastName, Suffix, BirthDate
FROM Customers
WHERE BirthDate is not null
ORDER BY BirthDate desc
to restrict the result to a number of rows, for example the 10 oldest, use top 10
SELECT top 10
FirstName, LastName, Suffix, BirthDate
FROM Customers
WHERE BirthDate is not null
ORDER BY BirthDate desc
to restrict the result to all customers born after a certain date, add to the where clause
SELECT FirstName, LastName, Suffix, BirthDate
FROM Customers
WHERE BirthDate is not null
and BirtDate < '19920101'
ORDER BY BirthDate desc

The first thing you need to do before you do anything else is define a unique numeric primary key on the Customers table.
ALTER TABLE Customers ADD Cust_Id int IDENTITY(1,1);
ALTER TABLE Customers ADD CONSTRAINT PK_Customers PRIMARY KEY (Cust_Id);
After you've doe that, the following code will give you the "oldest customer (or customers) in your database".
With qry1 As (
SELECT Cust_Id,
DATEDIFF(year, BirthDate, GETDATE()) As Age
FROM Customers
WHERE BirthDate is not null
),
qry2 As (
SELECT Max(Age) As Max_Age
FROM qry1
)
SELECT Customers.Cust_Id,
Customers.Prefix,
Customers.FirstName,
Customers.MiddleName,
Customers.LastName,
Customers.Suffix,
Qry1.Age
FROM Customers
Inner Join Qry1 On Customers.Cust_Id = Qry1.Cust_Id
Inner Join Qry2 On Qry1.Age = Qry2.Max_Age

Related

Trouble with COUNT

I have two tables, PATIENT and VISIT. One with PatientID as the primary key and one with VisitID as the primary key. I need to select the first name and last name of the patients that have visited the hospital more than twice.
I have tried DISTINCT, a nested where clause, INNER JOIN, etc.
SELECT FirstName
, LastName
, PatientID
, COUNT(*) AS total_visits
FROM VISIT
WHERE total_visits > 2;
It should just show the first and last name of the patients that have more than two occurrences in the VISIT table, but no matter how I rearrange the code it doesn't work.
Following on from Gordon's answer and your comment I presume that PatientID in VISIT is a key to the PATIENT table. So you will need to use an ´INNER JOIN´. So your query looks something like this:
SELECT FirstName, LastName, v.PatientID, COUNT(*) AS total_visits
FROM VISIT v
INNER JOIN PATIENT p ON p.PatientID = v.PatientID
GROUP BY FirstName, LastName, v.PatientID
HAVING COUNT(*) > 2;
Note that AFAIK in Access you cannot use the alias name in the HAVING clause. You need to repeat the COUNT(*) as is.
You need GROUP BY and HAVING:
SELECT FirstName, LastName, PatientID, COUNT(*) AS total_visits
FROM VISIT
GROUP BY FirstName, LastName, PatientID
HAVING total_visits > 2;

No column was specified for column 1 of 'T1' when using a sub-select with a group by

I have a working query:
SELECT
COUNT(*), ACCOUNT_ID
FROM
CDS_PLAYER
GROUP BY
ACCOUNT_ID
HAVING
COUNT(*) > 1`
Output
No column name Account_ID
----------------------------
'2' '12345'
I'm trying to add names to these accounts (all from the same table) but with no luck. The only query that gets me close is:
SELECT
LASTNAME, FIRSTNAME, COUNT(ACCOUNT_ID) AS NUMBER
FROM
(SELECT
COUNT(*), ACCOUNT_ID
FROM
CDS_PLAYER
GROUP BY
ACCOUNT_ID
HAVING
COUNT(*) > 1) AS T1
GROUP BY
LASTNAME, FIRSTNAME, PLAYER_ID
But I get an error:
No column was specified for column 1 of 'T1'
Like I said VERY NEW AT THIS. My boss of 4 months wanted me to learn and so I'm self taught (books and google). Any help at all to get me where I need to be would be appreciated!
(I'm using Windows Server 2003 and SQL Server 2000)
The error message can be resolved as below
SELECT LASTNAME, FIRSTNAME, COUNT(ACCOUNT_ID) AS NUMBER
FROM
(SELECT COUNT(*) AS Total, ACCOUNT_ID FROM CDS_PLAYER GROUP BY ACCOUNT_ID HAVING
COUNT(*) > 1) AS T1
GROUP BY LASTNAME, FIRSTNAME, PLAYER_ID`
Add as TOTAL after the count(*)
Does this do what you want?
SELECT COUNT(*), ACCOUNT_ID, LASTNAME, FIRSTNAME, PLAYER_ID
FROM CDS_PLAYER
GROUP BY ACCOUNT_ID, LASTNAME, FIRSTNAME, PLAYER_ID
HAVING COUNT(*) > 1;
You should also update your version of SQL Server. It is like 15 years out of date and hasn't been supported in many years. You can download a free version of SQL Server Express from Microsoft.
you want to select the LASTNAME and FIRSTNAME, but havn't it selected in your subselect. You only can access field which are in the resultset.
Solution: Add it to your GROUP BY clause.
ie:
SELECT
LASTNAME, FIRSTNAME, COUNT(ACCOUNT_ID) AS NUMBER
FROM
(SELECT COUNT(*), LASTNAME, FIRSTNAME, ACCOUNT_ID
FROM CDS_PLAYER
GROUP BY ACCOUNT_ID, LASTNAME, FIRSTNAME
HAVING COUNT(*) > 1) AS T1
GROUP BY
LASTNAME, FIRSTNAME, PLAYER_ID

Listing duplicated records using T SQL

I have a database that is used to record patient information for a small clinic. We use MS SQL Server 2008 as the backend. The patient table contains the following columns:
Id int identity(1,1),
FamilyName varchar(30),
FirstName varchar (20),
DOB datetime,
AddressLine1 varchar (50),
AddressLine2 varchar (50),
State varchar (20),
Postcode varchar (4),
NextOfKin varchar (20),
Homephone varchar (20),
Mobile varchar (20)
Occasionally the staff register a new patient, unaware that the patient already has a record in the system. We end up with several thousands duplicated records.
What I would like to do is to present a list of patients who have duplicated records for the staff to merge during quiet time. We consider 2 records to be duplicated if the 2 records have exactly the same FamilyName, FirstName and DOB. What I am doing at the moment is to use a sub query to return the records as follow:
SELECT FamilyName,
FirstName,
DOB,
AddressLine1,
AddressLine2,
State,
Postcode,
NextOfKin,
HomePhone,
Mobile
FROM
Patients AS p1
WHERE Id IN
(
SELECT Max(Id)
FROM Patients AS p2,
COUNT(id) AS NumberOfDuplicate
GROUP BY
FamilyName,
FirstName,
DOB HAVING COUNT(Id) > 1
)
This produces the result but the performance is terrible. Is there any better way to do it? The only requirements is I need to show all the fields in the Patients table as the user of the system wants to view all the details before making the decision whether to merge the records or not.
This will output every row which has a duplicate, based on firstname and lastname
SELECT DISTINCT t1.*
FROM Table AS t1
INNER JOIN Table AS t2
ON t1.firstname = t2.firstname
AND t1.lastname = t2.lastname
AND t1.id <> t2.id
I suggest you build an index on the 3 fields you use to detect duplicates,
then try this query:
with Duplicates as
(
select FamilyName, FirstName, DOB
from Patients
group by FamilyName, FirstName, DOB
having count(*) > 1
)
Select Patients.*
from Patients
inner join Duplicates
on Patients.FamilyName = Duplicates.FamilyName
And Patients.FirstName= Duplicates.FirstName
and Patients.DOB= Duplicates.DOB
WITH CTE
AS
(
SELECT Id, FamilyName, FirstName ,DOB
ROW_NUMBER() OVER(PARTITION BY FamilyName, FirstName ,DOB ORDER BY Id) AS DuplicateCount
FROM PatientTable
)
select * from CTE where DuplicateCount > 1
If I were in your shoes, I'd do following:
add indexes to FamilyName, FirstName and DOB
create view for your subquery
modified the query as following
Select p.* FROM Patients p INNER JOIN view_name v ON v.FirstName=p.Firstname AND ...
select FamilyName, FirstName, DOB
from Patients
group by FamilyName, FirstName, DOB
having count(*)>1
Will show all duplicates.
However, please consider names being written different, but similar. You might want to look for the topics 'data deduplication' and/or 'record linkage'. I solved the problem using a string similarity algorithm (modified Jaro/Winkler and levenshtein).

Deleting duplicates in a table based on a criteria only in SQL

Let's say I have a table with columns:
CustomerNumber
Lastname
Firstname
PurchaseDate
...and other columns that do not change anything in the question if they're not shown here.
In this table I could have many rows for the same customer with different purchase dates (I know, poorly designed... I'm only trying to fix an issue for reporting, not really trying to fix the root of the problem).
How, in SQL, can I keep one record per customer with the latest date, and delete the rest? A group by doesn't seem to be working for my case
;with a as
(
select row_number() over (partition by CustomerNumber, Lastname, Firstname order by PurchaseDate desc) rn
from <table>
)
delete from a where rn > 1
This worked for me (on DB2):
DELETE FROM my_table
WHERE (CustomerNumber, Lastname, Firstname, PurchaseDate)
NOT IN (
SELECT CustomerNumber, Lastname, Firstname, MAX(PurchaseDate)
FROM my_table
GROUP BY CustomerNumber, Lastname, FirstName
)
SELECT CustomerNumber, Lastname, Firstname, MAX(PurchaseDate) LatestPurchaseDate
FROM Table
GROUP BY CustomerNumber, Lastname, Firstname
The MAX will select the highest (latest) date and show that date for each unique combination of the GROUP BY columns.
EDIT: I misunderstood that you wanted to delete records for all but the latest purchase date.
WITH Keep AS
(
SELECT CustomerNumber, Lastname, Firstname, MAX(PurchaseDate) LatestPurchaseDate
FROM Table
GROUP BY CustomerNumber, Lastname, Firstname
)
DELETE FROM Table
WHERE NOT EXISTS
(
SELECT *
FROM Keep
WHERE Table.CustomerNumber = Keep.CustomerNumber
AND Table.Lastname = Keep.Lastname
AND Table.Firstname = Keep.Firstname
AND Table.PurchaseDate = Keep.LastPurchaseDate
)

SELECT using GROUP BY and HAVING not returning records

I'm trying to select all records that have a duplicate value in the LASTNAME column. This is my code so far
If EXISTS( SELECT name FROM sysobjects WHERE name = 'USER_DUPLICATES' AND type = 'U' )
DROP TABLE USER_DUPLICATES
GO
CREATE TABLE USER_DUPLICATES
(
FIRSTNAME VARCHAR(MAX),
LASTNAME VARCHAR(MAX),
PHONE VARCHAR(MAX),
EMAIL VARCHAR(MAX),
TITLE VARCHAR(MAX),
LMU VARCHAR(MAX)
)
GO
INSERT INTO USER_DUPLICATES
(
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
)
SELECT
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
FROM TM_USER
GROUP BY
FIRSTNAME,
LASTNAME,
PHONE,
EMAIL,
TITLE,
LMU
HAVING COUNT(LASTNAME) > 1
It does not return any records. I changed the
HAVING COUNT(LASTNAME) > 1
to
HAVING COUNT(LASTNAME) > 0
and it returns all the records. I am also certain there are records with the same LASTNAME value. It is written using T-SQL on SQL Server
Try this:
SELECT
a.FIRSTNAME,
a.LASTNAME,
a.PHONE,
a.EMAIL,
a.TITLE,
a.LMU
FROM TM_USER a
INNER JOIN
(
SELECT LASTNAME
FROM TM_USER
GROUP BY LASTNAME
HAVING COUNT(1) > 1
) b ON a.LASTNAME = b.LASTNAME
Your Group By clause will Group By all the comuns in the list. Those columns probably define a discreet record of count = 1
You will need to do something like:
Select LAST_NAME from TM_USER GROUP BY LAST_NAME HAVING COUNT(LAST_NAME) > 1
COUNT function is computed over all grouping expression, not over LASTNAME.
To get unique last names use
SELECT LASTNAME FROM TM_USER GROUP BY LASTNAME HAVING COUNT(LASTNAME) > 1
If you group by few columns, you will get count of their unique combination even if computing COUNT over single column value.