ms-access 2010: count duplicate names per household address - sql

I am currently working with a spreadsheet in MS Access 2010 which contains about 130k rows of information about people who voted in a local election recently. Each row has their residential information (street name, number, postcode etc.) and personal information (title, surname, forename, middle name, DOB etc.). Each row represents an individual person rather than a household (therefore in many cases the same residential address appears more than once as more than one person resides in a particular household).
What I want to achieve is basically to create a new field in this dataset called 'count'. I want this field to give me a count of how many different surnames reside at a single address.
Is there an SQL script that will allow me to do this in Access 2010?
+------------------+----------+-------+---------+----------+-------------+
| PROPERTYADDRESS1 | POSTCODE | TITLE | SURNAME | FORENAME | MIDDLE_NAME |
+------------------+----------+-------+---------+----------+-------------+
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N
FAKEADDRESS2 EEE 5BB MRS BLOGGS SUZANNE P
FAKEADDRESS3 EEE 5RG MS SMITH PAULINE S
FAKEADDRESS4 EEE 4BV DR JONES ANNE D
FAKEADDRESS5 EEE 3AS MR TAYLOR STUART A
The following syntax has got me close so far:
SELECT COUNT(electoral.SURNAME)
FROM electoral
GROUP BY electoral.UPRN
However, instead of returning me all 130k odd rows, it only returns me around 67k rows. Is there anything I can do to the syntax to achieve the same result, but just returning every single row?
Any help is greatly appreciated!
Thanks

You could use something like this:
select *,
count(surname) over (partition by householdName)
from myTable
If you have only one column which contains the name,
ex: Rob Adams
then you can do this to have all the surnames in a different column so it will be easier in the select:
SELECT LEFT('HELLO WORLD',CHARINDEX(' ','HELLO WORLD')-1)
in our example:
select right (surmane, charindex (' ',surname)-1) as surname
example on how to use charindex, left and right here:
http://social.technet.microsoft.com/wiki/contents/articles/17948.t-sql-right-left-substring-and-charindex-functions.aspx
if there are any questions, leave a comment.
EDIT: I edited the query, had a syntax error, please try it again. This works on sql server.
here is an example:
create table #temp (id int, PropertyAddress varchar(50), surname varchar(50), forname varchar(50))
insert into #temp values
(1, 'hiddenBase', 'Adamns' , 'Kara' ),
(2, 'hiddenBase', 'Adamns' , 'Anne' ),
(3, 'hiddenBase', 'Adamns' , 'John' ),
(4, 'QueensResidence', 'Queen' , 'Oliver' ),
(5, 'QueensResidence', 'Queen' , 'Moira' ),
(6, 'superSecretBase', 'Diggle' , 'John' ),
(7, 'NandaParbat', 'Merlin' , 'Malcom' )
select * from #temp
select *,
count (surname) over (partition by PropertyAddress) as CountMembers
from #temp
gives:
1 hiddenBase Adamns Kara 3
2 hiddenBase Adamns Anne 3
3 hiddenBase Adamns John 3
7 NandaParbat Merlin Malcom 1
4 QueensResidence Queen Oliver 2
5 QueensResidence Queen Moira 2
6 superSecretBase Diggle John 1
Your query should look like this:
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
EDIT
If over partition by isn't supported, then I guess you can get to your desired result by using group by
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
group by -- put here the fields in the select (one by one), however you can't write group by *

GROUP BY creates an aggregate query, so it's by design that you get fewer records (one per UPRN).
To get the count for each row in the original table, you can join the table with the aggregate query:
SELECT electoral.*, elCount.NumberOfPeople
FROM electoral
INNER JOIN
(
SELECT UPRN, COUNT(*) AS NumberOfPeople
FROM electoral
GROUP BY UPRN
) AS elCount
ON electoral.UPRN = elCount.UPRN

Given the update I want to post another answer. Try it like this:
create table #temp2 ( PropertyAddress1 varchar(50), POSTCODE varchar(20), TITLE varchar (20),
surname varchar(50), FORENAME varchar(50), MIDDLE_NAME varchar (50) )
insert into #temp2 values
('FAKEADDRESS1', 'EEE 5GG', 'MR', 'BLOGGS', 'JOE', 'N'),
('FAKEADDRESS1', 'EEE 5BB', 'MRS', 'BLOGGS', 'SUZANNE', 'P'),
('FAKEADDRESS2', 'EEE 5RG', 'MS', 'SMITH', 'PAULINE', 'S'),
('FAKEADDRESS3', 'EEE 4BV', 'DR', 'JONES', 'ANNE', 'D'),
('FAKEADDRESS4', 'EEE 3AS', 'MR', 'TAYLOR', 'STUART', 'A')
select PropertyAddress1, surname,count (#temp2.surname) as CountADD
into #countTemp
from #temp2
group by PropertyAddress1, surname
select * from #temp2 t2
left join #countTemp ct
on t2.PropertyAddress1 = ct.PropertyAddress1 and t2.surname = ct.surname
This yields:
PropertyAddress1 POSTCODE TITLE surname FORENAME MIDDLE_NAME PropertyAddress1 surname CountADD
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N FAKEADDRESS1 BLOGGS 2
FAKEADDRESS1 EEE 5BB MRS BLOGGS SUZANNE P FAKEADDRESS1 BLOGGS 2
FAKEADDRESS2 EEE 5RG MS SMITH PAULINE S FAKEADDRESS2 SMITH 1
FAKEADDRESS3 EEE 4BV DR JONES ANNE D FAKEADDRESS3 JONES 1
FAKEADDRESS4 EEE 3AS MR TAYLOR STUART A FAKEADDRESS4 TAYLOR 1

Related

Merging Duplicate Rows with SQL

I have a table that contains usernames, these names are duplicated in various forms, for example, Mr. John is replicated as John Mr. I want to combine the two rows using their unique phone numbers in SQL.
I want a new table in this form after removing the duplicates
you can do it with ROW_NUMBER window function.
First, you need to group the data by your unique column (Phone_Number), then sort by name.
Preparing the table and example data:
DECLARE #vCustomers TABLE (
Name NVARCHAR(25),
Phone_Number NVARCHAR(9),
Address NVARCHAR(25)
)
INSERT INTO #vCustomers
VALUES
('Mr John', '234881675', 'Lagos'),
('Mr Felix', '234867467', 'Atlanta'),
('Mrs Ayo', '234786959', 'Doha'),
('John Mr', '234881675', 'Lagos'),
('Mr Jude', '235689760', 'Rabat'),
('Ayo', '234786959', 'Doha'),
('Jude', '235689760', 'Rabat')
After that, removing the duplicate rows:
DELETE
vc
FROM (
SELECT
ROW_NUMBER() OVER(PARTITION BY Phone_Number ORDER BY Name DESC) AS RN
FROM #vCustomers
) AS vc
WHERE RN > 1
SELECT * FROM #vCustomers
As final, the result:
Name
Phone_Number
Address
Mr John
234881675
Lagos
Mr Felix
234867467
Atlanta
Mrs Ayo
234786959
Doha
Mr Jude
235689760
Rabat

Transform rows to comma delimited string by groups of n records

Based on this sample table
ID
NAME
ZIP
NTID
1
Juan
123
H1
1
Juan
123
H2
2
John
456
H3
2
John
456
H4
2
John
456
H5
I want to show ntid as a comma separated value but in groups of maximum 2 items
Expected result
ID
NAME
ZIP
NTID
1
Juan
123
H1, H2
2
John
456
H3, H4
2
John
456
H5
Using SQL Server, is there a way to achieve this?
By using the STRING_AGG function it concatenates all in a single row, but I need to group as groups of maximum of 2 members.
SELECT ID, NAME, ZIP, STRING_AGG(NTID,',')
FROM MyTable
GROUP BY ID, NAME, ZIP
So you can use a modified row number (divided by 2) to group your data into blocks of 2 rows, and then you can use string_agg().
You do need a way to order the rows though (if you want consistent results), I have assumed NTID will work, but you may have a better column to order by.
declare #Test table (ID int, [NAME] varchar(32), ZIP varchar(12), NTID varchar(2));
insert into #Test (ID, [NAME], ZIP, NTID)
values
(1, 'Juan', '123', 'H1'),
(1, 'Juan', '123', 'H2'),
(2, 'John', '456', 'H3'),
(2, 'John', '456', 'H4'),
(2, 'John', '456', 'H5');
with cte as (
select *
, (row_number() over (partition by ID order by NTID) - 1) / 2 rn
from #Test
)
select ID, [NAME], string_agg(NTID,',')
from cte
group by ID, [NAME], rn;
db<>fiddle here
Note: If you supply the DDL+DML as part of your question you make it much easier to answer.

How can I use SQL Pivot Table to turn my rows of data into columns

In SQL 2008, I need to flatten a table and show extra rows as columns. All I can find are queries with calculations. I just want to show the raw data. The data is like as below (simplified):
ID# Name Name_Type
1 Mary Jane Legal
1 MJ Nickname
1 Smith Maiden
2 John Legal
3 Suzanne Legal
3 Susie Nickname
I want the data to show as:
ID# Legal Nickname Maiden
1 Mary Jane MJ Smith
2 John
3 Suzanne Susie
where nothing shows in the column if there is not a row existing for that column. I'm thinking the Pivot Table method should work.
PIVOT requires you to use an aggregate. See this post for a better explanation of how it works.
CREATE TABLE #MyTable
(
ID# INT
, Name VARCHAR(50)
, Name_Type VARCHAR(50)
);
INSERT INTO #MyTable VALUES
(1, 'Mary Jane', 'Legal')
, (1, 'MJ', 'Nickname')
, (1, 'Smith', 'Maiden')
, (2, 'John', 'Legal')
, (3, 'Suzanne', 'Legal')
, (3, 'Susie', 'Nickname');
SELECT *
FROM
(
SELECT * FROM #MyTable
) AS Names
PIVOT (MAX(NAME)
FOR Name_Type IN ([Legal], [Nickname], [Maiden]))
AS PVT;
DROP TABLE #MyTable;
Try this (replace "new_Table" with your table - name and "ID_" with your id - column):
SELECT ID_ AS rootID, (
SELECT Name
FROM new_table
WHERE Name_type = 'legal'
AND new_table.ID_ = rootID
) AS legal,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Nickname'
AND ID_ = rootID
) AS Nickname,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Maiden'
AND ID_ = rootID
) AS Maiden
FROM new_table
GROUP BY rootID;

SQL Server : query to update record with latest entry

I have a table that maintains records of employers and employees' data. Something like this
EmployerName EmployerPhone EmployerAddress EmployeeName EmployeePhone EmployeeAddress Date
-------------------------------------------------------------------------------------------------------
John 12345 NewYork Harry 59786 NewYork 12-1-1991
Mac 22345 Bankok John 12345 Delhi 12-3-1991
Smith 54732 Arab Amar 59226 China 21-6-1991
Sarah 12345 Bhutan Mac 22345 NewYork 5-9-1991
Root 85674 NewYork Smith 54732 Japan 2-11-1991
I have another table that will have generic records on the basis of phone number (both employers and employees).
Table structure is as following
Phone Name Address
I want to put latest records according to date from Table1 to Table2 on the basis of phone..
Like this
Phone Name Address
-----------------------
59786 Harry NewYork
22345 Mac NewYork
59226 Amar China
12345 Sarah Bhutan
22345 Mac NewYork
85674 Root NewYork
54732 Smith Arab
I've written many queries but couldn't find anyone resulted as required.
Any kind of help will be appreciated.
For initialize the table without phone duplicates:
INSERT IGNORE INTO Table2 (Phone, Name, Address)
SELECT X.* FROM (
SELECT EmployeeName,EmployeePhone,EmployeeAddress FROM Table1
UNION
SELECT EmployerName,EmployerPhone,EmployerAddress FROM Table1
) X
WHERE NOT EXISTS (SELECT Phone FROM Table2 WHERE Phone=X.Phone)
I think this is what you are looking for if I understand your question correctly. Should work for a once-off
DECLARE #restbl TABLE
(
Name varchar(100),
Phone varchar(20),
Addr varchar(100),
[Date] date,
RecType varchar(100)
)
INSERT INTO #restbl
SELECT EmployerName, EmployerPhone, NULL, MAX([Date]), 'Employer'
FROM #tbl
GROUP BY EmployerName, EmployerPhone
INSERT INTO #restbl
SELECT EmployeeName, EmployeePhone, NULL, MAX([Date]), 'Employee'
FROM #tbl
GROUP BY EmployeeName, EmployeePhone;
WITH LatestData (Name, Phone, [Date])
AS
(
SELECT Name, Phone, MAX([Date])
FROM #restbl
GROUP BY Name, Phone
)
INSERT INTO FinalTable (Name, Phone, [Address])
SELECT DISTINCT ld.Name, ld.Phone, ISNULL(tEmployer.EmployerAddress, tEmployee.EmployeeAddress) AS [Address]
FROM LatestData ld
LEFT JOIN #tbl tEmployer ON ld.Name = tEmployer.EmployerName AND ld.Phone = tEmployer.EmployerPhone AND ld.Date = tEmployer.Date
LEFT JOIN #tbl tEmployee ON ld.Name = tEmployee.EmployeeName AND ld.Phone = tEmployee.EmployeePhone AND ld.Date = tEmployee.Date

MS SQL 2012 query to combine each duplicate found entry [duplicate]

This question already has answers here:
Convert multiple rows into one with comma as separator [duplicate]
(10 answers)
Closed 7 years ago.
Hey all I am wanting to create a query so that I can combine each of the found duplicates into one entry.
An example of this is:
Name | ID | Tag | Address |carNum
-------------------------------------------------------
Bob Barker |2054 |52377 |235 Some road |9874
Bill Gates |5630 |69471 |014 Washington Rd. |3700
Bob Barker |2054 |97011 |235 Some road |9874
Bob Barker |2054 |40019 |235 Some road |9874
Steve Jobs |8501 |73051 |100 Infinity St. |4901
John Doe |7149 |86740 |7105 Bull Rd. |9282
Bill Gates |5630 |55970 |014 Washington Rd. |3700
Tim Boons |6370 |60701 |852 Mnt. Creek Rd. |7059
In the example above, Bob Barker and Bill gates are both in the database more than once so I would like the output to be the following:
Bob Barker|2054|52377/97011/40019 |235 Some road |9874
Bill Gates|5630|69471/55970 |014 Washington Rd.|3700
Steve Jobs|8501|73051 |100 Infinity St. |4901
John Doe |7149|86740 |7105 Bull Rd. |9282
Tim Boons |6370|60701 |852 Mnt. Creek Rd.|7059
Notice how Bob Barker & Bill Gates appends the tag row (the duplicated data) into one row instead of having multiple rows. This is because I do not want to have to check the previous ID and see if it matches the current id and append to the data.
I am hoping a SQL query guru would have a query to do this for me!
Thanks for your time and help!
------------------------------------------------------------------------------------------------------------------------
Question has changed from ACCESS DATABASE to MS SQL SERVER 2012 database
------------------------------------------------------------------------------------------------------------------------
Use MySQL GROUP_CONCAT for Tag field and group by data with Name field.
Query:
SELECT Name, ID, GROUP_CONCAT(Tag SEPARATOR '/') AS Tag, Address, carNum
FROM users GROUP BY Name
You can do it like this:
CREATE TABLE MyTable ( Name nvarchar(50)
, ID int
, Tag int
, Address nvarchar(50)
, carNum int
)
INSERT INTO MyTable VALUES
('Bob Barker', 2054, 52377, '235 Some road' , 9874)
, ('Bill Gates', 5630, 69471, '014 Washington Rd.' , 3700)
, ('Bob Barker', 2054, 97011, '235 Some road' , 9874)
, ('Bob Barker', 2054, 40019, '235 Some road' , 9874)
, ('Steve Jobs', 8501, 73051, '100 Infinity St.' , 4901)
, ('John Doe' , 7149, 86740, '7105 Bull Rd.' , 9282)
, ('Bill Gates', 5630, 55970, '014 Washington Rd.' , 3700)
, ('Tim Boons' , 6370, 60701, '852 Mnt. Creek Rd.' , 7059)
SELECT YT.Name
, ID
, LEFT(YT.SUB, LEN(YT.SUB) - 1) AS Tags
, Address
, carNum
FROM (SELECT DISTINCT
Name
, ( SELECT CAST(ST1.Tag AS nvarchar(5)) + ',' AS [text()]
FROM MyTable ST1
WHERE ST1.ID = ST2.ID
ORDER BY ST1.Name
FOR
XML PATH('')
) SUB
, ID
, Address
, carNum
FROM MyTable ST2
) YT
DROP TABLE MyTable