MS SQL 2012 query to combine each duplicate found entry [duplicate] - sql

This question already has answers here:
Convert multiple rows into one with comma as separator [duplicate]
(10 answers)
Closed 7 years ago.
Hey all I am wanting to create a query so that I can combine each of the found duplicates into one entry.
An example of this is:
Name | ID | Tag | Address |carNum
-------------------------------------------------------
Bob Barker |2054 |52377 |235 Some road |9874
Bill Gates |5630 |69471 |014 Washington Rd. |3700
Bob Barker |2054 |97011 |235 Some road |9874
Bob Barker |2054 |40019 |235 Some road |9874
Steve Jobs |8501 |73051 |100 Infinity St. |4901
John Doe |7149 |86740 |7105 Bull Rd. |9282
Bill Gates |5630 |55970 |014 Washington Rd. |3700
Tim Boons |6370 |60701 |852 Mnt. Creek Rd. |7059
In the example above, Bob Barker and Bill gates are both in the database more than once so I would like the output to be the following:
Bob Barker|2054|52377/97011/40019 |235 Some road |9874
Bill Gates|5630|69471/55970 |014 Washington Rd.|3700
Steve Jobs|8501|73051 |100 Infinity St. |4901
John Doe |7149|86740 |7105 Bull Rd. |9282
Tim Boons |6370|60701 |852 Mnt. Creek Rd.|7059
Notice how Bob Barker & Bill Gates appends the tag row (the duplicated data) into one row instead of having multiple rows. This is because I do not want to have to check the previous ID and see if it matches the current id and append to the data.
I am hoping a SQL query guru would have a query to do this for me!
Thanks for your time and help!
------------------------------------------------------------------------------------------------------------------------
Question has changed from ACCESS DATABASE to MS SQL SERVER 2012 database
------------------------------------------------------------------------------------------------------------------------

Use MySQL GROUP_CONCAT for Tag field and group by data with Name field.
Query:
SELECT Name, ID, GROUP_CONCAT(Tag SEPARATOR '/') AS Tag, Address, carNum
FROM users GROUP BY Name

You can do it like this:
CREATE TABLE MyTable ( Name nvarchar(50)
, ID int
, Tag int
, Address nvarchar(50)
, carNum int
)
INSERT INTO MyTable VALUES
('Bob Barker', 2054, 52377, '235 Some road' , 9874)
, ('Bill Gates', 5630, 69471, '014 Washington Rd.' , 3700)
, ('Bob Barker', 2054, 97011, '235 Some road' , 9874)
, ('Bob Barker', 2054, 40019, '235 Some road' , 9874)
, ('Steve Jobs', 8501, 73051, '100 Infinity St.' , 4901)
, ('John Doe' , 7149, 86740, '7105 Bull Rd.' , 9282)
, ('Bill Gates', 5630, 55970, '014 Washington Rd.' , 3700)
, ('Tim Boons' , 6370, 60701, '852 Mnt. Creek Rd.' , 7059)
SELECT YT.Name
, ID
, LEFT(YT.SUB, LEN(YT.SUB) - 1) AS Tags
, Address
, carNum
FROM (SELECT DISTINCT
Name
, ( SELECT CAST(ST1.Tag AS nvarchar(5)) + ',' AS [text()]
FROM MyTable ST1
WHERE ST1.ID = ST2.ID
ORDER BY ST1.Name
FOR
XML PATH('')
) SUB
, ID
, Address
, carNum
FROM MyTable ST2
) YT
DROP TABLE MyTable

Related

Merging Duplicate Rows with SQL

I have a table that contains usernames, these names are duplicated in various forms, for example, Mr. John is replicated as John Mr. I want to combine the two rows using their unique phone numbers in SQL.
I want a new table in this form after removing the duplicates
you can do it with ROW_NUMBER window function.
First, you need to group the data by your unique column (Phone_Number), then sort by name.
Preparing the table and example data:
DECLARE #vCustomers TABLE (
Name NVARCHAR(25),
Phone_Number NVARCHAR(9),
Address NVARCHAR(25)
)
INSERT INTO #vCustomers
VALUES
('Mr John', '234881675', 'Lagos'),
('Mr Felix', '234867467', 'Atlanta'),
('Mrs Ayo', '234786959', 'Doha'),
('John Mr', '234881675', 'Lagos'),
('Mr Jude', '235689760', 'Rabat'),
('Ayo', '234786959', 'Doha'),
('Jude', '235689760', 'Rabat')
After that, removing the duplicate rows:
DELETE
vc
FROM (
SELECT
ROW_NUMBER() OVER(PARTITION BY Phone_Number ORDER BY Name DESC) AS RN
FROM #vCustomers
) AS vc
WHERE RN > 1
SELECT * FROM #vCustomers
As final, the result:
Name
Phone_Number
Address
Mr John
234881675
Lagos
Mr Felix
234867467
Atlanta
Mrs Ayo
234786959
Doha
Mr Jude
235689760
Rabat

Count the Number of Cities in the column - SQL Server

I have a column with Client Labels (separated by comma) associated with each Client.
I need to find Clients who has more or less than 1 State in the ClientLabels column. Another words, find exceptions where State either needs to be added (when no State specified at all) or removed (when extra State specified, usually happened when Client moved out of State or Employees Error).
All States we can serve: California, Arizona, Texas, Virginia and Washington.
P.S. If Client is moving to any State not in the above list, it becomes inactive right the way, so there is no way Alaska (for example) to be in the ClientLabels column.
CREATE TABLE Clients
(ClientId INT, ClientName VARCHAR(100), ClientLabels VARCHAR(max));
INSERT INTO Clients
VALUES
(1 , 'Justin Bieber', 'California, Musician, Male'),
(2 , 'Lionel Messi', 'Washington, Soccer Player, Male'),
(3 , 'Nicolas Cage', 'California, Actor, Male'),
(4 , 'Harry Potter', 'Fake, Male'),
(5 , 'Tom Holland', 'Arizona, Actor, California, Male'),
(6 , 'Ariana Grande', 'Texas, Musician, Female'),
(7 , 'Madonna', 'Virginia, Musician, Female'),
(8 , 'Dwayne Johnson', 'California, Actor, Male')
SELECT * FROM Clients
Output I need:
ClientId
ClientName
ClientLabels
NumberOfStates
1
Justin Bieber
California, Musician, Male
1
2
Lionel Messi
Washington, Soccer Player, Male
1
3
Nicolas Cage
California, Actor, Male
1
4
Harry Potter
Fake, Male
0
5
Tom Holland
Arizona, Actor, California, Male
2
6
Ariana Grande
Texas, Musician, Female
1
7
Madonna
Virginia, Musician, Female
1
8
Dwayne Johnson
California, Actor, Male
1
I've started the code, but don't know how to finish it:
SELECT c.*,
COUNT(c.ClientLabels) OVER(PARTITION BY c.ClientId) AS NumberOfStates
FROM Clients AS c
You may try the below query out.
declare #Clients table(ClientId INT, ClientName VARCHAR(100), ClientLabels VARCHAR(max))
declare #labels table(label varchar(100))
insert into #Clients
VALUES
(1 , 'Justin Bieber', 'California, Musician, Male'),
(2 , 'Lionel Messi', 'Washington, Soccer Player, Male'),
(3 , 'Tom Holland', 'Arizona, Actor, California, Male'),
(4 , 'Harry Potter', 'Fake, Male')
insert into #labels
values('California')
,('Arizona')
,('Texas')
,('Virginia')
,('Washington')
select distinct c.*,case when cl.label is null then 0
else count(*)over(partition by clientid order by clientid) end as [NumberOfStates]
from #Clients c
left join #labels cl
on c.ClientLabels like '%' + cl.label + '%'
If you sql server version support STRING_SPLIT function you can try to use STRING_SPLIT with CROSS APPLY split ClientLabels by commna each ClientId.
Then use condition aggregate function count the NumberOfStates
SELECT ClientId,
ClientName,
ClientLabels,
COUNT(CASE WHEN trim(v.value) IN ('California','Arizona', 'Texas', 'Virginia', 'Washington') THEN 1 END) NumberOfStates
FROM Clients c
CROSS APPLY STRING_SPLIT(c.ClientLabels,',') v
GROUP BY ClientId,ClientName,ClientLabels
sqlfiddle

How can I use SQL Pivot Table to turn my rows of data into columns

In SQL 2008, I need to flatten a table and show extra rows as columns. All I can find are queries with calculations. I just want to show the raw data. The data is like as below (simplified):
ID# Name Name_Type
1 Mary Jane Legal
1 MJ Nickname
1 Smith Maiden
2 John Legal
3 Suzanne Legal
3 Susie Nickname
I want the data to show as:
ID# Legal Nickname Maiden
1 Mary Jane MJ Smith
2 John
3 Suzanne Susie
where nothing shows in the column if there is not a row existing for that column. I'm thinking the Pivot Table method should work.
PIVOT requires you to use an aggregate. See this post for a better explanation of how it works.
CREATE TABLE #MyTable
(
ID# INT
, Name VARCHAR(50)
, Name_Type VARCHAR(50)
);
INSERT INTO #MyTable VALUES
(1, 'Mary Jane', 'Legal')
, (1, 'MJ', 'Nickname')
, (1, 'Smith', 'Maiden')
, (2, 'John', 'Legal')
, (3, 'Suzanne', 'Legal')
, (3, 'Susie', 'Nickname');
SELECT *
FROM
(
SELECT * FROM #MyTable
) AS Names
PIVOT (MAX(NAME)
FOR Name_Type IN ([Legal], [Nickname], [Maiden]))
AS PVT;
DROP TABLE #MyTable;
Try this (replace "new_Table" with your table - name and "ID_" with your id - column):
SELECT ID_ AS rootID, (
SELECT Name
FROM new_table
WHERE Name_type = 'legal'
AND new_table.ID_ = rootID
) AS legal,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Nickname'
AND ID_ = rootID
) AS Nickname,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Maiden'
AND ID_ = rootID
) AS Maiden
FROM new_table
GROUP BY rootID;

SQL Server : query to update record with latest entry

I have a table that maintains records of employers and employees' data. Something like this
EmployerName EmployerPhone EmployerAddress EmployeeName EmployeePhone EmployeeAddress Date
-------------------------------------------------------------------------------------------------------
John 12345 NewYork Harry 59786 NewYork 12-1-1991
Mac 22345 Bankok John 12345 Delhi 12-3-1991
Smith 54732 Arab Amar 59226 China 21-6-1991
Sarah 12345 Bhutan Mac 22345 NewYork 5-9-1991
Root 85674 NewYork Smith 54732 Japan 2-11-1991
I have another table that will have generic records on the basis of phone number (both employers and employees).
Table structure is as following
Phone Name Address
I want to put latest records according to date from Table1 to Table2 on the basis of phone..
Like this
Phone Name Address
-----------------------
59786 Harry NewYork
22345 Mac NewYork
59226 Amar China
12345 Sarah Bhutan
22345 Mac NewYork
85674 Root NewYork
54732 Smith Arab
I've written many queries but couldn't find anyone resulted as required.
Any kind of help will be appreciated.
For initialize the table without phone duplicates:
INSERT IGNORE INTO Table2 (Phone, Name, Address)
SELECT X.* FROM (
SELECT EmployeeName,EmployeePhone,EmployeeAddress FROM Table1
UNION
SELECT EmployerName,EmployerPhone,EmployerAddress FROM Table1
) X
WHERE NOT EXISTS (SELECT Phone FROM Table2 WHERE Phone=X.Phone)
I think this is what you are looking for if I understand your question correctly. Should work for a once-off
DECLARE #restbl TABLE
(
Name varchar(100),
Phone varchar(20),
Addr varchar(100),
[Date] date,
RecType varchar(100)
)
INSERT INTO #restbl
SELECT EmployerName, EmployerPhone, NULL, MAX([Date]), 'Employer'
FROM #tbl
GROUP BY EmployerName, EmployerPhone
INSERT INTO #restbl
SELECT EmployeeName, EmployeePhone, NULL, MAX([Date]), 'Employee'
FROM #tbl
GROUP BY EmployeeName, EmployeePhone;
WITH LatestData (Name, Phone, [Date])
AS
(
SELECT Name, Phone, MAX([Date])
FROM #restbl
GROUP BY Name, Phone
)
INSERT INTO FinalTable (Name, Phone, [Address])
SELECT DISTINCT ld.Name, ld.Phone, ISNULL(tEmployer.EmployerAddress, tEmployee.EmployeeAddress) AS [Address]
FROM LatestData ld
LEFT JOIN #tbl tEmployer ON ld.Name = tEmployer.EmployerName AND ld.Phone = tEmployer.EmployerPhone AND ld.Date = tEmployer.Date
LEFT JOIN #tbl tEmployee ON ld.Name = tEmployee.EmployeeName AND ld.Phone = tEmployee.EmployeePhone AND ld.Date = tEmployee.Date

ms-access 2010: count duplicate names per household address

I am currently working with a spreadsheet in MS Access 2010 which contains about 130k rows of information about people who voted in a local election recently. Each row has their residential information (street name, number, postcode etc.) and personal information (title, surname, forename, middle name, DOB etc.). Each row represents an individual person rather than a household (therefore in many cases the same residential address appears more than once as more than one person resides in a particular household).
What I want to achieve is basically to create a new field in this dataset called 'count'. I want this field to give me a count of how many different surnames reside at a single address.
Is there an SQL script that will allow me to do this in Access 2010?
+------------------+----------+-------+---------+----------+-------------+
| PROPERTYADDRESS1 | POSTCODE | TITLE | SURNAME | FORENAME | MIDDLE_NAME |
+------------------+----------+-------+---------+----------+-------------+
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N
FAKEADDRESS2 EEE 5BB MRS BLOGGS SUZANNE P
FAKEADDRESS3 EEE 5RG MS SMITH PAULINE S
FAKEADDRESS4 EEE 4BV DR JONES ANNE D
FAKEADDRESS5 EEE 3AS MR TAYLOR STUART A
The following syntax has got me close so far:
SELECT COUNT(electoral.SURNAME)
FROM electoral
GROUP BY electoral.UPRN
However, instead of returning me all 130k odd rows, it only returns me around 67k rows. Is there anything I can do to the syntax to achieve the same result, but just returning every single row?
Any help is greatly appreciated!
Thanks
You could use something like this:
select *,
count(surname) over (partition by householdName)
from myTable
If you have only one column which contains the name,
ex: Rob Adams
then you can do this to have all the surnames in a different column so it will be easier in the select:
SELECT LEFT('HELLO WORLD',CHARINDEX(' ','HELLO WORLD')-1)
in our example:
select right (surmane, charindex (' ',surname)-1) as surname
example on how to use charindex, left and right here:
http://social.technet.microsoft.com/wiki/contents/articles/17948.t-sql-right-left-substring-and-charindex-functions.aspx
if there are any questions, leave a comment.
EDIT: I edited the query, had a syntax error, please try it again. This works on sql server.
here is an example:
create table #temp (id int, PropertyAddress varchar(50), surname varchar(50), forname varchar(50))
insert into #temp values
(1, 'hiddenBase', 'Adamns' , 'Kara' ),
(2, 'hiddenBase', 'Adamns' , 'Anne' ),
(3, 'hiddenBase', 'Adamns' , 'John' ),
(4, 'QueensResidence', 'Queen' , 'Oliver' ),
(5, 'QueensResidence', 'Queen' , 'Moira' ),
(6, 'superSecretBase', 'Diggle' , 'John' ),
(7, 'NandaParbat', 'Merlin' , 'Malcom' )
select * from #temp
select *,
count (surname) over (partition by PropertyAddress) as CountMembers
from #temp
gives:
1 hiddenBase Adamns Kara 3
2 hiddenBase Adamns Anne 3
3 hiddenBase Adamns John 3
7 NandaParbat Merlin Malcom 1
4 QueensResidence Queen Oliver 2
5 QueensResidence Queen Moira 2
6 superSecretBase Diggle John 1
Your query should look like this:
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
EDIT
If over partition by isn't supported, then I guess you can get to your desired result by using group by
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
group by -- put here the fields in the select (one by one), however you can't write group by *
GROUP BY creates an aggregate query, so it's by design that you get fewer records (one per UPRN).
To get the count for each row in the original table, you can join the table with the aggregate query:
SELECT electoral.*, elCount.NumberOfPeople
FROM electoral
INNER JOIN
(
SELECT UPRN, COUNT(*) AS NumberOfPeople
FROM electoral
GROUP BY UPRN
) AS elCount
ON electoral.UPRN = elCount.UPRN
Given the update I want to post another answer. Try it like this:
create table #temp2 ( PropertyAddress1 varchar(50), POSTCODE varchar(20), TITLE varchar (20),
surname varchar(50), FORENAME varchar(50), MIDDLE_NAME varchar (50) )
insert into #temp2 values
('FAKEADDRESS1', 'EEE 5GG', 'MR', 'BLOGGS', 'JOE', 'N'),
('FAKEADDRESS1', 'EEE 5BB', 'MRS', 'BLOGGS', 'SUZANNE', 'P'),
('FAKEADDRESS2', 'EEE 5RG', 'MS', 'SMITH', 'PAULINE', 'S'),
('FAKEADDRESS3', 'EEE 4BV', 'DR', 'JONES', 'ANNE', 'D'),
('FAKEADDRESS4', 'EEE 3AS', 'MR', 'TAYLOR', 'STUART', 'A')
select PropertyAddress1, surname,count (#temp2.surname) as CountADD
into #countTemp
from #temp2
group by PropertyAddress1, surname
select * from #temp2 t2
left join #countTemp ct
on t2.PropertyAddress1 = ct.PropertyAddress1 and t2.surname = ct.surname
This yields:
PropertyAddress1 POSTCODE TITLE surname FORENAME MIDDLE_NAME PropertyAddress1 surname CountADD
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N FAKEADDRESS1 BLOGGS 2
FAKEADDRESS1 EEE 5BB MRS BLOGGS SUZANNE P FAKEADDRESS1 BLOGGS 2
FAKEADDRESS2 EEE 5RG MS SMITH PAULINE S FAKEADDRESS2 SMITH 1
FAKEADDRESS3 EEE 4BV DR JONES ANNE D FAKEADDRESS3 JONES 1
FAKEADDRESS4 EEE 3AS MR TAYLOR STUART A FAKEADDRESS4 TAYLOR 1