Count the Number of Cities in the column - SQL Server - sql

I have a column with Client Labels (separated by comma) associated with each Client.
I need to find Clients who has more or less than 1 State in the ClientLabels column. Another words, find exceptions where State either needs to be added (when no State specified at all) or removed (when extra State specified, usually happened when Client moved out of State or Employees Error).
All States we can serve: California, Arizona, Texas, Virginia and Washington.
P.S. If Client is moving to any State not in the above list, it becomes inactive right the way, so there is no way Alaska (for example) to be in the ClientLabels column.
CREATE TABLE Clients
(ClientId INT, ClientName VARCHAR(100), ClientLabels VARCHAR(max));
INSERT INTO Clients
VALUES
(1 , 'Justin Bieber', 'California, Musician, Male'),
(2 , 'Lionel Messi', 'Washington, Soccer Player, Male'),
(3 , 'Nicolas Cage', 'California, Actor, Male'),
(4 , 'Harry Potter', 'Fake, Male'),
(5 , 'Tom Holland', 'Arizona, Actor, California, Male'),
(6 , 'Ariana Grande', 'Texas, Musician, Female'),
(7 , 'Madonna', 'Virginia, Musician, Female'),
(8 , 'Dwayne Johnson', 'California, Actor, Male')
SELECT * FROM Clients
Output I need:
ClientId
ClientName
ClientLabels
NumberOfStates
1
Justin Bieber
California, Musician, Male
1
2
Lionel Messi
Washington, Soccer Player, Male
1
3
Nicolas Cage
California, Actor, Male
1
4
Harry Potter
Fake, Male
0
5
Tom Holland
Arizona, Actor, California, Male
2
6
Ariana Grande
Texas, Musician, Female
1
7
Madonna
Virginia, Musician, Female
1
8
Dwayne Johnson
California, Actor, Male
1
I've started the code, but don't know how to finish it:
SELECT c.*,
COUNT(c.ClientLabels) OVER(PARTITION BY c.ClientId) AS NumberOfStates
FROM Clients AS c

You may try the below query out.
declare #Clients table(ClientId INT, ClientName VARCHAR(100), ClientLabels VARCHAR(max))
declare #labels table(label varchar(100))
insert into #Clients
VALUES
(1 , 'Justin Bieber', 'California, Musician, Male'),
(2 , 'Lionel Messi', 'Washington, Soccer Player, Male'),
(3 , 'Tom Holland', 'Arizona, Actor, California, Male'),
(4 , 'Harry Potter', 'Fake, Male')
insert into #labels
values('California')
,('Arizona')
,('Texas')
,('Virginia')
,('Washington')
select distinct c.*,case when cl.label is null then 0
else count(*)over(partition by clientid order by clientid) end as [NumberOfStates]
from #Clients c
left join #labels cl
on c.ClientLabels like '%' + cl.label + '%'

If you sql server version support STRING_SPLIT function you can try to use STRING_SPLIT with CROSS APPLY split ClientLabels by commna each ClientId.
Then use condition aggregate function count the NumberOfStates
SELECT ClientId,
ClientName,
ClientLabels,
COUNT(CASE WHEN trim(v.value) IN ('California','Arizona', 'Texas', 'Virginia', 'Washington') THEN 1 END) NumberOfStates
FROM Clients c
CROSS APPLY STRING_SPLIT(c.ClientLabels,',') v
GROUP BY ClientId,ClientName,ClientLabels
sqlfiddle

Related

SQL Server : query to update record with latest entry

I have a table that maintains records of employers and employees' data. Something like this
EmployerName EmployerPhone EmployerAddress EmployeeName EmployeePhone EmployeeAddress Date
-------------------------------------------------------------------------------------------------------
John 12345 NewYork Harry 59786 NewYork 12-1-1991
Mac 22345 Bankok John 12345 Delhi 12-3-1991
Smith 54732 Arab Amar 59226 China 21-6-1991
Sarah 12345 Bhutan Mac 22345 NewYork 5-9-1991
Root 85674 NewYork Smith 54732 Japan 2-11-1991
I have another table that will have generic records on the basis of phone number (both employers and employees).
Table structure is as following
Phone Name Address
I want to put latest records according to date from Table1 to Table2 on the basis of phone..
Like this
Phone Name Address
-----------------------
59786 Harry NewYork
22345 Mac NewYork
59226 Amar China
12345 Sarah Bhutan
22345 Mac NewYork
85674 Root NewYork
54732 Smith Arab
I've written many queries but couldn't find anyone resulted as required.
Any kind of help will be appreciated.
For initialize the table without phone duplicates:
INSERT IGNORE INTO Table2 (Phone, Name, Address)
SELECT X.* FROM (
SELECT EmployeeName,EmployeePhone,EmployeeAddress FROM Table1
UNION
SELECT EmployerName,EmployerPhone,EmployerAddress FROM Table1
) X
WHERE NOT EXISTS (SELECT Phone FROM Table2 WHERE Phone=X.Phone)
I think this is what you are looking for if I understand your question correctly. Should work for a once-off
DECLARE #restbl TABLE
(
Name varchar(100),
Phone varchar(20),
Addr varchar(100),
[Date] date,
RecType varchar(100)
)
INSERT INTO #restbl
SELECT EmployerName, EmployerPhone, NULL, MAX([Date]), 'Employer'
FROM #tbl
GROUP BY EmployerName, EmployerPhone
INSERT INTO #restbl
SELECT EmployeeName, EmployeePhone, NULL, MAX([Date]), 'Employee'
FROM #tbl
GROUP BY EmployeeName, EmployeePhone;
WITH LatestData (Name, Phone, [Date])
AS
(
SELECT Name, Phone, MAX([Date])
FROM #restbl
GROUP BY Name, Phone
)
INSERT INTO FinalTable (Name, Phone, [Address])
SELECT DISTINCT ld.Name, ld.Phone, ISNULL(tEmployer.EmployerAddress, tEmployee.EmployeeAddress) AS [Address]
FROM LatestData ld
LEFT JOIN #tbl tEmployer ON ld.Name = tEmployer.EmployerName AND ld.Phone = tEmployer.EmployerPhone AND ld.Date = tEmployer.Date
LEFT JOIN #tbl tEmployee ON ld.Name = tEmployee.EmployeeName AND ld.Phone = tEmployee.EmployeePhone AND ld.Date = tEmployee.Date

MS SQL 2012 query to combine each duplicate found entry [duplicate]

This question already has answers here:
Convert multiple rows into one with comma as separator [duplicate]
(10 answers)
Closed 7 years ago.
Hey all I am wanting to create a query so that I can combine each of the found duplicates into one entry.
An example of this is:
Name | ID | Tag | Address |carNum
-------------------------------------------------------
Bob Barker |2054 |52377 |235 Some road |9874
Bill Gates |5630 |69471 |014 Washington Rd. |3700
Bob Barker |2054 |97011 |235 Some road |9874
Bob Barker |2054 |40019 |235 Some road |9874
Steve Jobs |8501 |73051 |100 Infinity St. |4901
John Doe |7149 |86740 |7105 Bull Rd. |9282
Bill Gates |5630 |55970 |014 Washington Rd. |3700
Tim Boons |6370 |60701 |852 Mnt. Creek Rd. |7059
In the example above, Bob Barker and Bill gates are both in the database more than once so I would like the output to be the following:
Bob Barker|2054|52377/97011/40019 |235 Some road |9874
Bill Gates|5630|69471/55970 |014 Washington Rd.|3700
Steve Jobs|8501|73051 |100 Infinity St. |4901
John Doe |7149|86740 |7105 Bull Rd. |9282
Tim Boons |6370|60701 |852 Mnt. Creek Rd.|7059
Notice how Bob Barker & Bill Gates appends the tag row (the duplicated data) into one row instead of having multiple rows. This is because I do not want to have to check the previous ID and see if it matches the current id and append to the data.
I am hoping a SQL query guru would have a query to do this for me!
Thanks for your time and help!
------------------------------------------------------------------------------------------------------------------------
Question has changed from ACCESS DATABASE to MS SQL SERVER 2012 database
------------------------------------------------------------------------------------------------------------------------
Use MySQL GROUP_CONCAT for Tag field and group by data with Name field.
Query:
SELECT Name, ID, GROUP_CONCAT(Tag SEPARATOR '/') AS Tag, Address, carNum
FROM users GROUP BY Name
You can do it like this:
CREATE TABLE MyTable ( Name nvarchar(50)
, ID int
, Tag int
, Address nvarchar(50)
, carNum int
)
INSERT INTO MyTable VALUES
('Bob Barker', 2054, 52377, '235 Some road' , 9874)
, ('Bill Gates', 5630, 69471, '014 Washington Rd.' , 3700)
, ('Bob Barker', 2054, 97011, '235 Some road' , 9874)
, ('Bob Barker', 2054, 40019, '235 Some road' , 9874)
, ('Steve Jobs', 8501, 73051, '100 Infinity St.' , 4901)
, ('John Doe' , 7149, 86740, '7105 Bull Rd.' , 9282)
, ('Bill Gates', 5630, 55970, '014 Washington Rd.' , 3700)
, ('Tim Boons' , 6370, 60701, '852 Mnt. Creek Rd.' , 7059)
SELECT YT.Name
, ID
, LEFT(YT.SUB, LEN(YT.SUB) - 1) AS Tags
, Address
, carNum
FROM (SELECT DISTINCT
Name
, ( SELECT CAST(ST1.Tag AS nvarchar(5)) + ',' AS [text()]
FROM MyTable ST1
WHERE ST1.ID = ST2.ID
ORDER BY ST1.Name
FOR
XML PATH('')
) SUB
, ID
, Address
, carNum
FROM MyTable ST2
) YT
DROP TABLE MyTable

ms-access 2010: count duplicate names per household address

I am currently working with a spreadsheet in MS Access 2010 which contains about 130k rows of information about people who voted in a local election recently. Each row has their residential information (street name, number, postcode etc.) and personal information (title, surname, forename, middle name, DOB etc.). Each row represents an individual person rather than a household (therefore in many cases the same residential address appears more than once as more than one person resides in a particular household).
What I want to achieve is basically to create a new field in this dataset called 'count'. I want this field to give me a count of how many different surnames reside at a single address.
Is there an SQL script that will allow me to do this in Access 2010?
+------------------+----------+-------+---------+----------+-------------+
| PROPERTYADDRESS1 | POSTCODE | TITLE | SURNAME | FORENAME | MIDDLE_NAME |
+------------------+----------+-------+---------+----------+-------------+
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N
FAKEADDRESS2 EEE 5BB MRS BLOGGS SUZANNE P
FAKEADDRESS3 EEE 5RG MS SMITH PAULINE S
FAKEADDRESS4 EEE 4BV DR JONES ANNE D
FAKEADDRESS5 EEE 3AS MR TAYLOR STUART A
The following syntax has got me close so far:
SELECT COUNT(electoral.SURNAME)
FROM electoral
GROUP BY electoral.UPRN
However, instead of returning me all 130k odd rows, it only returns me around 67k rows. Is there anything I can do to the syntax to achieve the same result, but just returning every single row?
Any help is greatly appreciated!
Thanks
You could use something like this:
select *,
count(surname) over (partition by householdName)
from myTable
If you have only one column which contains the name,
ex: Rob Adams
then you can do this to have all the surnames in a different column so it will be easier in the select:
SELECT LEFT('HELLO WORLD',CHARINDEX(' ','HELLO WORLD')-1)
in our example:
select right (surmane, charindex (' ',surname)-1) as surname
example on how to use charindex, left and right here:
http://social.technet.microsoft.com/wiki/contents/articles/17948.t-sql-right-left-substring-and-charindex-functions.aspx
if there are any questions, leave a comment.
EDIT: I edited the query, had a syntax error, please try it again. This works on sql server.
here is an example:
create table #temp (id int, PropertyAddress varchar(50), surname varchar(50), forname varchar(50))
insert into #temp values
(1, 'hiddenBase', 'Adamns' , 'Kara' ),
(2, 'hiddenBase', 'Adamns' , 'Anne' ),
(3, 'hiddenBase', 'Adamns' , 'John' ),
(4, 'QueensResidence', 'Queen' , 'Oliver' ),
(5, 'QueensResidence', 'Queen' , 'Moira' ),
(6, 'superSecretBase', 'Diggle' , 'John' ),
(7, 'NandaParbat', 'Merlin' , 'Malcom' )
select * from #temp
select *,
count (surname) over (partition by PropertyAddress) as CountMembers
from #temp
gives:
1 hiddenBase Adamns Kara 3
2 hiddenBase Adamns Anne 3
3 hiddenBase Adamns John 3
7 NandaParbat Merlin Malcom 1
4 QueensResidence Queen Oliver 2
5 QueensResidence Queen Moira 2
6 superSecretBase Diggle John 1
Your query should look like this:
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
EDIT
If over partition by isn't supported, then I guess you can get to your desired result by using group by
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
group by -- put here the fields in the select (one by one), however you can't write group by *
GROUP BY creates an aggregate query, so it's by design that you get fewer records (one per UPRN).
To get the count for each row in the original table, you can join the table with the aggregate query:
SELECT electoral.*, elCount.NumberOfPeople
FROM electoral
INNER JOIN
(
SELECT UPRN, COUNT(*) AS NumberOfPeople
FROM electoral
GROUP BY UPRN
) AS elCount
ON electoral.UPRN = elCount.UPRN
Given the update I want to post another answer. Try it like this:
create table #temp2 ( PropertyAddress1 varchar(50), POSTCODE varchar(20), TITLE varchar (20),
surname varchar(50), FORENAME varchar(50), MIDDLE_NAME varchar (50) )
insert into #temp2 values
('FAKEADDRESS1', 'EEE 5GG', 'MR', 'BLOGGS', 'JOE', 'N'),
('FAKEADDRESS1', 'EEE 5BB', 'MRS', 'BLOGGS', 'SUZANNE', 'P'),
('FAKEADDRESS2', 'EEE 5RG', 'MS', 'SMITH', 'PAULINE', 'S'),
('FAKEADDRESS3', 'EEE 4BV', 'DR', 'JONES', 'ANNE', 'D'),
('FAKEADDRESS4', 'EEE 3AS', 'MR', 'TAYLOR', 'STUART', 'A')
select PropertyAddress1, surname,count (#temp2.surname) as CountADD
into #countTemp
from #temp2
group by PropertyAddress1, surname
select * from #temp2 t2
left join #countTemp ct
on t2.PropertyAddress1 = ct.PropertyAddress1 and t2.surname = ct.surname
This yields:
PropertyAddress1 POSTCODE TITLE surname FORENAME MIDDLE_NAME PropertyAddress1 surname CountADD
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N FAKEADDRESS1 BLOGGS 2
FAKEADDRESS1 EEE 5BB MRS BLOGGS SUZANNE P FAKEADDRESS1 BLOGGS 2
FAKEADDRESS2 EEE 5RG MS SMITH PAULINE S FAKEADDRESS2 SMITH 1
FAKEADDRESS3 EEE 4BV DR JONES ANNE D FAKEADDRESS3 JONES 1
FAKEADDRESS4 EEE 3AS MR TAYLOR STUART A FAKEADDRESS4 TAYLOR 1

SQL - Distribute Same Values equally across X number of tables

I want to see if someone knows a way to evenly distribute multiple like values across "x" number of temp tables ensuring that the 'like' values (same team name in this example) never get lumped into one particular table. What I am trying to do is create heats for a race and evenly distribute teams across tables. Ex:
**Teams**
-----------
Los Angeles
New York
New York
Los Angeles
Florida
Florida
Arizona
Texas
Alabama
Alaska
New York
New York
New York
I would like the distribution to go end up something like this where all multiple teams are evenly distribute across 2 (or 3 or 4) heats:
**Heat One**
-------------
Los Angeles
New York
Florida
Arizona
Alabama
New York
New York
**Heat Two**
------------
Los Angeles
New York
Florida
Texas
Alaska
New York
Starting with SQL Server 2005, there's a native functionality for bucketing data. NTILE()
The NTILE function is the fourth of four windowing functions introduced in SQL Server 2005. NTILE takes a different approach to paritioning data. ROW_NUMBER, RANK and DENSE_RANK will generate variable sized buckets of data based on the partition key(s). NTILE attempts to split the data into equal, fixed size buckets. BOL has a comprehensive page comparing the ranking functions if
you want a quick visual reference on their effects.
Syntax
The syntax for NTILE differs slightly from the other window functions. It's NTILE(#BUCKET_COUNT) OVER ([PARTITION BY _] ORDER BY _) , where #BUCKET_COUNT is a positive integer or bigint value.
The challenge is ensuring we get a good distribution and that's the part that is subject to the vagueries of the random number generator (newid calls/(SELECT NULL)).
Leveraging Rhys's setup
CREATE table dbo.Teams (TeamId int, TeamName varchar(32));
insert dbo.Teams values
( 1, 'Los Angeles'),
( 2, 'New York'),
( 3, 'New York'),
( 4, 'Los Angeles'),
( 5, 'Florida'),
( 6, 'Florida'),
( 7, 'Arizona'),
( 8, 'Texas'),
( 9, 'Alabama'),
(10, 'Alaska'),
(11, 'New York'),
(12, 'New York'),
(13, 'New York');
SELECT
NTILE(2) OVER (ORDER BY NEWID()) AS Heat
, NTILE(2) OVER (ORDER BY (SELECT NULL)) AS HeatAlternate
, T.TeamName
, T.TeamId
FROM
dbo.Teams AS T
ORDER BY
1,3;
One of the nicer things about this approach is that it can be switched out to make whatever bucketing size you want by simply changing the value passed to ntile. It also ought to scale better as it would only take one pass through the source table.
This approach doesn't sound right (having separate tables called Heat1, Heat2 etc) so you might want to re-think what you're doing, but if your circumstances dictate this is a good approach then how about allocating a random unique (but sequential) number to each team then use MOD to split the teams across heats? In order to get the 'like' teams (same teamname) into different heats they just need to be randomised together and the MOD will separate them.
create table dbo.Teams (TeamId int, TeamName varchar(32))
go
insert dbo.Teams values
( 1, 'Los Angeles'),
( 2, 'New York'),
( 3, 'New York'),
( 4, 'Los Angeles'),
( 5, 'Florida'),
( 6, 'Florida'),
( 7, 'Arizona'),
( 8, 'Texas'),
( 9, 'Alabama'),
(10, 'Alaska'),
(11, 'New York'),
(12, 'New York'),
(13, 'New York')
go
-- First get a random number per unique team name
; with cte as (
select row_number() over (order by newid()) as lrn, t.TeamName
from dbo.Teams t
group by t.TeamName
)
-- Second get a unique random number per team with like teams ordered together
select row_number() over (order by lrn, newid()) - 1 as rn, t.*
into #teams
from dbo.Teams t
join cte c on c.TeamName = t.TeamName
select 'Heat1', *
from #teams
where rn % 4 = 0
select 'Heat2', *
from #teams
where rn % 4 = 1
select 'Heat3', *
from #teams
where rn % 4 = 2
select 'Heat4', *
from #teams
where rn % 4 = 3

Database table with no foreign key must be joined on nvarchar field results in multiple unexisting rows

Summary
Okay, back in 2007 I've been asked to produce a tiny piece of software which objective was to input persons names, address and phone number according to the local phone directory.
At this time, the only requirement was to be able to group this list by street name. Hence, the street name in the address was sufficient.
It's been the third year now that once a year I'm having this headache of reuniting the street names and sectors together. A sector in nothing more than "downtown", "upper city", "125e", "over 125e" and "unknown" for streets I can't classify.
Data sample and structure
I have an initial table which was created the first time the software was delivered. I will make it SQL Server, as I imported the data into it for ease of work.
CREATE TABLE Contacts (
ContactId int not null identity(1, 1) primary key
, lastname nvarchar(50) not null
, firstname nvarchar(20) not null
, civic nvarchar(10) not null
, street nvarchar(20) not null
, city nvarchar(20) not null
, phone bigint not null
)
-- With the following sample data:
insert into Contacts (lastname, firstname, civic, street, city, phone)
values (N'LNAME-5551231234', N'A', N'89', N'MY STREET', N'SHAWINIGAN', 5551231234)
GO
insert into Contacts (lastname, firstname, civic, street, city, phone)
values (N'LNAME-5559879876', N'FNAME', N'10', N'YOUR STREET', N'SHAWINIGAN', 5559879876)
GO
insert into Constacts (lastname, firstname, civic, street, city, phone)
values (N'LNAME-5554564567', N'AFNAME', N'25', N'HIS STREET', N'SHAWINIGAN-SUD', 5554564567)
GO
Then, I added tables with the street names correctly orthographed, and another for the different sectors.
-- Sectors
CREATE TABLE Sectors (
sectorId int not null identity(1, 1) primary key
, sectorName nvarchar(20) not null
)
GO
insert into Sectors (sectorName)
values (N'Downtown')
GO
insert into Sectors (sectorName)
values (N'Upper city')
GO
-- Streets
CREATE TABLE Streets (
streetId int not null identity(1, 1) primary key
, sectorId int not null references Sectors (sectorId)
, streetName nvarchar(20) not null
)
GO
insert into Streets (sectorId, streetName)
values (1, N'My St.')
GO
insert into Streets (sectorId, streetName)
values(1, N'Ur Street')
GO
insert into Streets (sectorId, streetName)
values (2, N'HIS STREET')
GO
Which would result, for the benefit of the explanation:
Sectors
sectorId | sectorName
---------------------
1 | Downtown
2 | Upper city
Streets
streetId | sectorId | streetName
--------------------------------
1 | 1 | My St.
2 | 1 | Ur Street
3 | 2 | HIS STREET
Contacts
contactId | lastname | firstname | civic | street | city | phone
--------------------------------------------------------------------------------------------
1 | LNAME-5551231234 | A | 89 | My Street | SHAWINIGAN | 5551231234
2 | LNAME-5559879876 | FNAME | 10 | Your Street | SHAWINIGAN | 5559879876
3 | LNAME-5554564567 | AFNAME | 25 | HIS STREET | SHAWINIGAN-SUD | 5554564567
Objective
I got to resolve the street name conflict due to the orthograph. First, it seems that the field Contacts.street holds one value which exists in Streets.streetName. Thus, when I comparing with an equal (=) sign, I get about 6,000 rows only, when the population of the city is about 13,000 people.
Because of this, I try to join the tables with a like clause, but then I can gather about 20,000 rows, with duplicates lastname, name, civic, phone information combination from Contacts.
In addition to it, I seem to lack of precision or don't know how to say, but when I'm using a like, I get some strange results.
The results obtained are, for instance, let's consider I have the street 125e Rue in Streets, and having 12e Rue, 25e Rue in Contacts, then it looks like the Contact is duplicated because both streets meet the like pattern. (It would be so much easier with production data to understand, but these are people addresses and phone number, so I can't...)
Queries tempted so far
This query produces the kind of above-mentioned duplicates information, but only duplicate information from Contacts, as the Streets.streetName change from a record to another in the scope of this query. Besides, this query makes look the information like if there were multiple addresses for A LASTNAME-5551231234, for instance.
select c.city
, s.sectorName
, st.streetName
, c.civic
, c.lastname
, c.firstname
, c.phone
from Contacts c
inner join Streets st on st.streetName like N'%' + c.street + N'%'
inner join Sectors s on s.sectorId = st.sectorId
group by c.city
, s.sectorName
, st.streetName
, c.civic
, c.lastname
, c.firstname
, c.phone
order by c.city
, s.sectorName
, st.streetName
, c.civic
, c.lastname
Another query, from which I would have liked to inspire myself of since it looks like it produces the right results, when we remove as much information as possible from the Contacts table.
Finally, I'm pretty confused myself, and I don't expect one of you, professional developers and DBA, can help me with one simple answer, but with a walkthrough and an empirical approach, so I'm willing to try anything you may tink of that I have not already thought of.
Thanks for any help you provide. =)
surely it can't be this simple...
select c.city
, s.sectorName
, st.streetName
, c.civic
, c.lastname
, c.firstname
, c.phone
from Contacts c
OUTER JOIN Streets st on st.streetName = c.street
inner join Sectors s on s.sectorId = st.sectorId
group by c.city
, s.sectorName
, st.streetName
, c.civic
, c.lastname
, c.firstname
, c.phone
order by c.city
, s.sectorName
, st.streetName
, c.civic
, c.lastname