SQL Column / Row Grouping - sql

I am new to SQL. I am looking for a simple SQL solution to combine a row/column for row whose column contain the same data, in this case a zip code. For example, the data look looks like this:
state, county, city, zip, count
"CA","ALAMEDA","HAYWARD","94541",5371
"CA","ALAMEDA","HAYWARD","94542",2209
"CA","ALAMEDA","HAYWARD","94544",7179
"CA","ALAMEDA","HAYWARD","94545",4209
"CA","ALAMEDA","CASTRO VALLEY","94546",7213
"CA","ALAMEDA","HAYWARD","94546",37
"CA","ALAMEDA","LIVERMORE","94550",9809
"CA","ALAMEDA","LIVERMORE","94551",6558
"CA","ALAMEDA","CASTRO VALLEY","94552",3121
"CA","ALAMEDA","HAYWARD","94552",12
"CA","ALAMEDA","FREMONT","94555",5392
I'd like to end up with the data to look like this:
state, county, city, zip, count
"CA","ALAMEDA","HAYWARD","94541",5371
"CA","ALAMEDA","HAYWARD","94542",2209
"CA","ALAMEDA","HAYWARD","94544",7179
"CA","ALAMEDA","HAYWARD","94545",4209
"CA","ALAMEDA","CASTRO VALLEY / HAYWARD","94546",7250
"CA","ALAMEDA","LIVERMORE","94550",9809
"CA","ALAMEDA","LIVERMORE","94551",6558
"CA","ALAMEDA","CASTRO VALLEY HAYWARD","94552",3133
"CA","ALAMEDA","FREMONT","94555",5392
You can see that in two rows the data has been combined or summed. For rows that contain the exact same zip code, the city names (both) appear in the city column and the count is the sum of the count from each row.
Is there any way to do this using SQL? Even if it requires two different SQL statements that is fine.

Assuming SQL Server, you can use FOR XML to get your desired results.
select distinct t.state,t.county,t.zip,t2.sumcount,
STUFF(
(
SELECT '/' + city AS [text()]
FROM mytable t3
WHERE t.zip = t3.zip
FOR XML PATH('')
), 1, 1, '') AS ColList
from mytable t
join (select zip, sum(count) as sumcount
from mytable
group by zip) t2 on t.zip=t2.zip
And some SQL Fiddle.
If you are using MySQL, look at using GROUP_CONCAT:
select distinct t.state,t.county,t.zip,t2.sumcount,
GROUP_CONCAT(t.city) as cities
from mytable t
join (select zip, sum(count) as sumcount
from mytable
group by zip) t2 on t.zip=t2.zip
GROUP BY t.state,t.county,t.zip,t2.sumcount
And more Fiddle.
Good luck.

I tweaked sgeddes' superb answer to actually get the count and avoid duplicate entries, and added a support script so you can test it. This does assume SQL Server.
IF EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'MYTABLE') BEGIN
drop table MYTABLE;
END;
go
create table MYTABLE
(
state nvarchar(2)
,county nvarchar(100)
,city nvarchar(100)
,zip nvarchar(10)
)
go
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94541');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94541');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94544');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94545');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','CASTRO VALLEY','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','CASTRO VALLEY','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','CASTRO VALLEY','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','CASTRO VALLEY','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','CASTRO VALLEY','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94546');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','LIVERMORE','94550');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','LIVERMORE','94551');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','CASTRO VALLEY','94552');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','HAYWARD','94552');
insert into MYTABLE(state,county,city,zip) values('CA','ALAMEDA','FREMONT','94555');
select distinct
t.state
,t.county
,t.zip
,t2.sumcount
,STUFF((
SELECT distinct '/' + city AS [text()]
FROM mytable t3
WHERE t.zip = t3.zip
FOR XML PATH('')
), 1, 1, '') AS ColList
from
mytable t
inner join
(
select zip, sum(count) as sumcount
from
(
select zip,count(*) as count
from mytable
group by zip
) x
group by zip
) t2
on t.zip=t2.zip
The output looks like this:
state county zip sumcount ColList
CA ALAMEDA 94541 2 HAYWARD
CA ALAMEDA 94544 1 HAYWARD
CA ALAMEDA 94545 1 HAYWARD
CA ALAMEDA 94546 9 CASTRO VALLEY/HAYWARD
CA ALAMEDA 94550 1 LIVERMORE
CA ALAMEDA 94551 1 LIVERMORE
CA ALAMEDA 94552 2 CASTRO VALLEY/HAYWARD
CA ALAMEDA 94555 1 FREMONT

Related

How to handle Multivalue attribute in sql server

I have a table in Data warehouse.
Create table customer
(
id int,
name varchar(30),
address varchar(50)
);
let data in table
insert into Customer values(1, 'Smith', 'abc,def, lkj');
insert into Customer values(2, 'James', 'pqr,lmn');
i want to split the table address column and insert new row if we have many values. Like
1 Smith abc
1 Smith def
1 Smith lkj
2 James pqr
2 James lmn
i have data of 100000 recrds, please help me in this regards.
You can use string_split() and adjust the insert statement:
insert into Customer (id, name, address)
select v.id, v.name, s.value
from (values (1, 'Smith', 'abc,def,lkj')
) v(id, name, address) cross apply
string_split(v.address, ',') s;
You might also want to add a check constraint on address so it does not contain a comma.
You would load from a staging table to the final table by doing:
insert into Customer (id, name, address)
select t.id, t.name, s.value
from staging t cross apply
string_split(v.address, ',') s;

I need to migrate data from one old table to a new table by storing appropriate CityId instead CityName

I'm migrating data from one table to another table in SQL Server, In this process what I need to do is "I have 10 columns in old table one column is 'CityName' which is varchar and in the new table, I have a column 'CityId' which is an integer. And I have other table which has data about city id and names. I need store the appropriate cityId in new table instead of CityName. Please help me. Thanks in advance.
You'll need to join the source table to the CityName field in the city information table:
INSERT INTO dbo.Destination (CityID, OtherStuff)
SELECT t1.CityID, t2.OtherStuff
FROM CityInformationTable t1
INNER JOIN SourceTable t2
ON t1.CityName = t2.CityName
Below should give you an idea, you need to inner join to your look up table to achieve this.
declare #t_cities table (Id int, City nvarchar(20))
insert into #t_cities
(Id, City)
values
(1, 'London'),
(2, 'Dublin'),
(3, 'Paris'),
(4, 'Berlin')
declare #t table (City nvarchar(20), SomeColumn nvarchar(10))
insert into #t
values
('London', 'AaaLon'),
('Paris', 'BeePar'),
('Berlin', 'CeeBer'),
('London', 'DeeLon'),
('Dublin', 'EeeDub')
declare #finalTable table (Id int, SomeColumn nvarchar(10))
insert into #finalTable
select c.Id, t.SomeColumn
from #t t
join #t_cities c on c.City = t.City
select * from #finalTable
Output:
Id SomeColumn
1 AaaLon
3 BeePar
4 CeeBer
1 DeeLon
2 EeeDub

SQL to combine results into one group in the where clause

I have a query
SELECT name,
COUNT (name)
FROM employee
WHERE LOCATION IS LIKE (%%NY%%)
GROUP BY name
name coount
alex m 10
alex.m 5
alex.ma 1
alex 500
How can I combine all the alex's into just one Alex
so that I get the output as
name count
alex 516
I need something like if it matches alex%% then consider it as alex
Here is your dynamic solution on the below for SQL Server.
First, let's see the sample data I worked on:
create table #temp
(name varchar(20))
insert into #temp values ('jack')
insert into #temp values ('jack rx')
insert into #temp values ('jack.a')
insert into #temp values ('jack.bb')
insert into #temp values ('jack.xy')
insert into #temp values ('brandon.12')
insert into #temp values ('brandon')
insert into #temp values ('brandon.k7s')
insert into #temp values ('brandon.bg')
insert into #temp values ('Jonathan')
Then, we need to employ string operators:
;with cte (name, charin, charin_space) as
(
select name,CHARINDEX('.',name,0) as charin, CHARINDEX(' ',name,0) as charin_space
from #temp
)
select name,(case when charin = 0 and charin_space = 0 then name
when charin = 0 and charin_space <> 0 then SUBSTRING(name,0,charin_space)
when charin <> 0 and charin_space = 0 then SUBSTRING(name,0,charin)
end) as mainName
into #temp2
from cte
The temp table #temp2 has the names only like jack, brandon and jonathan. All we need is to connect those tables now and use group by like:
select t2.MainName,COUNT(t2.MainName)
from #temp t1
inner join #temp2 t2 on t1.name = t2.name
group by t2.mainName
I hope it helps!
You need to get part of the name. But this only work for SQL Server. You don't specify which dbms you are using. The query works with your example, but it will also pick up Alexa, Alexander, ...
SELECT LEFT(name, 4),
SUM(coount)
FROM employee
WHERE LOCATION IS LIKE (%%NY%%)
GROUP BY LEFT(name, 4)

SQL Extract Values from a String

How do I extract values from a string? I'm trying to separate into 3 new columns. A separate column for city, state and zipcode.
I've tried
select address2,
left(address2, charindex('',address2)-1)
from table
and ---when I try the below code I get "Invalid length parameter passed to the left or substring function"
,LTRIM(substring(a.Address2, CHARINDEX(' ', a.Address2)+1, CHARINDEX(' ', substring(a.address2, charindex(' ',
a.address2)+1, len(a.address2)))-1))
I can break out the city (except for West Warwick) using the following code, but not sure how to make it work for state and zip. This also removes the error.
SUBSTRING(Address2,1,CHARINDEX(' ', a.address2+ ' ')-1) as city
Any ideas what to try?
It looks like your zip codes and your states are all the same length. If that is true, you should be able to use something like this:
SELECT
LEFT(a.Address2,LEN(a.Address2) - 13) AS City,
RIGHT(LEFT(a.Address2,LEN(a.Address2) - 11),2) AS State,
RIGHT(a.Address2,10) AS Zip_Code
FROM
table;
DEMO CODE
Create the table and data:
CREATE TABLE MyTable (Address2 VARCHAR(100));
INSERT INTO MyTable
VALUES
('SAN DIEGO CA 92128-1234'),
('WEST WARWICK RI 02893-1349'),
('RICHMOND IN 47374-9409');
The query:
SELECT
LEFT(Address2,LEN(Address2) - 13) AS City,
RIGHT(LEFT(Address2,LEN(Address2) - 11),2) AS State,
RIGHT(Address2,10) AS Zip_Code
FROM
MyTable;
The output:
Since you only have 3 parts (City/State/Zip) you can take advantage of a function called parsename in SQL Server 2008 and later. (The original intent of the function is to parse out object names.)
Using a combination of the replace and parsename functions will allow you to be able to separate the data into 3 parts, even if the length of the State (not likely) or the Zip (more likely) change.
Example Data:
create table #my_table
(
address2 varchar(75) not null
)
insert into #my_table values ('CONNERSVILLE IN 47331-3351')
insert into #my_table values ('WEST WARWICK RI 02893-1349')
insert into #my_table values ('RICHMOND IN 47374-9409')
insert into #my_table values ('WILLIAMSBURG IN 47393-9617')
insert into #my_table values ('FARMERSVILLE OH 45325-9226')
--this record is an example of a likely scenario for when the zip length would change.
insert into #my_table values ('WILLIAMSBURG IN 47393')
Solution:
with len_vals as
(
select t.address2
, len(parsename(replace(t.address2,' ','.'), 1)) as zip_len
, len(parsename(replace(t.address2,' ','.'), 2)) as st_len
from #my_table as t
group by t.address2
)
select left(a.address2, len(a.address2) - b.zip_len - b.st_len - 2) as city
, substring(a.address2, len(a.address2) - b.zip_len - 2, b.st_len) as st
, right(a.address2, b.zip_len) as zip_code
from #my_table as a
inner join len_vals as b on a.address2 = b.address2
Results:

comparing two colums in sqlserver and returing the remaining data

I have two tables. First one is student table where he can select two optional courses and other table is current semester's optional courses list.
When ever the student selects a course, row is inserted with basic details such as roll number, inserted time, selected course and status as "1". When ever a selected course is de-selected the status is set as "0" for that row.
Suppose the student has select course id 1 and 2.
Now using this query
select SselectedCourse AS [text()] FROM Sample.dbo.Tbl_student_details where var_rollnumber = '020803009' and status = 1 order by var_courseselectedtime desc FOR XML PATH('')
This will give me the result as "12" where 1 is physics and 2 is social.
the second table holds the value from 1-9
For e.g course id
1 = physics
2 = social
3 = chemistry
4 = geography
5 = computer
6 = Spoken Hindi
7 = Spoken English
8 = B.EEE
9 = B.ECE
now the current student has selected 1 and 2. So on first column, i get "12" and second column i need to get "3456789"(remaining courses).
How to write a query for this?
This is not in single query but is simple.
DECLARE #STUDENT AS TABLE(ID INT, COURSEID INT)
DECLARE #SEM AS TABLE (COURSEID INT, COURSE VARCHAR(100))
INSERT INTO #STUDENT VALUES(1, 1)
INSERT INTO #STUDENT VALUES(1, 2)
INSERT INTO #SEM VALUES(1, 'physics')
INSERT INTO #SEM VALUES(2, 'social')
INSERT INTO #SEM VALUES(3, 'chemistry')
INSERT INTO #SEM VALUES(4, 'geography')
INSERT INTO #SEM VALUES(5, 'computer')
INSERT INTO #SEM VALUES(6, 'Spoken Hindi')
INSERT INTO #SEM VALUES(7, 'Spoken English')
INSERT INTO #SEM VALUES(8, 'B.EEE')
INSERT INTO #SEM VALUES(9, 'B.ECE')
DECLARE #COURSEIDS_STUDENT VARCHAR(100), #COURSEIDS_SEM VARCHAR(100)
SELECT #COURSEIDS_STUDENT = COALESCE(#COURSEIDS_STUDENT, '') + CONVERT(VARCHAR(10), COURSEID) + ' ' FROM #STUDENT
SELECT #COURSEIDS_SEM = COALESCE(#COURSEIDS_SEM , '') + CONVERT(VARCHAR(10), COURSEID) + ' ' FROM #SEM WHERE COURSEID NOT IN (SELECT COURSEID FROM #STUDENT)
SELECT #COURSEIDS_STUDENT COURSEIDS_STUDENT, #COURSEIDS_SEM COURSEIDS_SEM
try this:
;WITH CTE as (select ROW_NUMBER() over (order by (select 0)) as rn,* from Sample.dbo.Tbl_student_details)
,CTE1 As(
select rn,SselectedCourse ,replace(stuff((select ''+courseid from course_details for xml path('')),1,1,''),SselectedCourse,'') as rem from CTE a
where rn = 1
union all
select c2.rn,c2.SselectedCourse,replace(rem,c2.SselectedCourse,'') as rem
from CTE1 c1 inner join CTE c2
on c2.rn=c1.rn+1
)
select STUFF((select ''+SselectedCourse from CTE1 for xml path('')),1,0,''),(select top 1 rem from CTE1 order by rn desc)