Convert Comma separated ids into its assigned values - sql

I am writing a view for the Data export feature ,So basically they need view all the columns with data associated to it.
I have a column in a table Languages Spoken and we are storing values as comma separated list 1,2,3 ....etc.,
where as 1 is english , 2 germany ,3 Spanish etc. this value is stored in different table.
StaffID LanguagesSpoken
---------- -------------
1 1,2,3
2 3,4
3 2,5
So when we want to view the the expected out should be
StaffID LanguagesSpoken
---------- -------------
1 English, Germany, Spanish
2 Spanish,Hindi
3 Germany,Arabic

You can use the following to split the LanguagesSpoken string, do a join with Language table and use string_agg to get what you want. As mentioned by others your schema design needs to be fixed so this will help you get the data into the new schema also:
SELECT StaffID, value
FROM StaffLanguagesSpoken
CROSS APPLY string_split(LanguagesSpoken, ",")

For a table containing the languages like this:
CREATE TABLE languages (
id INTEGER,
name VARCHAR(20)
);
INSERT INTO languages
(id, name)
VALUES
('1', 'English'),
('2', 'Germany'),
('3', 'Spanish'),
('4', 'Hindi'),
('5', 'Arabic');
you can join the tables, group by StaffID and use string_agg():
select
t.StaffID,
string_agg(l.name, ',') within group (order by l.id) LanguagesSpoken
from tablename t inner join languages l
on concat(',', t.languagesspoken, ',') like concat('%,', l.id, ',%')
group by t.StaffID
See the demo.
Results:
> StaffID | LanguagesSpoken
> ------: | :----------------------
> 1 | English,Germany,Spanish
> 2 | Spanish,Hindi
> 3 | Germany,Arabic

Related

SQL Server: Split values from columns with multiple values, into multiple rows [duplicate]

This question already has answers here:
Turning a Comma Separated string into individual rows
(16 answers)
Closed 4 years ago.
I have data that currently looks like this (pipe indicates separate columns):
ID | Sex | Purchase | Type
1 | M | Apple, Apple | Food, Food
2 | F | Pear, Barbie, Soap | Food, Toys, Cleaning
As you can see, the Purchase and Type columns feature multiple values that are comma delimited (some of the cells in these columns actually have up to 50+ values recorded within). I want the data to look like this:
ID | Sex | Purchase | Type
1 | M | Apple | Food
1 | M | Apple | Food
2 | F | Pear | Food
2 | F | Barbie | Toys
2 | F | Soap | Cleaning
Any ideas on how would I be able to do this with SQL? Thanks for your help everyone.
Edit: Just to show that this is different to some of the other questions. The key here is that data for each unique row is contained across two separate columns i.e. the second word in "Purchase" should be linked with the second word in "Type" for ID #1. The other questions I've seen was where the multiple values had been contained in just one column.
Basically you will required a delimited spliter function. There are many around. Here i am using DelimitedSplit8K from Jeff Moden http://www.sqlservercentral.com/articles/Tally+Table/72993/
-- create the sample table
create table #sample
(
ID int,
Sex char,
Purchase varchar(20),
Type varchar(20)
)
-- insert the sample data
insert into #sample (ID, Sex, Purchase, Type) select 1, 'M', 'Apple,Apple', 'Food,Food'
insert into #sample (ID, Sex, Purchase, Type) select 2, 'M', 'Pear,Barbie,Soap', 'Food,Toys,Cleaning'
select s.ID, s.Sex, Purchase = p.Item, Type = t.Item
from #sample s
cross apply DelimitedSplit8K(Purchase, ',') p
cross apply DelimitedSplit8K(Type, ',') t
where p.ItemNumber = t.ItemNumber
drop table #sample
EDIT: The original question as posted had the data as strings, with pipe characters as column delimiters and commas within the columns. The below solution works for that.
The question has since been edited to show that the input data is actually in columns, not as a single string.
I've left the solution here as an interesting version of the original question.
This is an interesting problem. I have a solution that works for a single row of your data. I dont know from the question if you are going to process it row by row, but I assume you will.
If so, this will work. I suspect there might be a better way using xml or without the temp tables, but in any case this is one solution.
declare #row varchar(1000); set #row='2 | F | Pear, Barbie, Soap | Food, Toys, Cleaning'
declare #v table(i int identity, val varchar(1000), subval varchar(100))
insert #v select value as val, subval from STRING_SPLIT(#row,'|')
cross apply (select value as subval from STRING_SPLIT(value,',') s) subval
declare #v2 table(col_num int, subval varchar(100), correlation int)
insert #v2
select col_num, subval,
DENSE_RANK() over (partition by v.val order by i) as correlation
from #v v
join (
select val, row_number()over (order by fst) as Col_Num
from (select val, min(i) as fst from #v group by val) colnum
) c on c.val=v.val
order by i
select col1.subval as ID, col2.subval as Sex, col3.subval as Purchase, col4.subval as Type
from #v2 col1
join #v2 col2 on col2.col_num=2
join #v2 col3 on col3.col_num=3
join #v2 col4 on col4.col_num=4 and col4.correlation=col3.correlation
where col1.col_num=1
Result is:
ID Sex Purchase Type
2 F Pear Food
2 F Barbie Toys
2 F Soap Cleaning

ms-access 2010: count duplicate names per household address

I am currently working with a spreadsheet in MS Access 2010 which contains about 130k rows of information about people who voted in a local election recently. Each row has their residential information (street name, number, postcode etc.) and personal information (title, surname, forename, middle name, DOB etc.). Each row represents an individual person rather than a household (therefore in many cases the same residential address appears more than once as more than one person resides in a particular household).
What I want to achieve is basically to create a new field in this dataset called 'count'. I want this field to give me a count of how many different surnames reside at a single address.
Is there an SQL script that will allow me to do this in Access 2010?
+------------------+----------+-------+---------+----------+-------------+
| PROPERTYADDRESS1 | POSTCODE | TITLE | SURNAME | FORENAME | MIDDLE_NAME |
+------------------+----------+-------+---------+----------+-------------+
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N
FAKEADDRESS2 EEE 5BB MRS BLOGGS SUZANNE P
FAKEADDRESS3 EEE 5RG MS SMITH PAULINE S
FAKEADDRESS4 EEE 4BV DR JONES ANNE D
FAKEADDRESS5 EEE 3AS MR TAYLOR STUART A
The following syntax has got me close so far:
SELECT COUNT(electoral.SURNAME)
FROM electoral
GROUP BY electoral.UPRN
However, instead of returning me all 130k odd rows, it only returns me around 67k rows. Is there anything I can do to the syntax to achieve the same result, but just returning every single row?
Any help is greatly appreciated!
Thanks
You could use something like this:
select *,
count(surname) over (partition by householdName)
from myTable
If you have only one column which contains the name,
ex: Rob Adams
then you can do this to have all the surnames in a different column so it will be easier in the select:
SELECT LEFT('HELLO WORLD',CHARINDEX(' ','HELLO WORLD')-1)
in our example:
select right (surmane, charindex (' ',surname)-1) as surname
example on how to use charindex, left and right here:
http://social.technet.microsoft.com/wiki/contents/articles/17948.t-sql-right-left-substring-and-charindex-functions.aspx
if there are any questions, leave a comment.
EDIT: I edited the query, had a syntax error, please try it again. This works on sql server.
here is an example:
create table #temp (id int, PropertyAddress varchar(50), surname varchar(50), forname varchar(50))
insert into #temp values
(1, 'hiddenBase', 'Adamns' , 'Kara' ),
(2, 'hiddenBase', 'Adamns' , 'Anne' ),
(3, 'hiddenBase', 'Adamns' , 'John' ),
(4, 'QueensResidence', 'Queen' , 'Oliver' ),
(5, 'QueensResidence', 'Queen' , 'Moira' ),
(6, 'superSecretBase', 'Diggle' , 'John' ),
(7, 'NandaParbat', 'Merlin' , 'Malcom' )
select * from #temp
select *,
count (surname) over (partition by PropertyAddress) as CountMembers
from #temp
gives:
1 hiddenBase Adamns Kara 3
2 hiddenBase Adamns Anne 3
3 hiddenBase Adamns John 3
7 NandaParbat Merlin Malcom 1
4 QueensResidence Queen Oliver 2
5 QueensResidence Queen Moira 2
6 superSecretBase Diggle John 1
Your query should look like this:
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
EDIT
If over partition by isn't supported, then I guess you can get to your desired result by using group by
select *,
count (SURNAME) over (partition by PropertyAddress) as CountFamilyMembers
from electoral
group by -- put here the fields in the select (one by one), however you can't write group by *
GROUP BY creates an aggregate query, so it's by design that you get fewer records (one per UPRN).
To get the count for each row in the original table, you can join the table with the aggregate query:
SELECT electoral.*, elCount.NumberOfPeople
FROM electoral
INNER JOIN
(
SELECT UPRN, COUNT(*) AS NumberOfPeople
FROM electoral
GROUP BY UPRN
) AS elCount
ON electoral.UPRN = elCount.UPRN
Given the update I want to post another answer. Try it like this:
create table #temp2 ( PropertyAddress1 varchar(50), POSTCODE varchar(20), TITLE varchar (20),
surname varchar(50), FORENAME varchar(50), MIDDLE_NAME varchar (50) )
insert into #temp2 values
('FAKEADDRESS1', 'EEE 5GG', 'MR', 'BLOGGS', 'JOE', 'N'),
('FAKEADDRESS1', 'EEE 5BB', 'MRS', 'BLOGGS', 'SUZANNE', 'P'),
('FAKEADDRESS2', 'EEE 5RG', 'MS', 'SMITH', 'PAULINE', 'S'),
('FAKEADDRESS3', 'EEE 4BV', 'DR', 'JONES', 'ANNE', 'D'),
('FAKEADDRESS4', 'EEE 3AS', 'MR', 'TAYLOR', 'STUART', 'A')
select PropertyAddress1, surname,count (#temp2.surname) as CountADD
into #countTemp
from #temp2
group by PropertyAddress1, surname
select * from #temp2 t2
left join #countTemp ct
on t2.PropertyAddress1 = ct.PropertyAddress1 and t2.surname = ct.surname
This yields:
PropertyAddress1 POSTCODE TITLE surname FORENAME MIDDLE_NAME PropertyAddress1 surname CountADD
FAKEADDRESS1 EEE 5GG MR BLOGGS JOE N FAKEADDRESS1 BLOGGS 2
FAKEADDRESS1 EEE 5BB MRS BLOGGS SUZANNE P FAKEADDRESS1 BLOGGS 2
FAKEADDRESS2 EEE 5RG MS SMITH PAULINE S FAKEADDRESS2 SMITH 1
FAKEADDRESS3 EEE 4BV DR JONES ANNE D FAKEADDRESS3 JONES 1
FAKEADDRESS4 EEE 3AS MR TAYLOR STUART A FAKEADDRESS4 TAYLOR 1

pivot/cross distinct row data to colums with postgres

I have distinct data that I want to pivot/cross, for instance
Given table A with
name tag
Bob sport
Bob action
Bob comedy
Tom action
Tom drama
Sue sport
I'd like a query that transforms the data to
name sport action comedy drama
Bob 1 1 1 0
Tom 0 1 0 1
Sue 1 0 0 0
For any number n of distinct tags.
How would I create this transformation using sql if I didn't know the distinct tags before I begin.
Some simple solutions adequate for some cases. Using this table (SQL Fiddle is not working right now)
create table a (
name text,
tag text
);
insert into a (name, tag) values
('Bob', 'sport'),
('Bob', 'action'),
('Bob', 'comedy'),
('Tom', 'action'),
('Tom', 'drama'),
('Sue', 'sport');
A simple arrays aggregation if they can be split somewhere else
select
name,
array_agg(tag order by tag) as tags,
array_agg(total order by tag) as totals
from (
select name, tag, count(a.name) as total
from
a
right join (
(select distinct tag from a) t
cross join
(select distinct name from a) n
) c using (name, tag)
group by name, tag
) s
group by name
order by 1
;
name | tags | totals
------+-----------------------------+-----------
Bob | {action,comedy,drama,sport} | {1,1,0,1}
Sue | {action,comedy,drama,sport} | {0,0,0,1}
Tom | {action,comedy,drama,sport} | {1,0,1,0}
For JSON aware clients a set of JSON objects
select format(
'{%s:{%s}}',
to_json(name),
string_agg(o, ',')
)::json as o
from (
select name,
format(
'%s:%s',
to_json(tag),
to_json(count(a.name))
) as o
from
a
right join (
(select distinct tag from a) t
cross join
(select distinct name from a) n
) c using (name, tag)
group by name, tag
) s
group by name
;
o
-----------------------------------------------------
{"Bob":{"action":1,"comedy":1,"drama":0,"sport":1}}
{"Sue":{"action":0,"comedy":0,"drama":0,"sport":1}}
{"Tom":{"action":1,"comedy":0,"drama":1,"sport":0}}
or a single JSON object
select format('{%s}', string_agg(o, ','))::json as o
from (
select format(
'%s:{%s}',
to_json(name),
string_agg(o, ',')
) as o
from (
select name,
format(
'%s:%s',
to_json(tag),
to_json(count(a.name))
) as o
from
a
right join (
(select distinct tag from a) t
cross join
(select distinct name from a) n
) c using (name, tag)
group by name, tag
) s
group by name
) s
;
o
---------------------------------------------------------------------------------------------------------------------------------------------------------
{"Bob":{"action":1,"comedy":1,"drama":0,"sport":1},"Sue":{"action":0,"comedy":0,"drama":0,"sport":1},"Tom":{"action":1,"comedy":0,"drama":1,"sport":0}}

Postgresql aggregate array

I have a two tables
Student
--------
Id Name
1 John
2 David
3 Will
Grade
---------
Student_id Mark
1 A
2 B
2 B+
3 C
3 A
Is it possible to make native Postgresql SELECT to get results like below:
Name Array of marks
-----------------------
'John', {'A'}
'David', {'B','B+'}
'Will', {'C','A'}
But not like below
Name Mark
----------------
'John', 'A'
'David', 'B'
'David', 'B+'
'Will', 'C'
'Will', 'A'
Use array_agg: http://www.sqlfiddle.com/#!1/5099e/1
SELECT s.name, array_agg(g.Mark) as marks
FROM student s
LEFT JOIN Grade g ON g.Student_id = s.Id
GROUP BY s.Id
By the way, if you are using Postgres 9.1, you don't need to repeat the columns on SELECT to GROUP BY, e.g. you don't need to repeat the student name on GROUP BY. You can merely GROUP BY on primary key. If you remove the primary key on student, you need to repeat the student name on GROUP BY.
CREATE TABLE grade
(Student_id int, Mark varchar(2));
INSERT INTO grade
(Student_id, Mark)
VALUES
(1, 'A'),
(2, 'B'),
(2, 'B+'),
(3, 'C'),
(3, 'A');
CREATE TABLE student
(Id int primary key, Name varchar(5));
INSERT INTO student
(Id, Name)
VALUES
(1, 'John'),
(2, 'David'),
(3, 'Will');
What I understand you can do something like this:
SELECT p.p_name,
STRING_AGG(Grade.Mark, ',' ORDER BY Grade.Mark) As marks
FROM Student
LEFT JOIN Grade ON Grade.Student_id = Student.Id
GROUP BY Student.Name;
EDIT
I am not sure. But maybe something like this then:
SELECT p.p_name, 
    array_to_string(ARRAY_AGG(Grade.Mark),';') As marks
FROM Student
LEFT JOIN Grade ON Grade.Student_id = Student.Id
GROUP BY Student.Name;
Reference here
You could use the following:
SELECT Student.Name as Name,
(SELECT array(SELECT Mark FROM Grade WHERE Grade.Student_id = Student.Id))
AS ArrayOfMarks
FROM Student
As described here: http://www.mkyong.com/database/convert-subquery-result-to-array/
Michael Buen got it right. I got what I needed using array_agg.
Here just a basic query example in case it helps someone:
SELECT directory, ARRAY_AGG(file_name)
FROM table
WHERE type = 'ZIP'
GROUP BY directory;
And the result was something like:
| parent_directory | array_agg |
+-------------------------+----------------------------------------+
| /home/postgresql/files | {zip_1.zip,zip_2.zip,zip_3.zip} |
| /home/postgresql/files2 | {file1.zip,file2.zip} |
This post also helped me a lot: "Group By" in SQL and Python Pandas.
It basically says that it is more convenient to use only PSQL when possible, but that Python Pandas can be useful to achieve extra functionalities in the filtering process.

DB2 Comma Separated Output by Groups

Is there a built in function for comma separated column values in DB2 SQL?
Example: If there are columns with an ID and it has 3 rows with the same ID but have three different roles, the data should be concatenated with a comma.
ID | Role
------------
4555 | 2
4555 | 3
4555 | 4
The output should look like the following, per row:
4555 2,3,4
LISTAGG function is new function in DB2 LUW 9.7
see example:
create table myTable (id int, category int);
insert into myTable values (1, 1);
insert into myTable values (2, 2);
insert into myTable values (5, 1);
insert into myTable values (3, 1);
insert into myTable values (4, 2);
example: select without any order in grouped column
select category, LISTAGG(id, ', ') as ids from myTable group by category;
result:
CATEGORY IDS
--------- -----
1 1, 5, 3
2 2, 4
example: select with order by clause in grouped column
select
category,
LISTAGG(id, ', ') WITHIN GROUP(ORDER BY id ASC) as ids
from myTable
group by category;
result:
CATEGORY IDS
--------- -----
1 1, 3, 5
2 2, 4
I think with this smaller query, you can do what you want.
This is equivalent of MySQL's GROUP_CONCAT in DB2.
SELECT
NUM,
SUBSTR(xmlserialize(xmlagg(xmltext(CONCAT( ', ',ROLES))) as VARCHAR(1024)), 3) as ROLES
FROM mytable
GROUP BY NUM;
This will output something like:
NUM ROLES
---- -------------
1 111, 333, 555
2 222, 444
assumming your original result was something like that:
NUM ROLES
---- ---------
1 111
2 222
1 333
2 444
1 555
Depending of the DB2 version you have, you can use XML functions to achieve this.
Example table with some data
create table myTable (id int, category int);
insert into myTable values (1, 1);
insert into myTable values (2, 2);
insert into myTable values (3, 1);
insert into myTable values (4, 2);
insert into myTable values (5, 1);
Aggregate results using xml functions
select category,
xmlserialize(XMLAGG(XMLELEMENT(NAME "x", id) ) as varchar(1000)) as ids
from myTable
group by category;
results:
CATEGORY IDS
-------- ------------------------
1 <x>1</x><x>3</x><x>5</x>
2 <x>2</x><x>4</x>
Use replace to make the result look better
select category,
replace(
replace(
replace(
xmlserialize(XMLAGG(XMLELEMENT(NAME "x", id) ) as varchar(1000))
, '</x><x>', ',')
, '<x>', '')
, '</x>', '') as ids
from myTable
group by category;
Cleaned result
CATEGORY IDS
-------- -----
1 1,3,5
2 2,4
Just saw a better solution using XMLTEXT instead of XMLELEMENT here.
Since DB2 9.7.5 there is a function for that:
LISTAGG(colname, separator)
check this for more information: Using LISTAGG to Turn Rows of Data into a Comma Separated List
My problem was to transpose row fields(CLOB) to column(VARCHAR) with a CSV and use the transposed table for reporting. Because transposing on report layer slows down the report.
One way to go is to use recursive SQL. You can find many articles about that but its difficult and resource consuming if you want to join all your recursive transposed columns.
I created multiple global temp tables where I stored single transposed columns with one key identifier. Eventually, I had 6 temp tables for joining 6 columns but due to limited resource allocation I wasnt able to bring all columns together. I opted to below 3 formulas and then I just had to run 1 query which gave me output in 10 seconds.
I found various articles on using XML2CLOB functions and have found 3 different ways.
REPLACE(VARCHAR(XML2CLOB(XMLAGG(XMLELEMENT(NAME "A",ALIASNAME.ATTRIBUTENAME)))),'', ',') AS TRANSPOSED_OUTPUT
NVL(TRIM(',' FROM REPLACE(REPLACE(REPLACE(CAST(XML2CLOB(XMLAGG(XMLELEMENT(NAME "E", ALIASNAME.ATTRIBUTENAME))) AS VARCHAR(100)),'',' '),'',','), '', 'Nothing')), 'Nothing') as TRANSPOSED_OUTPUT
RTRIM(REPLACE(REPLACE(REPLACE(VARCHAR(XMLSERIALIZE(XMLAGG(XMLELEMENT(NAME "A",ALIASNAME.ATTRIBUTENAME) ORDER BY ALIASNAME.ATTRIBUTENAME) AS CLOB)), '',','),'',''),'','')) AS TRANSPOSED_OUTPUT
Make sure you are casting your "ATTRIBUTENAME" to varchar in a subquery and then calling it here.
other possibility, with recursive cte
with tablewithrank as (
select id, category, rownumber() over(partition by category order by id) as rangid , (select count(*) from myTable f2 where f1.category=f2.category) nbidbycategory
from myTable f1
),
cte (id, category, rangid, nbidbycategory, rangconcat) as (
select id, category, rangid, nbidbycategory, cast(id as varchar(500)) from tablewithrank where rangid=1
union all
select f2.id, f2.category, f2.rangid, f2.nbidbycategory, cast(f1.rangconcat as varchar(500)) || ',' || cast(f2.id as varchar(500)) from cte f1 inner join tablewithrank f2 on f1.rangid=f2.rangid -1 and f1.category=f2.category
)
select category, rangconcat as IDS from cte
where rangid=nbidbycategory
Try this:
SELECT GROUP_CONCAT( field1, field2, field3 ,field4 SEPARATOR ', ')