Getting similar column names and count from multiple tables - sql

So I have a table called 'SongsMetadata' in my database with 6 columns as shown below (appx 70k records). It contains all songs related information.
It is slightly different than the regular database table. The 'File_name' column contains .csv files. Those are the actual tables and values in front of them are the columns in that csv file.
So for '1001186_1_7562755270480253254.csv' record in SongsMetadata table, '1001186_1_7562755270480253254' is the table name and it's columns are '&nbsp', 'name', 'album', 'time', 'price' (these tables contain a lot of garbage values)
My goal is to compare all the tables(in this case .csv files) to get all the similar column names and their count. Now I already have a solution to get common column names and count for normal tables here. Each table will be compared with every other table. However, I'm not sure how I can achieve the same with .csv tables.
The expected output is:
1001186_1_7562755270480253254.csv & 1001186_0_5503858345485431752.csv | &nbsp, name, price| 3 #common columns count
1001186_0_5503858345485431752.csv & 99524146_0_3894874701785592836.csv | &nbsp, name, price| 3
and so on...
Any suggestions are appreciated.

The following solution shows how to treat your exsting table so that the wanted matching can occur efficiently, This requires an unpivot although the effect of an unpivot is performed by using cross apply and values which is a simple and efficient method. After that the "matching" is shown, followed by an alternative query for details yo may also find useful. Lastly the new table is displayed just to help visualize what it is.
See the as a live demo at SQL Fiddle
Small Sample:
CREATE TABLE SongsMetadata
([file_name] varchar(7), [col1] varchar(6), [col2] varchar(6), [col3] varchar(6), [col4] varchar(6))
;
INSERT INTO SongsMetadata
([file_name], [col1], [col2], [col3], [col4])
VALUES
('abc.csv', ' ', 'name', 'price', 'artist'),
('def.csv', 'name', ' ', ' ', 'price')
;
UNPIVOT Query
This query moves the column information into a normalized structure to enable the subsequent matching to occur. It is vital to the overall solution. As an added bonus you can mark some column names as "bad" so that these may be ignored later e.g. (which most likely is garbage data)
select
file_name, column_number, column_name
, case when column_name IN (' ','</b>','other-unwanted') then 0 else 1 end as col_is_good
into SongsMetadataUpivot
from (
select file_name, column_number, column_name
from SongsMetadata
cross apply (
values
(1, col1)
, (2, col2)
, (3, col3)
, (4, col4)
) ca (column_number, column_name)
) d
;
Query 1:
This is the "matching logic" provided at http://rextester.com/TLQ28814 but applied to the unpivoted songs data, AND it has the ability to exclude column names you simply don't want to consider (col_is_good).
with fmatch as (
select
l.file_name + ' & ' + r.file_name AS comparing_files
, l.column_name
from SongsMetadataUpivot l
inner join SongsMetadataUpivot r on l.column_name = r.column_name
and l.file_name < r.file_name
and r.col_is_good = 1
where l.col_is_good = 1
)
select --* from fmatch
f.comparing_files
, STUFF((
SELECT
N', ' + column_name
FROM fmatch c
WHERE f.comparing_files = c.comparing_files
order by c.column_name
FOR xml PATH (''), TYPE
)
.value('text()[1]', 'nvarchar(max)'), 1, 2, N'') as columns
, count(*) as num_col_matches
from fmatch f
group by f.comparing_files
Results:
| comparing_files | columns | num_col_matches |
|-------------------|-------------|-----------------|
| abc.csv & def.csv | name, price | 2 |
Query 2:
This will simply allow production of the column lists, in name order, together with their respective column positions in each file.
SELECT
file_name, ca.*
from SongsMetadata f
cross apply (
select
STUFF((
SELECT
N', ' + column_name
FROM SongsMetadataUpivot c
WHERE f.file_name = c.file_name
AND c.col_is_good = 1
ORDER BY column_name
FOR xml PATH (''), TYPE
)
.value('text()[1]', 'nvarchar(max)'), 1, 2, N'')
, STUFF((
SELECT
N', ' + cast(column_number as nvarchar)
FROM SongsMetadataUpivot c
WHERE f.file_name = c.file_name
AND c.col_is_good = 1
ORDER BY column_name
FOR xml PATH (''), TYPE
)
.value('text()[1]', 'nvarchar(max)'), 1, 2, N'')
) ca (column_names, col_numbers)
Results:
| file_name | column_names | col_numbers |
|-----------|---------------------|-------------|
| abc.csv | artist, name, price | 4, 2, 3 |
| def.csv | name, price | 1, 4 |
Query 3:
So you may visualize the "unpivoted" data, the overall solution requires this to occur.
select * from SongsMetadataUpivot
Results:
| file_name | column_number | column_name | col_is_good |
|-----------|---------------|-------------|-------------|
| abc.csv | 1 | | 0 |
| abc.csv | 2 | name | 1 |
| abc.csv | 3 | price | 1 |
| abc.csv | 4 | artist | 1 |
| def.csv | 1 | name | 1 |
| def.csv | 2 | | 0 |
| def.csv | 3 | | 0 |
| def.csv | 4 | price | 1 |

Related

SQL Server: Combine columns abort same values

I am using ADO to connect SQL Server. I have a table and I want to group some cols to one col. I need the values in the new col is distinct.
This is my needing
Thank for all!
Import your excel file into SQL so you can run queries
Then Transpose your table. Transpose means to reverse columns and rows like:
+------+---------+----------+
| Name | Email1 | Email2 |
+------+---------+----------+
| A | A#a.com | A#aa.com |
+------+---------+----------+
| B | B#b.com | B#bb.com |
+------+---------+----------+
To something like this:
+---------+---------+----------+
| Name | A | B |
+---------+---------+----------+
| Email1 | A#a.com | B#b.com |
+---------+---------+----------+
| Email2 | A#aa.com| B#bb.com |
+---------+---------+----------+
The way is describing here : Simple way to transpose columns and rows in Sql?
Then you can easily SELECT DISTINCT [A] FROM [MyTable] (for each column which is each person) one by one and insert it to a temp table with a single column.
Then:
SELECT STUFF((
SELECT ', ' + [temptablecolumn]
FROM #temptable
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
This query it gives you this result: A#a.com, A#aa.com
You can use APPLY to convert your TMs into rows & concat them using FOR XML PATH() clause :
WITH t AS (
SELECT DISTINCT name, tm
FROM table t CROSS APPLY
( VALUES (TM1), (TM2), (TM3), (TM4), (TM5)
) tt (tm)
)
SELECT nam,
(SELECT ''+t1.tm
FROM t t1
WHERE t1.nam = t.nam
FOR XML PATH('')
) AS tn
FROM t;
One method uses a giant case expression:
select id,
(tn1 +
(case when tn2 not in (tn1) then tn2 else '' end) +
(case when tn3 not in (tn1, tn2) then tn3 else '' end) +
(case when tn4 not in (tn1, tn2, tn3) then tn4 else '' end) +
(case when tn5 not in (tn1, tn2, tn3, tn4) then tn5 else '' end)
) as tn
from t;
I will add that having multiple columns with essentially the same data is usually a sign of a bad data model. Normally, you would want a table with one row per tn and id pair.

Using string_split to create rows from multiple columns

I have data that looks something like this example (on an unfortunately much larger scale):
+----+-------+--------------------+-----------------------------------------------+
| ID | Data | Cost | Comments |
+----+-------+--------------------+-----------------------------------------------+
| 1 | 1|2|3 | $0.00|$3.17|$42.42 | test test||previous thing has a blank comment |
+----+-------+--------------------+-----------------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-------+--------------------+-----------------------------------------------+
| 3 | 1|2 | $3.50|$4.20 | |test |
+----+-------+--------------------+-----------------------------------------------+
Some of the columns in the table I have are pipeline delimited, but they are consistent by each row. So each delimited value corresponds to the same index in the other columns of the same row.
So I can do something like this which is what I want for a single column:
SELECT ID, s.value AS datavalue
FROM MyTable t CROSS APPLY STRING_SPLIT(t.Data, '|') s
and that would give me this:
+----+-----------+
| ID | datavalue |
+----+-----------+
| 1 | 1 |
+----+-----------+
| 1 | 2 |
+----+-----------+
| 1 | 3 |
+----+-----------+
| 2 | 1 |
+----+-----------+
| 3 | 1 |
+----+-----------+
| 3 | 2 |
+----+-----------+
but I also want to get the other columns as well (cost and comments in this example) so that the corresponding items are all in the same row like this:
+----+-----------+-----------+------------------------------------+
| ID | datavalue | costvalue | commentvalue |
+----+-----------+-----------+------------------------------------+
| 1 | 1 | $0.00 | test test |
+----+-----------+-----------+------------------------------------+
| 1 | 2 | $3.17 | |
+----+-----------+-----------+------------------------------------+
| 1 | 3 | $42.42 | previous thing has a blank comment |
+----+-----------+-----------+------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-----------+-----------+------------------------------------+
| 3 | 1 | $3.50 | |
+----+-----------+-----------+------------------------------------+
| 3 | 2 | $4.20 | test |
+----+-----------+-----------+------------------------------------+
I'm not sure what the best or most simple way to achieve this would be
This isn't going to be achievable with STRING_SPLIT as Microsoft refuse to supply the ordinal position as part of the result set. As a result, you'll need to use a different function which does. Personally, I recommend Jeff Moden's DelimitedSplit8k.
Then, you can do this:
CREATE TABLE #Sample (ID int,
[Data] varchar(200),
Cost varchar(200),
Comments varchar(8000));
GO
INSERT INTO #Sample
VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment'),
(2,'1','$420.69','test'),
(3,'1|2','$3.50|$4.20','|test');
GO
SELECT S.ID,
DSd.Item AS DataValue,
DSc.Item AS CostValue,
DSct.Item AS CommentValue
FROM #Sample S
CROSS APPLY dbo.DelimitedSplit8K(S.[Data],'|') DSd
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Cost,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSc
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Comments,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSct;
GO
DROP TABLE #Sample;
GO
There is, however, only one true answer to this question: Don't store delimited values in SQL Server. Store them in a normalised manner, and you won't have this problem.
Here is a solution approach using a recursive CTE instead of a User Defined Funtion (UDF) which is useful for those without permission to create functions.
CREATE TABLE mytable(
ID INTEGER NOT NULL PRIMARY KEY
,Data VARCHAR(7) NOT NULL
,Cost VARCHAR(20) NOT NULL
,Comments VARCHAR(47) NOT NULL
);
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (2,'1','$420.69','test');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (3,'1|2','$3.50|$4.20','|test');
This query allows choice of delimiter by using a variable, then using a common table expression it parses each delimited string to produce a rows for each portion of those strings, and retains the ordinal position of each.
declare #delimiter as varchar(1)
set #delimiter = '|'
;with cte as (
select id
, convert(varchar(max), null) as datavalue
, convert(varchar(max), null) as costvalue
, convert(varchar(max), null) as commentvalue
, convert(varchar(max), data + #delimiter) as data
, convert(varchar(max), cost + #delimiter) as cost
, convert(varchar(max), comments + #delimiter) as comments
from mytable as t
union all
select id
, convert(varchar(max), left(data, charindex(#delimiter, data) - 1))
, convert(varchar(max), left(cost, charindex(#delimiter, cost) - 1))
, convert(varchar(max), left(comments, charindex(#delimiter, comments) - 1))
, convert(varchar(max), stuff(data, 1, charindex(#delimiter, data), ''))
, convert(varchar(max), stuff(cost, 1, charindex(#delimiter, cost), ''))
, convert(varchar(max), stuff(comments, 1, charindex(#delimiter, comments), ''))
from cte
where (data like ('%' + #delimiter + '%') and cost like ('%' + #delimiter + '%')) or comments like ('%' + #delimiter + '%')
)
select id, datavalue, costvalue, commentvalue
from cte
where datavalue IS NOT NULL
order by id, datavalue
As the recursion adds new rows, it places the first portion of the delimited strings into the wanted output columns using left(), then also, using stuff(), removes the last used delimiter from the source strings so that the next row will start at the next delimiter. Note that to initiate the extractions, the delimiter is added to the end of the source delimited strings which is to ensure the where clause does not exclude any of the wanted strings.
the result:
id datavalue costvalue commentvalue
---- ----------- ----------- ------------------------------------
1 1 $0.00 test test
1 2 $3.17
1 3 $42.42 previous thing has a blank comment
2 1 $420.69 test
3 1 $3.50
3 2 $4.20 test
demonstrated here at dbfiddle.uk

SQL SELECT: concatenated column with line breaks and heading per group

I have the following SQL result from a SELECT query:
ID | category| value | desc
1 | A | 10 | text1
2 | A | 11 | text11
3 | B | 20 | text20
4 | B | 21 | text21
5 | C | 30 | text30
This result is stored in a temporary table named #temptab. This temporary table is then used in another SELECT to build up a new colum via string concatenation (don't ask me about the detailed rationale behind this. This is code I took from a colleague). Via FOR XML PATH() the output of this column is a list of the results and is then used to send mails to customers.
The second SELECT looks as follows:
SELECT t1.column,
t2.column,
(SELECT t.category + ' | ' + t.value + ' | ' + t.desc + CHAR(9) + CHAR(13) + CHAR(10)
FROM #temptab t
WHERE t.ID = ttab.ID
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)') AS colname
FROM table1 t1
...
INNER JOIN #temptab ttab on ttab.ID = someOtherTable.ID
...
Without wanting to go into too much detail, the column colname becomes populated with several entries (due to multiple matches) and hence, a longer string is stored in this column (CHAR(9) + CHAR(13) + CHAR(10) is essentially a line break). The result/content of colname looks like this (it is used to send mails to customers):
A | 10 | text1
A | 11 | text11
B | 20 | text20
B | 21 | text21
C | 30 | text30
Now I would like to know, if there is a way to more nicely format this output string. The best case would be to group the same categories together and add a heading and empty line between different categories:
*A*
A | 10 | text1
A | 11 | text11
*B*
B | 20 | text20
B | 21 | text21
*C*
C | 30 | text30
My question is: How do I have to modify the above query (especially the string-concatenation-part) to achieve above formatting? I was thinking about using a GROUP BY statement, but this obviously does not yield the desired result.
Edit: I use Microsoft SQL Server 2008 R2 (SP2) - 10.50.4270.0 (X64)
Declare #YourTable table (ID int,category varchar(50),value int, [desc] varchar(50))
Insert Into #YourTable values
(1,'A',10,'text1'),
(2,'A',11,'text11'),
(3,'B',20,'text20'),
(4,'B',21,'text21'),
(5,'C',30,'text30')
Declare #String varchar(max) = ''
Select #String = #String + Case when RowNr=1 Then Replicate(char(13)+char(10),2) +'*'+Category+'*' Else '' end
+ char(13)+char(10) + category + ' | ' + cast(value as varchar(25)) + ' | ' + [desc]
From (
Select *
,RowNr=Row_Number() over (Partition By Category Order By Value)
From #YourTable
) A Order By Category, Value
Select Substring(#String,5,Len(#String))
Returns
*A*
A | 10 | text1
A | 11 | text11
*B*
B | 20 | text20
B | 21 | text21
*C*
C | 30 | text30
This should return what you want
Declare #YourTable table (ID int,category varchar(50),value int, [desc] varchar(50))
Insert Into #YourTable values
(1,'A',10,'text1'),
(2,'A',11,'text11'),
(3,'B',20,'text20'),
(4,'B',21,'text21'),
(5,'C',30,'text30');
WITH Categories AS
(
SELECT category
,'**' + category + '**' AS CatCaption
,ROW_NUMBER() OVER(ORDER BY category) AS CatRank
FROM #YourTable
GROUP BY category
)
,Grouped AS
(
SELECT c.CatRank
,0 AS ValRank
,c.CatCaption AS category
,-1 AS ID
,'' AS Value
,'' AS [desc]
FROM Categories AS c
UNION ALL
SELECT c.CatRank
,ROW_NUMBER() OVER(PARTITION BY t.category ORDER BY t.Value)
,t.category
,t.ID
,CAST(t.value AS VARCHAR(100))
,t.[desc]
FROM #YourTable AS t
INNER JOIN Categories AS c ON t.category=c.category
)
SELECT category,Value,[desc]
FROM Grouped
ORDER BY CatRank,ValRank
The result
category Value desc
**A**
A 10 text1
A 11 text11
**B**
B 20 text20
B 21 text21
**C**
C 30 text30

Count number of values across multiple columns

I have a table with 11 columns. The first column includes the category names. The remaining 10 columns have values like white, green, big, damaged etc. and these values can change in time.
I need a SQL query to find how many are there in table (in 10 columns) each value.
Table 1:
+------------+------------+
| ID | decription |
+------------+------------+
| 1 | white |
| 2 | green |
| 3 | big |
| 4 | damaged |
+------------+------------+
Table 2:
+------------+-----------+-----------+-----------+
| CATEGORY | SECTION 1 | SECTION 2 | SECTION 3 |
+------------+-----------+-----------+-----------+
| Category 1 | white | green | big |
| Category 2 | big | damaged | white |
| Category 1 | white | green | big |
| Category 3 | big | damaged | white |
+------------+-----------+-----------+-----------+
Desired result:
+------------+-------+-------+-----+---------+
| CATEGORY | White | Green | Big | Damaged |
+------------+-------+-------+-----+---------+
| Category 1 | 20 | 10 | 9 | 50 |
| Category 2 | 25 | 21 | 15 | 5 |
+------------+-------+-------+-----+---------+
Is it possible doing like this dynamically just as query ?
its on MS sql in visual studio reporting
Thanks
You've got yourself a bit of a mess with the design and the desired result. The problem is that your table is denormalized and then the final result you want is also denormalized. You can get the final result by unpivoting your Section columns, then pivoting the values of those columns. You further add to the mess by needing to do this dynamically.
First, I'd advise you to rethink your table structure because this is far too messy to maintain.
In the meantime, before you even think about writing a dynamic version to get the result you have to get the logic correct via a static or hard-coded query. Now, you didn't state which version of SQL Server you are using but you first need to unpivot the Section columns. You can use either the UNPIVOT function or CROSS APPLY. Your query will start with something similar to the following:
select
category,
value
from yourtable
unpivot
(
value for cols in (Section1,Section2,Section3)
) u
See SQL Fiddle with Demo. This gets your data into the format:
| CATEGORY | VALUE |
|------------|---------|
| Category 1 | white |
| Category 1 | green |
| Category 1 | big |
| Category 2 | big |
| Category 2 | damaged |
| Category 2 | white |
Now you have multiple Category rows - one for each value that previously were in the Section columns. Since you want a total count of each word in the Category, you can now apply the pivot function:
select
category,
white, green, big, damaged
from
(
select
category,
value
from yourtable
unpivot
(
value for cols in (Section1,Section2,Section3)
) u
) un
pivot
(
count(value)
for value in (white, green, big, damaged)
) p;
See SQL Fiddle with Demo. This will give you the result that you want but now you need this to be done dynamically. You'll have to use dynamic SQL which will create a SQL string that will be executed giving you the final result.
If the number of columns to UNPIVOT is limited, then you will create a list of the new column values in a string and then execute it similar to:
DECLARE #query AS NVARCHAR(MAX),
#colsPivot as NVARCHAR(MAX);
select #colsPivot
= STUFF((SELECT ',' + quotename(SectionValue)
from yourtable
cross apply
(
select Section1 union all
select Section2 union all
select Section3
) d (SectionValue)
group by SectionValue
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query
= 'select category, '+#colspivot+'
from
(
select
category,
value
from yourtable
unpivot
(
value
for cols in (Section1, Section2, Section3)
) un
) x
pivot
(
count(value)
for value in ('+ #colspivot +')
) p'
exec sp_executesql #query
See SQL Fiddle with Demo
If you have an unknown number of columns to unpivot, then your process will be a bit more complicated. You'll need to generate a string with the columns to unpivot, you can use the sys.columns table to get this list:
select #colsUnpivot
= stuff((select ','+quotename(C.name)
from sys.columns as C
where C.object_id = object_id('yourtable') and
C.name like 'Section%'
for xml path('')), 1, 1, '')
Then you'll need to get a list of the new column values - but since these are dynamic we will need to generate this list with a bit of work. You'll need to unpivot the table to generate the list of values into a temporary table for use. Create a temp table to store the values:
create table #Category_Section
(
Category varchar(50),
SectionValue varchar(50)
);
Load the temp table with the data that you need to unpivot:
set #unpivotquery
= 'select
category,
value
from yourtable
unpivot
(
value for cols in ('+ #colsUnpivot +')
) u'
insert into #Category_Section exec(#unpivotquery);
See SQL Fiddle with Demo. You'll see that your data looks the same as the static version above. Now you need to create a string with the values from the temp table that will be used in the final query:
select #colsPivot
= STUFF((SELECT ',' + quotename(SectionValue)
from #Category_Section
group by SectionValue
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
Once you have all this you can put it together into a final query:
DECLARE #colsUnpivot AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX),
#colsPivot as NVARCHAR(MAX),
#unpivotquery AS NVARCHAR(MAX);
select #colsUnpivot
= stuff((select ','+quotename(C.name)
from sys.columns as C
where C.object_id = object_id('yourtable') and
C.name like 'Section%'
for xml path('')), 1, 1, '');
create table #Category_Section
(
Category varchar(50),
SectionValue varchar(50)
);
set #unpivotquery
= 'select
category,
value
from yourtable
unpivot
(
value for cols in ('+ #colsUnpivot +')
) u';
insert into #Category_Section exec(#unpivotquery);
select #colsPivot
= STUFF((SELECT ',' + quotename(SectionValue)
from #Category_Section
group by SectionValue
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query
= 'select category, '+#colspivot+'
from
(
select
category,
value
from yourtable
unpivot
(
value
for cols in ('+ #colsunpivot +')
) un
) x
pivot
(
count(value)
for value in ('+ #colspivot +')
) p'
exec sp_executesql #query
See SQL Fiddle with Demo. All versions will get you the end result:
| CATEGORY | BIG | DAMAGED | GREEN | WHITE |
|------------|-----|---------|-------|-------|
| Category 1 | 2 | 0 | 2 | 2 |
| Category 2 | 1 | 1 | 0 | 1 |
| Category 3 | 1 | 1 | 0 | 1 |
If your values are stored in a separate table, then you would generate your list of values from that table:
DECLARE #query AS NVARCHAR(MAX),
#colsPivot as NVARCHAR(MAX);
select #colsPivot
= STUFF((SELECT ',' + quotename(decription)
from descriptions
group by decription
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query
= 'select category, '+#colspivot+'
from
(
select
category,
value
from yourtable
unpivot
(
value
for cols in (Section1, Section2, Section3)
) un
) x
pivot
(
count(value)
for value in ('+ #colspivot +')
) p'
exec sp_executesql #query
See SQL Fiddle with Demo and still get the same result:
| CATEGORY | BIG | DAMAGED | GREEN | WHITE |
|------------|-----|---------|-------|-------|
| Category 1 | 2 | 0 | 2 | 2 |
| Category 2 | 1 | 1 | 0 | 1 |
| Category 3 | 1 | 1 | 0 | 1 |
select category,
SUM(CASE when section1='white' then 1 when section2='white' then 1 when section3='white' then 1 else 0 end) as white,
SUM(CASE when section1='green' then 1 when section2='green' then 1 when section3='green' then 1 else 0 end) as green,
SUM(CASE when section1='damaged' then 1 when section2='damaged' then 1 when section3='damaged' then 1 else 0 end) as damaged,
SUM(CASE when section1='big' then 1 when section2='big' then 1 when section3='big' then 1 else 0 end) as big
from test
group by category
SQLFiddle
You can extend more to n section values as shown above gor section1,section2,section3

How do I Pivot Vertical Data to Horizontal Data SQL with Variable Row Lengths?

Okay I have the following table.
Name ID Website
Aaron | 2305 | CoolSave1
Aaron | 8464 | DiscoWorld1
Adriana | 2956 | NewCin1
Adriana | 5991 | NewCin2
Adriana | 4563 NewCin3
I would like to transform it into the following way.
Adriana | 2956 | NewCin1 | 5991 | NewCin2 | 4563 | NewCin3
Aaron | 2305 | CoolSave1 | 8464 | DiscoWorld | NULL | NULL
As you can see i am trying to take the first name from the first table and make a single row with all the IDs / Websites associated with that name. The problem is, there is a variable amount of websites that may be associated with each name. To handle this i'd like to just make a table with with the number of fields sequal to the max line item, and then for the subsequent lineitems, plug in a NULL where there are not enough data.
In order to get the result, you will need to apply both the UNPIVOT and the PIVOT functions to the data. The UNPIVOT will take the columns (ID, website) and convert them to rows, once this is done, then you can PIVOT the data back into columns.
The UNPIVOT code will be similar to the following:
select name,
col+'_'+cast(col_num as varchar(10)) col,
value
from
(
select name,
cast(id as varchar(11)) id,
website,
row_number() over(partition by name order by id) col_num
from yt
) src
unpivot
(
value
for col in (id, website)
) unpiv;
See SQL Fiddle with Demo. This gives a result:
| NAME | COL | VALUE |
-------------------------------------
| Aaron | id_1 | 2305 |
| Aaron | website_1 | CoolSave1 |
| Aaron | id_2 | 8464 |
| Aaron | website_2 | DiscoWorld1 |
As you can see I applied a row_number() to the data prior to the unpivot, the row number is used to generate the new column names. The columns in the UNPIVOT must also be of the same datatype, I applied a cast to the id column in the subquery to convert the data to a varchar prior to the pivot.
The col values are then used in the PIVOT. Once the data has been unpivoted, you apply the PIVOT function:
select *
from
(
select name,
col+'_'+cast(col_num as varchar(10)) col,
value
from
(
select name,
cast(id as varchar(11)) id,
website,
row_number() over(partition by name order by id) col_num
from yt
) src
unpivot
(
value
for col in (id, website)
) unpiv
) d
pivot
(
max(value)
for col in (id_1, website_1, id_2, website_2, id_3, website_3)
) piv;
See SQL Fiddle with Demo.
The above version works great if you have a limited or known number of values. But if the number of rows is unknown, then you will need to use dynamic SQL to generate the result:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME( col+'_'+cast(col_num as varchar(10)))
from
(
select row_number() over(partition by name order by id) col_num
from yt
) t
cross apply
(
select 'id' col union all
select 'website'
) c
group by col, col_num
order by col_num, col
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT name,' + #cols + '
from
(
select name,
col+''_''+cast(col_num as varchar(10)) col,
value
from
(
select name,
cast(id as varchar(11)) id,
website,
row_number() over(partition by name order by id) col_num
from yt
) src
unpivot
(
value
for col in (id, website)
) unpiv
) x
pivot
(
max(value)
for col in (' + #cols + ')
) p '
execute(#query);
See SQL Fiddle with Demo. Both versions give the result:
| NAME | ID_1 | WEBSITE_1 | ID_2 | WEBSITE_2 | ID_3 | WEBSITE_3 |
------------------------------------------------------------------------
| Aaron | 2305 | CoolSave1 | 8464 | DiscoWorld1 | (null) | (null) |
| Adriana | 2956 | NewCin1 | 4563 | NewCin3 | 5991 | NewCin2 |