Group and Separate into Different Columns - sql

I'm trying to group some data and separate one column into several. The following is the kind of table I'm working with although each of these columns are from individual tables connected through an ID on each:
ParticipantId QuestionText QuestionAnswer
1 What is your gender? 2
2 What is your gender? 1
3 What is your gender? 1
4 What is your gender? 2
5 What is your gender? 1
1 What is your age? 28
2 What is your age? NULL
3 What is your age? 55
4 What is your age? 63
And this is what I want to achieve:
ParticipantId Question1Answer Question2Answer Question3Answer
1 2 28 3
2 1 NULL 4
I imagine this is quite a difficult thing to do? As the questionnaire contains around 100 questions. I don't think using case would be suitable without typing each questionID out. I'm using SQL Server 2008. The following is some of the table structures I'm working with. I'm sure there's an clearer way than typing it out.
The QuestionnaireQuestion table contains QuestionNumber for the sequence and joins to the Question table to via questionID which is the Question tables PID. The question table contains QuestionText and links to the Answer table using QuestionID which contains the answer field. Then the answer table goes through a link table called QuestionnaireInstance which finally links to the PaperQuestionnaire table which contains the ParticipantID.
That probably hasn't made it any clearer, just let me know anything else that might clear it up a bit.

In case you don't want to have to type out all of the question text each time, you could always use this:
;with sample_data as
(
SELECT
ParticipantId
,QuestionText
,QuestionAnswer
,row_number() OVER (PARTITION BY PARTICIPANTID ORDER BY (SELECT NULL)) AS rn
FROM yourdatatable
)
SELECT
PARTICIPANTID
,MAX(CASE WHEN rn = 1 THEN questionanswer END) AS Q1
,MAX(CASE WHEN rn = 2 THEN questionanswer END) AS Q2
,MAX(CASE WHEN rn = 3 THEN questionanswer END) AS Q3
,MAX(CASE WHEN rn = 4 THEN questionanswer END) AS Q4
FROM sample_data
GROUP BY ParticipantId
Although it might be better in your case to consider dynamic pivoting instead, depending on how many columns you want to ultimately end up with

If you have uniquness in you table for column combination ParticipantId and QuestionText then you can use below query also to acive the desired output -
SELECT Participantid,
MAX(CASE
WHEN Questiontext = 'What is your gender?' THEN
Questionanswer
ELSE
NULL
END) AS Question1answer,
MAX(CASE
WHEN Questiontext = 'What is your age?' THEN
Questionanswer
ELSE
NULL
END) AS Question2answer,
MAX(CASE
WHEN Questiontext = '...your third question...' THEN
Questionanswer
ELSE
NULL
END) AS Question3answer,
..
..
FROM Your_Table_Name
GROUP BY Participantid

Related

Grouping results based on CASE expression?

I'm working with a table that stores the results of a questionnaire administered to people. Each question and its result is stored as a separate record, as shown below. I've written a CASE expression that creates a simple 1/0 flag depending on people's answers to certain questions. My results look something like this.
PersonID Question Answer Flag
---------------------------------------------
1001 Question 1 yes 1
1001 Question 2 3 0
1001 Question 3 1 or more 1
1234 Question 1 no 0
1234 Question 2 2 0
1234 Question 3 none 0
My issue now is that if a person has even one flagged response, I need to flag their entire questionnaire. I've been looking around for other examples of this issue—this is dealing with almost exactly the same thing, but I don't necessarily want to actually group my results, because I still want to be able to see the flags for the individual questions ("oh, this person's questionnaire got flagged, let's see which question it was for and what their response was"). I know it's probably not the most efficient thing, but I'm hoping for results that look like this:
PersonID Question Answer Flag Overall
--------------------------------------------------------
1001 Question 1 yes 1 1
1001 Question 2 3 0 1
1001 Question 3 1 or more 1 1
1234 Question 1 no 0 0
1234 Question 2 2 0 0
1234 Question 3 none 0 0
This is where I'm at with my query. It works fine for flagging the individual questions, but I'm not sure what steps to take in order to flag the whole questionnaire based on the individual answers. What kind of logic/syntax should I be looking at?
SELECT
PersonID,
QuestionDescription as Question,
ResultValue as Answer,
(CASE
WHEN (QuestionDescription LIKE '%ion 1%' AND ResultValue = 'yes') THEN 1
WHEN (QuestionDescription LIKE '%ion 2%' AND ResultValue >= 5) THEN 1
WHEN (QuestionDescription LIKE '%ion 3%' AND ResultValue = '1 or more') THEN 1
ELSE 0
END) as Flag
FROM Questionnaire
ORDER BY PersonID, QuestionDescription
At its most simplistic you can add up the flags within a partition of the person and see whether they sum to 0 or not:
WITH x AS
(
SELECT
PersonID,
QuestionDescription as Question,
ResultValue as Answer,
CASE
WHEN (QuestionDescription LIKE '%ion 1%' AND ResultValue = 'yes') THEN 1
WHEN (QuestionDescription LIKE '%ion 2%' AND ResultValue >= 5) THEN 1
WHEN (QuestionDescription LIKE '%ion 3%' AND ResultValue = '1 or more') THEN 1
ELSE 0
END as Flag
FROM Questionnaire
)
SELECT
*,
CASE WHEN SUM(Flag) OVER(PARTITION BY PersonID) > 0 THEN 1 ELSE 0 END as Overall
FROM
x
SUM(...) OVER(...) is a bit like doing the following:
WITH x AS (
--your existing query here
)
SELECT *, CASE WHEN SumFlag > 0 THEN 1 ELSE 0 END as OVerall
FROM
x
INNER JOIN
(SELECT PersonId, SUM(Flag) AS SumFlag FROM X GROUP BY PersonId) y ON x.PersonId = y.PersonId
i.e. SUM OVER does a grouping on PersonId, a Sum and then auto joins the result back to each row on the thing that was grouped (PersonId) - they're incredibly powerful and useful things, these window functions
This latter form (where a separate query groups and is rejoined) would also work if you can't get on with the window function (SUM OVER) approach - it's something akin to what datarocker pointed to in their answer

Separate columns for product counts using CTEs

Asking a question again as my post did not follow community rules.
I first tried to write a PIVOT statement to get the desired output. However, I am now trying to approach this using CTEs.
Here's the raw data. Let's call it ProductMaster:
PRODUCT_NUM
CO_CD
PROD_CD
MASTER_ID
Date
ROW_NUM
1854
MAWC
STATIONERY
10003493039
1/1/2021
1
1567
PREF
PRINTER
10003493039
2/1/2021
2
2151
MAWC
STATIONERY
10003497290
3/2/2021
1
I require the Count of each product for every Household from this data in separate columns, Printer_CT, Stationery_Ct
Each Master_ID represents a household. And a household can have multiple products.
So each household represents one row in my final output and I need the Product Counts in separate columns. There can be multiple products in each household, 4 or even more. But I have simplified this example.
I'm writing a query with CTEs to give me the output that I want. In my output, each row is grouped by Master ID
ORGL_CO_CD
ORGL_PROD_CD
STATIONERY_CT
PRINTER_CT
MAWC
STATIONERY
1
1
MAWC
STATIONERY
1
0
Here's my query. I'm not sure where to introduce Column 'Stationery_Ct'
WITH CTE AS
(
SELECT
CO_CD, Prod_CD, MASTER_ID,
'' as S1_CT, '' as P1_CT
FROM
ProductMaster
WHERE
ROW_NUM = 1
), CTE_2 AS
(
SELECT Prod_CD, MASTER_ID
FROM ProductMaster
WHERE ROW_NUM = 2
)
SELECT
CO_CD AS ORGL_CO_CD,
c.Prod_CD AS ORGL_PROD_CD,
(CASE WHEN c2.Prod_CD = ‘PRINTER’ THEN P1_CT = 1 END) PRINTER_CT
FROM
CTE AS c
LEFT OUTER JOIN
CTE_2 AS c2 ON c.MASTER_ID = c2.MASTER_ID
Any pointers would be appreciated.
Thank you!
I guess you can solve that using just GROUP BY and SUM:
-- Test data
DECLARE #ProductMaster AS TABLE (PRODUCT_NUM INT, CO_CD VARCHAR(30), PROD_CD VARCHAR(30), MASTER_ID BIGINT)
INSERT #ProductMaster VALUES (1854, 'MAWC', 'STATIONERY', 10003493039)
INSERT #ProductMaster VALUES (1567, 'PREF', 'PRINTER', 10003493039)
INSERT #ProductMaster VALUES (2151, 'MAWC', 'STATIONERY', 10003497290)
SELECT
MASTER_ID,
SUM(CASE PROD_CD WHEN 'STATIONERY' THEN 1 ELSE 0 END) AS STATIONERY_CT,
SUM(CASE PROD_CD WHEN 'PRINTER' THEN 1 ELSE 0 END) AS PRINTER_CT
FROM #ProductMaster
GROUP BY MASTER_ID
The result is:
MASTER_ID
STATIONERY_CT
PRINTER_CT
10003493039
1
1
10003497290
1
0

SQL Conditional Aggregation not returning all expected rows

So I've been trying to get a conditional aggregation running on one of my tables in SQL Server Management Studio and I've run across a problem: only one row is being returned when there should be 2.
SELECT ListID,
MAX(CASE WHEN QuestionName = 'Probability Value' THEN Answer END) AS 'prob',
MAX(CASE WHEN QuestionName = 'Impact Value' THEN Answer END) As 'impa',
MAX(CASE WHEN QuestionName = 'What is the Risk Response Strategy' THEN Answer END) AS 'strat',
MAX(CASE WHEN QuestionName = 'Response Comment' THEN Answer END) AS 'rrap'
FROM table1
GROUP BY ListID
By the information stored on the table is should return two rows, something like:
ListID | Prob | Impa | Strat | rrap |
1 2 3 Admin text1
1 5 5 Elim text2
but only the first row appears. I don't have any good leads at the moment, but I wonder if you good people might have spotted something obviously wrong with the initial query.
Your only group by is ListID and your 2 rows both have 1 on ListID, that's why they group up
Why do you think it should return more than 1 row? You are grouping by ListID and getting the MAX answer for all these questions.
If you want more rows returned you will have to group by other columns/expressions as well. You can't expect ListID 1 to appear more than once if you grouped by ListID only.

In SQL, how can I count the number of values in a column and then pivot it so the column becomes the row?

I have a survey database with one column for each question and one row for each person who responds. Each question is answered with a value from 1 to 3.
Id Quality? Speed?
-- ------- -----
1 3 1
2 2 1
3 2 3
4 3 2
Now, I need to display the results as one row per question, with a column for each response number, and the value in each column being the number of responses that used that answer. Finally, I need to calculate the total score, which is the number of 1's plus two times the number of 2's plus three times the number of threes.
Question 1 2 3 Total
-------- -- -- -- -----
Quality? 0 2 2 10
Speed? 2 1 1 7
Is there a way to do this in set-based SQL? I know how to do it using loops in C# or cursors in SQL, but I'm trying to make it work in a reporting tool that doesn't support cursors.
This will give you what you're asking for:
SELECT
'quality' AS question,
SUM(CASE WHEN quality = 1 THEN 1 ELSE 0 END) AS [1],
SUM(CASE WHEN quality = 2 THEN 1 ELSE 0 END) AS [2],
SUM(CASE WHEN quality = 3 THEN 1 ELSE 0 END) AS [3],
SUM(quality)
FROM
dbo.Answers
UNION ALL
SELECT
'speed' AS question,
SUM(CASE WHEN speed = 1 THEN 1 ELSE 0 END) AS [1],
SUM(CASE WHEN speed = 2 THEN 1 ELSE 0 END) AS [2],
SUM(CASE WHEN speed = 3 THEN 1 ELSE 0 END) AS [3],
SUM(speed)
FROM
dbo.Answers
Keep in mind though that this will quickly balloon as you add questions or even potential answers. You might be much better off if you normalized a bit and had an Answers table with a row for each answer with a question code or id, instead of putting them across as columns in one table. It starts to look a little bit like the entity-value pair design, but I think that it's different enough to be useful here.
You can also leverage SQL 2005's pivoting functions to achieve what you want. This way you don't need to hard code any questions as you do in cross-tabulation. Note that I called the source table "mytable" and I used common table expressions for readability but you could also use subqueries.
WITH unpivoted AS (
SELECT id, value, question
FROM mytable a
UNPIVOT (value FOR question IN (quality,speed) ) p
)
,counts AS (
SELECT question, value, count(*) AS counts
FROM unpivoted
GROUP BY question, value
)
, repivoted AS (
SELECT question, counts, [1], [2], [3]
FROM counts
PIVOT (count(value) FOR value IN ([1],[2],[3])) p
)
SELECT question, sum(counts*[1]) AS [1], sum(counts*[2]) AS [2], sum(counts*[3]) AS [3]
,sum(counts*[1]) + 2*sum(counts*[2]) + 3*sum(counts*[3]) AS Total
FROM repivoted
GROUP BY question
Note if you don't want the breakdown the query is simpler:
WITH unpivoted AS (
SELECT id, value, question
FROM mytable a
UNPIVOT (value FOR question IN (quality,speed) ) p
)
, totals AS (
SELECT question, value, count(value)*value AS score
FROM unpivoted
GROUP BY question, value
)
SELECT question, sum(score) AS score
FROM totals
GROUP BY question

Counting values in columns

What I am looking for is to group by and count the total of different data in the same table and have them show in two different columns. Like below.
Data in table A
Fields:
Name Type
Bob 1
John 2
Bob 1
Steve 1
John 1
Bob 2
Desired result from query:
Name Type 1 Type 2
Bob 2 1
John 1 1
Steve 1 0
This will do the trick in SQL Server:
SELECT
name,
SUM( CASE type WHEN 1 THEN 1 ELSE 0 END) AS type1,
SUM( CASE type WHEN 2 THEN 1 ELSE 0 END) AS type2
FROM
myTable
GROUP BY
name
No time to write the code, but the Case statement is what you want here. SImply havea value of 1 if it meets the case and zero if it deosn't. Then you can sum the columns.
Use two separate GROUP BY subqueries.
SELECT Name, a.Count1, b.Count2
from myTable
JOIN
(SELECT Name, SUM(Type) AS Count1 FROM myTable GROUP BY Name WHERE Type=1) AS a ON a.Name = myTable.Name
(SELECT Name, SUM(Type) FROM myTable GROUP BY Name WHERE Type=2) AS b ON b.Name = myTable.Name
You're looking for a CrossTab solution. The above solutions will work, but you'll come unstuck if you want a general solution and have N types.
A CrossTab solution will solve this for you. If this is for quickly crunching some numbers then dump your data into Excel and use the native Pivot Table feature.
If it's for a RDBMS in an app, then it depends upon the RDBMS. MS SQL 2005 and above has a crosstab syntax. See:
http://www.databasejournal.com/features/mssql/article.php/3521101/Cross-Tab-reports-in-SQL-Server-2005.htm
#Seb has a good solution, but it's server-dependent. Here's an alternate using subselects that should be portable:
select
name,
(select count(type) from myTable where type=1 and name=a.name) as type1,
(select count(type) from myTable where type=2 and name=a.name) as type2
from
myTable as a
group by
name