Split Multiple Columns into Multiple Rows - sql

I have a table with this structure.
UserID | UserName | AnswerToQuestion1 | AnswerToQuestion2 | AnswerToQuestion3
1 | John | 1 | 0 | 1
2 | Mary | 1 | 1 | 0
I can't figure out what SQL query I would use to get a result set like this:
UserID | UserName | QuestionName | Response
1 | John | AnswerToQuestion1 | 1
1 | John | AnswerToQuestion2 | 0
1 | John | AnswerToQuestion3 | 1
2 | Mary | AnswerToQuestion1 | 1
2 | Mary | AnswerToQuestion2 | 1
2 | Mary | AnswerToQuestion3 | 0
I'm trying to split the three columns into three separate rows. Is this possible?

SELECT
Y.UserID,
Y.UserName,
QuestionName = 'AnswerToQuestion' + X.Which,
Response =
CASE X.Which
WHEN '1' THEN AnswerToQuestion1
WHEN '2' THEN AnswerToQuestion2
WHEN '3' THEN AnswerToQuestion3
END
FROM
YourTable Y
CROSS JOIN (SELECT '1' UNION ALL SELECT '2' UNION ALL SELECT '3') X (Which)
This performs equally well to UNPIVOT (sometimes better) and works in SQL 2000 as well.
I took advantage of the questions' similarity to create the QuestionName column, but of course this will work with varying question names.
Note that if your list of questions is long or the question names are long, you might experiment with 2 columns in the X table, one for the question number and one for the question name. Or if you already have a table with the list of questions, then CROSS JOIN to that. If some questions are NULL then easiest is to put the above query in a CTE or derived table and then add WHERE Response IS NOT NULL.

Assuming SQL Server 2005+ you can use UNPIVOT
;with YourTable as
(
SELECT 1 UserID,'John' UserName,1 AnswerToQuestion1,0 AnswerToQuestion2,1 AnswerToQuestion3
UNION ALL
SELECT 2, 'Mary', 1, 1, 0
)
SELECT UserID, UserName, QuestionName, Response
FROM YourTable
UNPIVOT
(Response FOR QuestionName IN
(AnswerToQuestion1, AnswerToQuestion2,AnswerToQuestion3)
)AS unpvt;

According to Itzik Ben-Gan in Inside Microsoft SQL Server 2008: T-SQL Querying, SQL Server goes through three steps when unpivoting a table:
Generate copies
Extract elements
Remove rows with NULLs
Step 1: Generate copies
A virtual table is created that has a copy of each row in the orignal table for each column that is being unpivoted.
Also, a character string of the column name is stored in a new column (call this the QuestionName column). *Note: I modified the value in one of your columns to NULL to show the full process.
UserID UserName AnswerTo1 AnswerToQ2 AnswerToQ3 QuestionName
1 John 1 0 1 AnswerToQuestion1
1 John 1 0 1 AnswerToQuestion2
1 John 1 0 1 AnswerToQuestion3
2 Mary 1 NULL 1 AnswerToQuestion1
2 Mary 1 NULL 1 AnswerToQuestion2
2 Mary 1 NULL 1 AnswerToQuestion3
Step 2: Extract elements
Then another table is created that creates a new row for each value from the source column which corresponds
to the character string value in the QuestionName column. The value is stored in a new column (call this the Response column).
UserID UserName QuestionName Response
1 John AnswerToQuestion1 1
1 John AnswerToQuestion2 0
1 John AnswerToQuestion3 1
2 Mary AnswerToQuestion1 1
2 Mary AnswerToQuestion2 NULL
2 Mary AnswerToQuestion3 1
Step 3: Remove rows with NULLS
This step filters out any rows that were created with null values in the Response column. In other words,
if any of the AnswerToQuestion columns had a null value, it would not be represented as an unpivoted row.
UserID UserName QuestionName Response
1 John AnswerToQuestion1 1
1 John AnswerToQuestion2 0
1 John AnswerToQuestion3 1
2 Mary AnswerToQuestion1 1
2 Mary AnswerToQuestion3 1
If you follow those steps, you can
CROSS JOIN all rows in the table against each AnswerToQuestion
column name to get row copies
Populate the Response column based
on the matching the source column and QuestionName
Remove the NULLs to get the same
results without using UNPIVOT.
An example below:
DECLARE #t1 TABLE (UserID INT, UserName VARCHAR(10), AnswerToQuestion1 INT,
AnswertoQuestion2 INT, AnswerToQuestion3 INT
)
INSERT #t1 SELECT 1, 'John', 1, 0, 1 UNION ALL SELECT 2, 'Mary', 1, NULL, 1
SELECT
UserID,
UserName,
QuestionName,
Response
FROM (
SELECT
UserID,
UserName,
QuestionName,
CASE QuestionName
WHEN 'AnswerToQuestion1' THEN AnswerToQuestion1
WHEN 'AnswerToQuestion2' THEN AnswertoQuestion2
ELSE AnswerToQuestion3
END AS Response
FROM #t1 t1
CROSS JOIN (
SELECT 'AnswerToQuestion1' AS QuestionName
UNION ALL SELECT 'AnswerToQuestion2'
UNION ALL SELECT 'AnswerToQuestion3'
) t2
) t3
WHERE Response IS NOT NULL

Related

CASE expression on multiple columns

I have a table with below mentioned columns and values
StudentId | Geography | History | Maths
_______________________________________________
1 | NULL | 25 | NULL
2 | 20 | 23 | NULL
3 | 20 | 22 | 21
I need the output like below:
StudentId | Subject
___________________________
1 | History
2 | Geography
2 | History
3 | Geography
3 | History
3 | Maths
Wherever the value in subject columns (Geography, History and Maths) is NON NULL, I need the 'subject' value of the recepective column name.
I have an idea to pull it for one column using CASE, but not sure how to do it for multiple columns.
Here is what I tried:
SELECT StudentId, CASE WHEN IsNUll(Geography, '#NULL#') <> '#NULL#' THEN 'Geography'
CASE WHEN IsNUll(History, '#NULL#') <> '#NULL#' THEN 'History'
CASE WHEN IsNUll(Maths, '#NULL#') <> '#NULL#' THEN 'Maths' END Subject
FROM MyTable
You need to normalise your data. You can do this with a VALUES operator:
--Create sample data
WITH YourTable AS(
SELECT V.StudentID,
V.[Geography],
V.History,
V.Maths
FROM (VALUES(1,NULL,25,NULL),
(2,20,23,NULL),
(3,20,22,21))V(StudentID,[Geography], History, Maths))
--Solution
SELECT YT.StudentID,
V.[Subject]
FROM YourTable YT
CROSS APPLY (VALUES('Geography',YT.[Geography]),
('History',YT.History),
('Maths',YT.Maths))V([Subject],SubjectMark)
WHERE V.SubjectMark IS NOT NULL
ORDER BY YT.StudentID;
DB<>Fiddle
Use union all
select subjectid, Geography from table
union all
select subjectid, history from table
union all
select subjectid, Maths from table
You can use UNPIVOT. It shows you all grades row by row. Below code works fine
SELECT * FROM MyTable t
UNPIVOT
(
[Grade] FOR [Subject] IN ([Geography], [History], [Maths])
) AS u

Alternative to CASE WHEN?

I have a table in SQL where the results look something like:
Number | Name | Name 2
1 | John | Derek
1 | John | NULL
2 | Jane | Louise
2 | Jane | NULL
3 | Michael | Mark
3 | Michael | NULL
4 | Sara | Paul
4 | Sara | NULL
I want a way to say that if Number=1, return Name 2 in new column Name 3, so that the results would look like:
Number | Name | Name 2 | Name 3
1 | John | Derek | Derek
1 | John | NULL | Derek
2 | Jane | Louise | Louise
2 | Jane | NULL | Louise
3 | Michael | Mark | Mark
3 | Michael | NULL | Mark
4 | Sara | Paul | Paul
4 | Sara | NULL | Paul
The problem is that I can't say if Number=1, return Name 2 in Name 3, because my table has >100,000 records. I need it to do it automatically. More like "if Number is the same, return Name 2 in Name 3." I've tried to use a CASE statement but haven't been able to figure it out. Is there any way to do this?
Empirically, this seems to work:
SELECT
Number, Name, [Name 2],
MAX([Name 2]) OVER (PARTITION BY Number) [Name 3]
FROM yourTable;
The idea here, if I interpreted your requirements correctly, is that you want to report the non NULL value of the second name for all records as the third name value.
Solution 3, with group by
with maxi as(
SELECT Number, max(Name2) name3
FROM #sample
group by number, name
)
SELECT f1.*, f2.name3
FROM #sample f1 inner join maxi f2 on f1.number=f2.number
Solution 4, with cross apply
SELECT *
FROM #sample f1 cross apply
(
select top 1 f2.Name2 as Name3 from #sample f2
where f2.number=f1.number and f2.Name2 is not null
) f3
you can try this:
Solution 1, with row_number
declare #sample table (Number integer, Name varchar(50), Name2 varchar(50))
insert into #sample
select 1 , 'John' , 'Derek' union all
select 1 , 'John' , NULL union all
select 2 , 'Jane' , 'Louise' union all
select 2 , 'Jane' , NULL union all
select 3 , 'Michael' , 'Mark' union all
select 3 , 'Michael' , NULL union all
select 4 , 'Sara' , 'Paul' union all
select 4 , 'Sara' , NULL ;
with tmp as (
select *, row_number() over(partition by number order by number) rang
from #sample
)
select f1.Number, f1.Name, f1.Name2, f2.Name2 as Name3
from tmp f1 inner join tmp f2 on f1.Number=f2.Number and f2.rang=1
Solution 2, with lag (if your sql server version has lag function)
SELECT
Number, Name, Name2,
isnull(Name2, lag(Name2) OVER (PARTITION BY Number order by number)) Name3
FROM #sample;

Can't use group by in SQL Server

I am learning how to use Group By in SQL Server and I am trying to write a Query that would let me get all the information from Alumns in a table in numbers.
My table is like the following:
Name | Alumn_ID | Course | Credits | Passed
Peter 1 Math 2 YES
John 2 Math 3 YES
Thomas 3 Math 0 NO
Peter 1 English 3 YES
Thomas 2 English 2 YES
John 3 English 0 NO
The result I want is the following one:
Alumn | Total_Credits | Courses | Passed | Not_Passed
Peter 5 2 2 0
John 5 2 2 0
Thomas 0 2 0 2
I know that I have to use Group By and COUNT but I'm stuck since I'm a beginner, I really don't know how can I separate Passed and Not_Passed in the result from the PASSED column in the table, thanks in advance
SELECT t.id, t.name AS alum,
SUM(credits) AS total_credits,
COUNT(*) AS courses,
SUM(CASE WHEN Passed = 'YES' THEN 1 ELSE 0 END) AS Passed,
SUM(CASE WHEN Passed = 'NO' THEN 1 ELSE 0 END) AS Reprobated
FROM t
GROUP BY t.id, t.name
I assume reprobated means not passed.
The example below will do that like you solicited.
create table Alumns
(
Name varchar(30) not null
,Alumn_Id int not null
,Course varchar(30) not null
,Credits int not null
,passed varchar(3) not null
)
GO
insert into Alumns
(Name, Alumn_ID, Course, Credits, Passed)
values
('Peter', 1, 'Math', 2, 'YES')
,('John', 2, 'Math', 3, 'YES')
,('Thomas', 3, 'Math', 0, 'NO')
,('Peter', 1, 'English', 3, 'YES')
,('John', 2, 'English', 2, 'YES')
,('Thomas', 3, 'English', 0, 'NO')
GO
select al.Alumn_Id,al.Name
, Sum(al.Credits) as [Total Credits]
, Count(al.Course) as Courses
, Sum(case al.passed when 'YES' then 1 else 0 end) as Passed
, Sum(case al.passed when 'NO' then 1 else 0 end) as [Not Passed]
from dbo.Alumns al
group by al.Alumn_Id, al.Name
but note you will get an error because you data is incorrect.
Look at your own example where John and Peter are with wrong Ids for the Math/English rows.
That way you will never end with the correct result and that's why it's a good practice to group by Ids.
Edit
I see you corrected your example data yes that way will fetch the exact results you want.
You can separate Passed and Not_Passed using a CASE function.
SELECT MAX([name]) AS [Name],
SUM(Credits) AS Total_Credits,
COUNT(Course) AS Courses,
SUM(CASE WHEN Passed='Yes' THEN 1 ELSE 0 END) AS Passed,
SUM(CASE WHEN Passed='No' THEN 1 ELSE 0 END) AS Not_Passed
FROM TableName
GROUP BY Alumn_ID
However, I do not think that values of your tables (both table) are correct. Please check them again. For example, according to your table, John has two Alumn_IDs (both 2 and 3). If these are two different Johns, then your desired outcome should be changed.
Result
+--------+---------------+---------+--------+------------+
| Name | Total_Credits | Courses | Passed | Not_Passed |
+--------+---------------+---------+--------+------------+
| Peter | 5 | 2 | 2 | 0 |
| John | 3 | 1 | 1 | 0 |
| Thomas | 2 | 3 | 1 | 2 |
+--------+---------------+---------+--------+------------+

Data Matching with SQL and assigning Identity ID's

How to write a query that will match data and produce and identity for it.
For Example:
RecordID | Name
1 | John
2 | John
3 | Smith
4 | Smith
5 | Smith
6 | Carl
I want a query which will assign an identity after matching exactly on Name.
Expected Output:
RecordID | Name | ID
1 | John | 1X
2 | John | 1X
3 | Smith | 1Y
4 | Smith | 1Y
5 | Smith | 1Y
6 | Carl | 1Z
Note: The ID should be unique for every match. Also, it can be numbers or varchar.
Can somebody help me with this? The main thing is to assign the ID's.
Thanks.
How about this:
with temp as
(
select 1 as id,'John' as name
union
select 2,'John'
union
select 3,'Smith'
union
select 4,'Smith'
union
select 5,'Smith'
union
select 6,'Carl'
)
SELECT *, DENSE_RANK() OVER
(ORDER BY Name) as NewId
FROM TEMP
Order by id
The first part is for testing purposes only.
Please try:
SELECT *,
Rank() over (order by Name ASC)
FROM table
This structure seems to work:
CREATE TABLE #Table
(
Department VARCHAR(100),
Name VARCHAR(100)
);
INSERT INTO #Table VALUES
('Sales','michaeljackson'),
('Sales','michaeljackson'),
('Sales','jim'),
('Sales','jim'),
('Sales','jill'),
('Sales','jill'),
('Sales','jill'),
('Sales','j');
WITH Cte_Rank AS
(
SELECT [Name],
rw = ROW_NUMBER() OVER (ORDER BY [Name])
FROM #Table
GROUP BY [Name]
)
SELECT a.Department,
a.Name,
b.rw
FROM #Table a
INNER JOIN Cte_Rank b
ON a.Name = b.Name;

Collapse SQL rows

Say I have this table:
id | name
-------------
1 | john
2 | steve
3 | steve
4 | john
5 | steve
I only want the rows that are unique compared to the previous row, these:
id | name
-------------
1 | john
2 | steve
4 | john
5 | steve
I can partly achieve this by using this query:
SELECT *, (
SELECT `name` FROM demotable WHERE id=t.id-1
) AS prevName FROM demotable AS t GROUP BY prevName ORDER BY id ASC
But when I am using a query with multiple UNIONs and stuff, this gets way to complicated. Is there an easy way to do this (like GROUP BY, but than more specific)?
This should work, but I don't know if it's simpler :
select demotable.*
from demotable
left join demotable as prev on prev.id = demotable.id - 1
where demotable.name != prev.name