Using Pivot with non Numerical Data - sql

This is the first time I have ever tried to use PIVOT.
I am using Microsoft SQL Server.
So here is my issue, I have been reading up on Pivot and have decided that it would work great for a project that exports Patient data to a formatted file i.e. Report, that can be printed out etc.. etc..
VPatientPlusAllergyData is a VIEW, that displays this as a sample result with some of the data cut out for ease of reading
strPatientFullName strAllergy strAllergyMedication
------------------------------------------------------------
Smith, John Henry Dogs Pounces
Smith, John Henry Dogs Orange Juice
Smith, John Henry Mustard Ketchup
Smith, John Henry Mustard Sugar
This is the result I want
strPatientFullName strAllergy1 strAllergy1Medications strAllergy2 strAllergy2Medications
------------------------------------------------------------------------------------------------------
Smith, John Henry Dogs Pounces, OrangeJuice Mustard Ketchup, Sugar
After readin on W3Schools, watching a Youtube video and even reading some articles on this site I'm wondering if what I am trying to do is possible
below is a code snippet but I got stuck on what I should put in the IN statement, and when I started to question the viability of PIVOT being the answer to my particular problem.
GO
SELECT
strPatientFullName
,strStreetAddress
,strCity
,strState
,strZipcode
,strPrimaryPhoneNumber
,strSecondaryPhoneNumber
,blnSmoker
,decPackYears
,blnHeadOfHousehold
,dtmDateOfBirth
,strSex
,strAllergy
,strAllergyMedication
,strEmailAddress
,strRecordCreator
FROM ( SELECT * FROM VPatientPlusAllergyData ) PatientAllergyData
PIVOT
(
MAX(strAllergyMedication)
FOR strAllergy
IN ()
)
GO
Hoping someone more familiar with Pivot will show me what I am missing or enlighten me to a much more efficient solution.
Thanks for the help
****** EDIT: I Have Decided that while I would love to put this sort of operation on the server side, for my particular application, it was just simpler to create a ton of views then perform SELECT queries on the client side and concatenate them that way, then implementing a "EXPORT PROCESSING" Screen.
I appreciate all the help, maybe on day I will write a script and have it execute server side, but for the moment this work good enough ******

Here's an example of how you could do something like this with a STUFF statement, conditional aggregation and dynamic SQL.
DECLARE #SQL NVARCHAR(MAX) = '';
SELECT #SQL += '
, MAX(CASE WHEN RN = ' + RN + ' THEN strAllergy END) strAllergy' + RN + '
, MAX(CASE WHEN RN = ' + RN + ' THEN strAllergyMedications END) strAllergyMedications' + RN
FROM (
SELECT CAST(ROW_NUMBER() OVER (PARTITION BY strPatientFullName, strAllergy ORDER BY (SELECT NULL)) AS VARCHAR(5)) RN
FROM VPatientPlusAllergyData) T
GROUP BY RN;
SELECT #SQL = 'SELECT strPatientFullName' + #SQL + '
FROM (
SELECT strPatientFullname
, strAllergy
, STUFF((SELECT '', '' + strAllergyMedication FROM VPatientPlusAllergyData WHERE strPatientFullName = T.strPatientFullName AND strAllergy = T.strAllergy FOR XML PATH ('''')), 1, 2, '''') strAllergyMedications
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN
FROM VPatientPlusAllergyData T
GROUP BY strPatientFullname, strAllergy) T
GROUP BY strPatientFullname;';
PRINT #SQL;
EXEC(#SQL);
As scsimon mentions in the comments, dynamic SQL may be necessary if there can be any number of allergies. A stuff statement is one way of getting the comma separated values into a single column. And the conditional aggregation works in the same way that a PIVOT would normally work, but is far easier (IMO) to write and understand than a normal PIVOT statement.

So to get to what you want you are actually looking at needing the following techniques:
For the case of strAllergyMedications you are needing to Concatenate Rows to a Delimited String
Then to make your rows into columns you need to PIVOT, but because you are pivoting 2 columns you would have to PIVOT twice or use Conditional Aggregation
The main trick to pulling it off is to prepare your table by doing the concatenation and coming up with a Row Number for the Allergy. Here is an example using a Common Table Expression [CTE] and STUFF() with a sub select XML to create the delimited string and create the Row Number.
DECLARE #VPatientPlusAllergyData AS TABLE (strPatientFullName VARCHAR(100), strAllergy VARCHAR(50), strAllergyMedication VARCHAR(100))
INSERT INTO #VPatientPlusAllergyData VALUES
('Smith, John Henry','Dogs','Pounces')
,('Smith, John Henry','Dogs','Orange Juice')
,('Smith, John Henry','Mustard','Ketchup')
,('Smith, John Henry','Mustard','Sugar')
;WITH cte AS (
SELECT DISTINCT
v1.strPatientFullName
,v1.strAllergy
,strAllergyMedications = STUFF(
(SELECT ', ' + v2.strAllergyMedication
FROM
#VPatientPlusAllergyData v2
WHERE
v1.strPatientFullName = v2.strPatientFullName
AND v1.strAllergy = v2.strAllergy
FOR XML PATH(''))
,1,2,'')
,AllergyRowNum = DENSE_RANK() OVER (PARTITION BY v1.strPatientFullName ORDER BY v1.strAllergy)
FROM
#VPatientPlusAllergyData v1
)
SELECT
strPatientFullName
,strAllergy1 = MAX(CASE WHEN AllergyRowNum = 1 THEN strAllergy END)
,strAllergy1Medications = MAX(CASE WHEN AllergyRowNum = 1 THEN strAllergyMedications END)
,strAllergy2 = MAX(CASE WHEN AllergyRowNum = 2 THEN strAllergy END)
,strAllergy2Medications = MAX(CASE WHEN AllergyRowNum = 2 THEN strAllergyMedications END)
FROM
cte
GROUP BY
strPatientFullName
AND while I was preparing and posting this #ZLK wrote a nice method to do it dynamically.

Related

compare 2 text columns and show difference in the third cell using sql

I am trying to compare 2 columns and I have to get the only difference for example
select * from table1
Column_1 column_2
---------------- ------------------
Swetha working Swetha is working in Chennai
Raju 10th Raju is studying 10th std
ranjith Ranjith played yesterday
how to play how to play Cricket
My name is my name is john
Output:
If words come in between it should also remove like row 1 and 2
Column_1 column_2 column_3
---------------- ------------------ ------------------------
Swetha working Swetha is working in Chennai is in Chennai
Raju 10th Raju is studying 10th std is studying std
ranjith Ranjith played yesterday played yesterday
how to play how to play Cricket Cricket
My name is my name is john john
This is much more complicated than your previous question. You can break the first column into words and then substitute them individually in the second column. To do that, though, you need a recursive CTE:
with words as (
select t.*, s.*,
max(s.seqnum) over (partition by t.id) as max_seqnum
from t cross apply
(select s.value as word,
row_number() over (order by (select null)) as seqnum
from string_split(col1, ' ') s
) s
),
cte as (
select id, col1, col2,
replace(' ' + col2 + ' ', ' ' + word + ' ', ' ') as result,
word, seqnum, max_seqnum
from words
where seqnum = 1
union all
select cte.id, cte.col1, cte.col2,
replace(cte.result, ' ' + w.word + ' ', ' '),
w.word, w.seqnum, cte.max_seqnum
from cte join
words w
on w.id = cte.id and w.seqnum = cte.seqnum + 1
)
select id, col1, col2, ltrim(rtrim(result)) as result
from cte
where max_seqnum = seqnum
order by id;
Here is a db<>fiddle.
I added an id so each row is uniquely defined. If your version of SQL Server doesn't have the built-in string_split() function, you can easily find a version that does the same thing.
One trick that this uses is for handling the first and last words in the second column. The code adds spaces at the beginning and end. That way, all words in the string are surrounded by spaces, making it easier to replace only complete words.
SQL 2016 definitely has string split. This approach appends an extra space to either side of the split word from Column 2.
Data
drop table if exists #strings;
go
create table #strings(
Id int,
Column_1 varchar(200),
Column_2 varchar(200));
go
insert #strings(Id, Column_1, Column_2) values
(1, 'Swetha', 'Swetha is working in Chennai'),
(2, 'Raju', 'Raju is studying 10 std'),
(3, 'Swetha working', 'Swetha is working in Chennai'),
(4, 'Raju 10th', 'Raju is studying 10th std');
Query
declare
#add_delim char(1)=' ';
;with
c1_cte(split_str) as (
select ltrim(rtrim(s.[value]))
from
#strings st
cross apply
string_split(st.Column_1, ' ') s),
c2_cte(Id, ndx, split_str) as (
select Id, charindex(#add_delim + s.[value] + #add_delim, #add_delim + st.Column_2 + #add_delim), s.[value]
from
#strings st
cross apply
string_split(st.Column_2, ' ') s
where
st.Column_2 not like '% %')
select
Id, stuff((select ' ' + c.split_str
from c2_cte c
where c.Id = c2.Id and not exists(select 1
from c1_cte c1
where c.split_str=c1.split_str)
order by c.ndx FOR XML PATH('')), 1, 1, '') [new_str]
from c2_cte c2
group by Id;
Results
Id new_str
1 is in Chennai
2 is studying 10 std
3 is in Chennai
4 is studying std
Here is the solution using STRING_SPLIT and STRING_AGG
DBFIDDLE working link
;WITH split_words
AS (
SELECT *
FROM dbo.Strings
CROSS APPLY (
SELECT VALUE
FROM STRING_SPLIT(column_2, ' ')
WHERE VALUE NOT IN (
SELECT VALUE
FROM STRING_SPLIT(column_1, ' ')
)
) a
)
SELECT *
,(
SELECT sw.VALUE + ' ' [text()]
FROM split_words sw
WHERE sw.Column_1 = s.Column_1
AND sw.Column_2 = s.Column_2
FOR XML PATH('')
,TYPE
).value('.', 'NVARCHAR(MAX)') [difference]
FROM dbo.Strings s
For SQL version 2017+ where STRING_AGG is supported
SELECT b.Column_1
,b.Column_2
,STRING_AGG(b.VALUE, ' ')
FROM (
SELECT *
FROM dbo.Strings
CROSS APPLY (
SELECT VALUE
FROM STRING_SPLIT(column_2, ' ')
WHERE VALUE NOT IN (
SELECT VALUE
FROM STRING_SPLIT(column_1, ' ')
)
) a
) b
GROUP BY b.Column_1
,b.Column_2
Results:
WITH
-- your input
input(column_1,column_2,column_3) AS (
SELECT 'Swetha working','Swetha is working in Chennai','is in Chennai'
UNION ALL SELECT 'Raju 10th','Raju is studying 10th std','is studying std'
UNION ALL SELECT 'ranjith','Rantith played yesterday','played yesterday'
UNION ALL SELECT 'how to play','how to play Cricket','Cricket'
UNION ALL SELECT 'My name is','my name is john','john'
)
,
-- need a series of integers
-- you can also try to play with the STRING_SPLIT() function
i(i) AS (
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
)
,
-- you can also try to play with the STRING_SPLIT() function
unfound_tokens AS (
SELECT
i
, column_1
, column_2
, TOKEN(column_2,' ',i) AS token
FROM input CROSS JOIN i
WHERE TOKEN(column_2,' ',i) <> ''
AND CHARINDEX(
UPPER(TOKEN(column_2,' ',i))
, UPPER(column_1)
) = 0
)
SELECT
column_1
, column_2
, STRING_AGG(token ,' ') AS column_3
FROM unfound_tokens
GROUP BY
column_1
, column_2
-- out column_1 | column_2 | column_3
-- out ----------------+------------------------------+--------------------------
-- out My name is | my name is john | john
-- out Swetha working | Swetha is working in Chennai | is Chennai
-- out how to play | how to play Cricket | Cricket
-- out Raju 10th | Raju is studying 10th std | is studying std
-- out ranjith | Rantith played yesterday | Rantith played yesterday
I am not sure that the results, while using STRING_AGG or STRING_SPLIT, will preserve the ordering of the words...
Just look over this query that give a different ordering :
WITH
SS1 AS
(SELECT Id, SS.value AS COL1
FROM #strings
CROSS APPLY STRING_SPLIT(Column_1, ' ') AS SS
),
SS2 AS
(SELECT Id, SS.value AS COL2
FROM #strings
CROSS APPLY STRING_SPLIT(Column_2, ' ') AS SS
),
DIF AS
(
SELECT Id, COL2 AS COL
FROM SS2
EXCEPT
SELECT Id, COL1
FROM SS1
)
SELECT DIF.Id, Column_1, Column_2, STRING_AGG(COL, ' ')
FROM DIF
JOIN #strings AS S ON S.Id = DIF.Id
GROUP BY DIF.Id, Column_1, Column_2;
You must try with a very huge amount of data to see if the queries that have been given, will not have a side effect like the unconsistent ordering (I am pretty sure that no consistent order will appear due to parallelism....)
So the only way to preserve a consistent ordering is to create a recursive query that add an indiced value of the word in the sentence...

SQL Server 2016 - Transpose column of integers to row by day

I need to transpose one of the columns in the data date to a row of string and group by 2 other columns. My sample data consists of the following data:
I need the result to look like this:
That is all the LNs in one row per Employee code, per day.
I tried the below code -
DECLARE #Process_Conditions_Loans VARCHAR(500)
SELECT
t1.EmplCode,
t1.LogDate,
#Process_Conditions_Loans = CONCAT(COALESCE(#Process_Conditions_Loans + ',', ''),PS2)
FROM
#temp t1
WHERE
LN IS NOT NULL
GROUP BY
EmplCode, LogDate
But I am getting an error
A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations.
I can not use group_concat since I am using SQL Server 2016.
Any help would be great appreciated.
Thanks,
JH
You can use the older form of string aggregation:
select emplcode, logdate,
stuff( (select concat(', ', ln)
from t
where t.emplcode = el.emplcode and t.logdate = el.logdate
order by ln
for xml path ('')
), 1, 2, ''
)
from (select distinct emplcode, logdate
from t
) el

How to combine return results of query in one row

I have a table that save personnel code.
When I select from this table I get 3 rows result such as:
2129,3394,3508,3534
2129,3508
4056
I want when create select result combine in one row such as:
2129,3394,3508,3534,2129,3508,4056
or distinct value such as:
2129,3394,3508,3534,4056
You should ideally avoid storing CSV data at all in your tables. That being said, for your first result set we can try using STRING_AGG:
SELECT STRING_AGG(col, ',') AS output
FROM yourTable;
Your second requirement is more tricky, and we can try going through a table to remove duplicates:
WITH cte AS (
SELECT DISTINCT VALUE AS col
FROM yourTable t
CROSS APPLY STRING_SPLIT(t.col, ',')
)
SELECT STRING_AGG(col, ',') WITHIN GROUP (ORDER BY CAST(col AS INT)) AS output
FROM cte;
Demo
I solved this by using STUFF and FOR XML PATH:
SELECT
STUFF((SELECT ',' + US.remain_uncompleted
FROM Table_request US
WHERE exclusive = 0 AND reqact = 1 AND reqend = 0
FOR XML PATH('')), 1, 1, '')
Thank you Tim

Count the number of not null columns using a case statement

I need some help with my query...I am trying to get a count of names in each house, all the col#'s are names.
Query:
SELECT House#,
COUNT(CASE WHEN col#1 IS NOT NULL THEN 1 ELSE 0 END) +
COUNT(CASE WHEN col#2 IS NOT NULL THEN 1 ELSE 0 END) +
COUNT(CASE WHEN col#3 IS NOT NULL THEN 1 ELSE 0 END) as count
FROM myDB
WHERE House# in (house#1,house#2,house#3)
GROUP BY House#
Desired results:
house 1 - the count is 3 /
house 2 - the count is 2 /
house 3 - the count is 1
...with my current query the results for count would be just 3's
In this case, it seems that counting names is the same as counting the commas (,) plus one:
SELECT House_Name,
LEN(Names) - LEN(REPLACE(Names,',','')) + 1 as Names
FROM dbo.YourTable;
Another option since Lamak stole my thunder, would be to split it and normalize your data, and then aggregate. This uses a common split function but you could use anything, including STRING_SPLIT for SQL Server 2016+ or your own...
declare #table table (house varchar(16), names varchar(256))
insert into #table
values
('house 1','peter, paul, mary'),
('house 2','sarah, sally'),
('house 3','joe')
select
t.house
,NumberOfNames = count(s.Item)
from
#table t
cross apply dbo.DelimitedSplit8K(names,',') s
group by
t.house
Notice how the answers you are getting are quite complex for what they're doing? That's because relational databases are not designed to store data that way.
On the other hand, if you change your data structure to something like this:
house name
1 peter
1 paul
1 mary
2 sarah
2 sally
3 joe
The query now is:
select house, count(name)
from housenames
group by house
So my recommendation is to do that: use a design that's more suitable for SQL Server to work with, and your queries become simpler and more efficient.
One dirty trick is to replace commas with empty strings and compare the lengths:
SELECT house +
' has ' +
CAST((LEN(names) - LEN(REPLACE(names, ',', '')) + 1) AS VARCHAR) +
' names'
FROM mytable
You can parse using xml and find count as below:
Select *, a.xm.value('count(/x)','int') from (
Select *, xm = CAST('<x>' + REPLACE((SELECT REPLACE(names,', ','$$$SSText$$$') AS [*] FOR XML PATH('')),'$$$SSText$$$','</x><x>')+ '</x>' AS XML) from #housedata
) a
select House, 'has '+cast((LEN(Names)-LEN(REPLACE(Names, ',', ''))+1) as varchar)+' names'
from TempTable

Dynamic SubSelects in SQL Select Statement

I am querying a table for some basic information, file number, case type, status, etc. In addition I need a column for every single one of 138 case status types that will display the date the case had that status. Here is a sample:
SELECT FileNum,
CaseType,
CurrentCaseStatus,
(SELECT TOP 1 EventDt FROM caseStatusHistory WHERE CaseID = c.caseID AND CaseStatus = 'CS001' ORDER BY EventDt DESC) AS [Charge - Phone],
(SELECT TOP 1 EventDt FROM caseStatusHistory WHERE CaseID = c.caseID AND CaseStatus = 'CS002' ORDER BY EventDt DESC) AS [Charge - Written],
-- 136 more just like the live above
FROM Case c
I can query another table for all the case status types:
SELECT Code, Description
FROM caseStatus
WHERE Code BETWEEN 'CS001' AND 'CS138'
ORDER BY Code
How can I dynamically create each of those columns instead of having to manually write 138 select statements?
That's going to be terribly slow -- 138 correlated subqueries. I think you can achieve the same result with an OUTER JOIN and a GROUP BY with MAX and CASE:
Select c.filenum,
c.casetype,
c.currentcasestatus,
max(case when csh.CaseStatus = 'CS001' then EventDt end) as [Charge - Phone],
max(case when csh.CaseStatus = 'CS002' then EventDt end) as [Charge - Written]
from case c
left join casestatushistory csh on c.caseid = csh.caseid
group by c.filenum,
c.casetype,
c.currentcasestatus
BTW, I would suggest just writing the statement out -- it won't take that long and it will out perform a dynamic sql approach. I'm not completely sure how you'd get your column names with dynamic sql either unless Phone and Written are in another column.
Try using a PIVOT. The SQL below should work -
--Select the pivot data into a temp table
SELECT c.caseID,
c.FileNum,
c.CaseType,
c.CurrentCaseStatus,
csh.EventDt,
cs.Description
INTO #StatusDates
FROM [Case] c
LEFT JOIN caseStatusHistory csh
ON csh.caseID = c.caseID
LEFT JOIN caseStatus cs
ON cs.Code = csh.CaseStatus
--From the pivot data, get the list of field names (assumes description field is the source for the field name)
DECLARE #statusDescriptions VARCHAR(MAX)
SET #statusDescriptions = ''
SELECT #statusDescriptions = COALESCE(#statusDescriptions+'[','') + Description
FROM (
SELECT DISTINCT Description
FROM #StatusDates
WHERE Description IS NOT NULL
) x
SET #statusDescriptions = REPLACE(#statusDescriptions, '[', '],[') + ']'
SET #statusDescriptions = SUBSTRING(#statusDescriptions, 3, LEN(#statusDescriptions))
--Create a SQL statement to pivot the data into the fields.
DECLARE #sql VARCHAR(MAX)
SET #SQL = '
SELECT *
FROM #StatusDates
PIVOT(MIN(EventDt)
FOR Description IN (' + #statusDescriptions + '))
AS PVTTable '
PRINT #sql
EXEC(#sql)
DROP TABLE #StatusDates