Transpose rows from split_string into columns - sql

I'm stuck trying to transpose a set of rows into a table. In my stored procedure, I take a delimited string as input, and need to transpose it.
SELECT *
FROM string_split('123,4,1,0,0,5|324,2,0,0,0,4','|')
CROSS APPLY string_split(value,',')
From which I receive:
value value
123,4,1,0,0,5 123
123,4,1,0,0,5 4
123,4,1,0,0,5 1
123,4,1,0,0,5 0
123,4,1,0,0,5 0
123,4,1,0,0,5 5
324,2,0,0,0,4 324
324,2,0,0,0,4 2
324,2,0,0,0,4 0
324,2,0,0,0,4 0
324,2,0,0,0,4 0
324,2,0,0,0,4 4
The values delimited by | are client details. And within each client, there are six attributes, delimited by ,. I would like an output table of:
ClientId ClientTypeId AttrA AttrB AttrC AttrD
------------------------------------------------
123 4 0 0 0 5
324 2 0 0 0 4
What's the best way to go about this? I've been looking at PIVOT but can't make it work because it seems like I need row numbers, at least.

This answer assumes that row number function will "follow the order" of the string. If it does not you will need to write your own split that includes a row number in the resulting table. (This is asked on the official documentation page but there is no official answer given).
SELECT
MAX(CASE WHEN col = 1 THEN item ELSE null END) as ClientId,
MAX(CASE WHEN col = 2 THEN item ELSE null END) as ClientTypeId,
MAX(CASE WHEN col = 3 THEN item ELSE null END) as AttrA,
MAX(CASE WHEN col = 4 THEN item ELSE null END) as AttrB,
MAX(CASE WHEN col = 5 THEN item ELSE null END) as AttrC,
MAX(CASE WHEN col = 6 THEN item ELSE null END) as AttrD
FROM (
SELECT A.value as org, B.value as item,
ROW_NUMBER() OVER (partition by A.value) as col
FROM string_split('123,4,1,0,0,5|324,2,0,0,0,4','|') as A
CROSS APPLY string_split(A.value,',') as B
) X
GROUP BY org
You might get a message about nulls in aggregate function ignored. (I always forget which platforms care and which don't.) If you do you can replace the null with 0.
Note, this is not as fast and using a CTE to find the 5 commas in the string with CHARINDEX and then using SUBSTRING to extract the values. But I'm to lazy to write up that solution which I would need to test to get all the off by 1 issues right. Still, I suggest you do it that way if you have a big data set.

I know you already got it pretty much answered, but here you can find a PIVOT solution
select [ClientID],[ClientTypeId],[AttrA],[AttrB],[AttrC],[AttrD]
FROM
(
select case when ColumnRow = 1 then 'ClientID'
when ColumnRow = 2 then 'ClientTypeId'
when ColumnRow = 3 then 'AttrA'
when ColumnRow = 4 then 'AttrB'
when ColumnRow = 5 then 'AttrC'
when ColumnRow = 6 then 'AttrD' else null end as
ColumnRow,t.value,ColumnID from (
select ColumnID,z.value as stringsplit,b.value, cast(Row_number()
over(partition by z.value order by z.value) as
varchar(50)) as ColumnRow from (SELECT cast(Row_number() over(order by
a.value) as
varchar(50)) as ColumnID,
a.value
FROM string_split('123,4,1,0,0,5|324,2,0,0,0,4','|') a
)z
CROSS APPLY string_split(value,',') b
)t
) AS SOURCETABLE
PIVOT
(
MAX(value)
FOR ColumnRow IN ([ClientID],[ClientTypeId],[AttrA],[AttrB],[AttrC],
[AttrD])
)
AS
PivotTable

Related

Trying to combine multiples of a key ID into single row, but with different values in columns

TSQL - SQL Sever
I'm building a report to very specific requirements. I'm trying to combine multiples of a key ID into single rows, but there's different values in some of the columns, so GROUP BY won't work.
SELECT count(tt.Person_ID) as CandCount, tt.Person_ID,
CASE e.EthnicSuperCategoryID WHEN CandCount > 1 THEN 10 ELSE e.EthnicSuperCategoryID END as EthnicSuperCategoryID,
CASE e.Ethnicity_Id WHEN 1 THEN 1 ELSE 0 END as Black ,
CASE e.Ethnicity_Id WHEN 2 THEN 1 ELSE 0 END as White ,
CASE e.Ethnicity_Id WHEN 3 THEN 1 ELSE 0 END as Asian,
etc
FROM T_1 TT
JOINS
WHERE
GROUP
Msg 102, Level 15, State 1, Line 4
Incorrect syntax near '>'.
Here's the results (without the first CASE). Note person 3 stated multiple ethnicities.
SELECT count(tt.Person_ID) as CandCount, tt.Person_ID,
CASE e.Ethnicity_Id WHEN 1 THEN 1 ELSE 0 END as Black ,
CASE e.Ethnicity_Id WHEN 2 THEN 1 ELSE 0 END as White ,
CASE e.Ethnicity_Id WHEN 3 THEN 1 ELSE 0 END as Asian,
etc
FROM T_1 TT
JOINS
WHERE
GROUP
That’s expected, but the goal would be to assign multiple ethnicities to Ethnicity_Id of 10 (multiple). I also want them grouped on a single line.
So the end result would look like this:
So my issue is two fold. If the candidate has more than 2 ethnicities, assign the records to Ethnicity_Id of 10. I also need duplicated person IDs grouped into a single row, while displaying all of the results of the columns.
This should bring your desired result:
SELECT Person_ID
, ISNULL(ID_Dummy,Ethnicity_ID) Ethnicity_ID
, MAX(Black) Black
, MAX(White) White
, MAX(Asian) Asian
FROM #T T
OUTER APPLY(SELECT MAX(10) FROM #T T2
WHERE T2.Person_ID = T.Person_ID
AND T2.Ethnicity_ID <> T.Ethnicity_ID
)EthnicityOverride(ID_Dummy)
GROUP BY Person_ID, ISNULL(ID_Dummy,Ethnicity_ID)
You want conditional aggregation. Your query is incomplete, but the idea is:
select
person_id,
sum(case ethnicity_id = 1 then 1 else 0 end) as black,
sum(case ethnicity_id = 2 then 1 else 0 end) as white,
sum(case ethnicity_id = 3 then 1 else 0 end) as asian
from ...
where ...
group by person_id
You might want max() instead of sum(). Also I did not get the logic for column the second column in the desired results - maybe that's just count(*).
This would be my approach
SELECT
person_id,
CASE WHEN flag = 1 THEN Ethnicity_Id ELSE 10 END AS Ethnicity_Id,
[1] as black,
[2] as white,
[3] as asian
FROM
(
SELECT
person_id,
Ethnicity_Id as columns,
1 as n,
MAX(Ethnicity_Id) over(PARTITION BY person_id) as Ethnicity_Id,
COUNT(Ethnicity_Id) over(PARTITION BY person_id) as flag
FROM
#example
) AS SourceTable
PIVOT
(
MAX(n) FOR columns IN ([1], [2], [3])
) AS PivotTable;
Pivot the Ethnicity_Id column into multiples columns, Using constant
1 to make it complain with your expected result.
Using Max(Ethnicity_Id) with Partition By to get the original
Ethnicity_Id
Using Count(Ethnicity_Id) to flag if a need to raplace Ethnicity_Id
with 10 bc there is more that 1 row for that person_id
If you need to add more Ethnicitys add the ids in ... IN ([1], [2], [3]) ... and in the select

Create a query that counts instances in a related table

I have re-written this query about 20 times today and I keep getting close but no dice... I'm sure this is easy-peasy for y'all, but my SQL (Oracle) is pretty rusty.
Here's what I need:
PersonID Count1 Count2 Count3 Count4
1 0 0 2 1
2 1 1 1 0
3 1 1 1 2
Data is coming from several sources. I have a table People, and a table Values. People can have any number of values in that table.
PersonID Item Value
1 Check1 3
1 Check2 3
1 Check3 4
2 Check4 2
2 Check5 3
2 Check6 1
.. etc
So the query would, for each PersonID, count how many times the particular Value appears. The values are always 1, 2, 3, or 4. I tried to do 4 subqueries, but it wouldn't read the PersonID from the main query and just returned the count of all instances of value=1.
I was then thinking do a Group_By ... I don't know. Any help is appreciated!
ETA: I've deleted & re-written the query many times in many ways and unfortunately did not save any intermediate attempts. I didn't include it originally because I was in the middle of rearranging it again, and it's not runnable as-is. But here it is as it stands now:
/*sources are the tested requirements
values are the scores people received on the tested sources
people are those who were tested on the requirements */
WITH sub_query4 (
SELECT values.personid,
count (values.ID) as count4 --how many 4s
FROM values
INNER JOIN sources ON values.valueid = sources.sourceid
INNER JOIN people ON people.personid = values.personid
WHERE values.yearid = 2017
AND values.quarter = 'Q1'
AND instr (sources.identifier, 'TESTBANK.01', 1 ,1) > 0
AND values.value = '4'
GROUP_BY people.personid
)
SELECT p.first_name,
p.last_name,
p.position,
p.email,
p.locationid,
sub_query4.count4 as count4 --eventually this would repeat for 1, 2, & 3
FROM people p
WHERE p.locationid=406
AND p.position in (9,10);
values is a bad name for a table because it is a SQL keyword.
In any case, conditional aggregation should work:
select personid,
sum(case when value = 1 then 1 else 0 end) as cnt_1,
sum(case when value = 2 then 1 else 0 end) as cnt_2,
sum(case when value = 3 then 1 else 0 end) as cnt_3,
sum(case when value = 4 then 1 else 0 end) as cnt_4
from values
group by personid;
I prefer to use PIVOT for this. Here is Example SQL Fiddle
SELECT "PersonID", val1,val2,val3,val4 FROM
(
SELECT "PersonID", "Value" from VALS
)
PIVOT
(
count("Value")
FOR "Value" IN (1 as val1, 2 as val2, 3 as val3, 4 as val4)
);

Conditional SUM with SELECT statement

I like to sum values in a table based on a condition taken from the same table called. The structure of the table as per below. The table is called Data
Data
Type Value
1 5
1 10
1 15
1 25
1 15
1 20
1 5
2 10
3 5
If the Value of Type 2 is larger than the Value of Type 3 then I like to subtract the Value of Type 2 from the sum of all the Values in the table. I'm not sure how to write the IF statements using Values looked up in the table. I have tried below but it doesn't work.
SELECT SUM(Value)-IF(SELECT Value FROM Data WHERE Type=2>SELECT Value
FROM Data WHERE Type=3 THEN SELECT Value FROM Data
WHERE Type=2 ELSE SELECT Value FROM Data WHERE Type=3) FROM Data
or
SELECT SUM(d.Value)-IIF(a.type2>b.type3, a.type2, b.type3)
FROM Data d, (SELECT Value AS type2 FROM Data WHERE Type=2) a,
(SELECT Value AS type3 FROM Data WHERE Type=3) b
If I follow your logic correctly, then this would seem to do what you want:
select d.value - (case when d2.value > d3.value then d2.value else 0 end)
from data d cross join
(select value from data where type = 2) d2 cross join
(select value from data where type = 3) d3 ;
EDIT:
If you want just one number, then use conditional aggregation:
select sum(value) -
(case when sum(case when type = 2 then value else 0 end) >
sum(case when type = 3 then value else 0 end)
then sum(case when type = 2 then value else 0 end)
else 0
end)
from data;
Thanks for pointing me in the right direction. This is what I came up with in the end. It is a little bit different to the reply above since I'm using MS Access
SELECT SUM(Value)-IIf(SUM(IIf(Type=2, Value, 0)>SUM(IIf(Type=3, Value, 0), SUM(IIf(Type=2, Value, 0), SUM(IIf(Type=3, Value, 0) FROM Data
It is them same as the second suggestion above but adapted to MS Access SQL.

Counting if data exists in a row

Hey guys I have the below sample data which i want to query for.
MemberID AGEQ1 AGEQ2 AGEQ2
-----------------------------------------------------------------
1217 2 null null
58458 3 2 null
58459 null null null
58457 null 5 null
299576 6 5 7
What i need to do is to lookup the table and if any AGEx COLUMN contains any data then it counts the number of times there is data for that row in each column
Results example:
for memberID 1217 the count would be 1
for memberID 58458 the count would be 2
for memberID 58459 the count would be 0 or null
for memberID 58457 the count would be 1
for memberID 299576 the count would be 3
This is how it should look like in SQL if i query the entire table
1 Children - 2
2 Children - 1
3 Children - 1
0 Children - 1
So far i have been doing it using the following query which isnt very efficient and does give incorrect tallies as there are multiple combinations that people can answer the AGE question. Also i have to write multiple queries and change the is null to is not null depending on how many children i am looking to count a person has
select COUNT (*) as '1 Children' from Member
where AGEQ1 is not null
and AGEQ2 is null
and AGEQ3 is null
The above query only gives me an answer of 1 but i want to be able to count the other columns for data as well
Hope this is nice and clear and thank you in advance
If all of the columns are integers, you can take advantage of integer math - dividing the column by itself will yield 1, unless the value is NULL, in which case COALESCE can convert the resulting NULL to 0.
SELECT
MemberID,
COALESCE(AGEQ1 / AGEQ1, 0)
+ COALESCE(AGEQ2 / AGEQ2, 0)
+ COALESCE(AGEQ3 / AGEQ3, 0)
+ COALESCE(AGEQ4 / AGEQ4, 0)
+ COALESCE(AGEQ5 / AGEQ5, 0)
+ COALESCE(AGEQ6 / AGEQ6, 0)
FROM dbo.table_name;
To get the number of people with each count of children, then:
;WITH y(y) AS
(
SELECT TOP (7) rn = ROW_NUMBER() OVER
(ORDER BY [object_id]) - 1 FROM sys.objects
),
x AS
(
SELECT
MemberID,
x = COALESCE(AGEQ1 / AGEQ1, 0)
+ COALESCE(AGEQ2 / AGEQ2, 0)
+ COALESCE(AGEQ3 / AGEQ3, 0)
+ COALESCE(AGEQ4 / AGEQ4, 0)
+ COALESCE(AGEQ5 / AGEQ5, 0)
+ COALESCE(AGEQ6 / AGEQ6, 0)
FROM dbo.table_name
)
SELECT
NumberOfChildren = y.y,
NumberOfPeopleWithThatMany = COUNT(x.x)
FROM y LEFT OUTER JOIN x ON y.y = x.x
GROUP BY y.y ORDER BY y.y;
I'd look at using UNPIVOT. That will make your wide column into rows. Since you don't care about what value was in a column, just the presence/absence of value, this will generate a row per not-null column.
The trick then becomes mashing that into the desired output format. It could probably have been done cleaner but I'm a fan of "showing my work" so that others can conform it to their needs.
SQLFiddle
-- Using the above logic
WITH HadAges AS
(
-- Find everyone and determine number of rows
SELECT
UP.MemberID
, count(1) AS rc
FROM
dbo.Member AS M
UNPIVOT
(
ColumnValue for ColumnName in (AGEQ1, AGEQ2, AGEQ3)
) AS UP
GROUP BY
UP.MemberID
)
, NoAge AS
(
-- Account for those that didn't show up
SELECT M.MemberID
FROM
dbo.Member AS M
EXCEPT
SELECT
H.MemberID
FROM
HadAges AS H
)
, NUMBERS AS
(
-- Allowable range is 1-6
SELECT TOP 6
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS TheCount
FROM
sys.all_columns AS SC
)
, COMBINATION AS
(
-- Link those with rows to their count
SELECT
N.TheCount AS ChildCount
, H.MemberID
FROM
NUMBERS AS N
LEFT OUTER JOIN
HadAges AS H
ON H.rc = N.TheCount
UNION ALL
-- Deal with the unlinked
SELECT
0
, NA.MemberID
FROM
NoAge AS NA
)
SELECT
C.ChildCount
, COUNT(C.MemberID) AS Instances
FROM
COMBINATION AS C
GROUP BY
C.ChildCount;
Try this:
select id, a+b+c+d+e+f
from ( select id,
case when age1 is null then 0 else 1 end a,
case when age2 is null then 0 else 1 end b,
case when age3 is null then 0 else 1 end c,
case when age4 is null then 0 else 1 end d,
case when age5 is null then 0 else 1 end e,
case when age6 is null then 0 else 1 end f
from ages
) as t
See here in fiddle http://sqlfiddle.com/#!3/88020/1
To get the quantity of persons with childs
select childs, count(*) as ct
from (
select id, a+b+c+d+e+f childs
from
(
select
id,
case when age1 is null then 0 else 1 end a,
case when age2 is null then 0 else 1 end b,
case when age3 is null then 0 else 1 end c,
case when age4 is null then 0 else 1 end d,
case when age5 is null then 0 else 1 end e,
case when age6 is null then 0 else 1 end f
from ages ) as t
) ct
group by childs
order by 1
See it here at fiddle http://sqlfiddle.com/#!3/88020/24

SQL (TSQL) - Select values in a column where another column is not null?

I will keep this simple- I would like to know if there is a good way to select all the values in a column when it never has a null in another column. For example.
A B
----- -----
1 7
2 7
NULL 7
4 9
1 9
2 9
From the above set I would just want 9 from B and not 7 because 7 has a NULL in A. Obviously I could wrap this as a subquery and USE the IN clause etc. but this is already part of a pretty unique set and am looking to keep this efficient.
I should note that for my purposes this would only be a one-way comparison... I would only be returning values in B and examining A.
I imagine there is an easy way to do this that I am missing, but being in the thick of things I don't see it right now.
You can do something like this:
select *
from t
where t.b not in (select b from t where a is null);
If you want only distinct b values, then you can do:
select b
from t
group by b
having sum(case when a is null then 1 else 0 end) = 0;
And, finally, you could use window functions:
select a, b
from (select t.*,
sum(case when a is null then 1 else 0 end) over (partition by b) as NullCnt
from t
) t
where NullCnt = 0;
The query below will only output one column in the final result. The records are grouped by column B and test if the record is null or not. When the record is null, the value for the group will increment each time by 1. The HAVING clause filters only the group which has a value of 0.
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
If you want to get all the rows from the records, you can use join.
SELECT a.*
FROM TableName a
INNER JOIN
(
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
) b ON a.b = b.b