Oracle pivot a table based on column value using like - sql

sample table
Child Parent
FF00001 12345
AA00002 12345
GG00003 12345
TT00003 12345
What I want is a table like this
Parent FF AA GG TT
12345 FF00001 AA00002 GG00003 TT00003
The numbers after the first 2 letters can be anything but I know that they are always AA, FF, GG, TT etc. Can I pivot on a like statement? like 'AA%'

You can use aggregation like this:
select
parent,
max(case when child like 'FF%' then child end) FF,
max(case when child like 'AA%' then child end) AA,
max(case when child like 'GG%' then child end) GG,
max(case when child like 'TT%' then child end) TT
from your_table
group by parent;
Other way is to find the prefix in a subquery using substr and then apply pivot on it.
select *
from (
select
child,
parent,
substr(child, 1, 2) prefix
from your_table
)
pivot (
max(child) for prefix in ('FF' as FF,'AA' as AA,'GG' as GG,'TT' as TT)
)
Both produces:
PARENT FF AA GG TT
---------------------------------------
12345 FF00001 AA00002 GG00003 TT00003
If you have multiple values with same prefix and you want to keep them all, use row_number() window function in the subquery and then apply pivot:
with your_table (Child , Parent) as (
select 'FF00001', 12345 from dual union all
select 'FF00002', 12345 from dual union all
select 'AA00002', 12345 from dual union all
select 'GG00003', 12345 from dual union all
select 'TT00003', 12345 from dual
)
-- test data setup ends. See the solution below --
select parent, FF, AA, GG, TT
from (
select
child,
parent,
substr(child, 1, 2) prefix,
row_number() over (partition by parent, substr(child, 1, 2) order by child)
from your_table
)
pivot (
max(child) for prefix in ('FF' as FF,'AA' as AA,'GG' as GG,'TT' as TT)
)
Produces:
PARENT FF AA GG TT
---------------------------------------
12345 FF00001 AA00002 GG00003 TT00003
12345 FF00002 - - -

Related

SQL Server Order first by ParentID, then Child

I'm currently dealing with a database my company is phasing out, and we're trying to build a quick and dirty interface so that people can easily extract some data. A major problem with this database however, is that the primary assets are all recorded in one large table in order of when they were created, not how they relate to one another.
The gist of the database is shown below:
ParentAssetID ChildAssetID AssetName
------------------------------------
84 2 abc
35 1 cdf
956 35 PARENT35
84 1 ghi
956 3 PARENT3
35 3 jkl
956 84 PARENT84
3 5 mno
I would like to, using a select statement, output this ordered in such a way so that it appears as below:
ParentAssetID ChildAssetID AssetName
------------------------------------
956 3 PARENT3
3 5 mno
956 35 PARENT35
35 1 cdf
35 3 jkl
956 84 PARENT84
84 1 ghi
84 2 abc
As you can see, the data is first sorted by the ChildAssetID, and then each child of that asset is sorted below it. It's a pain to deal with, and that's one of the reasons why we're trying to get rid of it.
Currently, all I've got is the following:
select ParentAssetID, ChildAssetID, AssetName from dbo.Assets order by ParentAssetID
however this only groups the child assets all together without their parent headings at the start - they're all the way down the bottom at 956, grouped with their parent's children. Is there any way to sort the table like this so it's easily human readable, or will this job have to be done by hand?
For your example this could work:
SELECT t1.*
FROM elbat t1
ORDER BY CASE
WHEN NOT EXISTS (SELECT *
FROM elbat t2
WHERE t2.childassetid = t1.parentassetid) THEN
t1.childassetid
ELSE
t1.parentassetid
END,
CASE
WHEN NOT EXISTS (SELECT *
FROM elbat t2
WHERE t2.childassetid = t1.parentassetid) THEN
0
ELSE
1
END,
t1.childassetid;
db<>fiddle
The first CASE gets all children and their parent together, the second makes sure the parent is atop and then the children are sorted. If the levels in your real table are any deeper than in the example though, this might no longer work. But maybe you can make something out of it anyways.
you can achieve this using CTE
;with cte as
(
select
ParentAssetID,
ChildAssetID,
AssetName,
cast(row_number()over(partition by ParentAssetID order by AssetName) as varchar(max)) as [path],
0 as level,
row_number()over(partition by ParentAssetID order by AssetName) / power(10.0,0) as x
from Assets
where ParentAssetID =956
union all
select
t.ParentAssetID,
t.ChildAssetID,
t.AssetName,
[path] +'-'+ cast(row_number()over(partition by t.ParentAssetID order by t.AssetName) as varchar(max)),
level+1,
x + row_number()over(partition by t.ParentAssetID order by t.AssetName) / power(10.0,level+1)
from
cte
join Assets t on cte.ChildAssetID = t.ParentAssetID
)
select
ParentAssetID,
ChildAssetID,
AssetName,
[path],
x
from cte
order by x
Your data is a bit awkward, because "mno" has a parent of "3" and "3" is associated with two parent ids.
Other than this, you appear to want to order by the path to the top. You can do this with a recursive CTE:
with cte as (
select a.parentassetid, a.childassetid, a.assetname,
convert(varchar(max), concat(format(a.parentassetid, '0000'), format(a.childassetid, '0000'))) as path, 1 as lev
from assets a
where not exists (select 1 from assets ap where a.parentassetid = ap.childassetid)
union all
select a.parentassetid, a.childassetid, a.assetname,
convert(varchar(max), concat(cte.path, '/', format(a.childassetid, '0000'))), lev + 1
from cte join
assets a
on cte.childassetid = a.parentassetid
where lev < 10
)
select *
from cte
order by path;
This doesn't produce exactly what you want, because "mno" is duplicated. I would assume that is a transcription error.
If this is not a transcription error and you want the first time that a row occurs, you can use:
select cte.*
from (select cte.*,
row_number() over (partition by parentassetid, childassetid order by lev asc) as seqnum
from cte
) cte
where seqnum = 1
order by path
Here is a db<>fiddle.
Testing answer from #Krishna Muppalla (https://stackoverflow.com/a/59174634/956364)
Is there a way to not use the power(10) functions? I've never seen them used to sort like this!
drop table if exists #Assets;
create table #Assets( [ParentAssetID] int, [ChildAssetID] int, [AssetName] varchar(30) );
insert into #Assets( [ParentAssetID], [ChildAssetID], [AssetName] )
select 84, 2, 'abc' union all
select 35, 1, 'cdf' union all
select 956, 35, 'PARENT35' union all
select 84, 1, 'ghi' union all
select 956, 3, 'PARENT3' union all
select 35, 3, 'jkl' union all
select 956, 84, 'PARENT84' union all
select 3, 5, 'mno';
declare #one float = 1; --I don't know if power(10.0,0) was being recomputed each call.
with [cte] as (
select
[ParentAssetID],
[ChildAssetID],
[AssetName],
0 [level],
row_number() over( partition by [ParentAssetID] order by [AssetName] ) / #one [x]
from #Assets
where [ParentAssetID] = 956 --this is bad. How do we get around this?
union all
select
t.[ParentAssetID],
t.[ChildAssetID],
t.[AssetName],
[level] + 1,
[x] + row_number() over( partition by t.[ParentAssetID] order by t.[AssetName] ) / power(10.0, [level] + 1) [x]
from [cte]
join #Assets t on cte.[ChildAssetID] = t.[ParentAssetID]
)
select
[ParentAssetID],
[ChildAssetID],
[AssetName]
,[x]
from [cte]
order by [x];
Is this what you are looking for?
select
ParentAssetID,
ChildAssetID,
AssetName
from dbo.Assets
order by ParentAssetID desc, ChildAssetID asc;

Pivoting Multiple Attributres and grouping them as a 'single' attribute (many to one)

So I Have a table called Value that's associated with different 'Fields'. Note that some of these fields have similar 'names' but they are named differently. Ultimately I want these 'similar names' to be pivoted/grouped as the same field name in the result set
VALUE_ID VALUE_TX FIELD_NAME Version_ID
1 Yes Adult 1
2 18 Age 1
3 Black Eye Color 1
4 Yes Is_Adult 2
5 25 Years_old 2
6 Brown Color_of_Eyes 2
I have a table called Submitted that looks like the following:
Version_ID Version_Name
1 TEST_RUN
2 REAL_RUN
I need a result set that Looks like this:
Submitted_Name Adult? Age Eye_Color
TEST_RUN Yes 18 Black
REAL_RUN Yes 25 Brown
I've tried the following:
SELECT * FROM (
select value_Tx, field_name, version_id
from VALUE
)
PIVOT (max (value_tx) for field_name in (('Adult', 'Is_Adult') as 'Adult?', ('Age', 'Years_old') as 'Age', ('Eye Color', 'Color_of_Eyes') as 'Eye_Color')
);
What am I doing wrong? Please let me know if I need to add any additional details / data.
Thanks in advance!
The error message that I am getting is the following:
ORA-00907: missing right parenthesis
I would change the field names in the subquery:
SELECT *
FROM (select value_Tx,
(case when field_name in ('Adult', 'Is_Adult') then 'Adult?'
field_name in ('Age', 'Years_old') then 'Age'
field_name in ('Eye Color', 'Color_of_Eyes') then 'Eye_Color'
else field_name
end) as field_name, version_id
from VALUE
)
PIVOT (max(value_tx) for field_name in ('Adult?', 'Age', 'Eye_Color'));
You can use double quotes for column aliasing within the pivot clause's part, and I think decode function suits well for this question. You can consider using the following query :
with value( value_id, value_tx, field_name, version_id ) as
(
select 1 ,'Yes' ,'Adult' ,1 from dual union all
select 2 ,'18' ,'Age' ,1 from dual union all
select 3 ,'Black','Eye_Color' ,1 from dual union all
select 4 ,'Yes' ,'Is_Adult' ,2 from dual union all
select 5 ,'25' ,'Years_old' ,2 from dual union all
select 6 ,'Brown','Color_of_Eyes',2 from dual
), Submitted( version_id, version_name ) as
(
select 1 ,'TEST_RUN' from dual union all
select 2 ,'REAL_RUN' from dual
)
select * from
(
select s.version_name as "Submitted_Name", v.value_Tx,
decode(v.field_name,'Adult','Is_Adult','Age','Years_old','Eye_Color',
'Color_of_Eyes',v.field_name) field_name
from value v
join Submitted s
on s.version_id = v.version_id
group by decode(v.field_name,'Adult','Is_Adult','Age','Years_old','Eye_Color',
'Color_of_Eyes',v.field_name),
v.value_Tx, s.Version_Name
)
pivot(
max(value_tx) for field_name in ( 'Is_Adult' as "Adult?", 'Years_old' as "Age",
'Color_of_Eyes' as "Eye_Color" )
);
Submitted_Name Adult? Age Eye_Color
REAL_RUN Yes 25 Brown
TEST_RUN Yes 18 Black
I think, better to solve as much as shorter way, as an example, using modular arithmetic would even be better as below :
select *
from
(
select s.version_name as "Submitted_Name", v.value_Tx, mod(v.value_id,3) as value_id
from value v
join Submitted s
on s.version_id = v.version_id
group by v.value_Tx, s.version_name, mod(v.value_id,3)
)
pivot(
max(value_tx) for value_id in ( 1 as "Adult?", 2 as "Age", 0 as "Eye_Color" )
)
Demo

Find way for gathering data and replace with values from another table

I am looking for an Oracle SQL query to find a specific pattern and replace them with values from another table.
Scenario:
Table 1:
No column1
-----------------------------------------
12345 user:12345;group:56789;group:6785;...
Note: field 1 may be has one or more pattern
Table2 :
Id name type
----------------------
12345 admin user
56789 testgroup group
Result must be the same
No column1
-----------------------------------
12345 user: admin;group:testgroup
Logic:
First split the concatenated string to individual rows using connect
by clause and regex.
Join the newly created table(split_tab) with Table2(tab2).
Use listagg function to concatenate data in the columns.
Query:
WITH tab1 AS
( SELECT '12345' NO
,'user:12345;group:56789;group:6785;' column1
FROM DUAL )
,tab2 AS
( SELECT 12345 id
,'admin' name
,'user' TYPE
FROM DUAL
UNION
SELECT 56789 id
,'testgroup' name
,'group' TYPE
FROM DUAL )
SELECT no
,listagg(category||':'||name,';') WITHIN GROUP (ORDER BY tab2.id) column1
FROM ( SELECT NO
,REGEXP_SUBSTR( column1, '(\d+)', 1, LEVEL ) id
,REGEXP_SUBSTR( column1, '([a-z]+)', 1, LEVEL ) CATEGORY
FROM tab1
CONNECT BY LEVEL <= regexp_count( column1, '\d+' ) ) split_tab
,tab2
WHERE split_tab.id = tab2.id
GROUP BY no
Output:
No Column1
12345 user:admin;group:testgroup
with t1 (no, col) as
(
-- start of test data
select 1, 'user:12345;group:56789;group:6785;' from dual union all
select 2, 'user:12345;group:56789;group:6785;' from dual
-- end of test data
)
-- the lookup table which has the substitute strings
-- nid : concatenation of name and id as in table t1 which requires the lookup
-- tname : required substitute for each nid
, t2 (id, name, type, nid, tname) as
(
select t.*, type || ':' || id, type || ':' || name from
(
select 12345 id, 'admin' name, 'user' type from dual union all
select 56789, 'testgroup', 'group' from dual
) t
)
--select * from t2;
-- cte table calculates the indexes for the substrings (eg, user:12345)
-- no : sequence no in t1
-- col : the input string in t1
-- si : starting index of each substring in the 'col' input string that needs attention later
-- ei : ending index of each substring in the 'col' input string
-- idx : the order of substring to put them together later
,cte (no, col, si, ei, idx) as
(
select no, col, 1, case when instr(col,';') = 0 then length(col)+1 else instr(col,';') end, 1 from t1 union all
select no, col, ei+1, case when instr(col,';', ei+1) = 0 then length(col)+1 else instr(col,';', ei+1) end, idx+1 from cte where ei + 1 <= length(col)
)
,coll(no, col, sstr, idx, newstr) as
(
select
a.no, a.col, a.sstr, a.idx,
-- when a substitute is not found in t2, use the same input substring (eg. group:6785)
case when t2.tname is null then a.sstr else t2.tname end
from
(select cte.*, substr(col, si, ei-si) as sstr from cte) a
-- we don't want to miss if there is no substitute available in t2 for a substring
left outer join
t2
on (a.sstr = t2.nid)
)
select no, col, listagg(newstr, ';') within group (order by no, col, idx) from coll
group by no, col;

SQL hierarchy count totals report

I'm creating a report with SQL server 2012 and Report Builder which must show the total number of Risks at a high, medium and low level for each Parent Element.
Each Element contains a number of Risks which are rated at a certain level. I need the total for the Parent Elements. The total will include the number of all the Child Elements and also the number the Element itself may have.
I am using CTEs in my query- the code I have attached isn't working (there are no errors - it's just displaying the incorrect results) and I'm not sure that my logic is correct??
Hopefully someone can help. Thanks in advance.
My table structure is:
ElementTable
ElementTableId(PK) ElementName ElementParentId
RiskTable
RiskId(PK) RiskName RiskRating ElementId(FK)
My query:
WITH cte_Hierarchy(ElementId, ElementName, Generation, ParentElementId)
AS (SELECT ElementId,
NAME,
0,
ParentElementId
FROM Extract.Element AS FirtGeneration
WHERE ParentElementId IS NULL
UNION ALL
SELECT NextGeneration.ElementId,
NextGeneration.NAME,
Parent.Generation + 1,
Parent.ElementId
FROM Extract.Element AS NextGeneration
INNER JOIN cte_Hierarchy AS Parent
ON NextGeneration.ParentElementId = Parent.ElementId),
CTE_HighRisk
AS (SELECT r.ElementId,
Count(r.RiskId) AS HighRisk
FROM Extract.Risk r
WHERE r.RiskRating = 'High'
GROUP BY r.ElementId),
CTE_LowRisk
AS (SELECT r.ElementId,
Count(r.RiskId) AS LowRisk
FROM Extract.Risk r
WHERE r.RiskRating = 'Low'
GROUP BY r.ElementId),
CTE_MedRisk
AS (SELECT r.ElementId,
Count(r.RiskId) AS MedRisk
FROM Extract.Risk r
WHERE r.RiskRating = 'Medium'
GROUP BY r.ElementId)
SELECT rd.ElementId,
rd.ElementName,
rd.ParentElementId,
Generation,
HighRisk,
MedRisk,
LowRisk
FROM cte_Hierarchy rd
LEFT OUTER JOIN CTE_HighRisk h
ON rd.ElementId = h.ElementId
LEFT OUTER JOIN CTE_MedRisk m
ON rd.ElementId = m.ElementId
LEFT OUTER JOIN CTE_LowRisk l
ON rd.ElementId = l.ElementId
WHERE Generation = 1
Edit:
Sample Data
ElementTableId(PK) -- ElementName -- ElementParentId
1 ------------------- Main --------------0
2 --------------------Element1-----------1
3 --------------------Element2 ----------1
4 --------------------SubElement1 -------2
RiskId(PK) RiskName RiskRating ElementId(FK)
a -------- Financial -- High ----- 2
b -------- HR --------- High ----- 3
c -------- Marketing -- Low ------- 2
d -------- Safety -----Medium ----- 4
Sample Output:
Element Name High Medium Low
Main ---------- 2 ---- 1 -------1
Here is your sample tables
SELECT * INTO #TABLE1
FROM
(
SELECT 1 ElementTableId, 'Main' ElementName ,0 ElementParentId
UNION ALL
SELECT 2,'Element1',1
UNION ALL
SELECT 3, 'Element2',1
UNION ALL
SELECT 4, 'SubElement1',2
)TAB
SELECT * INTO #TABLE2
FROM
(
SELECT 'a' RiskId, 'Fincancial' RiskName,'High' RiskRating ,2 ElementId
UNION ALL
SELECT 'b','HR','High',3
UNION ALL
SELECT 'c', 'Marketing','Low',2
UNION ALL
SELECT 'd', 'Safety','Medium',4
)TAB
We are finding the children of a parent, its count of High,Medium and Low and use cross join to show parent with all the combinations of its children's High,Medium and Low
UPDATE
The below variable can be used to access the records dynamically.
DECLARE #ElementTableId INT;
--SET #ElementTableId = 1
And use the above variable inside the query
;WITH CTE1 AS
(
SELECT *,0 [LEVEL] FROM #TABLE1 WHERE ElementTableId = #ElementTableId
UNION ALL
SELECT E.*,e2.[LEVEL]+1 FROM #TABLE1 e
INNER JOIN CTE1 e2 on e.ElementParentId = e2.ElementTableId
AND E.ElementTableId<>#ElementTableId
)
,CTE2 AS
(
SELECT E1.*,E2.*,COUNT(RiskRating) OVER(PARTITION BY RiskRating) CNT
from CTE1 E1
LEFT JOIN #TABLE2 E2 ON E1.ElementTableId=E2.ElementId
)
,CTE3 AS
(
SELECT DISTINCT T1.ElementName,C2.RiskRating,C2.CNT
FROM #TABLE1 T1
CROSS JOIN CTE2 C2
WHERE T1.ElementTableId = #ElementTableId
)
SELECT *
FROM CTE3
PIVOT(MIN(CNT)
FOR RiskRating IN ([High], [Medium],[Low])) AS PVTTable
SQL FIDDLE
RESULT
UPDATE 2
I am updating as per your new requirement
Here is sample table in which I have added extra data to test
SELECT * INTO #ElementTable
FROM
(
SELECT 1 ElementTableId, 'Main' ElementName ,0 ElementParentId
UNION ALL
SELECT 2,'Element1',1
UNION ALL
SELECT 3, 'Element2',1
UNION ALL
SELECT 4, 'SubElement1',2
UNION ALL
SELECT 5, 'Main 2',0
UNION ALL
SELECT 6, 'Element21',5
UNION ALL
SELECT 7, 'SubElement21',6
UNION ALL
SELECT 8, 'SubElement22',7
UNION ALL
SELECT 9, 'SubElement23',7
)TAB
SELECT * INTO #RiskTable
FROM
(
SELECT 'a' RiskId, 'Fincancial' RiskName,'High' RiskRating ,2 ElementId
UNION ALL
SELECT 'b','HR','High',3
UNION ALL
SELECT 'c', 'Marketing','Low',2
UNION ALL
SELECT 'd', 'Safety','Medium',4
UNION ALL
SELECT 'e' , 'Fincancial' ,'High' ,5
UNION ALL
SELECT 'f','HR','High',6
UNION ALL
SELECT 'g','HR','High',6
UNION ALL
SELECT 'h', 'Marketing','Low',7
UNION ALL
SELECT 'i', 'Safety','Medium',8
UNION ALL
SELECT 'j', 'Safety','High',8
)TAB
I have written the logic in query
;WITH CTE1 AS
(
-- Here you will find the level of every elements in the table
SELECT *,0 [LEVEL]
FROM #ElementTable WHERE ElementParentId = 0
UNION ALL
SELECT ET.*,CTE1.[LEVEL]+1
FROM #ElementTable ET
INNER JOIN CTE1 on ET.ElementParentId = CTE1.ElementTableId
)
,CTE2 AS
(
-- Filters the level and find the major parant of each child
-- ie, 100->150->200, here the main parent of 200 is 100
SELECT *,CTE1.ElementTableId MajorParentID,CTE1.ElementName MajorParentName
FROM CTE1 WHERE [LEVEL]=1
UNION ALL
SELECT CTE1.*,CTE2.MajorParentID,CTE2.MajorParentName
FROM CTE1
INNER JOIN CTE2 on CTE1.ElementParentId = CTE2.ElementTableId
)
,CTE3 AS
(
-- Since each child have columns for main parent id and name,
-- you will get the count of each element corresponding to the level you have selected directly
SELECT DISTINCT CTE2.MajorParentName,RT.RiskRating ,
COUNT(RiskRating) OVER(PARTITION BY MajorParentID,RiskRating) CNT
FROM CTE2
JOIN #RiskTable RT ON CTE2.ElementTableId=RT.ElementId
)
SELECT MajorParentName, ISNULL([High],0)[High], ISNULL([Medium],0)[Medium],ISNULL([Low],0)[Low]
FROM CTE3
PIVOT(MIN(CNT)
FOR RiskRating IN ([High], [Medium],[Low])) AS PVTTable
SQL FIDDLE

SELECT DISTINCT for data groups

I have following table:
ID Data
1 A
2 A
2 B
3 A
3 B
4 C
5 D
6 A
6 B
etc. In other words, I have groups of data per ID. You will notice that the data group (A, B) occurs multiple times. I want a query that can identify the distinct data groups and number them, such as:
DataID Data
101 A
102 A
102 B
103 C
104 D
So DataID 102 would resemble data (A,B), DataID 103 would resemble data (C), etc. In order to be able to rewrite my original table in this form:
ID DataID
1 101
2 102
3 102
4 103
5 104
6 102
How can I do that?
PS. Code to generate the first table:
CREATE TABLE #t1 (id INT, data VARCHAR(10))
INSERT INTO #t1
SELECT 1, 'A'
UNION ALL SELECT 2, 'A'
UNION ALL SELECT 2, 'B'
UNION ALL SELECT 3, 'A'
UNION ALL SELECT 3, 'B'
UNION ALL SELECT 4, 'C'
UNION ALL SELECT 5, 'D'
UNION ALL SELECT 6, 'A'
UNION ALL SELECT 6, 'B'
In my opinion You have to create a custom aggregate that concatenates data (in case of strings CLR approach is recommended for perf reasons).
Then I would group by ID and select distinct from the grouping, adding a row_number()function or add a dense_rank() your choice. Anyway it should look like this
with groupings as (
select concat(data) groups
from Table1
group by ID
)
select groups, rownumber() over () from groupings
The following query using CASE will give you the result shown below.
From there on, getting the distinct datagroups and proceeding further should not really be a problem.
SELECT
id,
MAX(CASE data WHEN 'A' THEN data ELSE '' END) +
MAX(CASE data WHEN 'B' THEN data ELSE '' END) +
MAX(CASE data WHEN 'C' THEN data ELSE '' END) +
MAX(CASE data WHEN 'D' THEN data ELSE '' END) AS DataGroups
FROM t1
GROUP BY id
ID DataGroups
1 A
2 AB
3 AB
4 C
5 D
6 AB
However, this kind of logic will only work in case you the "Data" values are both fixed and known before hand.
In your case, you do say that is the case. However, considering that you also say that they are 1000 of them, this will be frankly, a ridiculous looking query for sure :-)
LuckyLuke's suggestion above would, frankly, be the more generic way and probably saner way to go about implementing the solution though in your case.
From your sample data (having added the missing 2,'A' tuple, the following gives the renumbered (and uniqueified) data:
with NonDups as (
select t1.id
from #t1 t1 left join #t1 t2
on t1.id > t2.id and t1.data = t2.data
group by t1.id
having COUNT(t1.data) > COUNT(t2.data)
), DataAddedBack as (
select ID,data
from #t1 where id in (select id from NonDups)
), Renumbered as (
select DENSE_RANK() OVER (ORDER BY id) as ID,Data from DataAddedBack
)
select * from Renumbered
Giving:
1 A
2 A
2 B
3 C
4 D
I think then, it's a matter of relational division to match up rows from this output with the rows in the original table.
Just to share my own dirty solution that I'm using for the moment:
SELECT DISTINCT t1.id, D.data
FROM #t1 t1
CROSS APPLY (
SELECT CAST(Data AS VARCHAR) + ','
FROM #t1 t2
WHERE t2.id = t1.id
ORDER BY Data ASC
FOR XML PATH('') )
D ( Data )
And then going analog to LuckyLuke's solution.