Use Left Join Alias in Column Select in SQL Views - sql

I am working on creating a view in SQL server one of the columns for which needs to be a comma separated value from a different table. Consider the tables below for instance -
CREATE TABLE Persons
(
Id INT NOT NULL PRIMARY KEY,
Name VARCHAR (100)
)
CREATE TABLE Skills
(
Id INT NOT NULL PRIMARY KEY,
Name VARCHAR (100),
)
CREATE TABLE PersonSkillLinks
(
Id INT NOT NULL PRIMARY KEY,
SkillId INT FOREIGN KEY REFERENCES Skills(Id),
PersonId INT FOREIGN KEY REFERENCES Persons(Id),
)
Sample data
INSERT INTO Persons VALUES
(1, 'Peter'),
(2, 'Sam'),
(3, 'Chris')
INSERT INTO Skills VALUES
(1, 'Poetry'),
(2, 'Cooking'),
(3, 'Movies')
INSERT INTO PersonSkillLinks VALUES
(1, 1, 1),
(2, 2, 1),
(3, 3, 1)
What I want is something like shown in the image
While I have been able to get the results using the script below, I have a feeling that this is not the best (and certainly not the only) way to do as far as performance goes -
CREATE VIEW vwPersonsAndTheirSkills
AS
SELECT p.Name,
ISNULL(STUFF((SELECT ', ' + s.Name FROM Skills s JOIN PersonSkillLinks psl ON s.Id = psl.SkillId WHERE psl.personId = p.Id FOR XML PATH ('')), 1, 2, ''), '') AS Skill
FROM Persons p
GO
I also tried my luck with the script below -
CREATE VIEW vwPersonsAndTheirSkills
AS
SELECT p.Name,
ISNULL(STUFF((SELECT ', ' + skill.Name FOR XML PATH ('')), 1, 2, ''), '') AS Skill
FROM persons p
LEFT JOIN
(
SELECT s.Name, psl.personid FROM Skills s
JOIN PersonSkillLinks psl ON s.Id = psl.SkillId
) skill ON skill.personId = p.Id
GO
but it is not concatenating the strings and returning separate rows for each skill as shown below -
So, is my assumption about the first script correct? If so, what concept am I missing about it and what should be the most efficient way to achieve it.

I would try with APPLY :
SELECT p.Name, STUFF(ss.skills, 1, 2, '') AS Skill
FROM Persons p OUTER APPLY
(SELECT ', ' + s.Name
FROM Skills s JOIN
PersonSkillLinks psl
ON s.Id = psl.SkillId
WHERE psl.personId = p.Id
FOR XML PATH ('')
) ss(skills);
By this way, optimizer will call STUFF() once not for all rows returned by outer query.

Related

'Merge Fields' - alike SQL Server function

I try to find a way to let the SGBD perform a population of merge fields within a long text.
Create the structure :
CREATE TABLE [dbo].[store]
(
[id] [int] NOT NULL,
[text] [nvarchar](MAX) NOT NULL
)
CREATE TABLE [dbo].[statement]
(
[id] [int] NOT NULL,
[store_id] [int] NOT NULL
)
CREATE TABLE [dbo].[statement_merges]
(
[statement_id] [int] NOT NULL,
[merge_field] [nvarchar](30) NOT NULL,
[user_data] [nvarchar](MAX) NOT NULL
)
Now, create test values
INSERT INTO [store] (id, text)
VALUES (1, 'Waw, stackoverflow is an amazing library of lost people in the IT hell, and i have the feeling that $$PERC_SAT$$ of the users found a solution, personally I asked $$ASKED$$ questions.')
INSERT INTO [statement] (id, store_id)
VALUES (1, 1)
INSERT INTO [statement_merges] (statement_id, merge_field, user_data)
VALUES (1, '$$PERC_SAT$$', '85%')
INSERT INTO [statement_merges] (statement_id, merge_field, user_data)
VALUES (1, '$$ASKED$$', '12')
At the time being my app is delivering the final statement, looping through merges, replacing in the stored text and output
Waw, stackoverflow is an amazing library of lost people in the IT
hell, and i have the feeling that 85% of the users found a solution,
personally I asked 12 questions.
I try to find a way to be code-independent and serve the output in a single query, as u understood, select a statement in which the stored text have been populated with user data. I hope I'm clear.
I looked on TRANSLATE function but it looks like a char replacement, so I have two choices :
I try a recursive function, replacing one by one until no merge_fields is found in the calculated text; but I have doubts about the performance of this approach;
There is a magic to do that but I need your knowledge...
Consider that I want this because the real texts are very long, and I don't want to store it more than once in my database. You can imagine a 3 pages contract with only 12 parameters, like start date, invoiced amount, etc... Everything else cant be changed for compliance.
Thank you for your time!
EDIT :
Thanks to Randy's help, this looks to do the trick :
WITH cte_replace_tokens AS (
SELECT replace(r.text, m.merge_field, m.user_data) as [final], m.merge_field, s.id, 1 AS i
FROM store r
INNER JOIN statement s ON s.store_id = r.id
INNER JOIN statement_merges m ON m.statement_id = s.id
WHERE m.statement_id = 1
UNION ALL
SELECT replace(r.final, m.merge_field, m.user_data) as [final], m.merge_field, r.id, r.i + 1 AS i
FROM cte_replace_tokens r
INNER JOIN statement_merges m ON m.statement_id = r.id
WHERE m.merge_field > r.merge_field
)
select TOP 1 final from cte_replace_tokens ORDER BY i DESC
I will check with a bigger database if the performance is good...
At least, I can "populate" one statement, I need to figure out to be able to extract a list as well.
Thanks again !
If a record is updated more than once by the same update, the last wins. None of the updates are affected by the others - no cumulative effect. It is possible to trick SQL using a local variable to get cumulative effects in some cases, but it's tricky and not recommended. (Order becomes important and is not reliable in an update.)
One alternate is recursion in a CTE. Generate a new record from the prior as each token is replaced until there are no tokens. Here is a working example that replaces 1 with A, 2 with B, etc. (I wonder if there is some tricky xml that can do this as well.)
if not object_id('tempdb..#Raw') is null drop table #Raw
CREATE TABLE #Raw(
[test] [varchar](100) NOT NULL PRIMARY KEY CLUSTERED,
)
if not object_id('tempdb..#Token') is null drop table #Token
CREATE TABLE #Token(
[id] [int] NOT NULL PRIMARY KEY CLUSTERED,
[token] [char](1) NOT NULL,
[value] [char](1) NOT NULL,
)
insert into #Raw values('123456'), ('1122334456')
insert into #Token values(1, '1', 'A'), (2, '2', 'B'), (3, '3', 'C'), (4, '4', 'D'), (5, '5', 'E'), (6, '6', 'F');
WITH cte_replace_tokens AS (
SELECT r.test, replace(r.test, l.token, l.value) as [final], l.id
FROM [Raw] r
CROSS JOIN #Token l
WHERE l.id = 1
UNION ALL
SELECT r.test, replace(r.final, l.token, l.value) as [final], l.id
FROM cte_replace_tokens r
CROSS JOIN #Token l
WHERE l.id = r.id + 1
)
select * from cte_replace_tokens where id = 6
It's not recommended to do such tasks inside sql engine but if you want to do that, you need to do it in a loop using cursor in a function or stored procedure like so :
DECLARE #merge_field nvarchar(30)
, #user_data nvarchar(MAX)
, #statementid INT = 1
, #text varchar(MAX) = 'Waw, stackoverflow is an amazing library of lost people in the IT hell, and i have the feeling that $$PERC_SAT$$ of the users found a solution, personally I asked $$ASKED$$ questions.'
DECLARE merge_statements CURSOR FAST_FORWARD
FOR SELECT
sm.merge_field
, sm.user_data
FROM dbo.statement_merges AS sm
WHERE sm.statement_id = #statementid
OPEN merge_statements
FETCH NEXT FROM merge_statements
INTO #merge_field , #user_data
WHILE ##FETCH_STATUS = 0
BEGIN
set #text = REPLACE(#text , #merge_field, #user_data )
FETCH NEXT FROM merge_statements
INTO #merge_field , #user_data
END
CLOSE merge_statements
DEALLOCATE merge_statements
SELECT #text
Here is a recursive solution.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE [dbo].[store]
(
[id] [int] NOT NULL,
[text] [nvarchar](MAX) NOT NULL
)
CREATE TABLE [dbo].[statement]
(
[id] [int] NOT NULL,
[store_id] [int] NOT NULL
)
CREATE TABLE [dbo].[statement_merges]
(
[statement_id] [int] NOT NULL,
[merge_field] [nvarchar](30) NOT NULL,
[user_data] [nvarchar](MAX) NOT NULL
)
INSERT INTO store (id, text)
VALUES (1, '$$(*)$$, stackoverflow...$$PERC_SAT$$...$$ASKED$$ questions.')
INSERT INTO store (id, text)
VALUES (2, 'Use The #_#')
INSERT INTO statement (id, store_id) VALUES (1, 1)
INSERT INTO statement (id, store_id) VALUES (2, 2)
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (1, '$$PERC_SAT$$', '85%')
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (1, '$$ASKED$$', '12')
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (1, '$$(*)$$', 'Wow')
INSERT INTO statement_merges (statement_id, merge_field, user_data) VALUES (2, ' #_#', 'Flux!')
Query 1:
;WITH Normalized AS
(
SELECT
store_id=store.id,
store.text,
sm.merge_field,
sm.user_data,
RowNumber = ROW_NUMBER() OVER(PARTITION BY store.id,sm.statement_id ORDER BY merge_field),
statement_id = st.id
FROM
store store
INNER JOIN statement st ON st.store_id = store.id
INNER JOIN statement_merges sm ON sm.statement_id = st.id
)
, Recurse AS
(
SELECT
store_id, statement_id, old_text = text, merge_field,user_data, RowNumber,
Iteration=1,
new_text = REPLACE(text, merge_field, user_data)
FROM
Normalized
WHERE
RowNumber=1
UNION ALL
SELECT
n.store_id, n.statement_id, r.old_text, n.merge_field, n.user_data,
RowNumber=r.RowNumber+1,
Iteration=Iteration+1,
new_text = REPLACE(r.new_text, n.merge_field, n.user_data)
FROM
Normalized n
INNER JOIN Recurse r ON r.RowNumber = n.RowNumber AND r.statement_id = n.statement_id
)
,ReverseOnIteration AS
(
SELECT *,
ReverseIteration = ROW_NUMBER() OVER(PARTITION BY statement_id ORDER BY Iteration DESC)
FROM
Recurse
)
SELECT
store_id, statement_id, new_text, old_text
FROM
ReverseOnIteration
WHERE
ReverseIteration=1
Results:
| store_id | statement_id | new_text | old_text |
|----------|--------------|------------------------------------------|--------------------------------------------------------------|
| 1 | 1 | Wow, stackoverflow...85%...12 questions. | $$(*)$$, stackoverflow...$$PERC_SAT$$...$$ASKED$$ questions. |
| 2 | 2 | Use TheFlux! | Use The #_# |
With the help of Randy, I think I've achieved what I wanted to do !
Known the fact that my real case is a contract, in which there are several statements that may be :
free text
stored text without any merges
stored text with one or
several merges
this CTE does the job !
WITH cte_replace_tokens AS (
-- The initial query dont join on merges neither on store because can be a free text
SELECT COALESCE(r.text, s.part_text) AS [final], CAST('' AS NVARCHAR) AS merge_field, s.id, 1 AS i, s.contract_id
FROM statement s
LEFT JOIN store r ON s.store_id = r.id
UNION ALL
-- We loop till the last merge field, output contains iteration to be able to keep the last record ( all fields updated )
SELECT replace(r.final, m.merge_field, m.user_data) as [final], m.merge_field, r.id, r.i + 1 AS i, r.contract_id
FROM cte_replace_tokens r
INNER JOIN statement_merges m ON m.statement_id = r.id
WHERE m.merge_field > r.merge_field AND r.final LIKE '%' + m.merge_field + '%'
-- spare lost replacements by forcing only one merge_field per loop
AND NOT EXISTS( SELECT mm.statement_id FROM statement_merges mm WHERE mm.statement_id = m.statement_id AND mm.merge_field > r.merge_field AND mm.merge_field < m.merge_field)
)
select s.id,
(select top 1 final from cte_replace_tokens t WHERE t.contract_id = s.contract_id AND t.id = s.id ORDER BY i DESC) as res
FROM statement s
where contract_id = 1
If the CTE solution with a cross join is too slow, an alternate solution would be to build a scalar fn dynamically that has every REPLACE required from the token table. One scalar fn call per record then is order(N). I get the same result as before.
The function is simple and likely not to be too long, depending upon how big the token table becomes...256 MB batch limit. I've seen attempts to dynamically create queries to improve performance backfire - moved the problem to compile time. Should not be a problem here.
if not object_id('tempdb..#Raw') is null drop table #Raw
CREATE TABLE #Raw(
[test] [varchar](100) NOT NULL PRIMARY KEY CLUSTERED,
)
if not object_id('tempdb..#Token') is null drop table #Token
CREATE TABLE #Token(
[id] [int] NOT NULL PRIMARY KEY CLUSTERED,
[token] [char](1) NOT NULL,
[value] [char](1) NOT NULL,
)
insert into #Raw values('123456'), ('1122334456')
insert into #Token values(1, '1', 'A'), (2, '2', 'B'), (3, '3', 'C'), (4, '4', 'D'), (5, '5', 'E'), (6, '6', 'F');
DECLARE #sql varchar(max) = 'CREATE FUNCTION dbo.fn_ReplaceTokens(#raw varchar(8000)) RETURNS varchar(8000) AS BEGIN RETURN ';
WITH cte_replace_statement AS (
SELECT a.id, CAST('replace(#raw,''' + a.token + ''',''' + a.value + ''')' as varchar(max)) as [statement]
FROM #Token a
WHERE a.id = 1
UNION ALL
SELECT n.id, CAST(replace(l.[statement], '#raw', 'replace(#raw,''' + n.token + ''',''' + n.value + ''')') as varchar(max)) as [statement]
FROM #Token n
INNER JOIN cte_replace_statement l
ON n.id = l.id + 1
)
select #sql += [statement] + ' END' from cte_replace_statement where id = 6
print #sql
if not object_id('dbo.fn_ReplaceTokens') is null drop function dbo.fn_ReplaceTokens
execute (#sql)
SELECT r.test, dbo.fn_ReplaceTokens(r.test) as [final] FROM [Raw] r

T-SQL loop through two tables

I have been unable to find a working solution for below dilemma.
I am using SQL Server 2016 and have the 2 tables shown below in a database.
Users table:
Id Name
----------
1 Lisa
2 Paul
3 John
4 Mike
5 Tom
Role table:
Id UserId Role
------------------------
1 3 Manager
2 2,4,5 Developer
3 1 Designer
I am looking for T-SQL code that loops through the Role table, extracts UserIds and retrieves associated name for each Id from the Users table.
So the looped result would look like this:
John
Paul,Mike,Tom
Lisa
FOR SQL SERVER 2017
SELECT R1.Id,STRING_AGG(U.Name , ','),R1.Role
FROM Users U
INNER JOIN
(
SELECT R.Id,S.value AS UserId,R.Role
FROM Role R
CROSS APPLY STRING_SPLIT (UserID, ',') S
) R1
ON U.Id=R1.UserId
GROUP BY R1.ID,R1.Role
ORDER BY R1.ID;
OR
FOR SQL SERVER 2016
WITH CTE AS
(
SELECT R2.ID,U.Name,R2.UserId,R2.Role
FROM Users U
INNER JOIN
(
SELECT R.Id,S.value AS UserId,R.Role
FROM Role R
CROSS APPLY STRING_SPLIT (UserID, ',') S
)R2
ON U.id=R2.UserId
)
SELECT DISTINCT R1.Id,
STUFF((
SELECT ',' + name
FROM CTE R3
WHERE R1.Role = R3.Role
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '') AS NAME
,R1.Role
FROM CTE AS R1;
OR
For Old versions
With CTE AS
(
SELECT r.id,
u.name,
r.Role
FROM Users u
INNER JOIN Role r
ON ',' + CAST(r.Userid AS NVARCHAR(20)) + ',' like '%,' + CAST(u.id AS NVARCHAR(20)) + ',%'
)
SELECT DISTINCT id,
STUFF((
SELECT ',' + name
FROM CTE md
WHERE T.Role = md.Role
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '') AS NAME,
Role
FROM
CTE AS T
ORDER BY id
Output
id NAME Role
1 John Manager
2 Paul,Mike,Tom Developer
3 Lisa Designer
Demo
http://sqlfiddle.com/#!18/04a2d/69
There are several problems going on here. The first issue is that you're storing your values in a delimited format. Next, because you're storing your values in a delimited format, the values are being stored as a varchar. This has problems as well, as, as I would guess that the value of your column Id in the table Users is an int; meaning an implicit cast is needed and ruining any SARGability.
So, the solution is to fix the problem, in my view. Because you have a many to many relationship, you'll need an extra table. Let's design the tables as you have them right now, anyway:
CREATE TABLE Users (Id int, Name varchar(100));
CREATE TABLE Role (Id int, UserId varchar(100), [Role] varchar(100));
INSERT INTO Users
VALUES (1,'Lisa'),
(2,'Paul'),
(3,'John'),
(4,'Mike'),
(5,'Tom');
INSERT INTO Roles
VALUES(1,'3','Manager'),
(2,'2,4,5','Developer'),
(3,'1','Designer');
Now, instead we need a new table:
CREATE TABLE UserRoles (Id int, UserID int, RoleID int);
Now, we can insert the proper rows into the database. As you're using SQL Server 2016, we can use STRING_SPLIT:
INSERT INTO UserRoles (UserID, RoleID)
SELECT SS.value, R.Id
FROM Roles R
CROSS APPLY STRING_SPLIT (UserID, ',') SS;
After this, if you want, you could drop your existing column using the following, however, I see no harm in leaving it at the moment:
ALTER TABLE Roles DROP COLUMN UserID;
Now, we can query the data correctly:
SELECT *
FROM Users U
JOIN UserRoles UR ON U.ID = UR.UserID
JOIN Roles R ON UR.RoleID = R.Id;
If you want to then delimit this data, you can use STUFF, but don't store it back; I've explained how to correct your data for a reason! :)
SELECT [Role],
STUFF((SELECT ',' + [Name]
FROM Users U
JOIN UserRoles UR ON U.Id = UR.UserID
WHERE UR.RoleID = R.Id
FOR XML PATH ('')),1,1,'') AS Users
FROM Roles R;
If you were using SQL Server 2017, you'd be able to use STRING_AGG
Clean up script:
DROP TABLE UserRoles;
DROP TABLE Users;
DROP TABLE Roles;
Try this solution:
declare #users table (Id int, Name varchar(100))
declare #role table (Id int, UserId varchar(100), [Role] varchar(100))
insert into #users values
(1, 'Lisa'),
(2, 'Paul'),
(3, 'John'),
(4, 'Mike'),
(5, 'Tom')
insert into #role values
(1, '3', 'Manager'),
(2, '2,4,5', 'Developer'),
(3, '1', 'Designer')
select * from #role [r]
join #users [u] on
CHARINDEX(',' + cast([u].Id as varchar(3)) + ',', ',' + [r].UserId + ',', 1) > 0
I joined both tables based on occurence Id in UserId. To make it possible and avoid matches like: 2 is matched to 12, I decided to match only IDs surrounded by commas. That's why I wrapped in commas Id in a query and also wrapped UserId in commas, to match IDs at the end and the beginning of userId.
This query should give you satisfying result, but to match your desired output exatcly, you have to wrap this query in a CTE and perform group by with string concatenation:
;with cte as (
select [r].Id, [r].Role, [u].Name from #role [r]
join #users [u] on
CHARINDEX(',' + cast([u].Id as varchar(3)) + ',', ',' + [r].UserId + ',', 1) > 0
)
select Id,
(select Name + ',' from cte where Id = [c].Id for xml path('')) [Name],
--I believe this should work in your case, if so, just pick one column from these two
string_agg(Name + ',') [Name2],
Role
from cte [c]
group by Id, Role

T-SQL - Concatenation of names on TWO tables/orphans

I'm prepared to be crucified for asking my first question on SO and what is a potentially duplicate question, but I cannot find it for the life of me.
I have three tables, a product table, a linking table, and a child table with names. Preloaded on SQLFiddle >> if I still have your attention.
CREATE TABLE Product (iProductID int NOT NULL PRIMARY KEY
, sProductName varchar(50) NOT NULL
, iPartGroupID int NOT NULL)
INSERT INTO Product VALUES
(10001, 'Avionic Tackle', '1'),
(10002, 'Eigenspout', '2'),
(10003, 'Impulse Polycatalyst', '3'),
(10004, 'O-webbing', '2'),
(10005, 'Ultraservo', '3'),
(10006, 'Yttrium Coil', '5')
CREATE TABLE PartGroup (iPartGroupID int NOT NULL
, iChildID int NOT NULL)
INSERT INTO PartGroup VALUES
(1, 1),
(2, 2),
(3, 1),
(3, 2),
(3, 3),
(3, 4),
(4, 5),
(4, 6),
(5, 1)
CREATE TABLE PartNames (iChildID int NOT NULL PRIMARY KEY
, sPartNameText varchar(50) NOT NULL)
INSERT INTO PartNames VALUES
(1, 'Bulbcap Lube'),
(2, 'Chromium Deltaquartz'),
(3, 'Dilation Gyrosphere'),
(4, 'Fliphose'),
(5, 'G-tightener Bypass'),
(6, 'Heisenberg Shuttle')
I am trying to find out how to list all the part groups (that may or may not belong to a product), and translate their child names. That is, how do I use only the linking table and child name table to list all the translated elements of the linking table. I am trying to find orphans.
I have two queries:
SELECT P.iPartGroupID
,STUFF(
(SELECT
CONCAT(', ', PN.sPartNameText)
FROM PartGroup PG
INNER JOIN PartNames PN ON PN.iChildID = PG.iChildID
WHERE PG.iPartGroupID = P.iPartGroupID
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
, 1, 2, ''
) AS [Child Elements]
FROM Product P
GROUP BY P.iPartGroupID
This lists all the part groups that belong to a product, and their child elements by name. iPartGroupID = 4 is not here.
I also have:
SELECT PG.iPartGroupID
,STUFF(
(SELECT
CONCAT(', ', PGList.iChildID)
FROM PartGroup PGList
WHERE PGList.iPartGroupID = PG.iPartGroupID
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
, 1, 2, ''
) AS [Child Elements]
FROM PartGroup PG
GROUP BY PG.iPartGroupID
This lists all the part groups, and their child elements by code. iPartGroupID = 4 is covered here, but the names aren't translated.
What query can I use to list the orphan part groups (and also the orphan parts):
4 G-tightener Bypass, Heisenberg Shuttle
Ideally it is included in a list of all the other part groups, but if not, I can union the results.
Every other SO question I've looked up uses either 3 tables, or only 1 table, self joining with aliases. Does anyone have any ideas?
No XML in the part names, no particular preference for CONCAT or SELECT '+'.
I would link to other posts, but I can't without points :(
I'm not entirely sure what do you mean, exactly, when you use the word "translate". And your required output seems to contradict your sample data (if I'm not lost something).
Nevertheless, try this query, maybe it's what you need:
select sq.iPartGroupID, cast((
select pn.sPartNameText + ',' as [data()] from #PartNames pn
inner join #PartGroup p on pn.iChildID = p.iChildID
where p.iPartGroupID = sq.iPartGroupID
order by pn.iChildID
for xml path('')
) as varchar(max)) as [GroupList]
from (select distinct pg.iPartGroupID from #PartGroup pg) sq
left join #Product pr on sq.iPartGroupID = pr.iPartGroupID
where pr.iProductID is null;
Following way you can use to get the answer you want
SELECT pg.iPartGroupID,
CASE COUNT(pg.iPartGroupID)
WHEN 1 THEN (
SELECT pn2.sPartNameText
FROM PartNames pn2
WHERE pn2.iChildID = pg.iPartGroupID
)
ELSE (
SELECT CASE ROW_NUMBER() OVER(ORDER BY(SELECT 1))
WHEN 1 THEN ''
ELSE ','
END + pn2.sPartNameText
FROM PartNames pn2
INNER JOIN PartGroup pg2
ON pg2.iChildID = pn2.iChildID
WHERE pg2.iPartGroupID = pg.iPartGroupID
FOR XML PATH('')
)
END
FROM PartGroup pg
GROUP BY
pg.iPartGroupID

Need to convert a recursive CTE query to an index friendly query

After going through all the hard work of writing a recursive CTE query to meet my needs, I realize I can't use it because it doesn't work in an indexed view. So I need something else to replace the CTE below. (Yes you can use a CTE in a non-indexed view, but that's too slow for me).
The requirements:
My ultimate goal is to have a self updating indexed view (it doesn't have to be a view, but something similar)... that is, if data changes in any of the tables the view joins on, then the view needs to update itself.
The view needs to be indexed because it has to be very fast, and the data doesn't change very frequently. Unfortunately, the non-indexed view using a CTE takes 3-5 seconds to run which is way too long for my needs. I need the query to run in milliseconds. The recursive table has a few hundred thousand records in it.
As far as my research has taken me, the best solution to meet all these requirements is an indexed view, but I'm open to any solution.
The CTE can be found in the answer to my other post.
Or here it is again:
DECLARE #tbl TABLE (
Id INT
,[Name] VARCHAR(20)
,ParentId INT
)
INSERT INTO #tbl( Id, Name, ParentId )
VALUES
(1, 'Europe', NULL)
,(2, 'Asia', NULL)
,(3, 'Germany', 1)
,(4, 'UK', 1)
,(5, 'China', 2)
,(6, 'India', 2)
,(7, 'Scotland', 4)
,(8, 'Edinburgh', 7)
,(9, 'Leith', 8)
;
DECLARE #tbl2 table (id int, abbreviation varchar(10), tbl_id int)
INSERT INTO #tbl2( Id, Abbreviation, tbl_id )
VALUES
(100, 'EU', 1)
,(101, 'AS', 2)
,(102, 'DE', 3)
,(103, 'CN', 5)
;WITH abbr AS (
SELECT a.*, isnull(b.abbreviation,'') abbreviation
FROM #tbl a
left join #tbl2 b on a.Id = b.tbl_id
), abcd AS (
-- anchor
SELECT id, [Name], ParentID,
CAST(([Name]) AS VARCHAR(1000)) [Path],
cast(abbreviation as varchar(max)) abbreviation
FROM abbr
WHERE ParentId IS NULL
UNION ALL
--recursive member
SELECT t.id, t.[Name], t.ParentID,
CAST((a.path + '/' + t.Name) AS VARCHAR(1000)) [Path],
isnull(nullif(t.abbreviation,'')+',', '') + a.abbreviation
FROM abbr AS t
JOIN abcd AS a
ON t.ParentId = a.id
)
SELECT *, [Path] + ':' + abbreviation
FROM abcd
After hitting all the roadblocks with indexed views (self join, cte, udf accessing data etc), I propose that the below as a solution for you.
Create support function
Based on maximum depth of 4 from root (5 total). Or use a CTE
CREATE FUNCTION dbo.GetHierPath(#hier_id int) returns varchar(max)
WITH SCHEMABINDING
as
begin
return (
select FullPath =
isnull(H5.Name+'/','') +
isnull(H4.Name+'/','') +
isnull(H3.Name+'/','') +
isnull(H2.Name+'/','') +
H1.Name
+
':'
+
isnull(STUFF(
isnull(','+A1.abbreviation,'') +
isnull(','+A2.abbreviation,'') +
isnull(','+A3.abbreviation,'') +
isnull(','+A4.abbreviation,'') +
isnull(','+A5.abbreviation,''),1,1,''),'')
from dbo.HIER H1
left join dbo.ABBR A1 on A1.hier_id = H1.Id
left join dbo.HIER H2 on H1.ParentId = H2.Id
left join dbo.ABBR A2 on A2.hier_id = H2.Id
left join dbo.HIER H3 on H2.ParentId = H3.Id
left join dbo.ABBR A3 on A3.hier_id = H3.Id
left join dbo.HIER H4 on H3.ParentId = H4.Id
left join dbo.ABBR A4 on A4.hier_id = H4.Id
left join dbo.HIER H5 on H4.ParentId = H5.Id
left join dbo.ABBR A5 on A5.hier_id = H5.Id
where H1.id = #hier_id)
end
GO
Add columns to the table itself
For example the fullpath column, if you need, add the other 2 columns in the CTE by splitting the result of dbo.GetHierPath on ':' (left=>path, right=>abbreviations)
-- index maximum key length is 900, based on your data, 400 is enough
ALTER TABLE HIER ADD FullPath VARCHAR(400)
Maintain the columns
Because of the hierarchical nature, record X could be deleted that affects a Y descendent and Z ancestor, which is quite hard to identify in either of INSTEAD OF or AFTER triggers. So the alternative approach is based on the conditions
if data changes in any of the tables the view joins on, then the view needs to update itself.
the non-indexed view using a CTE takes 3-5 seconds to run which is way too long for my needs
We maintain the data simply by running through the entire table again, taking 3-5 seconds per update (or faster if the 5-join query works out better).
CREATE TRIGGER TG_HIER
ON HIER
AFTER INSERT, UPDATE, DELETE
AS
UPDATE HIER
SET FullPath = dbo.GetHierPath(HIER.Id)
Finally, index the new column(s) on the table itself
create index ix_hier_fullpath on HIER(FullPath)
If you intended to access the path data via the id, then it is already in the table itself without adding an additional index.
The above TSQL references these objects
Modify the table and column names to suit your schema.
CREATE TABLE dbo.HIER (Id INT Primary Key Clustered, [Name] VARCHAR(20) ,ParentId INT)
;
INSERT dbo.HIER( Id, Name, ParentId ) VALUES
(1, 'Europe', NULL)
,(2, 'Asia', NULL)
,(3, 'Germany', 1)
,(4, 'UK', 1)
,(5, 'China', 2)
,(6, 'India', 2)
,(7, 'Scotland', 4)
,(8, 'Edinburgh', 7)
,(9, 'Leith', 8)
,(10, 'Antartica', NULL)
;
CREATE TABLE dbo.ABBR (id int primary key clustered, abbreviation varchar(10), hier_id int)
;
INSERT dbo.ABBR( Id, Abbreviation, hier_id ) VALUES
(100, 'EU', 1)
,(101, 'AS', 2)
,(102, 'DE', 3)
,(103, 'CN', 5)
GO
EDIT - Possibly faster alternative
Given that all records are recalculated each time, there is no real need for a function that returns the FullPath for a single HIER.ID. The query in the support function can be used without the where H1.id = #hier_id filter at the end. Furthermore, the expression for FullPath can be broken into PathOnly and Abbreviation easily down the middle. Or just use the original CTE, whichever is faster.

SQL - Ordering by multiple criteria

I have a table of categories. Each category can either be a root level category (parent is NULL), or have a parent which is a root level category. There can't be more than one level of nesting.
I have the following table structure:
Categories Table Structure http://img16.imageshack.us/img16/8569/categoriesi.png
Is there any way I could use a query which produced the following output:
Free Stuff
Hardware
Movies
CatA
CatB
CatC
Software
Apples
CatD
CatE
So the results are ordered by top level category, then after each top level category, subcategories of that category are listed?
It's not really ordering by Parent or Name, but a combo of the two. I'm using SQL Server.
It seems to me like you are looking to flatten and order your hierarchy, the cheapest way to get this ordering would be to store an additional column in the table that has the full path.
So for example:
Name | Full Path
Free Stuff | Free Stuff
aa2 | Free Stuff - aa2
Once you store the full path, you can order on it.
If you only have a depth of one you can auto generate a string to this effect with a single subquery (and order on it), but this solution does not work that easily when it gets deep.
Another option, is to move this all over to a temp table and calculate the full path there, on demand. But it is fairly expensive.
You could make the table look at itself, ordering by the parent Name then the child Name.
select categories.Name AS DisplayName
from categories LEFT OUTER JOIN
categories AS parentTable ON categories.Parent = parentTable.ID
order by parentTable.Name, DisplayName
Ok, here we go :
with foo as
(
select 1 as id, null as parent, 'CatA' as cat from dual
union select 2, null, 'CatB' from dual
union select 3, null, 'CatC' from dual
union select 4, 1, 'SubCatA_1' from dual
union select 5, 1, 'SubCatA_2' from dual
union select 6, 2, 'SubCatB_1' from dual
union select 7, 2, 'SubCatB_2' from dual
)
select child.cat
from foo parent right outer join foo child on parent.id = child.parent
order by case when parent.id is not null then parent.cat else child.cat end,
case when parent.id is not null then 1 else 0 end
Result :
CatA
SubCatA_1
SubCatA_2
CatB
SubCatB_1
SubCatB_2
CatC
Edit - Solution change inspire from van's order by ! Much simpler that way.
Not entirely sure of your questions but it sounds like PARTITION BY might be useful for you. There's a good introductory post on PARTITION BY here.
Here you have a complete working example using a resursive common table expression.
DECLARE #categories TABLE
(
ID INT NOT NULL,
[Name] VARCHAR(50),
Parent INT NULL
);
INSERT INTO #categories VALUES (4, 'Free Stuff', NULL);
INSERT INTO #categories VALUES (1, 'Hardware', NULL);
INSERT INTO #categories VALUES (3, 'Movies', NULL);
INSERT INTO #categories VALUES (2, 'Software', NULL);
INSERT INTO #categories VALUES (10, 'a', 0);
INSERT INTO #categories VALUES (12, 'apples', 2);
INSERT INTO #categories VALUES (8, 'catD', 2);
INSERT INTO #categories VALUES (9, 'catE', 2);
INSERT INTO #categories VALUES (5, 'catA', 3);
INSERT INTO #categories VALUES (6, 'catB', 3);
INSERT INTO #categories VALUES (7, 'catC', 3);
INSERT INTO #categories VALUES (11, 'aa2', 4);
WITH categories(ID, Name, Parent, HierarchicalName)
AS
(
SELECT
c.ID
, c.[Name]
, c.Parent
, CAST(c.[Name] AS VARCHAR(200)) AS HierarchicalName
FROM #categories c
WHERE c.Parent IS NULL
UNION ALL
SELECT
c.ID
, c.[Name]
, c.Parent
, CAST(pc.HierarchicalName + c.[Name] AS VARCHAR(200))
FROM #categories c
JOIN categories pc ON c.Parent = pc.ID
)
SELECT c.*
FROM categories c
ORDER BY c.HierarchicalName
SELECT
ID,
Name,
Parent,
RIGHT(
'000000000000000' +
CASE WHEN Parent IS NULL
THEN CONVERT(VARCHAR, Id)
ELSE CONVERT(VARCHAR, Parent)
END, 15
)
+ '_' + CASE WHEN Parent IS NULL THEN '0' ELSE '1' END
+ '_' + Name
FROM
categories
ORDER BY
4
The long padding is to account for the fact that SQL Server's INT data type goes from 2,147,483,648 through 2,147,483,647.
You can ORDER BY the expression directly, no need to use ORDER BY 4. It was just to show what it is sorting on.
It is worth noting that this expression cannot use any index. This means sorting a large table will be slow.