Exclusive-Or in SQL Many to Many Relationship - sql

I have a scenario where I want to find a list of records in a table joined to another through a many to many relationship using an exclusive-or type of relationship. Given the contrived example below, I need a list of categories that are assigned to at least one article, but not all articles. I could brute force this by looping through all of the categories, but that's extremely inefficient. Is there a nice clean way to do this in T-SQL on MS SQL Server?
CREATE TABLE [dbo].[ArticleCategories](
[ArticleId] [int] NOT NULL,
[CategoryId] [int] NOT NULL,
CONSTRAINT [PK_ArticleCategories] PRIMARY KEY CLUSTERED
(
[ArticleId] ASC,
[CategoryId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[Articles](
[Id] [int] NOT NULL,
[Title] [nvarchar](100) NOT NULL,
CONSTRAINT [PK_Articles] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[Categories](
[Id] [int] NOT NULL,
[CategoryName] [nvarchar](100) NULL,
CONSTRAINT [PK_Categories] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
insert into Articles ( Id, Title ) values ( 1, 'Jon Snow')
insert into Articles ( Id, Title ) values ( 2, 'Joffry Baratheon')
insert into Articles ( Id, Title ) values ( 3, 'Cercei Lanister')
insert into Articles ( Id, Title ) values ( 4, 'Sansa Stark')
insert into Articles ( Id, Title ) values ( 5, 'Khal Drogo')
insert into Articles ( Id, Title ) values ( 6, 'Ramsey Bolton')
insert into Articles ( Id, Title ) values ( 7, 'Melisandre')
insert into Categories ( Id, CategoryName ) values ( 1, 'Orange')
insert into Categories ( Id, CategoryName ) values ( 2, 'Blue')
insert into Categories ( Id, CategoryName ) values ( 3, 'Purple')
insert into Categories ( Id, CategoryName ) values ( 4, 'Green')
insert into Categories ( Id, CategoryName ) values ( 5, 'Violet')
insert into Categories ( Id, CategoryName ) values ( 6, 'Yellow')
insert into Categories ( Id, CategoryName ) values ( 7, 'Black')
insert into ArticleCategories (ArticleId, CategoryId) values (1, 1 )
insert into ArticleCategories (ArticleId, CategoryId) values (2, 1 )
insert into ArticleCategories (ArticleId, CategoryId) values (3, 1 )
insert into ArticleCategories (ArticleId, CategoryId) values (4, 1 )
insert into ArticleCategories (ArticleId, CategoryId) values (5, 1 )
insert into ArticleCategories (ArticleId, CategoryId) values (6, 1 )
insert into ArticleCategories (ArticleId, CategoryId) values (7, 1 )
insert into ArticleCategories (ArticleId, CategoryId) values (2, 2 )
insert into ArticleCategories (ArticleId, CategoryId) values (3, 2 )
insert into ArticleCategories (ArticleId, CategoryId) values (5, 3 )
insert into ArticleCategories (ArticleId, CategoryId) values (7, 3 )
In this scenario, the query would not return the category 'Orange' because it is assigned to all of the Articles. It would return 'Blue' and 'Purple' because they are assigned to at least one article, but not all. The other categories will not return at all because they aren't assigned at all.
The expected results would be:
2|Blue
3|Purple
Updated to include sample data and expected output.

The conditions can be tested without joins. The join is only necessary for the category name in the result
SELECT AC.CategoryId, C.CategoryName
FROM
ArticleCategories AC
INNER JOIN Categories C
ON AC.CategoryId = C.CategoryId
GROUP BY AC.CategoryID
HAVING Count(*) < (SELECT Count(*) FROM Articles)
The table ArticleCategories contains only information on groups that have been assigned to an article at least once, therefore no extra condition is required for this.
Since the Primary Key of ArticleCategories includes both columns (ArticleId and CategoryId) there can be no duplicate article per category. Therefore, the count per category is equal to the number of articles this category has been assigned to.
Note that I am using the HAVING-clause, not the WHERE-clause. The WHERE-clause is applied before grouping. The HAVING-clause is applied after grouping and can refer to aggregate results.

Using your sample: http://rextester.com/THCJ13143
and a query using group by and having:
SELECT AC.CategoryID, c.categoryName
FROM ArticleCategories AC
LEFT JOIN Categories C
on C.ID = AC.CategoryID
GROUP BY AC.CategoryID, c.Categoryname
HAVING count(AC.ArticleID) < (SELECT count(*) FROM Articles)
We get:
CategoryID categoryName
2 Blue
3 Purple

Related

Filter by Id - Exact match or get the default record

Filter the table by exact match if the match is not exist filter by default Id. Consider the following Table:
Table structure:
CREATE TABLE [dbo].[ProductGroupData]
(
[Id] TINYINT NOT NULL,
[Name] NVARCHAR(50) NOT NULL,
CONSTRAINT [PK_dbo_ProductGroupData]
PRIMARY KEY CLUSTERED ([Id] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
CREATE TABLE [dbo].[ProductData]
(
[Id] INT NOT NULL,
[GroupId] TINYINT NOT NULL,
[TypeId] TINYINT NOT NULL,
[Product] NVARCHAR(50) NOT NULL,
CONSTRAINT [PK_dbo_ProductData]
PRIMARY KEY CLUSTERED ([Id] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
Sample data:
INSERT INTO [dbo].[ProductGroupData] VALUES (1, N'Apple Box');
INSERT INTO [dbo].[ProductGroupData] VALUES (2, N'Orange Box');
INSERT INTO [dbo].[ProductData] VALUES (1, 1, 1, N'Apple #1');
INSERT INTO [dbo].[ProductData] VALUES (2, 1, 3, N'Apple #3');
INSERT INTO [dbo].[ProductData] VALUES (3, 1, 4, N'Apple #4');
INSERT INTO [dbo].[ProductData] VALUES (4, 1, 5, N'Apple #5');
INSERT INTO [dbo].[ProductData] VALUES (5, 2, 1, N'Orange #1');
INSERT INTO [dbo].[ProductData] VALUES (6, 2, 5, N'Orange #5');
The [TypeId] rages from 1 to 5 and the default [TypeId] is 1; If the match is not exist need to return the result with filter [TypeId] is 1 in the Same SELECT statement. Here I projected the table in the explanatory purpose but in actual scenario I used this logic in INNER JOIN
I tried with the following scenarios
Scenario #1:
DECLARE #GroupId TINYINT = 1;
DECLARE #DefaultTypeId TINYINT = 1;
DECLARE #TypeId TINYINT = 2;
SELECT PD.*
FROM [dbo].[ProductGroupData] PGD
INNER JOIN [dbo].[ProductData] PD ON PD.[GroupId] = PGD.[Id]
WHERE PD.[GroupId] = #GroupId
AND (PD.[TypeId] = #TypeId OR PD.[TypeId] = #DefaultTypeId);
Scenario #2:
DECLARE #TypeId TINYINT = 3;
This above statement i.e., Scenario #1 works fine for the missing Id's and the default Id, If I tried the Scenario #2 SELECT statement it returns two rows:
The expected result is
How to perform this in a Single SELECT Statement. Please assist.
You can use the following query:
;WITH CTE AS
(
SELECT ID FROM [dbo].[ProductData] WHERE [GroupId] = #GroupId AND [TypeId] = #TypeId
)
SELECT PD.*
FROM [dbo].[ProductGroupData] PGD
INNER JOIN [dbo].[ProductData] PD ON PD.[GroupId] = PGD.[Id]
WHERE
(PD.ID IN (SELECT ID FROM CTE))
OR
(NOT EXISTS(SELECT 1 FROM CTE) AND PD.[GroupId] = #GroupId AND PD.[TypeId] = #DefaultTypeId)
The first filter (exact match) is applied by the CTE. The second filter (default record) is applied by the main query only when CTE returns nothing.

Write SQL to identify multiple subgroupings within a grouping

I have a program that summarizes non-normalized data in one table and moves it to another and we frequently get a duplicate key violation on the insert due to bad data. I want to create a report for the users to help them identify the cause of the error.
For example, consider the following contrived simple SQL which summarizes data in the table Companies and inserts it into CompanySum, which has a primary key of State/Zone. In order for the INSERT not to fail, there cannot be more than one distinct combinations of Company/Code for every unique primary key State/Zone combination. If there is, we want the insert to fail so that the data can be corrected.
INSERT INTO CompanySum
(
[State]
,[Zone]
,[Company]
,[Code]
,[Revenue]
)
SELECT
--Keys of target
[State]
,[Zone]
--We are expecting to have one distinct combination of these fields per key grouping
,[Company]
,[Code]
--Aggregate
,SUM([Revenue])
FROM COMPANIES
GROUP BY
[State]
,[Zone]
,[Company]
,[Code]
I would like to create a report to help the users easily identify and correct the data so that there is only one distinct Company/Code combination within a State/Zone. For each distinct State/Zone value, I would like to identify the distinct Company/Code combinations within the State/Zone. If there are more than one Company/Code combinations within a State/Zone, I would like all of the records in the State/Zone to be displayed in the output. For example, here is the sample input and desired output:
Data:
RecordNumber State Zone Company Code Revenue
------------ ----- ---- ------- ---- --------
1 CT B State of CT 65453 10
2 CT B State of CT 65453 3
3 CT B Travelers 33443 20
4 CT C Cigna 45678 24
5 CT C Cigna 45678 234
6 MI A GM 48089 100
7 MI A GM 54555 200
8 MI B Chrysler 43434 44
Desired Output:
RecordNumber State Zone Company Code Revenue
------------ ----- ---- ------- ---- --------
1 CT B State of CT 65453 10
2 CT B State of CT 65453 3
3 CT B Travelers 33443 20
6 MI A GM 48089 100
7 MI A GM 54555 200
Here is the DDL and DML needed to create this test scenario
CREATE TABLE [dbo].[Companies](
[RecordNumber] [int] NULL,
[State] [char](2) NOT NULL,
[Zone] [varchar](30) NOT NULL,
[Company] [varchar](30) NOT NULL,
[Code] [varchar](30) NOT NULL,
[Revenue] [numeric](9, 1) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[CompanySum](
[State] [char](2) NOT NULL,
[Zone] [varchar](30) NOT NULL,
[Company] [varchar](30) NOT NULL,
[Code] [varchar](30) NOT NULL,
[Revenue] [numeric](9, 1) NULL,
CONSTRAINT [PK_CompanySum] PRIMARY KEY CLUSTERED
(
[State] ASC,
[Zone] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
DELETE FROM [dbo].[Companies]
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (1, N'CT', N'B', N'State of CT', N'65453', CAST(10.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (2, N'CT', N'B', N'State of CT', N'65453', CAST(3.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (3, N'CT', N'B', N'Travelers', N'33443', CAST(20.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (4, N'CT', N'C', N'Cigna', N'45678', CAST(24.0 AS Numeric(9, 1)))
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (5, N'CT', N'C', N'Cigna', N'45678', CAST(234.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (6, N'MI', N'A', N'GM', N'48089', CAST(100.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (7, N'MI', N'A', N'GM', N'54555', CAST(200.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (8, N'MI', N'B', N'Chrysler', N'43434', CAST(44.0 AS Numeric(9, 1)))
GO
This is a hopefully better re-construction of a previous post of mine SQL to return unique combinations of non key columns within a set of key columns where I am trying to help clarify the question and provide a simple working example that readers can use.
Please see this SQL Fiddle:
http://sqlfiddle.com/#!18/d0141/1
Is this a solution?
Fiddle: http://sqlfiddle.com/#!18/12e9a0/9
select c.*
from
Companies c
inner join (
select State, Zone
from Companies
group by State, Zone
having count(distinct Company + Code) > 1
) as dup_state_zone
on(
c.State = dup_state_zone.State
and c.Zone = dup_state_zone.Zone
)
Edited - Fix the having clause, with a little cheat...
I used windows ranking function to rank the records by state ordering by zone ascending, to get the desired output.
Suggestion: I would like to say that the insert statement of your CompanySum will ail due to your primary key constraint as you select duplicate key records. in this case you need to change your primary key constraint a little.
CONSTRAINT [PK_CompanySum] PRIMARY KEY CLUSTERED
(
[State] ASC,
[Zone] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Since State and zone both are with duplicate values this insert will fail. better add a auto increment primary key, or include RecordNumber in to Primary key constraint rather than using State and Zone to make it usnique as there are duplicate values in your desired output.
SELECT
A.[RecordNumber]
,A.[State]
,A.[Zone]
,A.[Company]
,A.Code
,A.Revenue
FROM
(
SELECT *
,RANK() OVER (PARTITION BY [STATE] ORDER BY Zone) AS [row]
FROM Companies
) AS A
WHERE [row] =1
Highlighted are duplicates which will make your insert fail.

Table not updating in single select query

i have a temporary table which i need to update, the first row is updated but the second row updates as null , please help
declare #T Table
(
ID int,
Name nvarchar(20),
rownum int
)
insert into #T(ID,rownum)
select ID, rownum = ROW_NUMBER() OVER(order by id) from testtabel4
select * from testtabel4
update #t
set Name=case when rownum>1 then (select top 1 Name from #T x where x.rownum=(y.rownum-1))
else 'first' end
from #t y
select * from #T
and here the definition of testtabel4
CREATE TABLE [dbo].[testtabel4](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](80) NULL,
PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
and here is the output
ID Name
1 first
2 NULL
I think your update would be better written with lag() and an updateable CTE.
with cte as (
select name, lag(name, 1, 'first') over(order by rownum) lag_name
from #t
)
update cte set name = lag_name
With this technique at hand, it is plain to see that don't actually need to feed the table first, then insert into it. You can do both at once, like so:
insert into #t (id, name, rownum)
select
id,
lag(name, 1, 'first') over(order by id),
row_number() over(order by id)
from testtabel4
I am not sure that you even need rownum column anymore, unless it is needed for some other purpose.
You are only inserting two columns in the #T:
insert into #T (ID, rownum)
select ID, ROW_NUMBER() OVER (order by id)
from testtabel4;
You are not inserting name so it is NULL on all rows. Hence, the then part of the case expression will always be NULL.

How to convert Rows to Columns in Sql

I've a table Columns
and a second table Response in which all data is saved.
Now I want to create a SQL View in which the result should be like this
I tried using pivot
select UserId ,FromDate, ToDate, Project, Comment
from
(
select R.UserId ,R.Text , C.ColumnName
from [Columns] C
INNER JOIN Response R ON C.Id=R.ColumnId
) d
pivot
(
max(Text)
for ColumnName in (FromDate, ToDate, Project, Comment)
) piv;
but that didn't worked for me, I also referred this Efficiently convert rows to columns in sql server but was not able to implement it. Any ideas how to achieve the same in SQL View?
Scripts for Tables:
CREATE TABLE [dbo].[Columns](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](1000) NULL,
[IsActive] [bit] NULL,
CONSTRAINT [PK_Columns] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
insert into [Columns] values('FromDate',1)
insert into [Columns] values('ToDate',1)
insert into [Columns] values('Project',1)
insert into [Columns] values('Comment',1)
CREATE TABLE [dbo].[Response](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[UserId] [bigint] NOT NULL,
[ColumnId] [bigint] NOT NULL,
[Text] [nvarchar](max) NULL,
[IsActive] [bit] NULL,
CONSTRAINT [PK_Response] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
insert into [Response] values(1,1,'1/1/2012',1)
insert into [Response] values(1,2,'1/2/2012',1)
insert into [Response] values(1,3,'p1',1)
insert into [Response] values(1,4,'c1',1)
insert into [Response] values(2,1,'1/1/2013',1)
insert into [Response] values(2,2,'1/2/2013',1)
insert into [Response] values(2,3,'p2',1)
insert into [Response] values(2,4,'c2',1)
insert into [Response] values(2,1,'1/1/2014',1)
insert into [Response] values(2,2,'1/2/2014',1)
insert into [Response] values(2,3,'p3',1)
insert into [Response] values(2,4,'c3',1)
insert into [Response] values(3,1,'1/1/2015',1)
insert into [Response] values(3,2,'1/2/2015',1)
insert into [Response] values(3,3,'p4',1)
insert into [Response] values(3,4,'c4',1)
Honestly, if the column types aren't going to change, or you only need a subset of them, you could just filter them out and then join on them rather than write a pivot. I wrote it using a cte, but they could just as easily be sub-queries:
;with fd as
(
select
UserID,
[Text] as FromDate,
row_number() over (partition by userID order by ID) as DEDUP
from response
where ColumnID = 1
),
td as
(
select
UserID,
[Text] as ToDate,
row_number() over (partition by userID order by ID) as DEDUP
from response
where ColumnID = 2
),
p as
(
select
UserID,
[Text] as Project,
row_number() over (partition by userID order by ID) as DEDUP
from response
where ColumnID = 3
),
c as
(
select
UserID,
[Text] as Comment,
row_number() over (partition by userID order by ID) as DEDUP
from response
where ColumnID = 4
)
select
fd.*,
td.ToDate,
p.Project,
c.Comment
from fd
inner join td
on fd.UserId = td.UserId
and fd.DEDUP = td.DEDUP
inner join p
on fd.UserId = p.UserId
and fd.DEDUP = p.DEDUP
inner join c
on fd.UserId = c.UserId
and fd.DEDUP = c.DEDUP
Try this. I worked on your answer.
select UserId ,FromDate, ToDate, Project, Comment
from
(
select R.UserId ,R.RText , C.ColumnName
from [Columns] C
INNER JOIN Response R ON C.Id=R.ColumnId
) d
pivot
(
Min(Rtext)
for ColumnName in (FromDate, ToDate, Project, Comment)
) piv
UNION
select UserId ,FromDate, ToDate, Project, Comment
from
(
select R.UserId ,R.RText , C.ColumnName
from [Columns] C
INNER JOIN Response R ON C.Id=R.ColumnId
) d
pivot
(
Max(Rtext)
for ColumnName in (FromDate, ToDate, Project, Comment)
) piv;
You can query like this
;with cte as
(
select r.*,
c.name
from Response r
inner join Columns c
on r.columnid = c.id
)
select
Userid,
max([FromDate]) as [FromDate],
max([ToDate]) as [ToDate],
max([Project]) as [Project],
max([Comment]) as [Comment]
from cte
pivot
(
max(Text) for name in ([FromDate], [ToDate], [Project], [Comment])
) p
group by userid

How to get result from parent child table

Work on SQL-Server. My table structure is below
CREATE TABLE [dbo].[AgentInfo](
[AgentID] [int] NOT NULL,
[ParentID] [int] NULL,
CONSTRAINT [PK_AgentInfo] PRIMARY KEY CLUSTERED
(
[AgentID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT [dbo].[AgentInfo] ([AgentID], [ParentID]) VALUES (1, -1)
INSERT [dbo].[AgentInfo] ([AgentID], [ParentID]) VALUES (2, -1)
INSERT [dbo].[AgentInfo] ([AgentID], [ParentID]) VALUES (3, 1)
INSERT [dbo].[AgentInfo] ([AgentID], [ParentID]) VALUES (4, 2)
Required output
Use my below syntax get required output but not satisfied. Is there any better way to get the required output
--get parent child list
---step--1
SELECT *
INTO #temp1
FROM ( SELECT a.AgentID ,
a.ParentID,
a.AgentID AS BaseAgent
FROM dbo.AgentInfo a WHERE ParentID=-1
UNION ALL
SELECT a.ParentID ,
0 as AgentID,
a.AgentID AS BaseAgent
FROM dbo.AgentInfo a WHERE ParentID!=-1
UNION ALL
SELECT a.AgentID ,
a.ParentID,
a.AgentID AS BaseAgent
FROM dbo.AgentInfo a
WHERE ParentID!=-1
) AS d
SELECT * FROM #temp1
DROP TABLE #temp1
Help me to improve my syntax. If you have any questions please ask.
You could use a recursive SELECT, see the examples in the documentation for WITH, starting with example D.
The general idea within the recursive WITH is: You have a first select that is the starting point, and then a UNION ALL and a second SELECT which describes the step from on level to the next, where the previous level can either be the result of the first select or the result of the previous run of the second SELECT.
You can try this, to get a tree of the elements:
WITH CTE_AgentInfo(AgentID, ParentID, BaseAgent)
AS(
SELECT
AgentID,
ParentID,
AgentID AS BaseAgent
FROM AgentInfo
WHERE ParentID = -1
UNION ALL
SELECT
a.AgentID,
a.ParentID,
a.AgentID AS BaseAgent
FROM AgentInfo a
INNER JOIN CTE_AgentInfo c ON
c.AgentID = a.ParentID
)
SELECT * FROM CTE_AgentInfo
And here is an SQLFiddle demo to see it.
Try something like this:
WITH Merged (AgentId, ParentId) AS (
SELECT AgentId, ParentId FROM AgentInfo WHERE ParentId = -1
UNION ALL
SELECT AgentInfo.AgentId, AgentInfo.ParentId FROM AgentInfo INNER JOIN Merged ON AgentInfo.AgentId = Merged.ParentId
)
SELECT * FROM Merged
You can use a Common Table Expression to do this.
The sql statement will then look like this:
WITH [Parents]([AgentID], [ParentID], [BaseAgent])
AS
(
SELECT
[AgentID],
[ParentID],
[AgentID] AS [BaseAgent]
FROM [AgentInfo]
WHERE [ParentID] = -1
UNION ALL
SELECT
[ai].[AgentID],
[ai].[ParentID],
[p].[BaseAgent]
FROM [AgentInfo] [ai]
INNER JOIN [Parents] [p]
ON [ai].[ParentID] = [p].[AgentID]
)
SELECT *
FROM [Parents]
ORDER BY
[BaseAgent] ASC,
[AgentID] ASC
But, the results are different from your desired output, since every Agent is only listed once.
The output is:
AGENTID PARENTID BASEAGENT
1 -1 1
3 1 1
2 -1 2
4 2 2
The Fiddle is over here.
And here is a nice post on working with hierarchies: What are the options for storing hierarchical data in a relational database?