Active Directory: Convert canonicalName node value from string to integer - sql

Are there any methods available to convert the string text value contained within the AD canonicalName attribute to an incremented integer value? Or, does this need to be performed manually?
For example:
canonicalName (what I am getting) hierarchyNode (what I need)
\domain.com\ /1/
\domain.com\Corporate /1/1/
\domain.com\Corporate\Hr /1/1/1/
\domain.com\Corporate\Accounting /1/1/2/
\domain.com\Users\ /1/2/
\domain.com\Users\Sales /1/2/1/
\domain.com\Users\Whatever /1/2/2/
\domain.com\Security\ /1/3/
\domain.com\Security\Servers /1/3/1/
\domain.com\Security\Administrative /1/3/2/
\domain.com\Security\Executive /1/3/3/
I am extracting user objects into a SQL Server database for reporting purposes. The user objects are spread throughout multiple OU's in the forest. So, by identifying the highest node on the tree that contains users, I can then utilize the SQL Server GetDescendent() method to quickly retrieve users recursively without having to write 1 + n number of sub-selects.
For reference: https://learn.microsoft.com/en-us/sql/t-sql/data-types/hierarchyid-data-type-method-reference
UPDATE:
I am able to convert the canonicalName from string to integer (see below using SQL Server 2014). However, this doesn't seem to solve my problem. I have only built the branches of the tree by stripping off the leafs, that way I can get IsDescendant() by tree branch. But now, I cannot insert the leafs in batch as it appears I need to GetDescendant(), which appears to be built for handling inserts one at a time.
How can I build the Active Directory directory tree, which resembles file system paths, as a SQL Hierarchy? All examples treat the hierarchy as an immediately parent/child relationship and use a recursive CTE to build from the root, which requires the parent child relationship to already be know. In my case, the parent child relationship is known only through the '/' delimiter.
-- Drop and re-create temp table(s) that are used by this procedure.
IF OBJECT_ID(N'Tempdb.dbo.#TEMP_TreeHierarchy', N'U') IS NOT NULL
BEGIN
DROP TABLE #TEMP_TreeHierarchy
END;
-- Drop and re-create temp table(s) that are used by this procedure.
IF OBJECT_ID(N'Tempdb.dbo.#TEMP_AdTreeHierarchyNodeNames', N'U') IS NOT NULL
BEGIN
DROP TABLE #TEMP_AdTreeHierarchyNodeNames
END;
-- CREATE TEMP TABLE(s)
CREATE TABLE #TEMP_TreeHierarchy(
TreeHierarchyKey INT IDENTITY(1,1) NOT NULL
,TreeHierarchyId hierarchyid NULL
,TreeHierarchyNodeLevel int NULL
,TreeHierarchyNode varchar(255) NULL
,TreeCanonicalName varchar(255) NOT NULL
PRIMARY KEY CLUSTERED
(
TreeCanonicalName ASC
))
CREATE TABLE #TEMP_AdTreeHierarchyNodeNames (
TreeCanonicalName VARCHAR(255) NOT NULL
,TreeHierarchyNodeLevel INT NOT NULL
,TreeHierarchyNodeName VARCHAR(255) NOT NULL
,IndexValueByLevel INT NULL
PRIMARY KEY CLUSTERED
(
TreeCanonicalName ASC
,TreeHierarchyNodeLevel ASC
,TreeHierarchyNodeName ASC
))
-- Step 1.) INSERT the DISTINCT list of CanonicalName values into #TEMP_TreeHierarchy.
-- Remove the reserved character '/' that has been escaped '\/'. Note: '/' is the delimiter.
-- Remove all of the leaves from the tree, leaving only the root and the branches/nodes.
;WITH CTE1 AS (SELECT CanonicalNameParseReserveChar = REPLACE(A.CanonicalName, '\/', '') -- Remove the reserved character '/' that has been escaped '\/'.
FROM dbo.AdObjects A
)
-- Remove CN from end of string in order to get the distinct list (i.e., remove all of the leaves from the tree, leaving only the root and the branches/nodes).
-- INSERT the records INTO #TEMP_TreeHierarchy
INSERT INTO #TEMP_TreeHierarchy (TreeCanonicalName)
SELECT DISTINCT
CanonicalNameTree = REVERSE(SUBSTRING(REVERSE(C1.CanonicalNameParseReserveChar), CHARINDEX('/', REVERSE(C1.CanonicalNameParseReserveChar), 0) + 1, LEN(C1.CanonicalNameParseReserveChar) - CHARINDEX('/', REVERSE(C1.CanonicalNameParseReserveChar), 0)))
FROM CTE1 C1
-- Step 2.) Get NodeLevel and NodeName (i.e., key/value pair).
-- Get the nodes for each entry by splitting out the '/' delimiter, which provides both the NodeLevel and NodeName.
-- This table will be used as scratch to build the HierarchyNodeByLvl,
-- which is where the heavy lifting of converting the canonicalName value from string to integer occurs.
-- Note: integer is required for the node name - string values are not allowed. Thus this bridge must be build dynamically.
-- Achieve dynamic result by using CROSS APPLY to convert a single delimited row into 1 + n rows, based on the number of nodes.
-- INSERT the key/value pair results INTO a temp table.
-- Use ROW_NUMBER() to identify each NodeLevel, which is the key.
-- Use the string contained between the delimiter, which is the value.
-- Combined, these create a unique identifier that will be used to roll-up the HierarchyNodeByLevel, which is a RECURSIVE key/value pair of NodeLevel and IndexValueByLevel.
-- The rolled-up value contained in HierarchyNodeByLevel is what the SQL Server hierarchyid::Parse() function requires in order to create the hierarchyid.
-- https://blog.sqlauthority.com/2015/04/21/sql-server-split-comma-separated-list-without-using-a-function/
INSERT INTO #TEMP_AdTreeHierarchyNodeNames (TreeCanonicalName, TreeHierarchyNodeLevel, TreeHierarchyNodeName)
SELECT TreeCanonicalName
,TreeHierarchyNodeLevel = ROW_NUMBER() OVER(PARTITION BY TreeCanonicalName ORDER BY TreeCanonicalName)
,TreeHierarchyNodeName = LTRIM(RTRIM(m.n.value('.[1]','VARCHAR(MAX)')))
FROM (SELECT TH.TreeCanonicalName
,x = CAST('<XMLRoot><RowData>' + REPLACE(TH.TreeCanonicalName,'/','</RowData><RowData>') + '</RowData></XMLRoot>' AS XML)
FROM #TEMP_TreeHierarchy TH
) SUB1
CROSS APPLY x.nodes('/XMLRoot/RowData')m(n)
-- Step 3.) Get the IndexValueByLevel RECURSIVE key/value pair
-- Get the DISTINCT list of TreeHierarchyNodeLevel, TreeHierarchyNodeName first
-- Use TreeHierarchyNodeLevel is the key
-- Use ROW_NUMBER() to identify each IndexValueByLevel, which is the value.
-- Since the IndexValueByLevel exists for each level, the value for each level must be concatenated together to create the final value that is stored in TreeHierarchyNode
;WITH CTE1 AS (SELECT DISTINCT TreeHierarchyNodeLevel, TreeHierarchyNodeName
FROM #TEMP_AdTreeHierarchyNodeNames
),
CTE2 AS (SELECT C1.*
,IndexValueByLevel = ROW_NUMBER() OVER(PARTITION BY C1.TreeHierarchyNodeLevel ORDER BY C1.TreeHierarchyNodeName)
FROM CTE1 C1
)
UPDATE TMP1
SET TMP1.IndexValueByLevel = C2.IndexValueByLevel
FROM #TEMP_AdTreeHierarchyNodeNames TMP1
INNER JOIN CTE2 C2
ON TMP1.TreeHierarchyNodeLevel = C2.TreeHierarchyNodeLevel
AND TMP1.TreeHierarchyNodeName = C2.TreeHierarchyNodeName
-- Step 4.) Build the TreeHierarchyNodeByLevel.
-- Use FOR XML to roll up all duplicate keys in order to concatenate their values into one string.
-- https://www.mssqltips.com/sqlservertip/2914/rolling-up-multiple-rows-into-a-single-row-and-column-for-sql-server-data/
;WITH CTE1 AS (SELECT DISTINCT TreeCanonicalName
,TreeHierarchyNodeByLevel =
(SELECT '/' + CAST(IndexValueByLevel AS VARCHAR(10))
FROM #TEMP_AdTreeHierarchyNodeNames TMP1
WHERE TMP1.TreeCanonicalName = TMP2.TreeCanonicalName
FOR XML PATH(''))
FROM #TEMP_AdTreeHierarchyNodeNames TMP2
),
CTE2 AS (SELECT C1.TreeCanonicalName
,C1.TreeHierarchyNodeByLevel
,TreeHierarchyNodeLevel = MAX(TMP1.TreeHierarchyNodeLevel)
FROM CTE1 C1
INNER JOIN #TEMP_AdTreeHierarchyNodeNames TMP1
ON TMP1.TreeCanonicalName = C1.TreeCanonicalName
GROUP BY C1.TreeCanonicalName, C1.TreeHierarchyNodeByLevel
)
UPDATE TH
SET TH.TreeHierarchyNodeLevel = C2.TreeHierarchyNodeLevel
,TH.TreeHierarchyNode = C2.TreeHierarchyNodeByLevel + '/'
,TH.TreeHierarchyId = hierarchyid::Parse(C2.TreeHierarchyNodeByLevel + '/')
FROM #TEMP_TreeHierarchy TH
INNER JOIN CTE2 C2
ON TH.TreeCanonicalName = C2.TreeCanonicalName
INSERT INTO AD.TreeHierarchy (EffectiveStartDate, EffectiveEndDate, TreeCanonicalName, TreeHierarchyNodeLevel, TreeHierarchyNode, TreeHierarchyId)
SELECT EffectiveStartDate = CAST(GETDATE() AS DATE)
,EffectiveEndDate = '12/31/9999'
,TH.TreeCanonicalName
,TH.TreeHierarchyNodeLevel
,TH.TreeHierarchyNode
,TH.TreeHierarchyId
FROM #TEMP_TreeHierarchy TH
ORDER BY TH.TreeHierarchyKey
---- For testing purposes only.
SELECT * FROM AD.TreeHierarchy TH
SELECT * FROM #TEMP_AdTreeHierarchyNodeNames
SELECT * FROM #TEMP_TreeHierarchy
-- Clean-up. DROP TEMP TABLE(s).
DROP TABLE #TEMP_TreeHierarchy
DROP TABLE #TEMP_AdTreeHierarchyNodeNames

This is where my thinking takes me
I gave you 9 levels, but the pattern is easy to see and expand
Without a proper sequence I defaulted to alphabetical by node.
It also supports multiple root nodes as well
Example
Select A.*
,Nodes = concat('/',dense_rank() over (Order By N1),'/'
,left(nullif(dense_rank() over (Partition By N1 Order By N2)-1,0),5)+'/'
,left(nullif(dense_rank() over (Partition By N1,N2 Order By N3)-1,0),5)+'/'
,left(nullif(dense_rank() over (Partition By N1,N2,N3 Order By N4)-1,0),5)+'/'
,left(nullif(dense_rank() over (Partition By N1,N2,N3,N4 Order By N5)-1,0),5)+'/'
,left(nullif(dense_rank() over (Partition By N1,N2,N3,N4,N5 Order By N6)-1,0),5)+'/'
,left(nullif(dense_rank() over (Partition By N1,N2,N3,N4,N5,N6 Order By N7)-1,0),5)+'/'
,left(nullif(dense_rank() over (Partition By N1,N2,N3,N4,N5,N6,N7 Order By N8)-1,0),5)+'/'
,left(nullif(dense_rank() over (Partition By N1,N2,N3,N4,N5,N6,N7,N8 Order By N9)-1,0),5)+'/'
)
From YourTable A
Cross Apply (
Select N1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
,N2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,N3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,N4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
,N5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
,N6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
,N7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
,N8 = ltrim(rtrim(xDim.value('/x[8]','varchar(max)')))
,N9 = ltrim(rtrim(xDim.value('/x[9]','varchar(max)')))
From (Select Cast('<x>' + replace((Select replace(stuff([canonicalName],1,1,''),'\','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as A
) B
Order By 1
Returns
canonicalName Nodes
\domain.com\ /1/
\domain.com\Corporate /1/1/
\domain.com\Corporate\Accounting /1/1/1/
\domain.com\Corporate\Hr /1/1/2/
\domain.com\Security\ /1/2/
\domain.com\Security\Administrative /1/2/1/
\domain.com\Security\Executive /1/2/2/
\domain.com\Security\Servers /1/2/3/
\domain.com\Users\ /1/3/
\domain.com\Users\Sales /1/3/1/
\domain.com\Users\Whatever /1/3/2/

Related

Does SQLite have performance shortage on recursive CTE or I write the sql wrong?

There's a tutorial:https://www.sqlservercentral.com/articles/hierarchies-on-steroids-1-convert-an-adjacency-list-to-nested-sets
USE TempDB;
--===== Conditionally drop Temp tables to make reruns easy
IF OBJECT_ID('dbo.Hierarchy','U') IS NOT NULL
DROP TABLE dbo.Hierarchy;
--===== Build the new table on-the-fly including some place holders
WITH cteBuildPath AS
( --=== This is the "anchor" part of the recursive CTE.
-- The only thing it does is load the Root Node.
SELECT anchor.EmployeeID,
anchor.ManagerID,
HLevel = 1,
SortPath = CAST(
CAST(anchor.EmployeeID AS BINARY(4))
AS VARBINARY(4000)) --Up to 1000 levels deep.
FROM dbo.Employee AS anchor
WHERE ManagerID IS NULL --Only the Root Node has a NULL ManagerID
UNION ALL
--==== This is the "recursive" part of the CTE that adds 1 for each level
-- and concatenates each level of EmployeeID's to the SortPath column.
SELECT recur.EmployeeID,
recur.ManagerID,
HLevel = cte.HLevel + 1,
SortPath = CAST( --This does the concatenation to build SortPath
cte.SortPath + CAST(Recur.EmployeeID AS BINARY(4))
AS VARBINARY(4000))
FROM dbo.Employee AS recur
INNER JOIN cteBuildPath AS cte
ON cte.EmployeeID = recur.ManagerID
) --=== This final SELECT/INTO creates the Node # in the same order as a
-- push-stack would. It also creates the final table with some
-- "reserved" columns on the fly. We'll leave the SortPath column in
-- place because we're still going to need it later.
-- The ISNULLs make NOT NULL columns.
SELECT EmployeeID = ISNULL(sorted.EmployeeID,0),
sorted.ManagerID,
HLevel = ISNULL(sorted.HLevel,0),
LeftBower = ISNULL(CAST(0 AS INT),0), --Place holder
RightBower = ISNULL(CAST(0 AS INT),0), --Place holder
NodeNumber = ROW_NUMBER() OVER (ORDER BY sorted.SortPath),
NodeCount = ISNULL(CAST(0 AS INT),0), --Place holder
SortPath = ISNULL(sorted.SortPath,sorted.SortPath)
INTO dbo.Hierarchy
FROM cteBuildPath AS sorted
OPTION (MAXRECURSION 100) --Change this IF necessary
;
--===========================================================================
-- 5. Once the data from Steps 1, 2, 3, AND 4 is complete, update that
-- data with the calculated Left Bower.
--===========================================================================
--===== Calculate the Left Bower
UPDATE dbo.Hierarchy
SET LeftBower = 2 * NodeNumber - HLevel
;
--SELECT * FROM dbo.Hierarchy
I rewrite it to SQLite version:
CREATE TABLE newtable AS
WITH recursive cteBuildPath(id, parent, HLevel, SortPath) AS
( --=== This is the "anchor" part of the recursive CTE.
-- The only thing it does is load the Root Node.
SELECT anchor.id,
anchor.parent,
1 as HLevel,
printf('%08d', anchor.id) as SortPath --Up to 1000 levels deep.
FROM main AS anchor
WHERE anchor.parent IS NULL --Only the Root Node has a NULL ManagerID
UNION ALL
--==== This is the "recursive" part of the CTE that adds 1 for each level
-- and concatenates each level of EmployeeID's to the SortPath column.
SELECT recur.id,
recur.parent,
cte.HLevel + 1 as HLevel,
cte.SortPath || printf('%08d', Recur.id) as SortPath
FROM main AS recur
INNER JOIN cteBuildPath AS cte
ON cte.id = recur.parent
) --=== This final SELECT/INTO creates the Node # in the same order as a
-- push-stack would. It also creates the final table with some
-- "reserved" columns on the fly. We'll leave the SortPath column in
-- place because we're still going to need it later.
-- The ISNULLs make NOT NULL columns.
SELECT ifnull(sorted.id,0) as id,
sorted.parent,
ifnull(sorted.HLevel,0) as HLevel,
ifnull(CAST(0 AS INT),0) as LeftBower, --Place holder
ifnull(CAST(0 AS INT),0) as RightBower, --Place holder
ROW_NUMBER() OVER (ORDER BY sorted.SortPath) as NodeNumber,
ifnull(CAST(0 AS INT),0) as NodeCount, --Place holder
ifnull(sorted.SortPath,sorted.SortPath) as SortPath
FROM cteBuildPath AS sorted;
UPDATE newtable
SET LeftBower = 2 * NodeNumber - HLevel
As the tutorial said, 100 thousands node only need few seconds, but when I run the test, 1800 nodes needs 18seconds, and 160,000 nodes is not completed over 5 minutes, the sqlite expert professional freezes and I don't know if it is still running or not.
Obviously, this is not even close to the claimed performance!
I would like to know if it is impossible to optimize because SQLite is borned to be so slow in recursive calculation or just I wrote it wrong.
any information would be appreciated because I am new to SQL and database.

I need to be able to generate non-repetitive 8 character random alphanumeric for 2.5 million records

I need to be able to apply unique 8 character strings per row on a table that has almost 2.5 million records.
I have tried this:
UPDATE MyTable
SET [UniqueID]=SUBSTRING(CONVERT(varchar(255), NEWID()), 1, 8)
Which works, but when I check the uniqueness of the ID's, I receive duplicates
SELECT [UniqueID], COUNT([UniqueID])
FROM NicoleW_CQ_2019_Audi_CR_Always_On_2019_T1_EM
GROUP BY [UniqueID]
HAVING COUNT([UniqueID]) > 1
I really would just like to update the table, as above, with just a simple line of code, if possible.
Here's a way that uses a temporary table to assure the uniqueness
Create and fill a #temporary table with unique random 8 character codes.
The SQL below uses a FOR XML trick to generate the codes in BASE62 : [A-Za-z0-9]
Examples : 8Phs7ZYl, ugCKtPqT, U9soG39q
A GUID only uses the characters [0-9A-F].
For 8 characters that can generate 16^8 = 4294967296 combinations.
While with BASE62 there are 62^8 = 2.183401056e014 combinations.
So the odds that a duplicate is generated are significantly lower with BASE62.
The temp table should have an equal of larger amount of records than the destination table.
This example only generates 100000 codes. But you get the idea.
IF OBJECT_ID('tempdb..#tmpRandoms') IS NOT NULL DROP TABLE #tmpRandoms;
CREATE TABLE #tmpRandoms (
ID INT PRIMARY KEY IDENTITY(1,1),
[UniqueID] varchar(8),
CONSTRAINT UC_tmpRandoms_UniqueID UNIQUE ([UniqueID])
);
WITH DIGITS AS
(
select n
from (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(n)
),
NUMS AS
(
select (d5.n*10000 + d4.n*1000 + d3.n*100 + d2.n * 10 + d1.n) as n
from DIGITS d1
cross join DIGITS d2
cross join DIGITS d3
cross join DIGITS d4
cross join DIGITS d5
)
INSERT INTO #tmpRandoms ([UniqueID])
SELECT DISTINCT LEFT(REPLACE(REPLACE((select CAST(NEWID() as varbinary(16)), n FOR XML PATH(''), BINARY BASE64),'+',''),'/',''), 8) AS [UniqueID]
FROM NUMS;
Then update your table with it
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (ORDER BY ID) AS RN, [UniqueID]
FROM YourTable
)
UPDATE t
SET t.[UniqueID] = tmp.[UniqueID]
FROM CTE t
JOIN #tmpRandoms tmp ON tmp.ID = t.RN;
A test on rextester here
Can you just use numbers and assign a randomish value?
with toupdate as (
select t.*,
row_number() over (order by newid()) as random_enough
from mytable t
)
update toupdate
set UniqueID = right(concat('00000000', random_enough), 8);
See: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/a289ed64-2038-415e-9f5d-ae84e50fe702/generate-random-string-of-length-5-az09?forum=transactsql
Alter: DECLARE #s char(5) and SELECT TOP (5) c1 to fix length you want.

Guarantee random inserting

I am trying to pregenerate some alphanumeric strings and insert the result into a table. The length of string will be 5. Example: a5r67. Basically I want to generate some readable strings for customers so they can access their orders like
www.example.com/order/a5r67. Now I have a select statement:
;WITH
cte1 AS(SELECT * FROM (VALUES('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) AS v(t)),
cte2 AS(SELECT * FROM (VALUES('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) AS v(t)),
cte3 AS(SELECT * FROM (VALUES('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) AS v(t)),
cte4 AS(SELECT * FROM (VALUES('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) AS v(t)),
cte5 AS(SELECT * FROM (VALUES('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) AS v(t))
INSERT INTO ProductHandles(ID, Used)
SELECT cte1.t + cte2.t + cte3.t + cte4.t + cte5.t, 0
FROM cte1
CROSS JOIN cte2
CROSS JOIN cte3
CROSS JOIN cte4
CROSS JOIN cte5
Now the problem is I need to write something like this to get a value from the table:
SELECT TOP 1 ID
FROM ProductHandles
WHERE Used = 0
I will have index on the Used column so it will be fast. The problem with this is that it comes with order:
00000
00001
00002
...
I know that I can order by NEWID(), but that will be much slower. I know that there is no guarantee of ordering unless we specify Order By clause. What is needed is opposite. I need guaranteed chaos, but not by ordering by NEWID() each time customer creates order.
I am going to use it like:
WITH cte as (
SELECT TOP 1 * FROM ProductHandles WHERE Used = 0
--I don't want to order by newid() here as it will be slow
)
UPDATE cte
SET Used = 1
OUTPUT INSERTED.ID
If you add an identity column to the table, and use order by newid() when inserting the records (that will be slow but it's a one time thing that's being done offline from what I understand) then you can use order by on the identity column to select the records in the order they where inserted to the table.
From the Limitations and Restrictions part of the INSERT page in Microsoft Docs:
INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted.
This means that by doing this you are effectively making the identity column ordered by the same random order the rows where selected in the insert...select statement.
Also, there is no need to repeat the same cte 5 times - you are already repeating the cross apply:
CREATE TABLE ProductHandles(sort int identity(1,1), ID char(5), used bit)
;WITH
cte AS(SELECT * FROM (VALUES('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9'),('a'),('b'),('c'),('d'),('e'),('f'),('g'),('h'),('i'),('j'),('k'),('l'),('m'),('n'),('o'),('p'),('q'),('r'),('s'),('t'),('u'),('v'),('w'),('x'),('y'),('z')) AS v(t))
INSERT INTO ProductHandles(ID, Used)
SELECT a.t + b.t + c.t + d.t + e.t, 0
FROM cte a
CROSS JOIN cte b
CROSS JOIN cte c
CROSS JOIN cte d
CROSS JOIN cte e
ORDER BY NEWID()
Then the cte can have an order by clause that guarantees the same random order as the rows returned from the select statement populating this table:
WITH cte as (
SELECT TOP 1 *
FROM ProductHandles
WHERE Used = 0
ORDER BY sort
)
UPDATE cte
SET Used = 1
OUTPUT INSERTED.ID
You can see a live demo on rextester. (with only digits since it's taking too long otherwise)
Here's a slightly different option...
Rather than trying to generate all possible values in a single sitting, you could simply generate a million or two at a time and generate more as they get used up.
Using this approach, you drastically reduce the the initial creation time and eliminate the need to maintain the massive table of values, the majority of which, that will never be used.
CREATE TABLE dbo.ProductHandles (
rid INT NOT NULL
CONSTRAINT pk_ProductHandles
PRIMARY KEY CLUSTERED,
ID_Value CHAR(5) NOT NULL
CONSTRAINT uq_ProductHandles_IDValue
UNIQUE WITH (IGNORE_DUP_KEY = ON), -- prevents the insertion of duplicate values w/o generating any errors.
Used BIT NOT NULL
CONSTRAINT df_ProductHandles_Used
DEFAULT (0)
);
-- Create a filtered index to help facilitate fast searches
-- of unused values.
CREATE NONCLUSTERED INDEX ixf_ProductHandles_Used_rid
ON dbo.ProductHandles (Used, rid)
INCLUDE(ID_Value)
WHERE Used = 0;
--==========================================================
WHILE 1 = 1 -- The while loop will attempt to insert new rows, in 1M blocks, until required minimum of unused values are available.
BEGIN
IF (SELECT COUNT(*) FROM dbo.ProductHandles ph WHERE ph.Used = 0) > 1000000 -- the minimum num of unused ID's you want to keep on hand.
BEGIN
BREAK;
END;
ELSE
BEGIN
WITH
cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),
cte_Tally (n) AS (
SELECT TOP (1000000) -- Sets the "block size" of each insert attempt.
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
cte_n3 a CROSS JOIN cte_n3 b
)
INSERT dbo.ProductHandles (rid, ID_Value, Used)
SELECT
t.n + ISNULL((SELECT MAX(ph.rid) FROM dbo.ProductHandles ph), 0),
CONCAT(ISNULL(c1.char_1, n1.num_1), ISNULL(c2.char_2, n2.num_2), ISNULL(c3.char_3, n3.num_3), ISNULL(c4.char_4, n4.num_4), ISNULL(c5.char_5, n5.num_5)),
0
FROM
cte_Tally t
-- for each of the 5 positions, randomly generate numbers between 0 & 36.
-- 0-9 are left as numbers.
-- 10 - 36 are converted to lower cased letters.
CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % 36) ) n1 (num_1)
CROSS APPLY ( VALUES (CHAR(CASE WHEN n1.num_1 > 9 THEN n1.num_1 + 87 END)) ) c1 (char_1)
CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % 36) ) n2 (num_2)
CROSS APPLY ( VALUES (CHAR(CASE WHEN n2.num_2 > 9 THEN n2.num_2 + 87 END)) ) c2 (char_2)
CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % 36) ) n3 (num_3)
CROSS APPLY ( VALUES (CHAR(CASE WHEN n3.num_3 > 9 THEN n3.num_3 + 87 END)) ) c3 (char_3)
CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % 36) ) n4 (num_4)
CROSS APPLY ( VALUES (CHAR(CASE WHEN n4.num_4 > 9 THEN n4.num_4 + 87 END)) ) c4 (char_4)
CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % 36) ) n5 (num_5)
CROSS APPLY ( VALUES (CHAR(CASE WHEN n5.num_5 > 9 THEN n5.num_5 + 87 END)) ) c5 (char_5);
END;
END;
After the initial creation, move the code in the WHILE loop to a stored procedure and schedule it to automatically run on a periodic basis.
If I'm understanding this right, It looks like your attempting to separate the URL/visible data from the DB record ID, as most apps use, and provide something that is not directly related to an ID field that the user will see. NEWID() does allow control of the number of characters so you could generate a smaller field with a smaller index. Or just use a portion of the full NEWID()
SELECT CONVERT(varchar(255), NEWID())
SELECT SUBSTRING(CONVERT(varchar(40), NEWID()),0,5)
You might also want to look at a checksum field, I don't know if its faster on indexing though. You could get crazier by combining random NEWID() with a checksum across 2 or 3 fields.
SELECT BINARY_CHECKSUM(5 ,'EP30461105',1)

Populate the record links using Recursive CTE

I have the contact table records which has a link of other contact record or the contact record is not linked to anything (null)
As per below example id 21 is a parent for contact 1
I need to populate the temptable using T-SQL records (Using the recursive CTE) with all the contact links for the each and every contact id in contact table as below
As one contact id is associated with multiple contact ids, the Link1,Link2,link3 columns should be dynamically created if possible.
Could anybody please help me with this script
Try this (necessary remarks in comments):
--data definition
declare #contactTable table (contactId int, linkContactId int)
insert into #contactTable values
(1,21),
(2,null),
(3,450),
(4,1),
(5,900),
(6,5),
(7,3),
(8,1)
--recursive cte
;with cte as (
(select 1 n, contactId from #contactTable
where linkContactId = 1
union
select 1, linkContactId from #contactTable
where contactId = 1)
union all
--this part might seem confusing, I tried writing recursive part similairly as anchor part,
--but it needed to joins, which isn't allowed in recursive part of cte, so I worked around it
select n + 1,
case when cte.n + 1 = t.contactId then t.linkContactId else t.contactId end
from cte join #contactTable [t] on
(cte.n + 1 = t.contactId or cte.n + 1 = t.linkContactId)
)
--grouping results by contactId concatenating all linkContacts
select n [contactId],
(select distinct cast(contactId as varchar(5)) + ',' from cte where n = c.n for xml path(''), type).value('(.)[1]', 'varchar(100)') [linkContactId]
from cte [c]
group by n
As per your above script i was able to nearly get the results
As 4,8 have already been included in the first row, it should not be shown as seperate record/records
Can you please adjust your query and please provide me the skipping script

Is it possible to concatenate column values into a string using CTE?

Say I have the following table:
id|myId|Name
-------------
1 | 3 |Bob
2 | 3 |Chet
3 | 3 |Dave
4 | 4 |Jim
5 | 4 |Jose
-------------
Is it possible to use a recursive CTE to generate the following output:
3 | Bob, Chet, Date
4 | Jim, Jose
I've played around with it a bit but haven't been able to get it working. Would I do better using a different technique?
I do not recommend this, but I managed to work it out.
Table:
CREATE TABLE [dbo].[names](
[id] [int] NULL,
[myId] [int] NULL,
[name] [char](25) NULL
) ON [PRIMARY]
Data:
INSERT INTO names values (1,3,'Bob')
INSERT INTO names values 2,3,'Chet')
INSERT INTO names values 3,3,'Dave')
INSERT INTO names values 4,4,'Jim')
INSERT INTO names values 5,4,'Jose')
INSERT INTO names values 6,5,'Nick')
Query:
WITH CTE (id, myId, Name, NameCount)
AS (SELECT id,
myId,
Cast(Name AS VARCHAR(225)) Name,
1 NameCount
FROM (SELECT Row_number() OVER (PARTITION BY myId ORDER BY myId) AS id,
myId,
Name
FROM names) e
WHERE id = 1
UNION ALL
SELECT e1.id,
e1.myId,
Cast(Rtrim(CTE.Name) + ',' + e1.Name AS VARCHAR(225)) AS Name,
CTE.NameCount + 1 NameCount
FROM CTE
INNER JOIN (SELECT Row_number() OVER (PARTITION BY myId ORDER BY myId) AS id,
myId,
Name
FROM names) e1
ON e1.id = CTE.id + 1
AND e1.myId = CTE.myId)
SELECT myID,
Name
FROM (SELECT myID,
Name,
(Row_number() OVER (PARTITION BY myId ORDER BY namecount DESC)) AS id
FROM CTE) AS p
WHERE id = 1
As requested, here is the XML method:
SELECT myId,
STUFF((SELECT ',' + rtrim(convert(char(50),Name))
FROM namestable b
WHERE a.myId = b.myId
FOR XML PATH('')),1,1,'') Names
FROM namestable a
GROUP BY myId
A CTE is just a glorified derived table with some extra features (like recursion). The question is, can you use recursion to do this? Probably, but it's using a screwdriver to pound in a nail. The nice part about doing the XML path (seen in the first answer) is it will combine grouping the MyId column with string concatenation.
How would you concatenate a list of strings using a CTE? I don't think that's its purpose.
A CTE is just a temporarily-created relation (tables and views are both relations) which only exists for the "life" of the current query.
I've played with the CTE names and the field names. I really don't like reusing fields names like id in multiple places; I tend to think those get confusing. And since the only use for names.id is as a ORDER BY in the first ROW_NUMBER() statement, I don't reuse it going forward.
WITH namesNumbered as (
select myId, Name,
ROW_NUMBER() OVER (
PARTITION BY myId
ORDER BY id
) as nameNum
FROM names
)
, namesJoined(myId, Name, nameCount) as (
SELECT myId,
Cast(Name AS VARCHAR(225)),
1
FROM namesNumbered nn1
WHERE nameNum = 1
UNION ALL
SELECT nn2.myId,
Cast(
Rtrim(nc.Name) + ',' + nn2.Name
AS VARCHAR(225)
),
nn.nameNum
FROM namesJoined nj
INNER JOIN namesNumbered nn2 ON nn2.myId = nj.myId
and nn2.nameNum = nj.nameCount + 1
)
SELECT myId, Name
FROM (
SELECT myID, Name,
ROW_NUMBER() OVER (
PARTITION BY myId
ORDER BY nameCount DESC
) AS finalSort
FROM namesJoined
) AS tmp
WHERE finalSort = 1
The first CTE, namesNumbered, returns two fields we care about and a sorting value; we can't just use names.id for this because we need, for each myId value, to have values of 1, 2, .... names.id will have 1, 2 ... for myId = 1 but it will have a higher starting value for subsequent myId values.
The second CTE, namesJoined, has to have the field names specified in the CTE signature because it will be recursive. The base case (part before UNION ALL) gives us records where nameNum = 1. We have to CAST() the Name field because it will grow with subsequent passes; we need to ensure that we CAST() it large enough to handle any of the outputs; we can always TRIM() it later, if needed. We don't have to specify aliases for the fields because the CTE signature provides those. The recursive case (after the UNION ALL) joins the current CTE with the prior one, ensuring that subsequent passes use ever-higher nameNum values. We need to TRIM() the prior iterations of Name, then add the comma and the new Name. The result will be, implicitly, CAST()ed to a larger field.
The final query grabs only the fields we care about (myId, Name) and, within the subquery, pointedly re-sorts the records so that the highest namesJoined.nameCount value will get a 1 as the finalSort value. Then, we tell the WHERE clause to only give us this one record (for each myId value).
Yes, I aliased the subquery as tmp, which is about as generic as you can get. Most SQL engines require that you give a subquery an alias, even if it's the only relation visible at that point.