SQL Server 2005 recursive query with loops in data - is it possible?

SQL Server 2005 recursive query with loops in data - is it possible? - sql

I've got a standard boss/subordinate employee table. I need to select a boss (specified by ID) and all his subordinates (and their subrodinates, etc). Unfortunately the real world data has some loops in it (for example, both company owners have each other set as their boss). The simple recursive query with a CTE chokes on this (maximum recursion level of 100 exceeded). Can the employees still be selected? I care not of the order in which they are selected, just that each of them is selected once.
Added: You want my query? Umm... OK... I though it is pretty obvious, but - here it is:
with
UserTbl as -- Selects an employee and his subordinates.
(
select a.[User_ID], a.[Manager_ID] from [User] a WHERE [User_ID] = #UserID
union all
select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])
)
select * from UserTbl
Added 2: Oh, in case it wasn't clear - this is a production system and I have to do a little upgrade (basically add a sort of report). Thus, I'd prefer not to modify the data if it can be avoided.

I know it has been a while but thought I should share my experience as I tried every single solution and here is a summary of my findings (an maybe this post?):
Adding a column with the current path did work but had a performance hit so not an option for me.
I could not find a way to do it using CTE.
I wrote a recursive SQL function which adds employeeIds to a table. To get around the circular referencing, there is a check to make sure no duplicate IDs are added to the table. The performance was average but was not desirable.
Having done all of that, I came up with the idea of dumping the whole subset of [eligible] employees to code (C#) and filter them there using a recursive method. Then I wrote the filtered list of employees to a datatable and export it to my stored procedure as a temp table. To my disbelief, this proved to be the fastest and most flexible method for both small and relatively large tables (I tried tables of up to 35,000 rows).

this will work for the initial recursive link, but might not work for longer links
DECLARE #Table TABLE(
ID INT,
PARENTID INT
)
INSERT INTO #Table (ID,PARENTID) SELECT 1, 2
INSERT INTO #Table (ID,PARENTID) SELECT 2, 1
INSERT INTO #Table (ID,PARENTID) SELECT 3, 1
INSERT INTO #Table (ID,PARENTID) SELECT 4, 3
INSERT INTO #Table (ID,PARENTID) SELECT 5, 2
SELECT * FROM #Table
DECLARE #ID INT
SELECT #ID = 1
;WITH boss (ID,PARENTID) AS (
SELECT ID,
PARENTID
FROM #Table
WHERE PARENTID = #ID
),
bossChild (ID,PARENTID) AS (
SELECT ID,
PARENTID
FROM boss
UNION ALL
SELECT t.ID,
t.PARENTID
FROM #Table t INNER JOIN
bossChild b ON t.PARENTID = b.ID
WHERE t.ID NOT IN (SELECT PARENTID FROM boss)
)
SELECT *
FROM bossChild
OPTION (MAXRECURSION 0)
what i would recomend is to use a while loop, and only insert links into temp table if the id does not already exist, thus removing endless loops.

Not a generic solution, but might work for your case: in your select query modify this:
select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])
to become:
select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])
and a.[User_ID] <> #UserID

You don't have to do it recursively. It can be done in a WHILE loop. I guarantee it will be quicker: well it has been for me every time I've done timings on the two techniques. This sounds inefficient but it isn't since the number of loops is the recursion level. At each iteration you can check for looping and correct where it happens. You can also put a constraint on the temporary table to fire an error if looping occurs, though you seem to prefer something that deals with looping more elegantly. You can also trigger an error when the while loop iterates over a certain number of levels (to catch an undetected loop? - oh boy, it sometimes happens.
The trick is to insert repeatedly into a temporary table (which is primed with the root entries), including a column with the current iteration number, and doing an inner join between the most recent results in the temporary table and the child entries in the original table. Just break out of the loop when ##rowcount=0!
Simple eh?

I know you asked this question a while ago, but here is a solution that may work for detecting infinite recursive loops. I generate a path and I checked in the CTE condition if the USER ID is in the path, and if it is it wont process it again. Hope this helps.
Jose
DECLARE #Table TABLE(
USER_ID INT,
MANAGER_ID INT )
INSERT INTO #Table (USER_ID,MANAGER_ID) SELECT 1, 2
INSERT INTO #Table (USER_ID,MANAGER_ID) SELECT 2, 1
INSERT INTO #Table (USER_ID,MANAGER_ID) SELECT 3, 1
INSERT INTO #Table (USER_ID,MANAGER_ID) SELECT 4, 3
INSERT INTO #Table (USER_ID,MANAGER_ID) SELECT 5, 2
DECLARE #UserID INT
SELECT #UserID = 1
;with
UserTbl as -- Selects an employee and his subordinates.
(
select
'/'+cast( a.USER_ID as varchar(max)) as [path],
a.[User_ID],
a.[Manager_ID]
from #Table a
where [User_ID] = #UserID
union all
select
b.[path] +'/'+ cast( a.USER_ID as varchar(max)) as [path],
a.[User_ID],
a.[Manager_ID]
from #Table a
inner join UserTbl b
on (a.[Manager_ID]=b.[User_ID])
where charindex('/'+cast( a.USER_ID as varchar(max))+'/',[path]) = 0
)
select * from UserTbl

basicaly if you have loops like this in data you'll have to do the retreival logic by yourself.
you could use one cte to get only subordinates and other to get bosses.
another idea is to have a dummy row as a boss to both company owners so they wouldn't be each others bosses which is ridiculous. this is my prefferd option.

I can think of two approaches.
1) Produce more rows than you want, but include a check to make sure it does not recurse too deep. Then remove duplicate User records.
2) Use a string to hold the Users already visited. Like the not in subquery idea that didn't work.
Approach 1:
; with TooMuchHierarchy as (
select "User_ID"
, Manager_ID
, 0 as Depth
from "User"
WHERE "User_ID" = #UserID
union all
select U."User_ID"
, U.Manager_ID
, M.Depth + 1 as Depth
from TooMuchHierarchy M
inner join "User" U
on U.Manager_ID = M."user_id"
where Depth < 100) -- Warning MAGIC NUMBER!!
, AddMaxDepth as (
select "User_ID"
, Manager_id
, Depth
, max(depth) over (partition by "User_ID") as MaxDepth
from TooMuchHierarchy)
select "user_id", Manager_Id
from AddMaxDepth
where Depth = MaxDepth
The line where Depth < 100 is what keeps you from getting the max recursion error. Make this number smaller, and less records will be produced that need to be thrown away. Make it too small and employees won't be returned, so make sure it is at least as large as the depth of the org chart being stored. Bit of a maintence nightmare as the company grows. If it needs to be bigger, then add option (maxrecursion ... number ...) to whole thing to allow more recursion.
Approach 2:
; with Hierarchy as (
select "User_ID"
, Manager_ID
, '#' + cast("user_id" as varchar(max)) + '#' as user_id_list
from "User"
WHERE "User_ID" = #UserID
union all
select U."User_ID"
, U.Manager_ID
, M.user_id_list + '#' + cast(U."user_id" as varchar(max)) + '#' as user_id_list
from Hierarchy M
inner join "User" U
on U.Manager_ID = M."user_id"
where user_id_list not like '%#' + cast(U."User_id" as varchar(max)) + '#%')
select "user_id", Manager_Id
from Hierarchy

The preferrable solution is to clean up the data and to make sure you do not have any loops in the future - that can be accomplished with a trigger or a UDF wrapped in a check constraint.
However, you can use a multi statement UDF as I demonstrated here: Avoiding infinite loops. Part One
You can add a NOT IN() clause in the join to filter out the cycles.

This is the code I used on a project to chase up and down hierarchical relationship trees.
User defined function to capture subordinates:
CREATE FUNCTION fn_UserSubordinates(#User_ID INT)
RETURNS #SubordinateUsers TABLE (User_ID INT, Distance INT) AS BEGIN
IF #User_ID IS NULL
RETURN
INSERT INTO #SubordinateUsers (User_ID, Distance) VALUES ( #User_ID, 0)
DECLARE #Distance INT, #Finished BIT
SELECT #Distance = 1, #Finished = 0
WHILE #Finished = 0
BEGIN
INSERT INTO #SubordinateUsers
SELECT S.User_ID, #Distance
FROM Users AS S
JOIN #SubordinateUsers AS C
ON C.User_ID = S.Manager_ID
LEFT JOIN #SubordinateUsers AS C2
ON C2.User_ID = S.User_ID
WHERE C2.User_ID IS NULL
IF ##RowCount = 0
SET #Finished = 1
SET #Distance = #Distance + 1
END
RETURN
END
User defined function to capture managers:
CREATE FUNCTION fn_UserManagers(#User_ID INT)
RETURNS #User TABLE (User_ID INT, Distance INT) AS BEGIN
IF #User_ID IS NULL
RETURN
DECLARE #Manager_ID INT
SELECT #Manager_ID = Manager_ID
FROM UserClasses WITH (NOLOCK)
WHERE User_ID = #User_ID
INSERT INTO #UserClasses (User_ID, Distance)
SELECT User_ID, Distance + 1
FROM dbo.fn_UserManagers(#Manager_ID)
INSERT INTO #User (User_ID, Distance) VALUES (#User_ID, 0)
RETURN
END

You need a some method to prevent your recursive query from adding User ID's already in the set. However, as sub-queries and double mentions of the recursive table are not allowed (thank you van) you need another solution to remove the users already in the list.
The solution is to use EXCEPT to remove these rows. This should work according to the manual. Multiple recursive statements linked with union-type operators are allowed. Removing the users already in the list means that after a certain number of iterations the recursive result set returns empty and the recursion stops.
with UserTbl as -- Selects an employee and his subordinates.
(
select a.[User_ID], a.[Manager_ID] from [User] a WHERE [User_ID] = #UserID
union all
(
select a.[User_ID], a.[Manager_ID]
from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])
where a.[User_ID] not in (select [User_ID] from UserTbl)
EXCEPT
select a.[User_ID], a.[Manager_ID] from UserTbl a
)
)
select * from UserTbl;
The other option is to hardcode a level variable that will stop the query after a fixed number of iterations or use the MAXRECURSION query option hint, but I guess that is not what you want.

Related

Populate the record links using Recursive CTE

I have the contact table records which has a link of other contact record or the contact record is not linked to anything (null)
As per below example id 21 is a parent for contact 1
I need to populate the temptable using T-SQL records (Using the recursive CTE) with all the contact links for the each and every contact id in contact table as below
As one contact id is associated with multiple contact ids, the Link1,Link2,link3 columns should be dynamically created if possible.
Could anybody please help me with this script

Try this (necessary remarks in comments):
--data definition
declare #contactTable table (contactId int, linkContactId int)
insert into #contactTable values
(1,21),
(2,null),
(3,450),
(4,1),
(5,900),
(6,5),
(7,3),
(8,1)
--recursive cte
;with cte as (
(select 1 n, contactId from #contactTable
where linkContactId = 1
union
select 1, linkContactId from #contactTable
where contactId = 1)
union all
--this part might seem confusing, I tried writing recursive part similairly as anchor part,
--but it needed to joins, which isn't allowed in recursive part of cte, so I worked around it
select n + 1,
case when cte.n + 1 = t.contactId then t.linkContactId else t.contactId end
from cte join #contactTable [t] on
(cte.n + 1 = t.contactId or cte.n + 1 = t.linkContactId)
)
--grouping results by contactId concatenating all linkContacts
select n [contactId],
(select distinct cast(contactId as varchar(5)) + ',' from cte where n = c.n for xml path(''), type).value('(.)[1]', 'varchar(100)') [linkContactId]
from cte [c]
group by n

As per your above script i was able to nearly get the results
As 4,8 have already been included in the first row, it should not be shown as seperate record/records
Can you please adjust your query and please provide me the skipping script

More efficient way of doing multiple joins to the same table and a "case when" in the select

At my organization clients can be enrolled in multiple programs at one time. I have a table with a list of all of the programs a client has been enrolled as unique rows in and the dates they were enrolled in that program.
Using an External join I can take any client name and a date from a table (say a table of tests that the clients have completed) and have it return all of the programs that client was in on that particular date. If a client was in multiple programs on that date it duplicates the data from that table for each program they were in on that date.
The problem I have is that I am looking for it to only return one program as their "Primary Program" for each client and date even if they were in multiple programs on that date. I have created a hierarchy for which program should be selected as their primary program and returned.
For Example:
1.)Inpatient
2.)Outpatient Clinical
3.)Outpatient Vocational
4.)Outpatient Recreational
So if a client was enrolled in Outpatient Clinical, Outpatient Vocational, Outpatient Recreational at the same time on that date it would only return "Outpatient Clinical" as the program.
My way of thinking for doing this would be to join to the table with the previous programs multiple times like this:
FROM dbo.TestTable as TestTable
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms1
ON TestTable.date = PreviousPrograms1.date AND PreviousPrograms1.type = 'Inpatient'
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms2
ON TestTable.date = PreviousPrograms2.date AND PreviousPrograms2.type = 'Outpatient Clinical'
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms3
ON TestTable.date = PreviousPrograms3.date AND PreviousPrograms3.type = 'Outpatient Vocational'
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms4
ON TestTable.date = PreviousPrograms4.date AND PreviousPrograms4.type = 'Outpatient Recreational'
and then do a condition CASE WHEN in the SELECT statement as such:
SELECT
CASE
WHEN PreviousPrograms1.name IS NOT NULL
THEN PreviousPrograms1.name
WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NOT NULL
THEN PreviousPrograms2.name
WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NULL AND PreviousPrograms3.name IS NOT NULL
THEN PreviousPrograms3.name
WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NULL AND PreviousPrograms3.name IS NOT NULL AND PreviousPrograms4.name IS NOT NULL
THEN PreviousPrograms4.name
ELSE NULL
END as PrimaryProgram
The bigger problem is that in my actual table there are a lot more than just four possible programs it could be and the CASE WHEN select statement and the JOINs are already cumbersome enough.
Is there a more efficient way to do either the SELECTs part or the JOIN part? Or possibly a better way to do it all together?
I'm using SQL Server 2008.

You can simplify (replace) your CASE by using COALESCE() instead:
SELECT
COALESCE(PreviousPrograms1.name, PreviousPrograms2.name,
PreviousPrograms3.name, PreviousPrograms4.name) AS PreviousProgram
COALESCE() returns the first non-null value.
Due to your design, you still need the JOINs, but it would be much easier to read if you used very short aliases, for example PP1 instead of PreviousPrograms1 - it's just a lot less code noise.

You can simplify the Join by using a bridge table containing all the program types and their priority (my sql server syntax is a bit rusty):
create table BridgeTable (
programType varchar(30),
programPriority smallint
);
This table will hold all the program types and the program priority will reflect the priority you've specified in your question.
As for the part of the case, that will depend on the number of records involved. One of the tricks that I usually do is this (assuming programPriority is a number between 10 and 99 and no type can have more than 30 bytes, because I'm being lazy):
Select patient, date,
substr( min(cast(BridgeTable.programPriority as varchar) || PreviousPrograms.type), 3, 30)
From dbo.TestTable as TestTable
Inner Join dbo.BridgeTable as BridgeTable
Left Outer Join dbo.PreviousPrograms as PreviousPrograms
on PreviousPrograms.type = BridgeTable.programType
and TestTable.date = PreviousPrograms.date
Group by patient, date

You can achieve this using sub-queries, or you could refactor it to use CTEs, take a look at the following and see if it makes sense:
DECLARE #testTable TABLE
(
[id] INT IDENTITY(1, 1),
[date] datetime
)
DECLARE #previousPrograms TABLE
(
[id] INT IDENTITY(1,1),
[date] datetime,
[type] varchar(50)
)
INSERT INTO #testTable ([date])
SELECT '2013-08-08'
UNION ALL SELECT '2013-08-07'
UNION ALL SELECT '2013-08-06'
INSERT INTO #previousPrograms ([date], [type])
-- a sample user as an inpatient
SELECT '2013-08-08', 'Inpatient'
-- your use case of someone being enrolled in all 3 outpation programs
UNION ALL SELECT '2013-08-07', 'Outpatient Recreational'
UNION ALL SELECT '2013-08-07', 'Outpatient Clinical'
UNION ALL SELECT '2013-08-07', 'Outpatient Vocational'
-- showing our workings, this is what we'll join to
SELECT
PPP.[date],
PPP.[type],
ROW_NUMBER() OVER (PARTITION BY PPP.[date] ORDER BY PPP.[Priority]) AS [RowNumber]
FROM (
SELECT
[type],
[date],
CASE
WHEN [type] = 'Inpatient' THEN 1
WHEN [type] = 'Outpatient Clinical' THEN 2
WHEN [type] = 'Outpatient Vocational' THEN 3
WHEN [type] = 'Outpatient Recreational' THEN 4
ELSE 999
END AS [Priority]
FROM #previousPrograms
) PPP -- Previous Programs w/ Priority
SELECT
T.[date],
PPPO.[type]
FROM #testTable T
LEFT JOIN (
SELECT
PPP.[date],
PPP.[type],
ROW_NUMBER() OVER (PARTITION BY PPP.[date] ORDER BY PPP.[Priority]) AS [RowNumber]
FROM (
SELECT
[type],
[date],
CASE
WHEN [type] = 'Inpatient' THEN 1
WHEN [type] = 'Outpatient Clinical' THEN 2
WHEN [type] = 'Outpatient Vocational' THEN 3
WHEN [type] = 'Outpatient Recreational' THEN 4
ELSE 999
END AS [Priority]
FROM #previousPrograms
) PPP -- Previous Programs w/ Priority
) PPPO -- Previous Programs w/ Priority + Order
ON T.[date] = PPPO.[date] AND PPPO.[RowNumber] = 1
Basically we have our deepest sub-select giving all PreviousPrograms a priority based on type, then our wrapping sub-select gives them row numbers per date so we can select only the ones with a row number of 1.
I am guessing you would need to include a UR Number or some other patient identifier, simply add that as an output to both sub-selects and change the join.

SQL Server: row present in one query, missing in another

Ok so I think I must be misunderstanding something about SQL queries. This is a pretty wordy question, so thanks for taking the time to read it (my problem is right at the end, everything else is just context).
I am writing an accounting system that works on the double-entry principal -- money always moves between accounts, a transaction is 2 or more TransactionParts rows decrementing one account and incrementing another.
Some TransactionParts rows may be flagged as tax related so that the system can produce a report of total VAT sales/purchases etc, so it is possible that a single Transaction may have two TransactionParts referencing the same Account -- one VAT related, and the other not. To simplify presentation to the user, I have a view to combine multiple rows for the same account and transaction:
create view Accounting.CondensedEntryView as
select p.[Transaction], p.Account, sum(p.Amount) as Amount
from Accounting.TransactionParts p
group by p.[Transaction], p.Account
I then have a view to calculate the running balance column, as follows:
create view Accounting.TransactionBalanceView as
with cte as
(
select ROW_NUMBER() over (order by t.[Date]) AS RowNumber,
t.ID as [Transaction], p.Amount, p.Account
from Accounting.Transactions t
inner join Accounting.CondensedEntryView p on p.[Transaction]=t.ID
)
select b.RowNumber, b.[Transaction], a.Account,
coalesce(sum(a.Amount), 0) as Balance
from cte a, cte b
where a.RowNumber <= b.RowNumber AND a.Account=b.Account
group by b.RowNumber, b.[Transaction], a.Account
For reasons I haven't yet worked out, a certain transaction (ID=30) doesn't appear on an account statement for the user. I confirmed this by running
select * from Accounting.TransactionBalanceView where [Transaction]=30
This gave me the following result:
RowNumber Transaction Account Balance
-------------------- ----------- ------- ---------------------
72 30 23 143.80
As I said before, there should be at least two TransactionParts for each Transaction, so one of them isn't being presented in my view. I assumed there must be an issue with the way I've written my view, and run a query to see if there's anything else missing:
select [Transaction], count(*)
from Accounting.TransactionBalanceView
group by [Transaction]
having count(*) < 2
This query returns no results -- not even for Transaction 30! Thinking I must be an idiot I run the following query:
select [Transaction]
from Accounting.TransactionBalanceView
where [Transaction]=30
It returns two rows! So select * returns only one row and select [Transaction] returns both. After much head-scratching and re-running the last two queries, I concluded I don't have the faintest idea what's happening. Any ideas?
Thanks a lot if you've stuck with me this far!
Edit:
Here are the execution plans:
select *
select [Transaction]
1000 lines each, hence finding somewhere else to host.
Edit 2:
For completeness, here are the tables I used:
create table Accounting.Accounts
(
ID smallint identity primary key,
[Name] varchar(50) not null
constraint UQ_AccountName unique,
[Type] tinyint not null
constraint FK_AccountType foreign key references Accounting.AccountTypes
);
create table Accounting.Transactions
(
ID int identity primary key,
[Date] date not null default getdate(),
[Description] varchar(50) not null,
Reference varchar(20) not null default '',
Memo varchar(1000) not null
);
create table Accounting.TransactionParts
(
ID int identity primary key,
[Transaction] int not null
constraint FK_TransactionPart foreign key references Accounting.Transactions,
Account smallint not null
constraint FK_TransactionAccount foreign key references Accounting.Accounts,
Amount money not null,
VatRelated bit not null default 0
);

Demonstration of possible explanation.
Create table Script
SELECT *
INTO #T
FROM master.dbo.spt_values
CREATE NONCLUSTERED INDEX [IX_T] ON #T ([name] DESC,[number] DESC);
Query one (Returns 35 results)
WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY NAME) AS rn
FROM #T
)
SELECT c1.number,c1.[type]
FROM cte c1
JOIN cte c2 ON c1.rn=c2.rn AND c1.number <> c2.number
Query Two (Same as before but adding c2.[type] to the select list makes it return 0 results)
;
WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY NAME) AS rn
FROM #T
)
SELECT c1.number,c1.[type] ,c2.[type]
FROM cte c1
JOIN cte c2 ON c1.rn=c2.rn AND c1.number <> c2.number
Why?
row_number() for duplicate NAMEs isn't specified so it just chooses whichever one fits in with the best execution plan for the required output columns. In the second query this is the same for both cte invocations, in the first one it chooses a different access path with resultant different row_numbering.
Suggested Solution
You are self joining the CTE on ROW_NUMBER() over (order by t.[Date])
Contrary to what may have been expected the CTE will likely not be materialised which would have ensured consistency for the self join and thus you assume a correlation between ROW_NUMBER() on both sides that may well not exist for records where a duplicate [Date] exists in the data.
What if you try ROW_NUMBER() over (order by t.[Date], t.[id]) to ensure that in the event of tied dates the row_numbering is in a guaranteed consistent order. (Or some other column/combination of columns that can differentiate records if id won't do it)

If the purpose of this part of the view is just to make sure that the same row isn't joined to itself
where a.RowNumber <= b.RowNumber
then how does changing this part to
where a.RowNumber <> b.RowNumber
affect the results?

It seems you read dirty entries. (Someone else deletes/insertes new data)
try SET TRANSACTION ISOLATION LEVEL READ COMMITTED.
i've tried this code (seems equal to yours)
IF object_id('tempdb..#t') IS NOT NULL DROP TABLE #t
CREATE TABLE #t(i INT, val INT, acc int)
INSERT #t
SELECT 1, 2, 70
UNION ALL SELECT 2, 3, 70
;with cte as
(
select ROW_NUMBER() over (order by t.i) AS RowNumber,
t.val as [Transaction], t.acc Account
from #t t
)
select b.RowNumber, b.[Transaction], a.Account
from cte a, cte b
where a.RowNumber <= b.RowNumber AND a.Account=b.Account
group by b.RowNumber, b.[Transaction], a.Account
and got two rows
RowNumber Transaction Account
1 2 70
2 3 70

Make SQL Select same row multiple times

I need to test my mail server. How can I make a Select statement
that selects say ID=5469 a thousand times.

If I get your meaning then a very simple way is to cross join on a derived query on a table with more than 1000 rows in it and put a top 1000 on that. This would duplicate your results 1000 times.
EDIT: As an example (This is MSSQL, I don't know if Access is much different)
SELECT
MyTable.*
FROM
MyTable
CROSS JOIN
(
SELECT TOP 1000
*
FROM
sysobjects
) [BigTable]
WHERE
MyTable.ID = 1234

You can use the UNION ALL statement.
Try something like:
SELECT * FROM tablename WHERE ID = 5469
UNION ALL
SELECT * FROM tablename WHERE ID = 5469
You'd have to repeat the SELECT statement a bunch of times but you could write a bit of VB code in Access to create a dynamic SQL statement and then execute it. Not pretty but it should work.

Create a helper table for this purpose:
JUST_NUMBER(NUM INT primary key)
Insert (with the help of some (VB) script) numbers from 1 to N. Then execute this unjoined query:
SELECT MYTABLE.*
FROM MYTABLE,
JUST_NUMBER
WHERE MYTABLE.ID = 5469
AND JUST_NUMBER.NUM <= 1000

Here's a way of using a recursive common table expression to generate some empty rows, then to cross join them back onto your desired row:
declare #myData table (val int) ;
insert #myData values (666),(888),(777) --some dummy data
;with cte as
(
select 100 as a
union all
select a-1 from cte where a>0
--generate 100 rows, the max recursion depth
)
,someRows as
(
select top 1000 0 a from cte,cte x1,cte x2
--xjoin the hundred rows a few times
--to generate 1030301 rows, then select top n rows
)
select m.* from #myData m,someRows where m.val=666
substitute #myData for your real table, and alter the final predicate to suit.

easy way...
This exists only one row into the DB
sku = 52 , description = Skullcandy Inkd Green ,price = 50,00
Try to relate another table in which has no constraint key to the main table
Original Query
SELECT Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod WHERE Prod_SKU = N'52'
The Functional Query ...adding a not related table called 'dbo.TB_Labels'
SELECT TOP ('times') Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod,dbo.TB_Labels WHERE Prod_SKU = N'52'

In postgres there is a nice function called generate_series. So in postgreSQL it is as simple as:
select information from test_table, generate_series(1, 1000) where id = 5469
In this way, the query is executed 1000 times.
Example for postgreSQL:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; --To be able to use function uuid_generate_v4()
--Create a test table
create table test_table (
id serial not null,
uid UUID NOT NULL,
CONSTRAINT uid_pk PRIMARY KEY(id));
-- Insert 10000 rows
insert into test_table (uid)
select uuid_generate_v4() from generate_series(1, 10000);
-- Read the data from id=5469 one thousand times
select id, uid, uuid_generate_v4() from test_table, generate_series(1, 1000) where id = 5469;
As you can see in the result below, the data from uid is read 1000 times as confirmed by the generation of a new uuid at every new row.
id |uid |uuid_generate_v4
----------------------------------------------------------------------------------------
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5630cd0d-ee47-4d92-9ee3-b373ec04756f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"ed44b9cb-c57f-4a5b-ac9a-55bd57459c02"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"3428b3e3-3bb2-4e41-b2ca-baa3243024d9"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7c8faf33-b30c-4bfa-96c8-1313a4f6ce7c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"b589fd8a-fec2-4971-95e1-283a31443d73"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"8b9ab121-caa4-4015-83f5-0c2911a58640"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7ef63128-b17c-4188-8056-c99035e16c11"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5bdc7425-e14c-4c85-a25e-d99b27ae8b9f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"9bbd260b-8b83-4fa5-9104-6fc3495f68f3"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"c1f759e1-c673-41ef-b009-51fed587353c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"4a70bf2b-ddf5-4c42-9789-5e48e2aec441"
Of course other DBs won't necessarily have the same function but it could be done:
See here.

If your are doing this in sql Server
declare #cnt int
set #cnt = 0
while #cnt < 1000
begin
select '12345'
set #cnt = #cnt + 1
end
select '12345' can be any expression

Repeat rows based on column value of TestTable. First run the Create table and insert statement, then run the following query for the desired result.
This may be another solution:
CREATE TABLE TestTable
(
ID INT IDENTITY(1,1),
Col1 varchar(10),
Repeats INT
)
INSERT INTO TESTTABLE
VALUES ('A',2), ('B',4),('C',1),('D',0)
WITH x AS
(
SELECT TOP (SELECT MAX(Repeats)+1 FROM TestTable) rn = ROW_NUMBER()
OVER (ORDER BY [object_id])
FROM sys.all_columns
ORDER BY [object_id]
)
SELECT * FROM x
CROSS JOIN TestTable AS d
WHERE x.rn <= d.Repeats
ORDER BY Col1;

This trick helped me in my requirement.
here, PRODUCTDETAILS is my Datatable
and orderid is my column.
declare #Req_Rows int = 12
;WITH cte AS
(
SELECT 1 AS Number
UNION ALL
SELECT Number + 1 FROM cte WHERE Number < #Req_Rows
)
SELECT PRODUCTDETAILS.*
FROM cte, PRODUCTDETAILS
WHERE PRODUCTDETAILS.orderid = 3

create table #tmp1 (id int, fld varchar(max))
insert into #tmp1 (id, fld)
values (1,'hello!'),(2,'world'),(3,'nice day!')
select * from #tmp1
go
select * from #tmp1 where id=3
go 1000
drop table #tmp1

in sql server try:
print 'wow'
go 5
output:
Beginning execution loop
wow
wow
wow
wow
wow
Batch execution completed 5 times.

The easy way is to create a table with 1000 rows. Let's call it BigTable. Then you would query for the data you want and join it with the big table, like this:
SELECT MyTable.*
FROM MyTable, BigTable
WHERE MyTable.ID = 5469

Select products where the category belongs to any category in the hierarchy

I have a products table that contains a FK for a category, the Categories table is created in a way that each category can have a parent category, example:
Computers
Processors
Intel
Pentium
Core 2 Duo
AMD
Athlon
I need to make a select query that if the selected category is Processors, it will return products that is in Intel, Pentium, Core 2 Duo, Amd, etc...
I thought about creating some sort of "cache" that will store all the categories in the hierarchy for every category in the db and include the "IN" in the where clause. Is this the best solution?

The best solution for this is at the database design stage. Your categories table needs to be a Nested Set. The article Managing Hierarchical Data in MySQL is not that MySQL specific (despite the title), and gives a great overview of the different methods of storing a hierarchy in a database table.
Executive Summary:
Nested Sets
Selects are easy for any depth
Inserts and deletes are hard
Standard parent_id based hierarchy
Selects are based on inner joins (so get hairy fast)
Inserts and deletes are easy
So based on your example, if your hierarchy table was a nested set your query would look something like this:
SELECT * FROM products
INNER JOIN categories ON categories.id = products.category_id
WHERE categories.lft > 2 and categories.rgt < 11
the 2 and 11 are the left and right respectively of the Processors record.

Looks like a job for a Common Table Expression.. something along the lines of:
with catCTE (catid, parentid)
as
(
select cat.catid, cat.catparentid from cat where cat.name = 'Processors'
UNION ALL
select cat.catid, cat.catparentid from cat inner join catCTE on cat.catparentid=catcte.catid
)
select distinct * from catCTE
That should select the category whose name is 'Processors' and any of it's descendents, should be able to use that in an IN clause to pull back the products.

I have done similar things in the past, first querying for the category ids, then querying for the products "IN" those categories. Getting the categories is the hard bit, and you have a few options:
If the level of nesting of categories is known or you can find an upper bound: Build a horrible-looking SELECT with lots of JOINs. This is fast, but ugly and you need to set a limit on the levels of the hierarchy.
If you have a relatively small number of total categories, query them all (just ids, parents), collect the ids of the ones you care about, and do a SELECT....IN for the products. This was the appropriate option for me.
Query up/down the hierarchy using a series of SELECTs. Simple, but relatively slow.
I believe recent versions of SQLServer have some support for recursive queries, but haven't used them myself.
Stored procedures can help if you don't want to do this app-side.

What you want to find is the transitive closure of the category "parent" relation. I suppose there's no limitation to the category hierarchy depth, so you can't formulate a single SQL query which finds all categories. What I would do (in pseudocode) is this:
categoriesSet = empty set
while new.size > 0:
new = select * from categories where parent in categoriesSet
categoriesSet = categoriesSet+new
So just keep on querying for children until no more are found. This behaves well in terms of speed unless you have a degenerated hierarchy (say, 1000 categories, each a child of another), or a large number of total categories. In the second case, you could always work with temporary tables to keep data transfer between your app and the database small.

Maybe something like:
select *
from products
where products.category_id IN
(select c2.category_id
from categories c1 inner join categories c2 on c1.category_id = c2.parent_id
where c1.category = 'Processors'
group by c2.category_id)
[EDIT] If the category depth is greater than one this would form your innermost query. I suspect that you could design a stored procedure that would drill down in the table until the ids returned by the inner query did not have children -- probably better to have an attribute that marks a category as a terminal node in the hierarchy -- then perform the outer query on those ids.

CREATE TABLE #categories (id INT NOT NULL, parentId INT, [name] NVARCHAR(100))
INSERT INTO #categories
SELECT 1, NULL, 'Computers'
UNION
SELECT 2, 1, 'Processors'
UNION
SELECT 3, 2, 'Intel'
UNION
SELECT 4, 2, 'AMD'
UNION
SELECT 5, 3, 'Pentium'
UNION
SELECT 6, 3, 'Core 2 Duo'
UNION
SELECT 7, 4, 'Athlon'
SELECT *
FROM #categories
DECLARE #id INT
SET #id = 2
; WITH r(id, parentid, [name]) AS (
SELECT id, parentid, [name]
FROM #categories c
WHERE id = #id
UNION ALL
SELECT c.id, c.parentid, c.[name]
FROM #categories c JOIN r ON c.parentid=r.id
)
SELECT *
FROM products
WHERE p.productd IN
(SELECT id
FROM r)
DROP TABLE #categories
The last part of the example isn't actually working if you're running it straight like this. Just remove the select from the products and substitute with a simple SELECT * FROM r

This should recurse down all the 'child' catagories starting from a given catagory.
DECLARE #startingCatagoryId int
DECLARE #current int
SET #startingCatagoryId = 13813 -- or whatever the CatagoryId is for 'Processors'
CREATE TABLE #CatagoriesToFindChildrenFor
(CatagoryId int)
CREATE TABLE #CatagoryTree
(CatagoryId int)
INSERT INTO #CatagoriesToFindChildrenFor VALUES (#startingCatagoryId)
WHILE (SELECT count(*) FROM #CatagoriesToFindChildrenFor) > 0
BEGIN
SET #current = (SELECT TOP 1 * FROM #CatagoriesToFindChildrenFor)
INSERT INTO #CatagoriesToFindChildrenFor
SELECT ID FROM Catagory WHERE ParentCatagoryId = #current AND Deleted = 0
INSERT INTO #CatagoryTree VALUES (#current)
DELETE #CatagoriesToFindChildrenFor WHERE CatagoryId = #current
END
SELECT * FROM #CatagoryTree ORDER BY CatagoryId
DROP TABLE #CatagoriesToFindChildrenFor
DROP TABLE #CatagoryTree

i like to use a stack temp table for hierarchal data.
here's a rough example -
-- create a categories table and fill it with 10 rows (with random parentIds)
CREATE TABLE Categories ( Id uniqueidentifier, ParentId uniqueidentifier )
GO
INSERT
INTO Categories
SELECT NEWID(),
NULL
GO
INSERT
INTO Categories
SELECT TOP(1)NEWID(),
Id
FROM Categories
ORDER BY Id
GO 9
DECLARE #lvl INT, -- holds onto the level as we move throught the hierarchy
#Id Uniqueidentifier -- the id of the current item in the stack
SET #lvl = 1
CREATE TABLE #stack (item UNIQUEIDENTIFIER, [lvl] INT)
-- we fill fill this table with the ids we want
CREATE TABLE #tmpCategories (Id UNIQUEIDENTIFIER)
-- for this example we’ll just select all the ids
-- if we want all the children of a specific parent we would include it’s id in
-- this where clause
INSERT INTO #stack SELECT Id, #lvl FROM Categories WHERE ParentId IS NULL
WHILE #lvl > 0
BEGIN -- begin 1
IF EXISTS ( SELECT * FROM #stack WHERE lvl = #lvl )
BEGIN -- begin 2
SELECT #Id = [item]
FROM #stack
WHERE lvl = #lvl
INSERT INTO #tmpCategories
SELECT #Id
DELETE FROM #stack
WHERE lvl = #lvl
AND item = #Id
INSERT INTO #stack
SELECT Id, #lvl + 1
FROM Categories
WHERE ParentId = #Id
IF ##ROWCOUNT > 0
BEGIN -- begin 3
SELECT #lvl = #lvl + 1
END -- end 3
END -- end 2
ELSE
SELECT #lvl = #lvl - 1
END -- end 1
DROP TABLE #stack
SELECT * FROM #tmpCategories
DROP TABLE #tmpCategories
DROP TABLE Categories
there is a good explanation here link text

My answer to another question from a couple days ago applies here... recursion in SQL
There are some methods in the book which I've linked which should cover your situation nicely.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server 2005 recursive query with loops in data - is it possible? - sql

Related

Populate the record links using Recursive CTE

More efficient way of doing multiple joins to the same table and a "case when" in the select

SQL Server: row present in one query, missing in another

Make SQL Select same row multiple times

Select products where the category belongs to any category in the hierarchy

Categories

Resources