Using WHILE LOOP to insert into Junction Table - sql

I'm using the following query to insert into a JunctionTable that is used to match 2 tables (Listing & Categories) for a sort of yellowpages directory.
Any given Listing can be attributed 1 or more categories.
The following query is inserting 3 records into my Junction table with 1 problem. I have 3 fields (ID,Junc_LID,Junc_CatID) representative of the ID columns in my two other tables LID being Listing ID and CatID being the Category ID.
Running the query is adding 3 records with 1,2,43,34 in the CatID field rather than inserting 4 times each with a different catID.
So if a user select 10 categories from my web-form. The query should loop 10 times inserting 10 rows into my JunctionTable 1 for each category the user selects. Inputting 1 CatID per insert rather than a string of all the categories comma delimited as it's doing now.
DECLARE #cnt INT = 0;
WHILE #cnt < 3
BEGIN
INSERT INTO BND_ListingJunction_testing (Junc_LID,Junc_CatID)
Values ('[PulledLID]','[CatID]')
SET #cnt = #cnt + 1;
END;
--------------------------------------UPDATE
Here is your query modified for my tokens. It does work but adds some additional inserts into my junction table that I have not idea where they are coming from.
DECLARE #CatIDStr VARCHAR(100) = '[CatID]',#CatID VARCHAR(100) = ''
WHILE LEN(#CatIDStr) > 0
BEGIN
IF CHARINDEX(',',#CatIDStr) = 0
BEGIN
SET #CatID = #CatIDStr
SET #CatIDStr = ''
END
ELSE
BEGIN
SELECT #CatID = SUBSTRING(#CatIDStr,0,CHARINDEX(',',#CatIDStr))
SELECT #CatIDStr=SUBSTRING(#CatIDStr,CHARINDEX(',',#CatIDStr)+1,LEN(#CatIDStr))
END
INSERT INTO BND_ListingJunction_testing (Junc_LID,Junc_CatID)
Values ('[ScopedLID]',#CatID)
END
Rows 1-7 are all from the same insert executed only once.

Try this
DECLARE #CatIDStr VARCHAR(100) = '1,2,43,34',#CatID VARCHAR(100) = ''
DECLARE #PulledLID INT = 1
WHILE LEN(#CatIDStr) > 0
BEGIN
IF CHARINDEX(',',#CatIDStr) = 0
BEGIN
SET #CatID = #CatIDStr
SET #CatIDStr = ''
END
ELSE
BEGIN
SELECT #CatID = SUBSTRING(#CatIDStr,0,CHARINDEX(',',#CatIDStr))
SELECT #CatIDStr=SUBSTRING(#CatIDStr,CHARINDEX(',',#CatIDStr)+1,LEN(#CatIDStr))
END
INSERT INTO BND_ListingJunction_testing (Junc_LID,Junc_CatID)
Values (#PulledLID,#CatID)
END

Related

Generating dummy data from existing data set is slow using cursor

I'm trying to generate dummy data from the existing data I have in the tables. All I want is to increase the number of records in Table1 to N specified amount. The other tables should increase based on the foreign key references.
The tables has one to many relationship. For one record in table 1, I can have multiple entries in table 2, and in table 3 I can have many records based on IDs of the second table.
Since IDs are primary keys, I either capture it by
SET #NEWLY_INSERTED_ID = SCOPE_IDENTITY()
after inserting to table 1 and using in insert for table2, or inserting them to temp table and joining them to achieve the same results for table 3.
Here's the approach I'm taking with the CURSOR.
DECLARE #MyId as INT;
DECLARE #myCursor as CURSOR;
DECLARE #DESIRED_ROW_COUNT INT = 70000
DECLARE #ROWS_INSERTED INT = 0
DECLARE #CURRENT_ROW_COUNT INT = 0
DECLARE #NEWLY_INSERTED_ID INT
DECLARE #LANGUAGE_PAIR_IDS TABLE ( LangugePairId INT, NewId INT, SourceLanguage varchar(100), TargetLangauge varchar(100) )
WHILE (#ROWS_INSERTED < #DESIRED_ROW_COUNT)
BEGIN
SET #myCursor = CURSOR FOR
SELECT Id FROM MyTable
SET #CURRENT_ROW_COUNT = (SELECT COUNT(ID) FROM MyTable)
OPEN #myCursor;
FETCH NEXT FROM #myCursor INTO #MyId;
WHILE ##FETCH_STATUS = 0
BEGIN
IF ((#CURRENT_SUBMISSION_COUNT < #DESIRED_ROW_COUNT) AND (#ROWS_INSERTED < #DESIRED_ROW_COUNT))
BEGIN
INSERT INTO [dbo].[MyTable]
([Column1]
([Column2]
([Column3]
)
SELECT
,convert(numeric(9,0),rand() * 899999999) + 100000000
,COlumn2
,Colum3
FROM MyTable
WHERE Id = #MyId
SET #NEWLY_INSERTED_ID = SCOPE_IDENTITY()
INSERT INTO [dbo].[Language]
([MyTable1Id]
,[Target]
,[Source]
OUTPUT inserted.Id, inserted.MyTable1Id, inserted.Source, inserted.[Target] INTO #LANGUAGE_PAIR_IDS (LangugePairId, NewId, SourceLanguage, TargetLangauge)
SELECT
#NEWLY_INSERTED_ID
,[Target]
,[Source]
FROM [dbo].[Language]
WHERE MyTableId = #MyId
ORDER BY Id
DECLARE #tbl AS TABLE (newLanguageId INT, oldLanguageId INT, sourceLanguage VARCHAR(100), targetLanguage VARCHAR(100))
INSERT INTO #tbl (newLanguageId, oldLanguageId, sourceLanguage, targetLanguage)
SELECT 0, id, [Source], [Target] MyTable1Id FROM Language WHERE MyTable1Id = #MyId ORDER BY Id
UPDATE t
SET t.newlanguageid = lp.LangugePairId
FROM #tbl t
JOIN #LANGUAGE_PAIR_IDS lp
ON t.sourceLanguage = lp.SourceLanguage
AND t.targetLanguage = lp.TargetLangauge
INSERT INTO [dbo].[Manager]
([LanguagePairId]
,[UserId]
,[MyDate])
SELECT
tbl.newLanguageId
,p.[UserId]
,p.[MyDate]
FROM Manager m
INNER JOIN #tbl tbl
ON m.LanguagePairId = tbl.oldLanguageId
WHERE m.LanguagePairId in (SELECT Id FROM Language WHERE MyTable1Id = #MyId) -- returns the old language pair id
SET #ROWS_INSERTED += 1
SET #CURRENT_ROW_COUNT +=1
END
ELSE
BEGIN
PRINT 'REACHED EXIT'
SET #ROWS_INSERTED = #DESIRED_ROW_COUNT
BREAK
END
FETCH NEXT FROM #myCursor INTO #MyId;
END
CLOSE #myCursor
DEALLOCATE #myCursor
END
The above code works! It generates the data I need. However, it's very very slow. Just to give some comparison. Initial load of data for table 1 was ~60,000 records, Table2: ~74,000 and Tabl3 ~3,400
I tried to insert 9,000 rows in Table1. With the above code, it took 17:05:01 seconds to complete.
Any suggestion on how I can optimize the query to run little faster? My goal is to insert 1-2 mln records in Table1 without having to wait for days. I'm not tied to CURSOR. I'm ok to achieve the same result in any other way possible.

How to get records that between m records back and n records forward from a reference row - Non-sequential data

My scenario is as follows:
I have a reference record, say, ProductId = 1
The records each have a non-unique ItemTypeId
I would like to fetch records that exists between the following points
START POINT being 2 records BACKWARDS of type ItemTypeId = 1, from record of ProductId =1
END POINT being 3 records FORWARDS of type ItemTypeId = 1, from record of ProductId = 1
The query should get ALL data between the two points, inclusively
Here's a picture that illustrates this better than my words:
How would I structure my query to do this?
Any better way to do it without temp tables?
Thank-you!
Note that for this to work at all, you need that record ID to be an actual column in the table. Rows have no inherent order in a table.
With that in place, you can use LAG and LEAD to get what you want:
CREATE TABLE #t
(
RecordId INT IDENTITY(1,1),
ProductId INT,
ItemType INT
);
INSERT INTO #t(ProductId, ItemType)
VALUES
(5,1),(3,1),(7,3),(6,1),(2,7),
(1,1),(7,3),(8,1),(10,3),(9,5),
(11,1),(19,1),(17,4),(13,3);
WITH c1 AS
(
SELECT ProductId,
RecordId,
LAG(RecordId,2) OVER (ORDER BY RecordId) AS Back2,
LEAD(RecordId,3) OVER (ORDER BY RecordId) AS Forward3
FROM #t
WHERE ItemType = (SELECT ItemType FROM #t WHERE ProductId = 1)
),c2 AS
(
SELECT c1.Back2, c1.Forward3 FROM c1
WHERE c1.ProductId = 1
)
SELECT #t.*
FROM #t
INNER JOIN c2 ON #t.RecordId BETWEEN c2.Back2 AND c2.Forward3;
If you wanna do without using temp tables as you ask, the following solution work.
But it is not very nice i agree.
Well this is what i done :
CREATE DATABASE TEST;
USE TEST
CREATE TABLE PRODUCT
(
ProductId INT,
ItemType INT
)
INSERT INTO PRODUCT
VALUES
(5,1),
(3,1),
(7,3),
(6,1),
(2,7),
(1,1),
(7,3),
(8,1),
(10,3),
(9,5),
(11,1),
(19,1),
(17,4),
(13,3)
DECLARE product_cursor CURSOR FOR
SELECT * FROM PRODUCT;
OPEN product_cursor
DECLARE
#ProductId INT,
#ItemId INT,
#END_FETCH INT,
#countFrom INT,
#countTo INT
DECLARE #TableResult TABLE
(
RProductId INT,
RItemId INT
)
FETCH NEXT FROM product_cursor
INTO #ProductId, #ItemId
SET #END_FETCH = 0
SET #countFrom = 0
SET #countTo = 0
WHILE ##FETCH_STATUS = 0 AND #END_FETCH = 0
BEGIN
IF #ItemId = 1 AND (#countFrom = 0 AND #countTo = 0)
BEGIN
SET #countFrom = 3
SET #countTo = 3
END
ELSE
BEGIN
IF #countFrom > 0
BEGIN
--SELECT 'INSERTION : ' ,#ProductId,#ItemId
INSERT INTO #TableResult VALUES(#ProductId, #ItemId)
IF #ItemId = 1
BEGIN
SET #countFrom -= 1
--SELECT 'CountFrom : ', #countFrom
END
END
ELSE
BEGIN
IF #countTo > 0
BEGIN
--SELECT 'INSERTION : ' ,#ProductId,#ItemId
INSERT INTO #TableResult VALUES(#ProductId, #ItemId)
IF #ItemId = 1
BEGIN
SET #countTo -= 1
--SELECT 'CountTO : ', #countTo
END
END
ELSE
BEGIN
SET #END_FETCH = 1
END
END
END
FETCH NEXT FROM product_cursor
INTO #ProductId, #ItemId
END
CLOSE product_cursor
DEALLOCATE product_cursor
SELECT * FROM #TableResult
And this is the result i got :
RProductId RItemId
3 1
7 3
6 1
2 7
1 1
7 3
8 1
10 3
9 5
11 1
19 1
But i prefer the solution of #James Casey.
By the way, why won't you use temp table ?

Loop for each row

I have two tables with FOREIGN KEY([Table_ID])
Columns
ID Table_ID ActiveFlag
1 1 0
2 2 1
3 1 1
4 3 0
Sys_Tables
Table_ID Name
1 Request
2 Plan
3 Contecst
I'm writing a stored procedure that returns any column for each table.
Example Output for values ​​above
--first output table
ID Table_ID ActiveFlag
1 1 0
3 1 1
--second output table
ID Table_ID ActiveFlag
2 2 1
--third output table
ID Table_ID ActiveFlag
4 3 0
My idea is this
Select c.*
from Ccolumns c
inner join Sys_tables t
on t.Table_ID = c.Table_ID and t.Table_ID = #Parameter
My problem, i do't know how to make a loop for each row. I need the best way. Example i can use following loop:
DECLARE #i int = 0
DECLARE #count int;
select #count = count(t.Table_ID)
from Sys_tables t
WHILE #i < #count BEGIN
SET #i = #i + 1
--DO ABOVE SELECT
END
But this is not entirely correct. Example my Sys_tables such data may be
Table_ID Name
1 Request
102 Plan
1001 Contecst
Do You have any idea?
There are couple ways you can achieve that: loops and cursors, but first of all you need to know that it's a bad idea: either are very slow, anyway, here's some kind of loop sample:
declare #row_ids table (
id INT IDENTITY (1, 1),
rid INT
);
insert into #row_ids (rid) select someIdField from SomeTable
declare #cnt INT = ##ROWCOUNT
declare #currentRow INT = 1
WHILE (#currentRow <= #cnt)
BEGIN
SELECT rid FROM #row_ids WHERE id = #currentRow
SET #currentRow = #currentRow + 1
END
I guess you're using SQL Server, right?
Then, you can use a CURSOR as here: How to write a cursor inside a stored procedure in SQL Server 2008

Deduplication of imported records in SQL server

I have the following T_SQL Stored Procedure that is currently taking up 50% of the total time needed to run all processes on newly imported records into our backend analysis suite. Unfortunately, this data needs to be imported every time and is causing a bottleneck as our DB size grows.
Basically, we are trying to identify all duplicate in the records and keep only one of them.
DECLARE #status INT
SET #status = 3
DECLARE #contactid INT
DECLARE #email VARCHAR (100)
--Contacts
DECLARE email_cursor CURSOR FOR
SELECT email FROM contacts WHERE (reference = #reference AND status = 1 ) GROUP BY email HAVING (COUNT(email) > 1)
OPEN email_cursor
FETCH NEXT FROM email_cursor INTO #email
WHILE ##FETCH_STATUS = 0
BEGIN
PRINT #email
UPDATE contacts SET duplicate = 1, status = #status WHERE email = #email and reference = #reference AND status = 1
SELECT TOP 1 #contactid = id FROM contacts where reference = #reference and email = #email AND duplicate = 1
UPDATE contacts SET duplicate =0, status = 1 WHERE id = #contactid
FETCH NEXT FROM email_cursor INTO #email
END
CLOSE email_cursor
DEALLOCATE email_cursor
I have added all the indexes I can see from query execution plans, but it may be possible to update the entire SP to run differently, as I have managed to do with others.
Use this single query to de-dup.
;with tmp as (
select *
,rn=row_number() over (partition by email, reference order by id)
,c=count(1) over (partition by email, reference)
from contacts
where status = 1
)
update tmp
set duplicate = case when rn=1 then 0 else 1 end
,status = case when rn=1 then 1 else 3 end
where c > 1
;
It will only de-dup among the records where status=1, and considers rows with the same (email,reference) combination as dups.

SQL Query return values in a set sequence

I have been trying for a while now to return data from the database with the ID(int) values in the following order.
3, 6, 1, 9, 2, 5.
Is there anyway this can be done?
EDIT: Ok i made a bit of a stuff up in my post. the ID's above are just an example.
I am trying to do this dynamically, based around how many records from another table are linked to the record i want to pull out, e.g. i host 3 branches and each branch has a group of shops how would i determine which has the most?
I hope this helps.
Yes, something like this:
select ID from tablename
order by
CASE WHEN ID = 3 THEN 1
WHEN ID = 6 THEN 2
WHEN ID = 1 THEN 3
WHEN ID = 9 THEN 4
WHEN ID = 2 THEN 5
WHEN ID = 5 THEN 6
ELSE 7 END, ID ASC
This will put 3,6,1,9,2,5 and afterwords the other numbers in ascending order.
select cols from table where
order by
case ID when 3 then 0
when 6 then 1
when 1 then 2
when 9 then 3
...
end
You get the idea...
Create a table for the sorting.
CREATE TABLE SortPriority (
SourceID int NULL,
Priority int NULL)
Populate it with the ids and what order they should showup in. Join to the table. and use SortPriority.Priority in your sorting.
You can more easily change the sorting around this way. You would just need to modify the data. You can also later write scripts to populate the table to handle predictable needs in the changing of the sorting.
A split function like this one:
CREATE FUNCTION fnSplit(#str varchar(max), #dlm char(1))
RETURNS #result TABLE (id int, value varchar(50))
AS BEGIN
DECLARE
#id int, #value varchar(50),
#lastpos int, #pos int, #len int;
SET #id = 0;
SET #len = LEN(#str);
SET #lastpos = 1;
SET #pos = CHARINDEX(#dlm, #str + #dlm);
IF #pos <> 0
WHILE 1 = 1 BEGIN
SET #value = SUBSTRING(#str, #lastpos, #pos - #lastpos);
IF #value <> '' BEGIN
SET #id = #id + 1;
INSERT INTO #result VALUES (#id, #value);
END;
IF #pos > #len BREAK;
SET #lastpos = #pos + 1;
SET #pos = CHARINDEX(#dlm, #str + #dlm, #lastpos);
END;
RETURN;
END
would return a row set containing not only the values, but also their indexes within the list. You could then use the function in this way:
SELECT
…
FROM atable t
LEFT JOIN dbo.Split('3,6,1,9,2,5', ',') s ON t.Value = s.Value
ORDER BY
CASE WHEN s.id IS NULL THEN 2147483647 ELSE s.id END