Fetch set of records in sequence until n rows - sql

I had a table with almost 1.7M rows. I have to fetch a set of 100 rows and perform a operation and once the first set of 100 completed, I've do the similar operation for the rows from 101 to 200. In this similar note, I've to do the operation for all the rows in the table. I do have a column with Rownumber as well. What will be the best approach to accomplish it?

I had same problem in SSIS Package, and I solved it like this:
DECLARE #i INT, #RowsLimit INT, #Count INT
SET #i = 1
SET #RowsLimit = 100
SELECT #Count = COUNT(*) FROM yourTable
WHILE #i < (#Count / #RowsLimit) --Or you can hardcode #i < 17000
BEGIN
SELECT *
FROM yourTable
OFFSET (#RowsLimit * (#i - 1)) ROWS
FETCH NEXT #RowsLimit ROWS ONLY
SET #i = #i + 1
END
In while you can place your logic.

Related

Repeat query if no results came up

Could someone please advise on how to repeat the query if it returned no results. I am trying to generate a random person out of the DB using RAND, but only if that number was not used previously (that info is stored in the column "allready_drawn").
At this point when the query comes over the number that was drawn before, because of the second condition "is null" it does not display a result.
I would need for query to re-run once again until it comes up with a number.
DECLARE #min INTEGER;
DECLARE #max INTEGER;
set #min = (select top 1 id from [dbo].[persons] where sector = 8 order by id ASC);
set #max = (select top 1 id from [dbo].[persons] where sector = 8 order by id DESC);
select
ordial,
name_surname
from [dbo].[persons]
where id = ROUND(((#max - #min) * RAND() + #min), 0) and allready_drawn is NULL
The results (two possible outcomes):
Any suggestion is appreciated and I would like to thank everyone in advance.
Just try this to remove the "id" filter so you only have to run it once
select TOP 1
ordial,
name_surname
from [dbo].[persons]
where allready_drawn is NULL
ORDER BY NEWID()
#gbn that's a correct solution, but it's possible it's too expensive. For very large tables with dense keys, randomly picking a key value between the min and max and re-picking until you find a match is also fair, and cheaper than sorting the whole table.
Also there's a bug in the original post, as the min and max rows will be selected only half as often as the others, as each maps to a smaller interval. To fix generate a random number from #min to #max + 1, and truncate, rather than round. That way you map the interval [N,N+1) to N, ensuring a fair chance for each N.
For this selection method, here's how to repeat until you find a match.
--drop table persons
go
create table persons(id int, ordial int, name_surname varchar(2000), sector int, allready_drawn bit)
insert into persons(id,ordial,name_surname,sector, allready_drawn)
values (1,1,'foo',8,null),(2,2,'foo2',8,null),(100,100,'foo100',8,null)
go
declare #min int = (select top 1 id from [dbo].[persons] where sector = 8 order by id ASC);
declare #max int = 1+ (select top 1 id from [dbo].[persons] where sector = 8 order by id DESC);
set nocount on
declare #results table(ordial int, name_surname varchar(2000))
declare #i int = 0
declare #selected bit = 0
while #selected = 0
begin
set #i += 1
insert into #results(ordial,name_surname)
select
ordial,
name_surname
from [dbo].[persons]
where id = ROUND(((#max - #min) * RAND() + #min), 0, 1) and allready_drawn is NULL
if ##ROWCOUNT > 0
begin
select *, #i tries from #results
set #selected = 1
end
end

How to optimize this t-sql script code by avoiding loop?

I use following sql query to update MyTable. the code take between 5 to 15 min. to update MyTabel as long as ROWS <= 100000000 but when Rows > 100000000 it take exponential time to update MYTable. How can I change this code to use set-base instead of while loop?
DECLARE #startTime DATETIME
DECLARE #batchSize INT
DECLARE #iterationCount INT
DECLARE #i INT
DECLARE #from INT
DECLARE #to INT
SET #batchSize = 10000
SET #i = 0
SELECT #iterationCount = COUNT(*) / #batchSize
FROM MyTable
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
WHILE #i <= #iterationCount BEGIN
BEGIN TRANSACTION T
SET #startTime = GETDATE()
SET #from = #i * #batchSize
SET #to = (#i + 1) * #batchSize - 1
;WITH data
AS (
SELECT DoorsReleased, ROW_NUMBER() OVER (ORDER BY id) AS Row
FROM MyTable
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
)
UPDATE data
SET DoorsReleased = ~DoorsReleased
WHERE row BETWEEN #from AND #to
SET #i = #i + 1
COMMIT TRANSACTION T
END
One of your issues is that your select statement in the loop fetches all records for LitraID = 8175, sets row numbers, then filters in the update statement. This happens on every iteration.
One way round this would be to get all ids for the update before entering the loop and storing them in a temporary table. Then you can write a similar query to the one you have, but joining to this table of ids.
However, there is an even easier way if you know approximately how many records have LitraID = 8175 and if they are spread throughout the table, not bunched together with similar ids.
DECLARE #batchSize INT
DECLARE #minId INT
DECLARE #maxId INT
SET #batchSize = 10000 --adjust according to how frequently LitraID = 8175, larger numbers if infrequent
SET #minId = 100000000
WHILE #minId <= 300000000 BEGIN
SET #maxId = #minId + #batchSize - 1
IF #maxId > 300000000 BEGIN
SET #maxId = 300000000
END
BEGIN TRANSACTION T
UPDATE MyTable
SET DoorsReleased = ~DoorsReleased
WHERE id BETWEEN #minId AND #maxId
COMMIT TRANSACTION T
SET #minId = #maxId + 1
END
This will use the value of id to control the loop, meaning you don't need the extra step to calculate #iterationCount. It uses small batches so that the table isn't locked for long periods. It doesn't have any unnecessary SELECT statements and the WHERE clause in the update is efficient assuming id has an index.
It won't have exactly the same number of records updated in every transaction, but there's no reason it needs to.
This will eliminate the loop
UPDATE MyTable
set DoorsReleased = ~DoorsReleased
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
AND DoorsReleased is not null -- if DoorsReleased is nullable
-- AND DoorsReleased <> ~DoorsReleased</strike>
if you are set on looping
below will NOT work
I thought ~ was part of the column name but it is a not operator
select 1;
WHILE (##ROWCOUNT > 0)
BEGIN
UPDATE top (100000) MyTable
set DoorsReleased = ~DoorsReleased
WHERE LitraID = 8175
AND id BETWEEN 100000000 AND 300000000
AND ( DoorsReleased <> ~DoorsReleased
or ( DoorsReleased is null and ~DoorsReleased is not null )
)
END
Inside a transaction I don't think looping would have value as the transaction log cannot clear. And a batch size of 10,000 is small.\
as stated in a comment if you want to loop then try using id as row_number() all those loops is expensive
you might be able to use OFFSET

Getting one record at a time in a loop takes a long time. Is there a way around it?

This is what I have and it takes forever. I can't figure out a different way of doing this.
I tried cursors that is even slower. Any ideas? Thank you.
declare #i int
declare #customer_sk int
declare #numrows int
declare #iprotable TABLE (idx int Primary Key IDENTITY(1,1),customer_sk int)
INSERT #iprotable
select distinct ipro.Customer_SK from IproProfile ipro inner join vwUsageLast60Days usage on usage.Customer_SK =ipro.Customer_SK
SET #i = 1
SET #numrows = (SELECT COUNT(*) FROM #iprotable)
IF #numrows > 0
WHILE (#i <= (SELECT MAX(idx) FROM #iprotable))
BEGIN
SET #customer_sk = (SELECT customer_sk FROM #iprotable WHERE idx = #i)
update IproProfile set TopClassification=x.clas
from
(
select top 1 website web, classification clas, COUNT(classification)cnt, customer_sk cust_sk
from vwusagelast60days group by website, customer_sk, classification
having Customer_SK=#customer_sk
order by cnt desc
)x
where Customer_SK=x.cust_sk
set #i=#i+1
end

Loop for each row

I have two tables with FOREIGN KEY([Table_ID])
Columns
ID Table_ID ActiveFlag
1 1 0
2 2 1
3 1 1
4 3 0
Sys_Tables
Table_ID Name
1 Request
2 Plan
3 Contecst
I'm writing a stored procedure that returns any column for each table.
Example Output for values ​​above
--first output table
ID Table_ID ActiveFlag
1 1 0
3 1 1
--second output table
ID Table_ID ActiveFlag
2 2 1
--third output table
ID Table_ID ActiveFlag
4 3 0
My idea is this
Select c.*
from Ccolumns c
inner join Sys_tables t
on t.Table_ID = c.Table_ID and t.Table_ID = #Parameter
My problem, i do't know how to make a loop for each row. I need the best way. Example i can use following loop:
DECLARE #i int = 0
DECLARE #count int;
select #count = count(t.Table_ID)
from Sys_tables t
WHILE #i < #count BEGIN
SET #i = #i + 1
--DO ABOVE SELECT
END
But this is not entirely correct. Example my Sys_tables such data may be
Table_ID Name
1 Request
102 Plan
1001 Contecst
Do You have any idea?
There are couple ways you can achieve that: loops and cursors, but first of all you need to know that it's a bad idea: either are very slow, anyway, here's some kind of loop sample:
declare #row_ids table (
id INT IDENTITY (1, 1),
rid INT
);
insert into #row_ids (rid) select someIdField from SomeTable
declare #cnt INT = ##ROWCOUNT
declare #currentRow INT = 1
WHILE (#currentRow <= #cnt)
BEGIN
SELECT rid FROM #row_ids WHERE id = #currentRow
SET #currentRow = #currentRow + 1
END
I guess you're using SQL Server, right?
Then, you can use a CURSOR as here: How to write a cursor inside a stored procedure in SQL Server 2008

Query not working fine in while loop

I have a While loop where I am trying to insert.
DECLARE #CurrentOffer int =121
DECLARE #OldestOffer int = 115
DECLARE #MinClubcardID bigint=0
DECLARE #MaxClubcardID bigint=1000
WHILE 1 = 1
BEGIN
INSERT INTO Temp WITH (TABLOCK)
SELECT top (100) clubcard from TempClub with (nolock) where ID between
#MinClubcardand and #MaxClubcard
declare #sql varchar (8000)
while #OldestOffer <= #CurrentOffer
begin
print #CurrentOffer
print #OldestOffer
set #sql = 'delete from Temp where Clubcard
in (select Clubcard from ClubTransaction_'+convert(varchar,#CurrentOffer)+' with (nolock))'
print (#sql)
exec (#sql)
SET #CurrentOffer = #CurrentOffer-1
IF #OldestOffer = #CurrentOffer
begin
-- my logic
end
end
end
My TempClub table always checks only with first 100 records. My TempClub table has 3000 records.
I need to check all my clubcard all 3000 records with ClubTransaction_121,ClubTransaction_120,ClubTransaction_119 table.
The SELECT query in line 8 returns only the top 100 items
SELECT top (100) clubcard from TempClub ...
If you want to retrieve all items, remove the top (100) part of your statement
SELECT clubcard from TempClub ...
In order to do batch type processing, you need to set the #MinClubcardID to the last ID processed plus 1 and include an ORDER BY ID to ensure that the records are being returned in order.
But... I wouldn't use the approach of using the primary key as my "index". What you're looking for is a basic pagination pattern. In SQL Server 2005+, Microsoft introduced the row_number() function which makes pagination a lot easier.
For example:
DECLARE #T TABLE (clubcard INT)
DECLARE #start INT
SET #start = 0
WHILE(1=1)
BEGIN
INSERT #T (clubcard)
SELECT TOP 100 clubcard FROM
(
SELECT clubcard,
ROW_NUMBER() OVER (ORDER BY ID) AS num
FROM dbo.TempClub
) AS t
WHERE num > #start
IF(##ROWCOUNT = 0) BREAK;
-- update counter
SET #start = #start + 100
-- process records found
-- make sure temp table is empty
DELETE FROM #T
END