Split query result by half in TSQL (obtain 2 resultsets/tables) - sql-server-2005

I have a query that returns a large number of heavy rows.
When I transform this rows in a list of CustomObject I have a big memory peak, and this transformation is made by a custom dotnet framework that I can't modify.
I need to retrieve a less number of rows to do "the transform" in two passes and then avoid the memory peak.
How can I split the result of a query by half? I need to do it in DB layer. I thing to do a "Top count(*)/2" but how to get the other half?
Thank you!

If you have identity field in the table, select first even ids, then odd ones.
select * from Table where Id % 2 = 0
select * from Table where Id % 2 = 1
You should have roughly 50% rows in each set.

Here is another way to do it from(http://www.tek-tips.com/viewthread.cfm?qid=1280248&page=5). I think it's more efficient:
Declare #Rows Int
Declare #TopRows Int
Declare #BottomRows Int
Select #Rows = Count(*) From TableName
If #Rows % 2 = 1
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows + 1
End
Else
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows
End
Set RowCount #TopRows
Select * From TableName Order By DisplayOrder
Set RowCount #BottomRows
Select * From TableNameOrder By DisplayOrderDESC
--- old answer below ---
Is this a stored procedure call or dynamic sql? Can you use temp tables?
if so, something like this would work
select row_number() OVER(order by yourorderfield) as rowNumber, *
INTO #tmp
FROM dbo.yourtable
declare #rowCount int
SELECT #rowCount = count(1) from #tmp
SELECT * from #tmp where rowNumber <= #rowCount / 2
SELECT * from #tmp where rowNumber > #rowCount / 2
DROP TABLE #tmp

SELECT TOP 50 PERCENT WITH TIES ... ORDER BY SomeThing
then
SELECT TOP 50 PERCENT ... ORDER BY SomeThing DESC
However, unless you snapshot the data first, a row in the middle may slip through or be processed twice

I don't think you should do that in SQL, unless you will always have a possibility to have the same record 2 times.
I would do it in an "software" programming language, not SQL. Java, .NET, C++, etc...

Related

SQL Server run SELECT for each in list

I won't be surprised if SQL just doesn't work this way at all, but:
If we run two SELECT statements in a query, we get a split "Results" pane. I'm wondering if I can add variables to a list, and then have the number of result pane splits match the length of that list.
If I were to mix languages:
id_list = [26275, 54374, 84567]
for i in id_list:
SELECT * FROM table WHERE id = i
I'm just trying to easily compare results of a query while keeping distinct groups, with a changing number of variables. Since loops never seem to be the answer in SQL, I'd be just as happy inserting something like a blank line or horizontal rule, etc. Not sure if that's possible either though...
There is no concept of "lists" (as a separate data structure) in T-SQL. Does this do what you want?
SELECT *
FROM table
WHERE id IN (26275, 54374, 84567);
declare #i int = 0;
declare #Id int;
declare #Ids table (Id int);
insert #Ids select Id from (values (26275), (54374), (84567)) t(Id);
-- OR: insert #Ids select * from string_split('26275, 54374, 84567', ',');
declare #Count int = (select count(*) from #Ids);
while #i < #Count
begin
select #Id = Id, #i = #i + 1
from #Ids order by Id
offset #i rows fetch next 1 rows only;
select * from dbo.MyTable where Id = #Id;
end
You can use UNION ALL:
SELECT * FROM table WHERE id = 26275
UNION ALL
SELECT * FROM table WHERE id = 54374
UNION ALL
SELECT * FROM table WHERE id = 84567

Dynamic TOP N / TOP 100 PERCENT in a single query based on condition

A local variable #V_COUNT INT. If the variable #V_COUNT is '0'(zero) the return all the records from table otherwise return the number of {#V_COUNT} records from table. For example if #V_COUNT = 50, return TOP 50 records. If #V_COUNT is 0 then return TOP 100 PERCENT records. Can we achieve this in a single query?
Sample query :
DECLARE #V_COUNT INT = 0
SELECT TOP (CASE WHEN #V_COUNT > 0 THEN #V_COUNT ELSE 100 PERCENT END) *
FROM MY_TABLE
ORDER BY COL1
Incorrect syntax near the keyword 'percent'
A better solution would be to not use TOP at all - but ROWCOUNT instead:
SET ROWCOUNT stops processing after the specified number of rows.
...
To return all rows, set ROWCOUNT to 0.
Please note that ROWCOUNT is recommended to use only with select statements -
Important
Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax.
DECLARE #V_COUNT INT = 0
SET ROWCOUNT #V_COUNT -- 0 means return all rows...
SELECT *
FROM MY_TABLE
ORDER BY COL1
SET ROWCOUNT 0 -- Avoid side effects...
This will eliminate the need to know how many rows there are in the table
Be sure to re-set the ROWCOUNT back to 0 after the query, to avoid side effects (Good point by Shnugo in the comments).
Instead of 100 percent you can write some very big number, which will surely be bigger than possible number of rows returned by the query, eg. max int which is 2147483647.
You can do something like:
DECLARE #V_COUNT INT = 0
SELECT TOP (CASE WHEN #V_COUNT > 0 THEN #V_COUNT ELSE (SELECT COUNT(1) FROM MY_TABLE) END) *
FROM MY_TABLE
DECLARE #V_COUNT int = 3
select *
from
MY_TABLE
ORDER BY
Service_Id asc
offset case when #V_COUNT >0 then ((select count(*) from MY_TABLE)- #V_COUNT) else #V_COUNT end rows
SET ROWCOUNT forces you into procedural logic. Furthermore, you'll have to provide an absolute number. PERCENT would need some kind of computation...
You might try this:
DECLARE #Percent FLOAT = 50;
SELECT TOP (SELECT CAST(CAST((SELECT COUNT(*) FROM sys.objects) AS FLOAT)/100.0 * CASE WHEN #Percent=0 THEN 100 ELSE #Percent END +1 AS INT)) o.*
FROM sys.objects o
ORDER BY o.[name];
This looks a bit clumsy, but the computation will be done once within microseconds...

Repeat query if no results came up

Could someone please advise on how to repeat the query if it returned no results. I am trying to generate a random person out of the DB using RAND, but only if that number was not used previously (that info is stored in the column "allready_drawn").
At this point when the query comes over the number that was drawn before, because of the second condition "is null" it does not display a result.
I would need for query to re-run once again until it comes up with a number.
DECLARE #min INTEGER;
DECLARE #max INTEGER;
set #min = (select top 1 id from [dbo].[persons] where sector = 8 order by id ASC);
set #max = (select top 1 id from [dbo].[persons] where sector = 8 order by id DESC);
select
ordial,
name_surname
from [dbo].[persons]
where id = ROUND(((#max - #min) * RAND() + #min), 0) and allready_drawn is NULL
The results (two possible outcomes):
Any suggestion is appreciated and I would like to thank everyone in advance.
Just try this to remove the "id" filter so you only have to run it once
select TOP 1
ordial,
name_surname
from [dbo].[persons]
where allready_drawn is NULL
ORDER BY NEWID()
#gbn that's a correct solution, but it's possible it's too expensive. For very large tables with dense keys, randomly picking a key value between the min and max and re-picking until you find a match is also fair, and cheaper than sorting the whole table.
Also there's a bug in the original post, as the min and max rows will be selected only half as often as the others, as each maps to a smaller interval. To fix generate a random number from #min to #max + 1, and truncate, rather than round. That way you map the interval [N,N+1) to N, ensuring a fair chance for each N.
For this selection method, here's how to repeat until you find a match.
--drop table persons
go
create table persons(id int, ordial int, name_surname varchar(2000), sector int, allready_drawn bit)
insert into persons(id,ordial,name_surname,sector, allready_drawn)
values (1,1,'foo',8,null),(2,2,'foo2',8,null),(100,100,'foo100',8,null)
go
declare #min int = (select top 1 id from [dbo].[persons] where sector = 8 order by id ASC);
declare #max int = 1+ (select top 1 id from [dbo].[persons] where sector = 8 order by id DESC);
set nocount on
declare #results table(ordial int, name_surname varchar(2000))
declare #i int = 0
declare #selected bit = 0
while #selected = 0
begin
set #i += 1
insert into #results(ordial,name_surname)
select
ordial,
name_surname
from [dbo].[persons]
where id = ROUND(((#max - #min) * RAND() + #min), 0, 1) and allready_drawn is NULL
if ##ROWCOUNT > 0
begin
select *, #i tries from #results
set #selected = 1
end
end

How to do Select query range by range on a particular table

I have one temp_table which consists of more than 80K rows.
In aqua I am unable to do select * on this table due to space/memory limitation I guess.
select * from #tmp
Is there any way to do select query range by range?
For eg:- give me first 10000 records and next 10000 and next 10000 till the end.
Note:-
1) I am using Aqua Data Studio, where I am restricted to select max 5000 rows in one select query.
2) I am using Sybase, which somehow doesn't allow 'except' and 'select top #var from table' syntax and ROWNUM() is not avaliable
Thanks!!
You can use something like the following in SQL Server. Just update #FirstRow for each new iteration.
declare #FirstRow int = 0
declare #Rows int = 10000
select top (#FirstRow+#Rows) * from Table
except
select top (#FirstRow) * from Table
set #FirstRow = #FirstRow + #Rows
select top (#FirstRow+#Rows) * from Table
except
select top (#FirstRow) * from Table
Can you not use something like with where clause on some id in the table
select top n * from table where some_id > current_iteration_starting_point
e.g
select top 200 * from tablename where some_id > 1
and keep increasing the iteration_starting_point say from 1 to 201 in the next iteration and so on.
Here is documentation on how to increase the memory capacity of Aqua Data Studio :
https://www.aquaclusters.com/app/home/project/public/aquadatastudio/wikibook/Documentation16/page/50/Launcher-Memory-Configuration

Use top based on condition

I have a parameter in my stored procedure that specifies number of rows to select. (Possible values: 0-100. 0 Means Select All rows)
For example #Rows = 5;
Then I can do this:
Insert into #MyTableVar
Select Top(#Rows) *
from myTable
Now, as I said before if 0 is supplied I need to return all rows.
This is a pseudo-code of what I need:
if (#Rows=0) then select * else select top(#Rows) *
I found out that there's SET ROWCOUNT that accepts 0 to return ALL rows, but I need to do an insert into a table variable which is not supported by ROWCOUNT.
Is it possible to achieve this without dynamic sql?
(I understand that I can write a simple if else statement and duplicate query, but I have pretty complex queries and there are lots fo them, I just want to avoid code duplication)
One way is to just put a big number in:
set #Rows = 5;
declare #RowsToUse = (case when #Rows = 0 then 1000000000 else #Rows end);
select top(#RowsToUse) * from myTable
First of all, you are missing the ORDER BY clause, since you are using TOP. You could do this:
SET #Rows = 5;
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(ORDER BY Id) --put the right order here
FROM myTable
)
INSERT INTO #MyTableVar
SELECT YourColumns
FROM CTE
WHERE RN <= #Rows OR #Rows = 0