How to do Select query range by range on a particular table - sql

I have one temp_table which consists of more than 80K rows.
In aqua I am unable to do select * on this table due to space/memory limitation I guess.
select * from #tmp
Is there any way to do select query range by range?
For eg:- give me first 10000 records and next 10000 and next 10000 till the end.
Note:-
1) I am using Aqua Data Studio, where I am restricted to select max 5000 rows in one select query.
2) I am using Sybase, which somehow doesn't allow 'except' and 'select top #var from table' syntax and ROWNUM() is not avaliable
Thanks!!

You can use something like the following in SQL Server. Just update #FirstRow for each new iteration.
declare #FirstRow int = 0
declare #Rows int = 10000
select top (#FirstRow+#Rows) * from Table
except
select top (#FirstRow) * from Table
set #FirstRow = #FirstRow + #Rows
select top (#FirstRow+#Rows) * from Table
except
select top (#FirstRow) * from Table

Can you not use something like with where clause on some id in the table
select top n * from table where some_id > current_iteration_starting_point
e.g
select top 200 * from tablename where some_id > 1
and keep increasing the iteration_starting_point say from 1 to 201 in the next iteration and so on.

Here is documentation on how to increase the memory capacity of Aqua Data Studio :
https://www.aquaclusters.com/app/home/project/public/aquadatastudio/wikibook/Documentation16/page/50/Launcher-Memory-Configuration

Related

Dynamic TOP N / TOP 100 PERCENT in a single query based on condition

A local variable #V_COUNT INT. If the variable #V_COUNT is '0'(zero) the return all the records from table otherwise return the number of {#V_COUNT} records from table. For example if #V_COUNT = 50, return TOP 50 records. If #V_COUNT is 0 then return TOP 100 PERCENT records. Can we achieve this in a single query?
Sample query :
DECLARE #V_COUNT INT = 0
SELECT TOP (CASE WHEN #V_COUNT > 0 THEN #V_COUNT ELSE 100 PERCENT END) *
FROM MY_TABLE
ORDER BY COL1
Incorrect syntax near the keyword 'percent'
A better solution would be to not use TOP at all - but ROWCOUNT instead:
SET ROWCOUNT stops processing after the specified number of rows.
...
To return all rows, set ROWCOUNT to 0.
Please note that ROWCOUNT is recommended to use only with select statements -
Important
Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax.
DECLARE #V_COUNT INT = 0
SET ROWCOUNT #V_COUNT -- 0 means return all rows...
SELECT *
FROM MY_TABLE
ORDER BY COL1
SET ROWCOUNT 0 -- Avoid side effects...
This will eliminate the need to know how many rows there are in the table
Be sure to re-set the ROWCOUNT back to 0 after the query, to avoid side effects (Good point by Shnugo in the comments).
Instead of 100 percent you can write some very big number, which will surely be bigger than possible number of rows returned by the query, eg. max int which is 2147483647.
You can do something like:
DECLARE #V_COUNT INT = 0
SELECT TOP (CASE WHEN #V_COUNT > 0 THEN #V_COUNT ELSE (SELECT COUNT(1) FROM MY_TABLE) END) *
FROM MY_TABLE
DECLARE #V_COUNT int = 3
select *
from
MY_TABLE
ORDER BY
Service_Id asc
offset case when #V_COUNT >0 then ((select count(*) from MY_TABLE)- #V_COUNT) else #V_COUNT end rows
SET ROWCOUNT forces you into procedural logic. Furthermore, you'll have to provide an absolute number. PERCENT would need some kind of computation...
You might try this:
DECLARE #Percent FLOAT = 50;
SELECT TOP (SELECT CAST(CAST((SELECT COUNT(*) FROM sys.objects) AS FLOAT)/100.0 * CASE WHEN #Percent=0 THEN 100 ELSE #Percent END +1 AS INT)) o.*
FROM sys.objects o
ORDER BY o.[name];
This looks a bit clumsy, but the computation will be done once within microseconds...

SQL Server recent rows

I'm sure this is easy but I have googled a lot and searched.
Ok, I have a table WITHOUT dates etc with 100000000000000000 records.
I want to see the latest entries, i.e.
Select top 200 *
from table
BUT I want to see the latest entries. Is there a rowidentifier that I could use in a table?
ie
select top 200 *
from table
order by rowidentifer Desc
Thanks
Is there a row.identifier that i could use in a table ie set top 200 * from table order by row.identifer Desc
As already stated in the comment's, there is not. The best way is having an identity, timestamp or some other form of identifying the record. Here is an alternative way using EXCEPT to get what you need, but the execution plan isn't the best... Play around with it and change as needed.
--testing purposes...
DECLARE #tbl TABLE(FirstName VARCHAR(50))
DECLARE #count INT = 0
WHILE (#count <= 12000)
BEGIN
INSERT INTO #tbl(FirstName)
SELECT
'RuPaul ' + CAST(#count AS VARCHAR(5))
SET #count += 1
END
--adjust how many records you would like, example is 200
SELECT *
FROM #tbl
EXCEPT(SELECT TOP (SELECT COUNT(*) - 200 FROM #tbl) * FROM #tbl)
--faster than above
DECLARE #tblCount AS INT = (SELECT COUNT(*) FROM #tbl) - 200
SELECT *
FROM #tbl
EXCEPT(SELECT TOP (#tblCount) * FROM #tbl)
On another note, you could create another Table Variable that has an ID and other columns, then you could insert the records you would need. Then you can perform other operations against the table, for example OrderBy etc...
What you could do
ALTER TABLE TABLENAME
ADD ID INT IDENTITY
This will add another column to the table "ID" and automatically give it an ID. Then you have an identifier you can use...
Nope, in short, there is none, if you don`t have a column dedicated as one (ie. an IDENTITY, or a SEQUENCE, or something similar). If you did, then you could get an ordered result back.

Sql server while loop

I have a select query that returns about 10million rows and I then need to insert them into a new table.
I want the performance to be ok so I want to insert them into the new table in batches of 10000. To give an example i created a simple select query below
Insert into new table
Select top 10000 * from applications
But now I need to get the next 10000 rows and insert them. Is there a way to iterate through the million rows to insert them in batches of 10000?? I'm using sql server 2008.
It will probably not be faster by batching it up. Probably the opposite. One statement is the fastest version most of the time. It might just require high amounts of temp space and log. But the fastest measured with the wall-clock.
Reason for that is that SQL Server automatically build a good plan that efficiently batches up all work at once.
To answer your question: The statement as you wrote it produces undefined rows because a table has no order. You should probably add a clustering key like an ID column. That way you can go along the table with a while loop, each time executing the following:
INSERT ...
SELECT TOP 10000 *
FROM T
WHERE ID > #lastMaxID
ORDER BY ID
Note, that the ORDER BY is required for correctness.
I wouldn't batch 10 million records.
If you are batching an insert, use an indexed field to define your batches.
DECLARE #intFlag INT
SET #intFlag = 1
WHILE (#intFlag <=10000000)
BEGIN
INSERT INTO yourTable
SELECT *
FROM applications
WHERE ID BETWEEN #intFlag AND #IntFlag + 9999
SET #intFlag = #intFlag + 10000
END
GO
Use CTE or While loop to insert like batches
;WITH q (n) AS (
SELECT 1
UNION ALL
SELECT n + 1
FROM q
WHERE n < 10000
)
INSERT INTO table1
SELECT * FROM q
OR
DECLARE #batch INT,
#rowcounter INT,
#maxrowcount INT
SET #batch = 10000
SET #rowcounter = 1
SELECT #maxrowcount = max(id) FROM table1
WHILE #rowcounter <= #maxrowcount
BEGIN
INSERT INTO table2 (col1)
SELECT col1
FROM table1
WHERE 1 = 1
AND id between #rowcounter and (#rowcounter + #batch)
-- Set the #rowcounter to the next batch start
SET #rowcounter = #rowcounter + #batch + 1;
END
As an option you can export query to a flat file by bcp and BULK IMPORT it into a table.
BULK IMPORT statement has BATCHSIZE option to limit number of rows.
In your case BATCHSIZE =10000 will work.
There is another option to create SSIS package. Select fast load in OLE DB destination and define 10000 number of rows in “Rows per batch:”. It is probably the easiest solution.

How can I extend this SQL query to find the k nearest neighbors?

I have a database full of two-dimensional data - points on a map. Each record has a field of the geometry type. What I need to be able to do is pass a point to a stored procedure which returns the k nearest points (k would also be passed to the sproc, but that's easy). I've found a query at http://blogs.msdn.com/isaac/archive/2008/10/23/nearest-neighbors.aspx which gets the single nearest neighbour, but I can't figure how to extend it to find the k nearest neighbours.
This is the current query - T is the table, g is the geometry field, #x is the point to search around, Numbers is a table with integers 1 to n:
DECLARE #start FLOAT = 1000;
WITH NearestPoints AS
(
SELECT TOP(1) WITH TIES *, T.g.STDistance(#x) AS dist
FROM Numbers JOIN T WITH(INDEX(spatial_index))
ON T.g.STDistance(#x) < #start*POWER(2,Numbers.n)
ORDER BY n
)
SELECT TOP(1) * FROM NearestPoints
ORDER BY n, dist
The inner query selects the nearest non-empty region and the outer query then selects the top result from that region; the outer query can easily be changed to (e.g.) SELECT TOP(20), but if the nearest region only contains one result, you're stuck with that.
I figure I probably need to recursively search for the first region containing k records, but without using a table variable (which would cause maintenance problems as you have to create the table structure and it's liable to change - there're lots of fields), I can't see how.
What happens if you remove TOP (1) WITH TIES from the inner query, and set the outer query to return the top k rows?
I'd also be interested to know whether this amendment helps at all. It ought to be more efficient than using TOP:
DECLARE #start FLOAT = 1000
,#k INT = 20
,#p FLOAT = 2;
WITH NearestPoints AS
(
SELECT *
,T.g.STDistance(#x) AS dist
,ROW_NUMBER() OVER (ORDER BY T.g.STDistance(#x)) AS rn
FROM Numbers
JOIN T WITH(INDEX(spatial_index))
ON T.g.STDistance(#x) < #start*POWER(#p,Numbers.n)
AND (Numbers.n - 1 = 0
OR T.g.STDistance(#x) >= #start*POWER(#p,Numbers.n - 1)
)
)
SELECT *
FROM NearestPoints
WHERE rn <= #k;
NB - untested - I don't have access to SQL 2008 here.
Quoted from Inside Microsoft® SQL Server® 2008: T-SQL Programming. Section 14.8.4.
The following query will return the 10
points of interest nearest to #input:
DECLARE #input GEOGRAPHY = 'POINT (-147 61)';
DECLARE #start FLOAT = 1000;
WITH NearestNeighbor AS(
SELECT TOP 10 WITH TIES
*, b.GEOG.STDistance(#input) AS dist
FROM Nums n JOIN GeoNames b WITH(INDEX(geog_hhhh_16_sidx)) -- index hint
ON b.GEOG.STDistance(#input) < #start*POWER(CAST(2 AS FLOAT),n.n)
AND b.GEOG.STDistance(#input) >=
CASE WHEN n = 1 THEN 0 ELSE #start*POWER(CAST(2 AS FLOAT),n.n-1) END
WHERE n <= 20
ORDER BY n
)
SELECT TOP 10 geonameid, name, feature_code, admin1_code, dist
FROM NearestNeighbor
ORDER BY n, dist;
Note: Only part of this query’s WHERE
clause is supported by the spatial
index. However, the query optimizer
correctly evaluates the supported part
(the "<" comparison) using the index.
This restricts the number of rows for
which the ">=" part must be tested,
and the query performs well. Changing
the value of #start can sometimes
speed up the query if it is slower
than desired.
Listing 2-1. Creating and Populating Auxiliary Table of Numbers
SET NOCOUNT ON;
USE InsideTSQL2008;
IF OBJECT_ID('dbo.Nums', 'U') IS NOT NULL DROP TABLE dbo.Nums;
CREATE TABLE dbo.Nums(n INT NOT NULL PRIMARY KEY);
DECLARE #max AS INT, #rc AS INT;
SET #max = 1000000;
SET #rc = 1;
INSERT INTO Nums VALUES(1);
WHILE #rc * 2 <= #max
BEGIN
INSERT INTO dbo.Nums SELECT n + #rc FROM dbo.Nums;
SET #rc = #rc * 2;
END
INSERT INTO dbo.Nums
SELECT n + #rc FROM dbo.Nums WHERE n + #rc <= #max;

Split query result by half in TSQL (obtain 2 resultsets/tables)

I have a query that returns a large number of heavy rows.
When I transform this rows in a list of CustomObject I have a big memory peak, and this transformation is made by a custom dotnet framework that I can't modify.
I need to retrieve a less number of rows to do "the transform" in two passes and then avoid the memory peak.
How can I split the result of a query by half? I need to do it in DB layer. I thing to do a "Top count(*)/2" but how to get the other half?
Thank you!
If you have identity field in the table, select first even ids, then odd ones.
select * from Table where Id % 2 = 0
select * from Table where Id % 2 = 1
You should have roughly 50% rows in each set.
Here is another way to do it from(http://www.tek-tips.com/viewthread.cfm?qid=1280248&page=5). I think it's more efficient:
Declare #Rows Int
Declare #TopRows Int
Declare #BottomRows Int
Select #Rows = Count(*) From TableName
If #Rows % 2 = 1
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows + 1
End
Else
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows
End
Set RowCount #TopRows
Select * From TableName Order By DisplayOrder
Set RowCount #BottomRows
Select * From TableNameOrder By DisplayOrderDESC
--- old answer below ---
Is this a stored procedure call or dynamic sql? Can you use temp tables?
if so, something like this would work
select row_number() OVER(order by yourorderfield) as rowNumber, *
INTO #tmp
FROM dbo.yourtable
declare #rowCount int
SELECT #rowCount = count(1) from #tmp
SELECT * from #tmp where rowNumber <= #rowCount / 2
SELECT * from #tmp where rowNumber > #rowCount / 2
DROP TABLE #tmp
SELECT TOP 50 PERCENT WITH TIES ... ORDER BY SomeThing
then
SELECT TOP 50 PERCENT ... ORDER BY SomeThing DESC
However, unless you snapshot the data first, a row in the middle may slip through or be processed twice
I don't think you should do that in SQL, unless you will always have a possibility to have the same record 2 times.
I would do it in an "software" programming language, not SQL. Java, .NET, C++, etc...