Performance gap using sub query with STGeomFromText - sql

I'm using a geometric table, with polygons inside.
The problem is the point I try to match is stored in a table, and I can't get the same performance using one query instead of two :
-- this is the base / best time
SELECT *
FROM dbo.table1
WHERE geomField.STContains(
GEOMETRY::STGeomFromText(
'POINT(6.82 7.21)'
,0)
) = 1
-- this is more or less the same as the previous
DECLARE #g GEOMETRY = GEOMETRY::STGeomFromText(
(select top 1 'POINT(6.82 7.21)')
,0);
SELECT *
FROM dbo.table1
WHERE geomField.STContains(#g) = 1
-- this is slow as hell
SELECT *
FROM dbo.table1
WHERE geomField.STContains(
GEOMETRY::STGeomFromText(
(select top 1 'POINT(6.82 7.21)')
,0)
) = 1
Is there any way to improve the last one ? (I'm using EXEC sp_executesql in the backend and the 2nd option mean a stored procedure)

Related

How to make efficient pagination with total count

We have a web application which helps organizing biological experiments (users describe experiment and upload experiment data). In the main page, we show first 10 experiments and then below Previous Next 1 2 3 .. 30.
I bugs me how to make efficient total count and pagination. Currently:
select count(id) from experiments; // not very efficient in large datasets
but how does this scale when dealing with large datarecords > 200.000. I tried to import random experiments to table, but it still performs quite ok (0.6 s for 300.000 experiments).
The other alternative I thought about is to add addtional table statistics (column tableName, column recordsCount). So after each insert to table experiments I would increase recordsCount in statistics (this means inserting to one table and updating other, using sql transaction of course). Vice versa goes for delete statement (recordsCount--).
For pagination the most efficient way is to do where id > last_id as sql uses index of course. Is there any other better way?
In case results are to be filtered e.g. select * from experiment where name like 'name%', option with table statistics fails. We need to get total count as: select count(id) from experiment where name like 'name%'.
Application was developed using Laravel 3 in case it makes any difference.
I would like to develop pagination that always performs the same. Records count must not affect pagination nor total count of records.
Please have the query like below:
CREATE PROCEDURE [GetUsers]
(
#Inactive Bit = NULL,
#Name Nvarchar(500),
#Culture VarChar(5) = NULL,
#SortExpression VarChar(50),
#StartRowIndex Int,
#MaxRowIndex Int,
#Count INT OUTPUT
)
AS
BEGIN
SELECT ROW_NUMBER()
OVER
(
ORDER BY
CASE WHEN #SortExpression = 'Name' THEN [User].[Name] END,
CASE WHEN #SortExpression = 'Name DESC' THEN [User].[Name] END DESC
) AS RowIndex, [User].*
INTO #tmpTable
FROM [User] WITH (NOLOCK)
WHERE (#Inactive IS NULL OR [User].[Inactive] = #Inactive)
AND (#Culture IS NULL OR [User].[DefaultCulture] = #Culture)
AND [User].Name LIKE '%' + #Name + '%'
SELECT *
FROM #tmpTable WITH (NOLOCK)
WHERE #tmpTable.RowIndex > #StartRowIndex
AND #tmpTable.RowIndex < (#StartRowIndex + #MaxRowIndex + 1)
SELECT #Count = COUNT(*) FROM #tmpTable
IF OBJECT_ID('tempdb..#tmpTable') IS NOT NULL DROP TABLE #tmpTable;
END

How to do Select query range by range on a particular table

I have one temp_table which consists of more than 80K rows.
In aqua I am unable to do select * on this table due to space/memory limitation I guess.
select * from #tmp
Is there any way to do select query range by range?
For eg:- give me first 10000 records and next 10000 and next 10000 till the end.
Note:-
1) I am using Aqua Data Studio, where I am restricted to select max 5000 rows in one select query.
2) I am using Sybase, which somehow doesn't allow 'except' and 'select top #var from table' syntax and ROWNUM() is not avaliable
Thanks!!
You can use something like the following in SQL Server. Just update #FirstRow for each new iteration.
declare #FirstRow int = 0
declare #Rows int = 10000
select top (#FirstRow+#Rows) * from Table
except
select top (#FirstRow) * from Table
set #FirstRow = #FirstRow + #Rows
select top (#FirstRow+#Rows) * from Table
except
select top (#FirstRow) * from Table
Can you not use something like with where clause on some id in the table
select top n * from table where some_id > current_iteration_starting_point
e.g
select top 200 * from tablename where some_id > 1
and keep increasing the iteration_starting_point say from 1 to 201 in the next iteration and so on.
Here is documentation on how to increase the memory capacity of Aqua Data Studio :
https://www.aquaclusters.com/app/home/project/public/aquadatastudio/wikibook/Documentation16/page/50/Launcher-Memory-Configuration

How to optimize stored procedures?

Following is my Stored Proc.
ALTER PROCEDURE [GetHomePageObjectPageWise]
#PageIndex INT = 1
,#PageSize INT = 10
,#PageCount INT OUTPUT
,#AccountID INT
,#Interests Varchar(3000)
AS
BEGIN
SET NOCOUNT ON;
SELECT StoryID
, AlbumID
, StoryTitle
, CAST(NULL as varchar) AS AlbumName
, (SELECT URL FROM AlbumPictures WHERE (AlbumID = Stories.AlbumID) AND (AlbumCover = 'True')) AS AlbumCover
, Votes
, CAST(NULL as Int) AS PictureId
, 'stories' AS tableName
, (SELECT CASE WHEN EXISTS (
SELECT NestedStories.StoryID FROM NestedStories WHERE (StoryID = Stories.StoryID) AND (AccountID=#AccountID)
)
THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT) END) AS Flag
, (SELECT UserName FROM UserAccounts WHERE Stories.AccountID=UserAccounts.AccountID) AS Username
INTO #Results1
FROM Stories WHERE FREETEXT(Stories.Tags,#Interests) AND AccountID <> #AccountID AND IsActive='True' AND Abused < 10
I have 7 more SELECT Statements (not included in the question for brevity) in the Stored Proc similar to SELECT StoryID statement, which i UNION ALL like this
SELECT * INTO #Results9 FROM #Results1
UNION ALL
SELECT * FROM #Results2
UNION ALL
SELECT * FROM #Results3
UNION ALL
SELECT * FROM #Results4
UNION ALL
SELECT * FROM #Results5
UNION ALL
SELECT * FROM #Results6
UNION ALL
SELECT * FROM #Results7
UNION ALL
SELECT * FROM #Results8
SELECT ROW_NUMBER() OVER
(
ORDER BY [tableName] DESC
)AS RowNumber
, * INTO #Results
FROM #Results9
DECLARE #RecordCount INT
SELECT #RecordCount = COUNT(*) FROM #Results
SET #PageCount = CEILING(CAST(#RecordCount AS DECIMAL(10, 2)) / CAST(#PageSize AS DECIMAL(10, 2)))
SELECT * FROM #Results
WHERE RowNumber BETWEEN(#PageIndex -1) * #PageSize + 1 AND(((#PageIndex -1) * #PageSize + 1) + #PageSize) - 1
DROP TABLE #Results
DROP TABLE #Results1
DROP TABLE #Results2
DROP TABLE #Results3
DROP TABLE #Results4
END
This takes around 6 seconds to return the result. How can i improve this stored proc? I have very little knowledge about stored procedures.
Raise a nonclustered index on columns in where clause, IsActive, AccountID and Abused.
Well, you can only optimize it by getting rid of the temporary tables. Your approach sucks not because it is a stored procedure (so the SP part is simply totally irrelevant) but because you do a lot of temporary table stuff that forces linear execution and makes it hard for the query optimizer to find a better day to go forward.
In this particular case, it may be that your db design may be horrifically bad (why #result 1 to #result 8 to start with) and then you have tons of "copy into temp table" on every stored procedure.
Query Optimization in SQL works "statement by statement" and execution is never paralleled between statements - so the temp table stuff really gets into your way here. Get rid of the temp tables.
Never ever use directly SELECT * INTO #temp
INSTEAD
Always create #temp tables then INSERT INTO #temp
this will reduce query execution time by 70%
Though it might be frustration to create #temp table with exact structures,
so here is a short cut for that:This will be once performed
CREATE dbo.tableName by using SELECT * INTO tableName from Your calling query
then
sp_help TableName will provide structures.
Then create #temp table in Store Procedure.
I have optimized query for one of our client which was taking 45 minutes to execute, just replaced with this logic It worked !!!
Now it takes 5 Minutes !!

How can I extend this SQL query to find the k nearest neighbors?

I have a database full of two-dimensional data - points on a map. Each record has a field of the geometry type. What I need to be able to do is pass a point to a stored procedure which returns the k nearest points (k would also be passed to the sproc, but that's easy). I've found a query at http://blogs.msdn.com/isaac/archive/2008/10/23/nearest-neighbors.aspx which gets the single nearest neighbour, but I can't figure how to extend it to find the k nearest neighbours.
This is the current query - T is the table, g is the geometry field, #x is the point to search around, Numbers is a table with integers 1 to n:
DECLARE #start FLOAT = 1000;
WITH NearestPoints AS
(
SELECT TOP(1) WITH TIES *, T.g.STDistance(#x) AS dist
FROM Numbers JOIN T WITH(INDEX(spatial_index))
ON T.g.STDistance(#x) < #start*POWER(2,Numbers.n)
ORDER BY n
)
SELECT TOP(1) * FROM NearestPoints
ORDER BY n, dist
The inner query selects the nearest non-empty region and the outer query then selects the top result from that region; the outer query can easily be changed to (e.g.) SELECT TOP(20), but if the nearest region only contains one result, you're stuck with that.
I figure I probably need to recursively search for the first region containing k records, but without using a table variable (which would cause maintenance problems as you have to create the table structure and it's liable to change - there're lots of fields), I can't see how.
What happens if you remove TOP (1) WITH TIES from the inner query, and set the outer query to return the top k rows?
I'd also be interested to know whether this amendment helps at all. It ought to be more efficient than using TOP:
DECLARE #start FLOAT = 1000
,#k INT = 20
,#p FLOAT = 2;
WITH NearestPoints AS
(
SELECT *
,T.g.STDistance(#x) AS dist
,ROW_NUMBER() OVER (ORDER BY T.g.STDistance(#x)) AS rn
FROM Numbers
JOIN T WITH(INDEX(spatial_index))
ON T.g.STDistance(#x) < #start*POWER(#p,Numbers.n)
AND (Numbers.n - 1 = 0
OR T.g.STDistance(#x) >= #start*POWER(#p,Numbers.n - 1)
)
)
SELECT *
FROM NearestPoints
WHERE rn <= #k;
NB - untested - I don't have access to SQL 2008 here.
Quoted from Inside Microsoft® SQL Server® 2008: T-SQL Programming. Section 14.8.4.
The following query will return the 10
points of interest nearest to #input:
DECLARE #input GEOGRAPHY = 'POINT (-147 61)';
DECLARE #start FLOAT = 1000;
WITH NearestNeighbor AS(
SELECT TOP 10 WITH TIES
*, b.GEOG.STDistance(#input) AS dist
FROM Nums n JOIN GeoNames b WITH(INDEX(geog_hhhh_16_sidx)) -- index hint
ON b.GEOG.STDistance(#input) < #start*POWER(CAST(2 AS FLOAT),n.n)
AND b.GEOG.STDistance(#input) >=
CASE WHEN n = 1 THEN 0 ELSE #start*POWER(CAST(2 AS FLOAT),n.n-1) END
WHERE n <= 20
ORDER BY n
)
SELECT TOP 10 geonameid, name, feature_code, admin1_code, dist
FROM NearestNeighbor
ORDER BY n, dist;
Note: Only part of this query’s WHERE
clause is supported by the spatial
index. However, the query optimizer
correctly evaluates the supported part
(the "<" comparison) using the index.
This restricts the number of rows for
which the ">=" part must be tested,
and the query performs well. Changing
the value of #start can sometimes
speed up the query if it is slower
than desired.
Listing 2-1. Creating and Populating Auxiliary Table of Numbers
SET NOCOUNT ON;
USE InsideTSQL2008;
IF OBJECT_ID('dbo.Nums', 'U') IS NOT NULL DROP TABLE dbo.Nums;
CREATE TABLE dbo.Nums(n INT NOT NULL PRIMARY KEY);
DECLARE #max AS INT, #rc AS INT;
SET #max = 1000000;
SET #rc = 1;
INSERT INTO Nums VALUES(1);
WHILE #rc * 2 <= #max
BEGIN
INSERT INTO dbo.Nums SELECT n + #rc FROM dbo.Nums;
SET #rc = #rc * 2;
END
INSERT INTO dbo.Nums
SELECT n + #rc FROM dbo.Nums WHERE n + #rc <= #max;

Split query result by half in TSQL (obtain 2 resultsets/tables)

I have a query that returns a large number of heavy rows.
When I transform this rows in a list of CustomObject I have a big memory peak, and this transformation is made by a custom dotnet framework that I can't modify.
I need to retrieve a less number of rows to do "the transform" in two passes and then avoid the memory peak.
How can I split the result of a query by half? I need to do it in DB layer. I thing to do a "Top count(*)/2" but how to get the other half?
Thank you!
If you have identity field in the table, select first even ids, then odd ones.
select * from Table where Id % 2 = 0
select * from Table where Id % 2 = 1
You should have roughly 50% rows in each set.
Here is another way to do it from(http://www.tek-tips.com/viewthread.cfm?qid=1280248&page=5). I think it's more efficient:
Declare #Rows Int
Declare #TopRows Int
Declare #BottomRows Int
Select #Rows = Count(*) From TableName
If #Rows % 2 = 1
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows + 1
End
Else
Begin
Set #TopRows = #Rows / 2
Set #BottomRows = #TopRows
End
Set RowCount #TopRows
Select * From TableName Order By DisplayOrder
Set RowCount #BottomRows
Select * From TableNameOrder By DisplayOrderDESC
--- old answer below ---
Is this a stored procedure call or dynamic sql? Can you use temp tables?
if so, something like this would work
select row_number() OVER(order by yourorderfield) as rowNumber, *
INTO #tmp
FROM dbo.yourtable
declare #rowCount int
SELECT #rowCount = count(1) from #tmp
SELECT * from #tmp where rowNumber <= #rowCount / 2
SELECT * from #tmp where rowNumber > #rowCount / 2
DROP TABLE #tmp
SELECT TOP 50 PERCENT WITH TIES ... ORDER BY SomeThing
then
SELECT TOP 50 PERCENT ... ORDER BY SomeThing DESC
However, unless you snapshot the data first, a row in the middle may slip through or be processed twice
I don't think you should do that in SQL, unless you will always have a possibility to have the same record 2 times.
I would do it in an "software" programming language, not SQL. Java, .NET, C++, etc...