SQL Linked server query very very slow - sql

I am extracting large amount of data via linked server from VIEWS. I am using SQL Server 2012 and linked server is SQL Server 2008
My select statement is
SELECT * INTO MY_LOCAL_TABLE
FROM
( SELECT * FROM LINKEDSERVER.DB.TABLE.VIEW
WHERE DATE>'2012-01-01' AND ID IN (SELECT ID FROM MY_LOCAL_VIEW)
) Q
I am expecting 300K rows for nearly 700+ IDs. before it used to take couple of hours but now its take more than a 20 hr!!
Could you please suggest any alternative solution for this PAIN??
Very many thanks in advance!

When you use a 4-part name such as [server].db.dbo.table, especially in a join, often times the entire table is copied over the wire to the local machine, which is obviously not ideal.
A better approach is to use an OPENQUERY -- which is handled at the source (linked server).
Try:
SELECT *
FROM OPENQUERY([LINKEDSERVER], 'SELECT * FROM DB.TABLE.VIEW WHERE DATE>'2012-01-01')
AND ID IN (SELECT ID FROM MY_LOCAL_VIEW)
With this approach the linked server will return all rows for date > x, and then the local server will filter that by ID's in your local table.
Of course, indexing will still play a factor for doing SELECT * FROM DB.TABLE.VIEW WHERE DATE>'2012-01-01.
Another approach, which I use on large subsets, is to dump the local ID's to the remote server, THEN handle it all remotely, such as:
-- copy local table to linked server by executing remote query
DECLARE #SQL NVARCHAR(MAX)
SET #SQL = 'SELECT ID INTO db.dbo.tmpTable FROM [SERVER].DB.DBO.MY_LOCAL_VIEW'
EXEC(#SQL) AT [LINKEDSERVER]
-- index remote table?!?
DECLARE #SQL NVARCHAR(MAX)
SET #SQL = 'CREATE INDEX [IXTMP] ON db.dbo.tmpTable (ID)'
EXEC(#SQL) AT [LINKEDSERVER]
-- run query on local machine against both remote tables
SELECT *
-- INTO sometable
FROM OPENQUERY([LINKEDSERVER], 'SELECT *
FROM DB.TABLE.VIEW
WHERE DATE>''2012-01-01''
AND ID IN (SELECT ID FROM db.dbo.tmpTable)')
-- now drop remote temp table of id's
DECLARE #SQL NVARCHAR(MAX)
SET #SQL = 'DROP TABLE db.dbo.tmpTable'
EXEC(#SQL) AT [LINKEDSERVER]
If the local view is also large, then you may consider executing a remote query that uses an openquery back to the local machine (assuming the remote machine has the local as a link).
-- copy local table to linked server by executing remote query
DECLARE #SQL NVARCHAR(MAX)
SET #SQL = 'SELECT ID INTO db.dbo.tmpTable FROM OPENQUERY([SERVER], ''SELECT ID FROM DB.DBO.MY_LOCAL_VIEW'')'
EXEC(#SQL) AT [LINKEDSERVER]

Others have already suggested about indexing. So I am not going there. suggest another option, if you could change that inner query
SELECT * FROM LINKEDSERVER.DB.TABLE.VIEW
WHERE DATE>'2012-01-01' AND ID IN (SELECT ID FROM MY_LOCAL_VIEW)
To a joined query using inner join since you said having 700+ inlist elements. give it a try.
SELECT lnv.* FROM LINKEDSERVER.DB.TABLE.VIEW lnv
inner join MY_LOCAL_VIEW mcv
on lnv.ID = mcv.ID
and lnv.DATE > '2012-01-01'

Related

Store a database name in variable & then using it dynamically

I have a Table in my Database Which have name of all the Database of my Server
Table Look like
create Table #db_name_list(Did INT IDENTITY(1,1), DNAME NVARCHAR(100))
INSERT INTO #db_name_list
SELECT 'db_One ' UNION ALL
SELECT 'db_Two' UNION ALL
SELECT 'db_Three' UNION ALL
SELECT 'db_four' UNION ALL
SELECT 'db_five'
select * from #db_name_list
I have so many SP in my Database..Which uses multiple table and Join Them..
At Present I am using the SQL code like
Select Column from db_One..Table1
Left outer join db_two..Table2
on ....some Condition ....
REQUIREMENT
But I do not want to HARDCODE the DATABASE Name ..
I want store DataBase name in Variable and use that .
Reason :: I want to restore same Database with Different name and want to Run those SP..At Present we Cant Do ,Because I have used db_One..Table1
or db_two..Table2
I want some thing like ...
/SAMPLE SP/
CREATE PROCEDURE LOAD_DATA
AS
BEGIN
DECLARE #dbname nvarchar(500)
set #dbname=( SELECT DNAME FROM #db_name_list WHERE Did=1)
set #dbname2=( SELECT DNAME FROM #db_name_list WHERE Did=2)
PRINT #DBNAME
SELECT * FROM #dbname..table1
/* or */
SELECT * FROM #dbname2.dbo.table1
END
i.e using Variable Instead of Database name ..
But it thow error
"Incorrect syntax near '.'."
P.S This was posted by some else on msdn but the answer there was not clear & I had the same kind of doubt. So please help
You can't use a variable like this in a static sql query. You have to use the variable in dynamic sql instead, in order to build the query you want to execute, like:
DECLARE #sql nvarchar(500) = 'SELECT * FROM ' + #dbname + '.dbo.mytable'
EXEC(#sql);
There seem to be a couple of options for you depending on your circumstances.
1. Simple - Generalise your procedures
Simply take out the database references in your stored procedure, as there is no need to have an explicit reference to the database if it is running against the database it is stored in. Your select queries will look like:
SELECT * from schema.table WHERE x = y
Rather than
SELECT * from database.schema.table WHERE x = y
Then just create the stored procedure in the new database and away you go. Simply connect to the new database and run the SP. This method would also allow you to promote the procedure to being a system stored procedure, which would mean they were automatically available in every database without having to run CREATE beforehand. For more details, see this article.
2. Moderate - Dynamic SQL
Change your stored procedure to take a database name as a parameter, such as this example:
CREATE PROCEDURE example (#DatabaseName VARCHAR(200))
AS
BEGIN
DECLARE #SQL VARCHAR(MAX) = 'SELECT * FROM ['+#DatabaseName+'].schema.table WHERE x = y'
EXEC (#SQL)
END

Querying the same table for a list of databases in MS SQL Server

This is my first time posting on SO, so please go easy!
I'm attempting to write a SQL script that queries the same table for a list of databases in a single SQL Server instance.
I have successfully queried the list of databases that I required using the following, and inserting this data into a temp table.
Select name Into #Versions
From sys.databases
Where name Like 'Master%'
Master is suffixed with numerical values to identify different environments.
Select * From #Versions
Drop Table #Versions
The table name I am trying to query, is the same in each of the databases, and I want to extract the newest value from this table and insert it into the temp table for each of the database names returned.
I have tried researching this but to no avail. I am fairly comfy with SQL but I fear I could be out of my depth here.
You can do the following. Once you have the list of your databases, you can build up the query (you need to edit it for your purpose).
Select name Into #Versions
From sys.databases
Where name Like 'test%'
declare #sql as varchar(max) = ''
select #sql = #sql + 'INSERT INTO sometable SELECT TOP 1 * FROM ' + name + '..sourcetable ORDER BY somedate DESC; '
FROM #Versions
exec (#sql)
Drop Table #Versions
Look at The undocumented sp_MSforeachdb procedure and here

Statement 'SELECT INTO' is not supported in this version of SQL Server - SQL Azure

I am getting
Statement 'SELECT INTO' is not supported in this version of SQL Server
in SQL Server
for the below query inside stored procedure
DECLARE #sql NVARCHAR(MAX)
,#sqlSelect NVARCHAR(MAX) = ''
,#sqlFrom NVARCHAR(MAX) = ''
,#sqlTempTable NVARCHAR(MAX) = '#itemSearch'
,#sqlInto NVARCHAR(MAX) = ''
,#params NVARCHAR(MAX)
SET #sqlSelect ='SELECT
,IT.ITEMNR
,IT.USERNR
,IT.ShopNR
,IT.ITEMID'
SET #sqlFrom =' FROM dbo.ITEM AS IT'
SET #sqlInto = ' INTO ' + #sqlTempTable + ' ';
IF (#cityId > 0)
BEGIN
SET #sqlFrom = #sqlFrom +
' INNER JOIN dbo.CITY AS CI2
ON CI2.CITYID = #cityId'
SET #sqlSelect = #sqlSelect +
'CI2.LATITUDE AS CITYLATITUDE
,CI2.LONGITUDE AS CITYLONGITUDE'
END
SELECT #params =N'#cityId int '
SET #sql = #sqlSelect +#sqlInto +#sqlFrom
EXEC sp_executesql #sql,#params
I have around 50,000 records, so decided to use Temp Table. But surprised to see this error.
How can i achieve the same in SQL Azure?
Edit: Reading this blog http://blogs.msdn.com/b/sqlazure/archive/2010/05/04/10007212.aspx suggesting us to CREATE a Table inside Stored procedure for storing data instead of Temp table. Is it safe under concurrency? Will it hit performance?
Adding some points taken from http://blog.sqlauthority.com/2011/05/28/sql-server-a-quick-notes-on-sql-azure/
Each Table must have clustered index. Tables without a clustered index are not supported.
Each connection can use single database. Multiple database in single transaction is not supported.
‘USE DATABASE’ cannot be used in Azure.
Global Temp Tables (or Temp Objects) are not supported.
As there is no concept of cross database connection, linked server is not the concept in Azure at this moment.
SQL Azure is shared environment and because of the same there is no concept of Windows Login.
Always drop TempDB objects after their need as they create pressure on TempDB.
During buck insert use batchsize option to limit the number of rows to be inserted. This will limit the usage of Transaction log space.
Avoid unnecessary usage of grouping or blocking ORDER by operations as they leads to high end memory usage.
SELECT INTO is one of the many things that you can unfortunately not perform in SQL Azure.
What you'd have to do is first create the temporary table, then perform the insert. Something like:
CREATE TABLE #itemSearch (ITEMNR INT, USERNR INT, IT.ShopNR INT, IT.ITEMID INT)
INSERT INTO #itemSearch
SELECT IT.ITEMNR, IT.USERNR, IT.ShopNR ,IT.ITEMID
FROM dbo.ITEM AS IT
The new Azure DB Update preview has this problem resolved:
The V12 preview enables you to create a table that has no clustered
index. This feature is especially helpful for its support of the T-SQL
SELECT...INTO statement which creates a table from a query result.
http://azure.microsoft.com/en-us/documentation/articles/sql-database-preview-whats-new/
Create the table using # prefix, e.g. create table #itemsearch then use insert into. The scope of the temp table is limited to the session so there will no concurrency problems.
Well, As we all know SQL Azure table must have a clustered index, that is why SELECT INTO failure copy data from one table in to another table.
If you want to migrate, you must create a table first with same structure and then execute INSERT INTO statement.
For temporary table which followed by # you don't need to create Index.
how to create index and how to execute insert into for temp table?

Finding number of columns returned by a query

How can I get the number of columns returned by an SQL query using SQL Server?
For example, if I have a query like following:
SELECT *
FROM A1, A2
It should return the total number of columns in table A1 + total number of columns in table A2. But the query might be more complicated.
Here is one method:
select top 0
into _MYLOCALTEMPTABLE
from (your query here) t
select count(*)
from Information_Schema.Columns c
where table_name = '_MYLOCALTEMPTABLE'
You can do something similar by creating a view.
You didn't specify your SQL Server version but I'm assuming it's not 2012. However, future readers of this question might be on 2012+ so I'm posting this answer for them.
SQL Server 2012 provides a set of procedures to provide more meta-data about queries and parameters. In this case, the stored procedure sp_describe_first_result_set will provide a handy tabular form.
There is also a DMO function, sys.dm_exec_describe_first_result_set, to provide similar content which is what you'd want to use in your example
DECLARE
-- Your query goes here
#query nvarchar(4000) = N'SELECT * FROM mdm.tblStgBatch AS TSB';
-- Tabular results
EXECUTE sys.sp_describe_first_result_set #tsql = #query;
-- Simple column count
SELECT
COUNT(1) AS column_count
FROM
sys.dm_exec_describe_first_result_set(#query, NULL, 0);
The new metadata discovery options are replacing FMTONLY which is how one would solve this problem prior to 2012. My TSQL chops are apparently not strong enough to do anything useful with it and instead I'd have to bail out to a .NET language to work with the output of FMTONLY.
SET FMTONLY ON;
SELECT *
FROM A1, A2;
SET FMTONLY OFF;
Try this;
--Insert into a temp table (this could be any query)
SELECT *
INTO #temp
FROM [yourTable]
--Select from temp table
SELECT * FROM #temp
--List of columns
SELECT COUNT(name) NumOfColumns FROM tempdb.sys.columns WHERE object_id =
object_id('tempdb..#temp');
--drop temp table
DROP TABLE #temp
Ugly I know:
SELECT COUNT(*) +
(
SELECT COUNT(*)
FROM information_schema.columns
WHERE table_name = 'A1'
)
FROM information_schema.columns
WHERE table_name = 'A2'

How to create a "materialized something" that accesses different tables, depending on a specific setting

I want a program to access a table/view/stored procedure, etc. (something materialized, let's call it X) that abstracts the real location of the data contained in three basic tables (the tables have the same definition in all locations).
I would want X to fetch the server name, catalog name and table name from somewhere (a table, probably) and access the specific three basic tables. The caller of X would not know which specific tables were being called.
How can I do this in SQL Server (2008)?
Like a function, a view can't use dynamic SQL - it can't go find some metadata reference somewhere and adjust accordingly.
I think the closest thing to what you want is a synonym. Let's say you have three different databases, A, B and C. In A the table you want the view to reference is dbo.foo, in B it is dbo.bar, and in Cit is dbo.splunge. So then you could create a synonym like so in each database:
USE A;
GO
CREATE SYNONYM dbo.YourCommonViewName FOR dbo.foo;
GO
USE B;
GO
CREATE SYNONYM dbo.YourCommonViewName FOR dbo.bar;
GO
USE C;
GO
CREATE SYNONYM dbo.YourCommonViewName FOR dbo.splunge;
GO
Now this technically isn't a view, but in each database you can say...
SELECT <cols> FROM dbo.YourCommonViewName;
...and it will return the data from the database-specific table.
To do this in a stored procedure would be much simpler. Say you store the server, database and table name in some table, e.g. dbo.lookup:
CREATE TABLE dbo.lookup
(
id INT PRIMARY KEY,
[server] SYSNAME,
[database] SYSNAME,
[table] SYSNAME,
active BIT NOT NULL DEFAULT (0)
);
-- you may want a constraint or trigger to ensure
-- only one row can be active at any one time.
INSERT dbo.lookup(id, [server], [database], [table])
SELECT 1,N'serverA',N'databaseA',N'tableA'
UNION ALL SELECT 2,N'serverB',N'databaseB',N'tableB';
Now your program can say:
UPDATE dbo.lookup SET active = 1 WHERE ... ?
And your stored procedure can be:
CREATE PROCEDURE dbo.whatever
AS
BEGIN
SET NOCOUNT ON;
DECLARE #sql NVARCHAR(MAX);
SELECT #sql = N'SELECT <cols> FROM ' + QUOTENAME([server])
+ '.' + QUOTENAME([database]) + '.dbo.' + QUOTENAME([table])
FROM dbo.lookup WHERE active = 1;
EXEC sp_executesql #sql;
END
GO
I still don't understand the point, and I don't know what you're planning to do when two different users expect to call your program at the same time, and they each should get results from a different location.
Agreed with Aaron on the fact that views and functions cannot use dynamic sql.
Still what you can do is build a clr table valued function. In that you can play with .net code and query whatever you want. And build you data accordingly and output what you need.
So instead of querying the data like
select * from myview
you can query it
select * from dbo.clr_mymockupview()
Create SYNONYMs to your remote servers.
Create your VIEW to concatenate your locations together using UNION ALL.
Since you said "tables", join your tables before the UNION ALL and hopefully, MS will perform the JOIN remotely.
Use a union query with parameters for database, server, and catalog:
Select col1, col2, <etc.>, 'table1' as tablename, 'server1' as servername, 'catalog1' as catname from server1.catalog1.table1
Union Select col1, col2, <etc.>, 'table2' as tablename, 'server2' as servername, 'catalog2' as catname from server2.catalog2.table2
Union Select col1, col2, <etc.>, 'table3' as tablename, 'server3' as servername, 'catalog3' as catname from server3.catalog3.table3
Then filter based on your 3 criteria. This probably won't be blazing fast but will wonk with std. SQL.