I'm developing an SSIS package which needs to pull data from ServerA based upon data in a DB table on ServerB. I'm DBadmin on ServerB, but very limited access to ServerA.
The query I need to execute, ideally using an OleDB source component, is like this:
SELECT
Blah
FROM ServerA.Database1.dbo.TableA
WHERE Something IN (SELECT foo FROM ServerB.Database2.dbo.TableB)
Is it possible to do this, or do I need to take a different approach?
EDIT: I need to run this query every ten minutes 24x7, and I don't want to pull the data from ServerA as there are millions of rows in the table, which is part of a business critical app which cannot be overloaded.
Pull from serverA into a third data source, pull from serverB into the same source, then use that source to apply your where clause.
OR, pull from serverA to serverB and apply your where clause on serverB.
In response to comment,
OR, pull from serverB to serverA and apply your where clause on serverA. That's really where you want the join done, not in your SSIS package.
Also, see if you can limit most of the rows in serverA based on some criteria outside of B or limit the amount of data from serverB that needs to exist on A for a rough cut before transferring to the SSIS package.
I'm also wondering if they could link serverB to serverA for you...
Related
I am transfering 90 million rows from a source server to my staging area on the destination server.
And from the staging area I transfer 20 million further up the ETL process by doing an WHERE ID EXISTS in a table located on the destination server.
Since the table is not present on the source server and only in the destination server. Is it possible to filter when I pull the rows directly from the source server (so I only transfer 20 million rows from the source server to my destination server)?
Besides creating a linked server on the Source Server. there are two pure SSIS approaches.
Create a temp staging table at the Destination server, copy all records from Source to this temp stage table, and use where exists filter.
On the Data Flow, create a Lookup transformation which for lookup set gets IDs from the table of the Destination Server. Then proceed with matched records only. For performance reason, you may use either Lookup in full cache or partial cache mode; only performance testing can tell which mode is better.
Hadi's recommendations with Linked Server are fine and will work. Pure SSIS approach has advantage that it does not bring any changes to the Source Server, all connection configuration is inside SSIS. In some cases it can be beneficial. Its disadvantage - performance can be worse that with Linked Server.
If needed to transfer as little rows from Source as possible, the most simple way is the Linked Server approach. Otherwise, one can create a table at Source Server (it even can be a global temp ## table created at SSIS package task) and copy filter IDs from the Destination server. The temp table should be a global with ##, since it will be filled at one task and used in subsequent tasks. Then filter records with EXISTS clause at the Source server.
You can do that by creating a linked server on the source machine. Linked servers allows to join tables on different instances:
How to create and configure a linked server in SQL Server Management Studio
Create Linked Servers (SQL Server Database Engine)
I want to move the record from one server to another server on certain criteria.
Note
I don't want to move all the records, I will do some filter on records that which I want
I have to move the records on daily basis.
That server is not in local network.
So If I make stored procedure using linq server, it is possible to move the records. But I don't think it is good way. Is there any other way to solve this issue?
UPDATE
what about BCP Utility?.
I don't have such awareness about it, Is it good performance to export and import for bulk data?
Do the following things :
1. Create Linked server
2. writer Query
Let Server1 with IP:172.16.9.13
Server2 with IP:172.16.9.14
You want to move data from Server1 to Server2 then first add Server2 at Server1
The Write Query Like
Insert into 172.16.9.14.SomeTable
select * from 172.16.9.13.SomeTable where isactive=1
====================Create Linked Server =====================
http://sqlserverplanet.com/dba/how-to-add-a-linked-server
You can add linked server and create procedure for moving records as per your filter criteria.Then schedule sql job on daily basis.
Sample Link for Creating Job
Second option:-
Please create Web service .In that service do this functionality .First fetch the data from target server and then insert to source server.Run this webservice in daily basis using timer or HangFire
Server1: Prod, hosting DB1
Server2: Dev hosting DB2
Is there a way to query databases living on 2 different server with a same select query? I need to bring all the new rows from Prod to dev, using a query
like below. I will be using SQL Server DTS (import export data utility)to do this thing.
Insert into Dev.db1.table1
Select *
from Prod.db1.table1
where table1.PK not in (Select table1.PK from Dev.db1.table1)
Creating a linked server is the only approach that I am aware of for this to occur. If you are simply trying to add all new rows from prod to dev then why not just create a backup of that one particular table and pull it into the dev environment then write the query from the same server and database?
Granted this is a one time use and a pain for re-occuring instances but if it is a one time thing then I would recommend doing that. Otherwise make a linked server between the two.
To backup a single table in SQL use the SQl Server import and export wizard. Select the prod database as your datasource and then select only the prod table as your source table and make a new table in the dev environment for your destination table.
This should get you what you are looking for.
You say you're using DTS; the modern equivalent would be SSIS.
Typically you'd use a data flow task in an SSIS package to pull all the information from the live system into a staging table on the target, then load it from there. This is a pretty standard operation when data warehousing.
There are plenty of different approaches to save you copying all the data across (e.g. use a timestamp, use rowversion, use Change Data Capture, make use of the fact your primary key only ever gets bigger, etc. etc.) Or you could just do what you want with a lookup flow directly in SSIS...
The best approach will depend on many things: how much data you've got, what data transfer speed you have between the servers, your key types, etc.
When your servers are all in one Active Directory, and when you use Windows Authentification, then all you need is an account which has proper rights on all the databases!
You can then simply reference all tables like server.database.schema.table
For example:
insert into server1.db1.dbo.tblData1 (...)
select ... from server2.db2.dbo.tblData2;
I'm trying to merge 2 tables from 2 databases on 2 differents servers.
For now, I create a linked server on one of the servers and I use a query like this:
MERGE INTO tablename1 as T1
using linkedservername.dbname.tablename2 as T2 ON
WHEN MATCHED THEN
UPDATE SET ...
WHEN NOT MATCHED THEN
INSERT ...
I would like to know if there is a solution to do that without create a linked server.
There are three general ways to do this in SSIS. But there is a lot more information if you check online.
Either way you first need to create a connection manager in SSIS pointing directly at your linked server. Start with that.
Then create a data flow task where you select from dbname.tablename2 in a data flow source
Then you can do it a few ways:
A. Staging Table
Dump that result into a staging table then run your merge statement locally in a subsequent SQL Task. This is usually the quickest (and simplest) way unless you aren't allowed to create tables/data in the target.
B. Lookup
Use a lookup in your data flow to identify if the record exists or not, followed by a OLEDB destination (for inserts) or a OLEDB command (for updates)
This is generally slow because both the lookup and update are inefficient.
C. row level merge
Feed the result into a OLEDB command, and put your merge directly in there
This is probably the slowest.
If you want more info, get your connection manager sorted and post back.
How can I get #tempTable to stay in memory after switching servers. Is this possible?
Select *
into #tempTable
from dbo.table
I have data in server 1 that I want to compare in server 2, but I only have readonly access to server 2 (so I can't just move my data there), and the table in server 2 is too big to move to server 1. This is why I want to know how to keep a temp table in memory after connecting to a new server.
Any help would be appreciated, thanks.
What you write is possible, but it just creates a temporary table on the server where the table is.
You probably want:
select *
into #Server2Table
from server2.database.dbo.table
You can then use #Server2Table in additional queries on the same connection where you copied it (such as the same window in SSMS or the same job step or the same stored procedure). If you need it in a more permanent location, either use a global temporary table (starts with ##) or a "real" table.
This requires the ability to link servers, using something like:
sp_addlinkedserver server2
You would run this on server1. Perhaps your DBA will need to set it up.
I have found that queries often run faster when loading cross-server tables into a temporary table. This is not because temporary tables are stored in memory. This is because there is more information available about a table on a local server for the SQL optimizer to take advantage of.
You could export the data from your query and import that into a database on your first server. You can use the SQL server import/export wizard. Bit convoluted, but if this happens a lot you can use SSIS to automate this move or even simply tick the 'Save this package' box. Once you have the data in your first server you can do whatever you like with it.