How to invoke SELECT over DBLINK over DBLINK? - sql

In Oracle 11G I can easily invoke:
SELECT * FROM TABLE#DB_LINK_NAME;
But how invoke SELECT over DB_LINK that is on another DB_LINK?
Something like this:
SELECT * FROM TABLE#REMOTE_DB_LINK_NAME#DB_LINK_NAME;

First off, architecturally, I'd be pretty leery of any design that involved pulling data over multiple database links. I've seen it done when the eventual source is some ancient version of Oracle that the target database cannot connect to directly so an intermediate database running an intermediate version of Oracle was used. That is very rare in practice, though.
From a performance perspective, this sort of approach is gravely problematic. There is, of course, the issue that the data is going to be sent over the network twice. But more to worryingly, you are taking a difficult problem, optimizing distributed SQL statements, and making it nearly intractable. You'd basically have to either guarantee that you would never query local data and remote data in the same query or you would have to live with the resulting performance if Oracle decides on a stupid query plan because the set of tools left to allow you to optimize this sort of query is minimal.
That being said, the intermediate database would need to have synonyms or views that abstract away the database link. So
On A:
create database link to B
On B:
create database link to C
create synonym table for table#C
On A, you can then
SELECT *
FROM table#B

Related

sql temp table join between servers

So I have a summary i need to return to the end user application.
It should accept 3 parameters DateType, StartDate, EndDate.
Date Type will determine the date field I use to filter the data.
The way i accomplished this was putting all the IDs of the records for a datetype into a TEMP table and then joining my summary to the list of IDs.
This worked fine when running on the query on the SQL server that houses the data.
However, that is a replicated server, so when I compiled to a stored proc that would be on the server with the rest of the application data, it slowed the query down. IE 2 seconds vs 50 seconds.
I think the cross join from the temp table that is created on the SQL server then joining to the tables on the replciation server, is causing the slow down.
Are there any methods or techniques that I can use to get around this and build this all in one stored procedure?
If I create 3 stored procedures with their own date range, then they are fast again. However, this means maintaining multiple stored procs for the same thing.
First off, if you are running a version of SQL Server older than 2012 SP1, one problem is that users who aren't allowed to run DBCC SHOW_STATISTICS (which is most users who aren't sysadmins, see the "Permissions" section in the documentation) don't get access to statistics on remote tables. This can severely cripple the optimizer's ability to generate a good execution plan. Upgrading SQL Server or granting more permissions can help there.
If your query involves filtering or joining on a character column, make sure the remote server is flagged in the linked server options as "collation compatible". If this option is off, SQL Server can't assume strings can be compared across the servers and it will start pumping entire tables up and down just to make sure the data ends up where the comparison has to be made.
If the execution plan is as good as it gets and it's still not good enough, one general (lame) technique is to transfer all data locally first (SELECT * INTO #localtable FROM remote.db.schema.table), then run the query as a non-distributed query. Obviously, in order for this to work, the remote table cannot be "too big" and in some cases this actually has worse performance, depending on how many rows are involved. But it's always worth considering, because the optimizer does a better job with local tables.
Another approach that avoids pulling tables together across servers is packing up data in parameters to remote stored procedure calls. Entire tables can be passed as XML through an NVARCHAR(MAX), since neither XML columns nor table-valued parameters are supported in distributed queries. The basic idea is the same: avoid the need for the the optimizer to figure out an efficient distributed query. The best approach greatly depends on your data and your query, obviously.

I'd like to merge data sets using an SQL query from different servers (one Sybase the other MS)

Is that possible? I'm using Aquadesk and I can't get it to work. The tables have a matching unique identifier and wondering if I can match them up in some way.
What you need - as I think - are "Federated Servers" (Databases) (you can look this up)
The basic idea behind that is, the you can create (catalog) a table in you local Database that is already residing on an other Database (or Server, or even an other DB System, but that depends in you SQL system and version) -> that is defintely a question for your DBAS
You get a table like 'MYSQL'.'PERSONS' that resides remotely (eg. 'BASE','PERSDATA'), so you can use them in a
`SELECT *
from 'LOCALNAME'.'USERS usr
JOIN 'MYSQL'.'PERSONS' pers
on usr.user_id=pers.id`
So jou can select and join over different Databases (and Servers)
I only used that whith IBM/UDB but it works realy fine, and has a fair performance (altough heavily depending on your statement)

SQL Server 2008, Sybase - large select queries over low bandwidth

I need to pull a large amount of data from various tables across a line that has very low bandwidth. I need to minimize the amount of data that gets sent too and fro.
On that side is a Sybase database, on this side SQL Server 2008.
What I need is to pull all the tables from the Sybase database that have to do with this office. Lets say I have the following tables as an example:
Farm
Tree
Branch
etc.
(one farm has many trees, one tree has many branches etc.)
Lets say the "Farm" table has a field called "CountryID", and I only want the data for where CountryID=12. The actual table structures I am looking at are very complex (and I am also not very familiar with them) so I want to try to keep the queries simple.
So I am thinking of setting up a series of views:
CREATE VIEW vw_Farm AS
SELECT * from Farm where CountryID=12
CREATE VIEW vw_Tree AS
SELECT * from Tree where FarmID in (SELECT FarmID FROM vw_Farm)
CREATE VIEW vw_Branch AS
SELECT * from Tree where BranchID in (SELECT BranchID FROM vw_Branch)
etc.
To then pull the actual data across I would then do:
SELECT * from vw_Farm into localDb.Farm
SELECT * from vw_Tree into localDb.Tree
SELECT * from vw_Branch into localDb.Branch
etc.
Simple enough to set up. I am wondering how this will perform though? Will it perform all the SELECT statements on the Sybase side and then just send back the result? Also, since this will be an iterative process, is it possible to index the views for subsequent calls?
Any other optimisation suggestions would also be welcome!
Thanks
Karl
EDIT: Just to clarify, the views will be set up in SQL Server. I am using a linked server using Sybase ASE to set up those views. What is worrying me in particular is whether the fact that the view is in SQL Server on this side and not on Sybase on that side will mean that for each iteration the data from the preceeing view will get pulled across to SQL Server first before the calculations get executed. I want Sybase to do all the calcs and just pass the results across.
It's difficult to be certain without testing, but my somewhat-relevant experience (using linked servers to platforms other than Sybase, and on SQL Server 2005) has been that using subqueries (such as your code for vw_Tree and vw_Branch) more or less guarantees that SQL Server will pull all the data for the outer table into a local temp table, then match it to the results of the inner query.
The problem is that SQL Server has no access to the linked server's table statistics, so can make no meaningful decisions about how to optimise the query.
If you want to be sure to have the work done on the Sybase server, your best bet will be to write code (could be views or stored procedures) on the Sybase side and reference them from SQL Server.
Linked server connections are, in my experience, not particularly resilient over flaky networks. If it's available, you could consider using Integration Services rather than linked-server queries - but even that may not be much better. You may need to consider falling back on moving text files with robocopy and bcp.

Move Data from Oracle to SQL Server

I would like to copy parts of an Oracle DB to a SQL Server DB. I need to move the data because the Oracle box is being decommissioned. I only need the data for reference purposes so don't need indexes or stored procedures or contstaints, etc. All I need is the data.
I have a link to the Oracle DB in SQL Server. I have tested the following query, which seemed to work just fine:
select
*
into
NewTableName
from
linkedserver.OracleTable
I was wondering if there are any potential issues with using this approach?
Using SSIS (sql integration services) may be a good alternative especially if your table names are the same on both servers. Use the import wizard via and it should create the destination tables for you and let you edit any mappings.
The only issue I see with that is you will need to execute that of course for each and every table you need. Glad you are decommissioning the oracle server :-). Otherwise if you are not concerned with indexes or any of the existing sprocs I don't see any issue in what you are doing.
The "select " approach could be very slow if tables are large. Consider writing pro*C in that case or use Fastreader http://www.wisdomforce.com/products-FastReader.html
A faster and easier approach might be to use the Data Transformation Services, depending on the number of objects you're trying to copy over.

SQL Passthrough in SAS

Are there any advantages of using SQL Passthrough facility along with SAS?
Although this question is overly broad, I can provide an overly broad answer.
The pass-through SQL in SAS allows you to communicate directly with a database. This becomes very advantageous when you are using database specific functions. An example would be Oracle's stats functions. You do not have to worry about how SAS will handle your coding or translate your SQL.
Additionally, it has also been a benefit to us that Pass-through SQL requires very little processing on the SAS side. If you have an extremely busy SAS box, you can opt to send the processing logic directly to the database. This is possible without using Pass-through SQL, but you have a higher degree of control when utilizing it.
This is by no means an exhaustive list of benefits, simply a few high level perks to using pass-through SQL. If you have a more concrete use case, we can discuss the specific differences in coding techniques.
PROC SQL will try and pass as much of the logic as it can to the database, but there are various times that it cannot. Using SAS functions that do not have equivalent in the database (or in the SAS/ACCESS engine for the database), will prevent passing the whole query to the database. When the query is not fully passed to the database, then the data is pulled into SAS and processed there. The more complicated your SQL is the more likely it will end up being processed in SAS. Here is a case that makes a larger difference than you might realize.
libname db <database> path=dbserver user=... password=...;
proc sql;
create table db.new as
select * from db.largedata where flag=1;
quit;
This will actually (at least thru SAS 9.1.3) pull all the data that matches flag=1 down to SAS and then load it back into the database. It this is millions of rows it really slows down.
You would find explicit pass through much faster in this case.
proc sql;
connect dbase (server=dbserver user=... password=...);
execute (create table db.new as
select * from db.largedata where flag=1) as dbase;
disconnect dbase;
quit;
I recently did an example using Oracle and a table with about 250,000 rows. The first way took 20 seconds and the second way to 2 seconds.
If you don't use the pass-through, then you have to import all the records (that you need for the processing) from the database to sas. By using the pass-through, you can have some processing done on the database side and bring over only the resulting records into sas. The difference (in terms of processing time and network usage) can very, from tiny to huge, depending on what you do.
There are advantages to using passthrough, but it depends on what you're trying to accomplish. Generally, I use standard proc sql without the passthrough when doing queries. Recently, however, I've used it to generate some stored procs.
proc sql;
connect to mysql(user = 'xxxxx' pass = 'xxxxx' server = 'localhost');
execute(set #id = &id.) by mysql;
execute(select (#lit:=image_text) from quality.links_image_text where image_id = #id) by mysql;
execute(set #lidx = locate('ninja',#lit)) by mysql;
execute(set #lidx2 = locate(' ',#lit,#lidx)) by mysql;
execute(set #lidxd = #lidx2 - #lidx) by mysql;
execute(set #lf = substr(#lit,#lidx,#lidxd)) by mysql;
create table asdf as
select &id. as id, a as ws from connection to mysql
(select #lf as a)
;
disconnect from mysql;
quit;
Clearly, that's not something that can be done outside of passthrough (at least not that I know of). So yea ... it all depends on what it is you're trying to accomplish.
Put simply, SQL pass-through statements give you more control over what gets sent to the database.