SAS Pass-through SQL - Multiple DBs - sql

I want to retrieve from DB2 the list of records that matches the identifiers in a DB1 table, like a regular SAS subquery. How can I perform that with SAS pass-through SQL?
Performing the (long and complex) SQL on db1 is too slow using a regular SAS SQL, that's why I am resorting to pass-through SQL instead.
I tried the following but no luck:
proc sql;
connect to db1 as A (user=&userid. password=&userpw. database=MY_DB);
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
select * from schema.table
Where ID_NUM =
(select * from connection to A
(select ID_NUM from schema2.table2)
);
);
disconnect from A;
disconnect from B;
quit;

If you're connecting to single DB2 instance and joining two tables in different schemas/databases, the following should work for you:
proc sql;
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
/* here we're in DB2 SQL */
select t1.* from schema.table as t1
inner join schema2.table2 as t2
on t1.ID_NUM = t2.ID_NUM
);
/* automatic disconnect at PROC SQL boundary */
quit;
If you talk to two different servers/two user accounts a heterogenous join without pass-through could be used. Then the expected number of ID_NUM values would be important.

You can't perform a pass-through query to another pass-through query, unless your two databases are naturally connected in some way that you could take advantage of in the native system.
The only way to do something like this would be to perform the connection to A query and store that result in a macro variable (the list of ID_NUMs), and then insert that macro variable into the query for connection to B.
It might well be better to not explicitly use passthrough here, but instead to use libname and execute the query as you would normally. SAS may well help you out here and do the work for you without actually copying all of B's rows in first.

Related

Passthrough Query from one SQL Server to another that drops a table on the source

I have a SQL Server in Spain and one in the US and there is a domain trust between the two with linked servers on each for access to the other.
I would like to be able to run the below query on the US SQL Server without having to maintain a stored proc on the US Server in order to run it. Is there a way to create a passthrough query from the SQL Server in Spain? I've already tried using OPENQUERY and OPENROWSET and it's just not working as they only seem to work with select statements that return results:
DROP TABLE IF EXISTS [Global].[dbo].[WW_Customer_Receivables]
SELECT *
INTO [global].[dbo].[ww_customer_receivables]
FROM
[LinkedServerObject-Spain].[global].dbo.ww_customer_receivables
If you want to execute DDL statement on your linked server with openquery you can with the following trick:
SELECT *
FROM OPENQUERY(linkedserver, '
DROP TABLE IF EXISTS [Global].[dbo].[WW_Customer_Receivables]
SELECT ##RowCount')
The SELECT ##RowCount returns a result set. So OPENQUERY works.
This trick works for all DDL operations like create, alter, drop. Or if you want to perform inserts/updates/deletes that don't return a result set.
Same trick is applied here Where they have a dummy select foobar.
If you want to execute the into statement from the openquery you can do it like this:
DROP TABLE IF EXISTS [Global].[dbo].[WW_Customer_Receivables]
SELECT *
INTO [global].[dbo].[ww_customer_receivables]
FROM OPENQUERY([LinkerServerObject-US], '
SELECT *
FROM [global].dbo.ww_customer_receivables')

Connecting to Teradata via SAS (SQL Explicit Passthrough), for data pull, is it recommended to use execute statement?

I have seen two options:
Not using execute statement
libname lib "/dir";
run;
proc sql ;
CONNECT TO TeraData (Server = 'edw' User =&tduser pass=&tdpass Database = UDW Mode = TeraData);
create table lib.datanew as
select * from connection to teradata
(select a.name,b.age from table1 a left join table2 b on a.pkey=b.pkey); disconnect from teradata; quit;
Using execute to create a multiset volatile table in Teradata and then bringing it to SAS library
libname lib "/dir";
run;
proc sql;
CONNECT TO TeraData (Server = 'edw' User =&tduser pass=&tdpass Database = UDW Mode = TeraData);
execute( create multiset volatile table datanew as
(select a.name,b.age from table1 a left join table2 b on a.pkey=b.pkey)
with data primary index (name) on commit preserve rows
)
BY TeraData;
CREATE TABLE lib.datanew AS (SELECT * FROM CONNECTION TO TeraData (SELECT * FROM datanew));
disconnect from teradata;
quit;
I just want to understand if one way or the other can be faster? If so, why?
So, when you use execute statement, you use pass through facility (Docs).
That mean that your sql code will be executed right in DBMS, and sas will only get the result table.
More examples you can see in pdf
Also SAS Community has a discussion .
Answering part of my question.
What I found is that when I need to create multiple volatile tables within Teradata, using the already created ones, 2) is the normal choice; I can run the proc sql commands, as if, I were in the Teradata SQL assistant environment.

Passthrough query in R

In SAS I'm used to PROC SQL, that I can use to query the database and return data into SAS, or execute SQL code in database. For example, from SAS I can use PROC SQL to run a passthrough query to create a new table on my database from another table on the database.
proc sql;
connect to netezza ( SERVER=X DATABASE=Z  AUTHDOMAIN="Y");
execute
(
create table B as
select * from A
)
by netezza
;
quit;
In R I'm able to connect and query a database using RODBC and some code like:
connect1 <- odbcConnect("NZ",believeNRows=FALSE)
query1 <- "SELECT * FROM A"
df_imp <- sqlQuery(connect1, query1)
But how do I go about doing something similar to the SAS code above?
You can use the same sqlQuery:
sqlQuery(connect1, "CREATE TABLE b as SELECT * FROM a")
IBM provides several interfaces to use R with Netezza, to include running R within the appliance.
These are provided for free under GPL. Originally it was only available via Revolution for a fee, but that changed over 1 year ago.
You must register with IBM Developerworks (www.ibm.com/developerworks).
Once registered you can download the software and installation directions.
See http://ibm.co/XOC1q3
On this wiki (under the How To section), there are several documents and labs regarding the use of R with Netezza.

Multiple databases of Teradata and SAS in UNIX

I would like to know how to query multiple databases of the same server of Teradata in SAS (unix). I can do it for one database but there are few different databases involved in my queries. The only related article was SAS connection to Teradata Database using Teradata ODBC but could not get the right answer. Could you please share syntax/snippet. Any comment is appreciated.
Thanks!
Jas
Edits:
Please see the below script, I want to do something like this.
libname lib 'mylibraryPath\';
proc sql;
connect to teradata (user="userid" password="pwaaaowrd" mode=teradata database=DB1 database=DB2 database=DB3 tdpid="MyServer");
execute (
create volatile table lib.tab1 as
(
Select statements and several joins of different tables from different databases (server is same)
)
WITH DATA
PRIMARY INDEX (abcd)
ON COMMIT PRESERVE ROWS;
)
By Teradata;
execute (commit work) by Teradata;
disconnect from teradata;
quit;
as written by Chris in the Question you linked you could use a so called implicit pass-through defining a libname for every Teradata db you need to point to:
libname db1 teradata user=xxx1 password=yyy1 database=zzz1;
libname db2 teradata user=xxx2 password=yyy2 database=zzz2;
Then you can use these inside Data steps or SQL queries as if they were standard SAS libraries:
data join;
merge db1.table1 db2.table2;
by id;
run;
or
proc sql;
select *
from db1.table1 t1
left join db2.table2 t2
on t1.id=t2.id;
quit;

SAS SQL Query - linking to SAS data set stored on UNIX folder

I have a SAS dataset consisting of CUSTOMER IDS stored on my UNIX folder (TABLE NAME IS CUSTOMERID) . I need to get information about all these CUSTOMER IDS from CUSTOMER INFORMATION TABLE
I use the query:
SELECT * FROM CUSTOMERINFORMATION
WHERE CUSTOMER_ID IN (select * from CUSTOMERID)
I get an error because the TABLE CUSTOMER ID is on UNIX while the QUERY is running on ORACLE (using SAS)
Any idea ?
Assuming you're running the SQL query in pass-through SQL. In that case, you can't directly access SAS data. You need to either construct the entire SQL query using LIBNAME access, or you need to upload your CUSTOMERID table to UNIX.
IE, if you have
proc sql;
connect to oracle (connection string);
select * from connection to oracle (
select * from customerinformation where customerID in []
);
quit;
You could convert that to
libname ora oracle (connection string); *oracle or OLEDB or ODBC or etc.;
proc sql;
select * from ora.customerinformation where customerId in
(select * from unix.customerID);
quit;
libname ora clear;
Or you could load the table to Oracle, using the same libname method.
Finally, if the list of CustomerIDs is small enough, you could store it in a macro variable that contained a comma delimited list of IDs; macro text will resolve as long as it's not in single quotes.