Passthrough query in R - sql

In SAS I'm used to PROC SQL, that I can use to query the database and return data into SAS, or execute SQL code in database. For example, from SAS I can use PROC SQL to run a passthrough query to create a new table on my database from another table on the database.
proc sql;
connect to netezza ( SERVER=X DATABASE=Z  AUTHDOMAIN="Y");
execute
(
create table B as
select * from A
)
by netezza
;
quit;
In R I'm able to connect and query a database using RODBC and some code like:
connect1 <- odbcConnect("NZ",believeNRows=FALSE)
query1 <- "SELECT * FROM A"
df_imp <- sqlQuery(connect1, query1)
But how do I go about doing something similar to the SAS code above?

You can use the same sqlQuery:
sqlQuery(connect1, "CREATE TABLE b as SELECT * FROM a")

IBM provides several interfaces to use R with Netezza, to include running R within the appliance.
These are provided for free under GPL. Originally it was only available via Revolution for a fee, but that changed over 1 year ago.
You must register with IBM Developerworks (www.ibm.com/developerworks).
Once registered you can download the software and installation directions.
See http://ibm.co/XOC1q3
On this wiki (under the How To section), there are several documents and labs regarding the use of R with Netezza.

Related

How to create a table from a linked server into the local machine

I need to copy tables from a linked server onto my local machine. I am working in SQL management studio. The linked server is Oracle based. My end goal is to set up a stored proc that deletes a table if it exists and creates a new table in its place with refreshed data. This will be done for many tables as needed. The issue with the below code is that I get the error:
Incorrect syntax near the keyword 'SELECT'.
I am stuck at creating the table.
CREATE TABLE test AS
SELECT DUMMY
FROM OPENQUERY (LServer, '
Select *
from sourceT
');
The data in the dummy table is just one column with a single value "x". I have seen posts that suggest using a certain notation in naming the linked server table, like <server.database.schema.tablename> but this doesn't seem to work,even if I just run the select statement using the openquery. If I just run the select part in the script above, this does work.
CREATE TABLE test AS
Is valid in Oracle but not SQL Server
You want
-- if the table already exists drop it
DROP TABLE IF EXISTS test;
-- now create a table and load into it
SELECT DUMMY
INTO test
FROM OPENQUERY (LServer, '
Select *
from sourceT')

Connecting to Teradata via SAS (SQL Explicit Passthrough), for data pull, is it recommended to use execute statement?

I have seen two options:
Not using execute statement
libname lib "/dir";
run;
proc sql ;
CONNECT TO TeraData (Server = 'edw' User =&tduser pass=&tdpass Database = UDW Mode = TeraData);
create table lib.datanew as
select * from connection to teradata
(select a.name,b.age from table1 a left join table2 b on a.pkey=b.pkey); disconnect from teradata; quit;
Using execute to create a multiset volatile table in Teradata and then bringing it to SAS library
libname lib "/dir";
run;
proc sql;
CONNECT TO TeraData (Server = 'edw' User =&tduser pass=&tdpass Database = UDW Mode = TeraData);
execute( create multiset volatile table datanew as
(select a.name,b.age from table1 a left join table2 b on a.pkey=b.pkey)
with data primary index (name) on commit preserve rows
)
BY TeraData;
CREATE TABLE lib.datanew AS (SELECT * FROM CONNECTION TO TeraData (SELECT * FROM datanew));
disconnect from teradata;
quit;
I just want to understand if one way or the other can be faster? If so, why?
So, when you use execute statement, you use pass through facility (Docs).
That mean that your sql code will be executed right in DBMS, and sas will only get the result table.
More examples you can see in pdf
Also SAS Community has a discussion .
Answering part of my question.
What I found is that when I need to create multiple volatile tables within Teradata, using the already created ones, 2) is the normal choice; I can run the proc sql commands, as if, I were in the Teradata SQL assistant environment.

SSIS ( SQL Server Integration Services ) & Teradata Volatile Tables . ( Teradata SQL tuning )

I would like to understand what Modalities SQL server Integration service ( SSIS ) uses to connect to Teradata 14
ODBC
.NEt
OLE DB
These ones or more / less than these. My main question is HOW do I IMPLEMENT teradata volatile table create syntax in SSIS Package. Which of these above support it and how is it done ? Thank You
As an alternative to using a sql task, you could use a global temporary table . That way you still have a session specific temp table, but you don't have to create it on the fly.
shouldn't it be as easy as an SQL task with the table sql?
CREATE VOLATILE TABLE table_1
(
column1 datatype,
column2 datatype,
.
.
columnN datatype
) ON COMMIT PRESERVE ROWS;

Multiple databases of Teradata and SAS in UNIX

I would like to know how to query multiple databases of the same server of Teradata in SAS (unix). I can do it for one database but there are few different databases involved in my queries. The only related article was SAS connection to Teradata Database using Teradata ODBC but could not get the right answer. Could you please share syntax/snippet. Any comment is appreciated.
Thanks!
Jas
Edits:
Please see the below script, I want to do something like this.
libname lib 'mylibraryPath\';
proc sql;
connect to teradata (user="userid" password="pwaaaowrd" mode=teradata database=DB1 database=DB2 database=DB3 tdpid="MyServer");
execute (
create volatile table lib.tab1 as
(
Select statements and several joins of different tables from different databases (server is same)
)
WITH DATA
PRIMARY INDEX (abcd)
ON COMMIT PRESERVE ROWS;
)
By Teradata;
execute (commit work) by Teradata;
disconnect from teradata;
quit;
as written by Chris in the Question you linked you could use a so called implicit pass-through defining a libname for every Teradata db you need to point to:
libname db1 teradata user=xxx1 password=yyy1 database=zzz1;
libname db2 teradata user=xxx2 password=yyy2 database=zzz2;
Then you can use these inside Data steps or SQL queries as if they were standard SAS libraries:
data join;
merge db1.table1 db2.table2;
by id;
run;
or
proc sql;
select *
from db1.table1 t1
left join db2.table2 t2
on t1.id=t2.id;
quit;

SAS Pass-through SQL - Multiple DBs

I want to retrieve from DB2 the list of records that matches the identifiers in a DB1 table, like a regular SAS subquery. How can I perform that with SAS pass-through SQL?
Performing the (long and complex) SQL on db1 is too slow using a regular SAS SQL, that's why I am resorting to pass-through SQL instead.
I tried the following but no luck:
proc sql;
connect to db1 as A (user=&userid. password=&userpw. database=MY_DB);
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
select * from schema.table
Where ID_NUM =
(select * from connection to A
(select ID_NUM from schema2.table2)
);
);
disconnect from A;
disconnect from B;
quit;
If you're connecting to single DB2 instance and joining two tables in different schemas/databases, the following should work for you:
proc sql;
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
/* here we're in DB2 SQL */
select t1.* from schema.table as t1
inner join schema2.table2 as t2
on t1.ID_NUM = t2.ID_NUM
);
/* automatic disconnect at PROC SQL boundary */
quit;
If you talk to two different servers/two user accounts a heterogenous join without pass-through could be used. Then the expected number of ID_NUM values would be important.
You can't perform a pass-through query to another pass-through query, unless your two databases are naturally connected in some way that you could take advantage of in the native system.
The only way to do something like this would be to perform the connection to A query and store that result in a macro variable (the list of ID_NUMs), and then insert that macro variable into the query for connection to B.
It might well be better to not explicitly use passthrough here, but instead to use libname and execute the query as you would normally. SAS may well help you out here and do the work for you without actually copying all of B's rows in first.