I would like to know how to query multiple databases of the same server of Teradata in SAS (unix). I can do it for one database but there are few different databases involved in my queries. The only related article was SAS connection to Teradata Database using Teradata ODBC but could not get the right answer. Could you please share syntax/snippet. Any comment is appreciated.
Thanks!
Jas
Edits:
Please see the below script, I want to do something like this.
libname lib 'mylibraryPath\';
proc sql;
connect to teradata (user="userid" password="pwaaaowrd" mode=teradata database=DB1 database=DB2 database=DB3 tdpid="MyServer");
execute (
create volatile table lib.tab1 as
(
Select statements and several joins of different tables from different databases (server is same)
)
WITH DATA
PRIMARY INDEX (abcd)
ON COMMIT PRESERVE ROWS;
)
By Teradata;
execute (commit work) by Teradata;
disconnect from teradata;
quit;
as written by Chris in the Question you linked you could use a so called implicit pass-through defining a libname for every Teradata db you need to point to:
libname db1 teradata user=xxx1 password=yyy1 database=zzz1;
libname db2 teradata user=xxx2 password=yyy2 database=zzz2;
Then you can use these inside Data steps or SQL queries as if they were standard SAS libraries:
data join;
merge db1.table1 db2.table2;
by id;
run;
or
proc sql;
select *
from db1.table1 t1
left join db2.table2 t2
on t1.id=t2.id;
quit;
Related
I have seen two options:
Not using execute statement
libname lib "/dir";
run;
proc sql ;
CONNECT TO TeraData (Server = 'edw' User =&tduser pass=&tdpass Database = UDW Mode = TeraData);
create table lib.datanew as
select * from connection to teradata
(select a.name,b.age from table1 a left join table2 b on a.pkey=b.pkey); disconnect from teradata; quit;
Using execute to create a multiset volatile table in Teradata and then bringing it to SAS library
libname lib "/dir";
run;
proc sql;
CONNECT TO TeraData (Server = 'edw' User =&tduser pass=&tdpass Database = UDW Mode = TeraData);
execute( create multiset volatile table datanew as
(select a.name,b.age from table1 a left join table2 b on a.pkey=b.pkey)
with data primary index (name) on commit preserve rows
)
BY TeraData;
CREATE TABLE lib.datanew AS (SELECT * FROM CONNECTION TO TeraData (SELECT * FROM datanew));
disconnect from teradata;
quit;
I just want to understand if one way or the other can be faster? If so, why?
So, when you use execute statement, you use pass through facility (Docs).
That mean that your sql code will be executed right in DBMS, and sas will only get the result table.
More examples you can see in pdf
Also SAS Community has a discussion .
Answering part of my question.
What I found is that when I need to create multiple volatile tables within Teradata, using the already created ones, 2) is the normal choice; I can run the proc sql commands, as if, I were in the Teradata SQL assistant environment.
I would like to understand what Modalities SQL server Integration service ( SSIS ) uses to connect to Teradata 14
ODBC
.NEt
OLE DB
These ones or more / less than these. My main question is HOW do I IMPLEMENT teradata volatile table create syntax in SSIS Package. Which of these above support it and how is it done ? Thank You
As an alternative to using a sql task, you could use a global temporary table . That way you still have a session specific temp table, but you don't have to create it on the fly.
shouldn't it be as easy as an SQL task with the table sql?
CREATE VOLATILE TABLE table_1
(
column1 datatype,
column2 datatype,
.
.
columnN datatype
) ON COMMIT PRESERVE ROWS;
In SAS I'm used to PROC SQL, that I can use to query the database and return data into SAS, or execute SQL code in database. For example, from SAS I can use PROC SQL to run a passthrough query to create a new table on my database from another table on the database.
proc sql;
connect to netezza ( SERVER=X DATABASE=Z AUTHDOMAIN="Y");
execute
(
create table B as
select * from A
)
by netezza
;
quit;
In R I'm able to connect and query a database using RODBC and some code like:
connect1 <- odbcConnect("NZ",believeNRows=FALSE)
query1 <- "SELECT * FROM A"
df_imp <- sqlQuery(connect1, query1)
But how do I go about doing something similar to the SAS code above?
You can use the same sqlQuery:
sqlQuery(connect1, "CREATE TABLE b as SELECT * FROM a")
IBM provides several interfaces to use R with Netezza, to include running R within the appliance.
These are provided for free under GPL. Originally it was only available via Revolution for a fee, but that changed over 1 year ago.
You must register with IBM Developerworks (www.ibm.com/developerworks).
Once registered you can download the software and installation directions.
See http://ibm.co/XOC1q3
On this wiki (under the How To section), there are several documents and labs regarding the use of R with Netezza.
I want to retrieve from DB2 the list of records that matches the identifiers in a DB1 table, like a regular SAS subquery. How can I perform that with SAS pass-through SQL?
Performing the (long and complex) SQL on db1 is too slow using a regular SAS SQL, that's why I am resorting to pass-through SQL instead.
I tried the following but no luck:
proc sql;
connect to db1 as A (user=&userid. password=&userpw. database=MY_DB);
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
select * from schema.table
Where ID_NUM =
(select * from connection to A
(select ID_NUM from schema2.table2)
);
);
disconnect from A;
disconnect from B;
quit;
If you're connecting to single DB2 instance and joining two tables in different schemas/databases, the following should work for you:
proc sql;
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
/* here we're in DB2 SQL */
select t1.* from schema.table as t1
inner join schema2.table2 as t2
on t1.ID_NUM = t2.ID_NUM
);
/* automatic disconnect at PROC SQL boundary */
quit;
If you talk to two different servers/two user accounts a heterogenous join without pass-through could be used. Then the expected number of ID_NUM values would be important.
You can't perform a pass-through query to another pass-through query, unless your two databases are naturally connected in some way that you could take advantage of in the native system.
The only way to do something like this would be to perform the connection to A query and store that result in a macro variable (the list of ID_NUMs), and then insert that macro variable into the query for connection to B.
It might well be better to not explicitly use passthrough here, but instead to use libname and execute the query as you would normally. SAS may well help you out here and do the work for you without actually copying all of B's rows in first.
I am trying to write an SQL query which needs to be compatible on both a Sybase and Oracle database. The query looks like the following :
SELECT *
INTO new_table
FROM other_table
This query is working great on a Sybase database but not on an Oracle one. I found the equivalent for Oracle :
CREATE table new_table AS
SELECT *
FROM other_table
Is there a way to write a third query that would do the same and that can be executed on a Sybase and on an Oracle database?
As you found, Oracle supports INTO but doesn't use it like Sybase/SQL Server do. Likewise, Sybase doesn't support Oracle's extension of the CREATE TABLE syntax.
The most reliable means of creating a table & importing data between the systems is to use two statements:
CREATE TABLE new_table (
...columns...
)
INSERT INTO new_table
SELECT *
FROM OLD_TABLE
Even then, syntax is different because Oracle requires each statement to be delimited by a semi-colon when TSQL doesn't.
Creating a table & importing all the data from another table is a red flag to me - This is not something you'd do in a stored procedure for a production system. TSQL and PLSQL are very different, so I'd expect separate scripts for DML changes.
There is no query that would do what you want. These environments are very different.
It is possible.
SELECT * INTO new_table FROM existing_table;
Thanks