Huge condition with connect by in Oracle - sql

I have some optimization problem with Oracle 11g database. I got query with structure as below:
select
(....)
from
(select ...)
where
(...)
CONNECT BY
PRIOR J.JDN_ID_POD = J.JDN_ID_NAD and (HUGE_CONDITION)
START WITH
J.JDN_ID_NAD = 1 and (HUGE_CONDITION)
and because of HUGE_CONDITION is really huge (~3500 characters) and is in two places, the query is very slow; is there any way to do it in another way?

Related

Optimization when merging from Oracle datalink

I am trying to write an Oracle procedure to merge data from a remote datalink into a local table. Individually the pieces work quickly, but together they time out. Here is a simplified version of what I am trying.
What works:
Select distinct ProjectID from Project where LastUpdated < (sysdate - 6/24);
--Works in split second.
Merge into project
using (select /*+DRIVING_SITE(remoteCompData)*/
rp.projectID,
rp.otherdata
FROM Them.Remote_Data#DBLink rd
WHERE rd.projectID in (1,2,3)) sourceData -- hardcoded IDs
On (rd.projectID = project.projectID)
When matched...
-- Merge statement works quickly when the IDs are hard coded
What doesn't work: Combining the two statements above.
Merge into project
using (select /*+DRIVING_SITE(rd)*/ -- driving site helps when this piece is extracted from the larger statement
rp.projectID,
rp.otherdata
FROM Them.Remote_Data#DBLink rd
WHERE rd.projectID in --in statement that works quickly by itself.
(Select distinct ProjectID from Project where LastUpdated < (sysdate - 6/24))
-- This select in the in clause one returns 10 rows. Its a test database.
On (rd.projectID = project.projectID)
)
When matched...
-- When I run this statement in SQL Developer, this is all that I get without the data updating
Connecting to the database local.
Process exited.
Disconnecting from the database local.
I also tried pulling out the in statement into a with statement hoping it would execute differently, but it had no effect.
Any direction for paths to pursue would be appreciated.
Thanks.
The /*+DRIVING_SITE(rd)*/ hint doesn't work with MERGE because the operation must run in the database where the merged table sits. Which in this case is the local database. That means the whole result set from the remote table is pulled across the database link and then filtered against the data from the local table.
So, discard the hint. I also suggest you convert the IN clause into a join:
Merge into project p
using (select rp.projectID,
rp.otherdata
FROM Project ld
inner join Them.Remote_Data#DBLink rd
on rd.projectID = ld.projectID
where ld.LastUpdated < (sysdate - 6/24)) q
-- This select in the in clause one returns 10 rows. Its a test database.
On (q.projectID = p.projectID)
)
Please bear in mind that answers to performance tuning questions without sufficient detail are just guesses.
I found your question having same problem. Yes, the hint in query is ignored when the query is included into using clause of merge command.
In my case I created work table, say w_remote_data for your example, and splitted merge command into two commands: (1) fill the work table, (2) invoke merge command using work table.
The pitfall is, we cannot simply use neither of commands create w_remote_data as select /*+DRIVING_SITE(rd)*/ ... or insert into w_remote_data select /*+DRIVING_SITE(rd)*/ ... to fill the work table. Both of these commands are valid but they are slow - the hint does not apply too so we would not get rid of the problem. The solution is in PLSQL: collect result of query in using clause using intermediate collection. See example (for simplicity I assume w_remote_data has same structure as remote_data, otherwise we have to define custom record instead of %rowtype):
declare
type ct is table of w_remote_data%rowtype;
c ct;
i pls_integer;
begin
execute immediate 'truncate table w_remote_data';
select /*+DRIVING_SITE(rd)*/ *
bulk collect into c
from Them.Remote_Data#DBLink rd ...;
if c.count > 0 then
forall i in c.first..c.last
insert into w_remote_data values c(i);
end if;
merge into project p using (select * from w_remote_data) ...;
execute immediate 'truncate table w_remote_data';
end;
My case was ETL script where I could rely it won't run in parallel. Otherwise we would have to cope with temporary (session-private) tables, I didn't try if it works with them.

Optimize view that dynamically choose a table or another

So the problem is that I have three huge table with same structure, and I need to show the results of one of them depending on result from another query.
So my order table looks like that:
code order
A 0
B 2
C 1
And I need to retrieve data from t_results
My approach (which is working) looks like this:
select *
from t_results_a
where 'A' in (
select code
from t_order
where order = 0
)
UNION ALL
select *
from t_results_b
where 'B' in (
select code
from t_order
where order = 0
)
UNION ALL
select *
from t_results_c
where 'C' in (
select code
from t_order
where order = 0
)
Is there anyway to not scan all three tables, as I am working with Athena so I can't program?
I presume that changing your database schema is not an option.
If it were, you could use one database table and add a CODE column whose value would be either A, B or C.
Basically the result of the SQL query on your ORDER table determines which other database table you need to query. For example, if CODE in table ORDER is A, then you have to query table T_RESULTS_A.
You wrote in your question
I am working with Athena so I can't program
I see that there is both an ODBC driver and a JDBC driver for Athena, so you can program with either .NET or Java. So you could write code that queries the ORDER table and use the result of that query to build another query string to query just the relevant table.
Another thought I had was dynamic SQL. Oracle database supports it. I can create a string containing variables where one variable is the database table name and have Oracle interpret the string as SQL and execute it. I briefly searched the Internet to see whether Athena supports this (as I have no experience with Athena) but found nothing - which doesn't mean to say that it does not exist.

Sub-Queries in Sybase SQL

We have an application which indexes data using user-written SQL statements. We place those statements within parenthesis so we can limit that query to a certain criteria. For example:
select * from (select F_Name from table_1)q where ID > 25
Though we have discovered that this format does not function using a Sybase database. Reporting a syntax error around the parenthesis. I've tried playing around on a test instance but haven't been able to find a way to achieve this result. I'm not directly involved in the development and my SQL knowledge is limited. I'm assuming the 'q' is to give the subresult an alias for the application to use.
Does Sybase have a specific syntax? If so, how could this query be adapted for it?
Thanks in advance.
Sybase ASE is case sensitive w.r.t. all identifiers and the query shall work:
as per #HannoBinder query :
select id from ... is not the same as select ID from... so make sure of the case.
Also make sure that the column ID is returned by the Q query in order to be used in where clause .
If the table and column names are in Upper case the following query shall work:
select * from (select F_NAME, ID from TABLE_1) Q where ID > 25

understanding existing SQL query

I am trying to read some exiting SQL queries written for MS SQL server.
I don't have access to database, table names etc.. Just raw query format...And I need to do some analysis on the fields required..
I need some help in understanding what certain query statements are doing...such as in the following block...
select FIELD1, x2.FIELD2
into #temp
from #temp1 x1 join #temp2 x2
on x1.FIELD1 = x2.FIELD2
and x1.FIELD3 = x2.MAXOCCUR
I have basic SQL understanding.. But I need to understand couple of things....Why does 'into' and 'from' statements have a '#' infront of table names.....what are x1 and x2 in this case. Why not just say
temp1.FIELD1 = temp2.FIELD2 instead of
x1.FIELD1 = x2.FIELD2
.....Am I missing something or is this query formed weird to begin with....I understand joins etc...
Can someone help me out...
Thanks
That is selecting from two already temp existing temp tables into a new temp table. The x1.FIELD1 is called aliasing. It's used so you don't have to type full table names when writing the query
As mentioned, the # signs indicate a TEMPORARY table.
x1 and x2 are used as "table alias" in this query. Yes, you could write
temp1.FIELD1 = temp2.FIELD2 instead of x1.FIELD1 = x2.FIELD2
but, consider if the tables had long names. Then using an alias makes the query easier to read (for humans. the computer doesn't really care).

SAS Pass-through SQL - Multiple DBs

I want to retrieve from DB2 the list of records that matches the identifiers in a DB1 table, like a regular SAS subquery. How can I perform that with SAS pass-through SQL?
Performing the (long and complex) SQL on db1 is too slow using a regular SAS SQL, that's why I am resorting to pass-through SQL instead.
I tried the following but no luck:
proc sql;
connect to db1 as A (user=&userid. password=&userpw. database=MY_DB);
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
select * from schema.table
Where ID_NUM =
(select * from connection to A
(select ID_NUM from schema2.table2)
);
);
disconnect from A;
disconnect from B;
quit;
If you're connecting to single DB2 instance and joining two tables in different schemas/databases, the following should work for you:
proc sql;
connect to db2 as B (user=&userid. password=&userpw. database=MY_DB);
create table test as
select * from connection to B (
/* here we're in DB2 SQL */
select t1.* from schema.table as t1
inner join schema2.table2 as t2
on t1.ID_NUM = t2.ID_NUM
);
/* automatic disconnect at PROC SQL boundary */
quit;
If you talk to two different servers/two user accounts a heterogenous join without pass-through could be used. Then the expected number of ID_NUM values would be important.
You can't perform a pass-through query to another pass-through query, unless your two databases are naturally connected in some way that you could take advantage of in the native system.
The only way to do something like this would be to perform the connection to A query and store that result in a macro variable (the list of ID_NUMs), and then insert that macro variable into the query for connection to B.
It might well be better to not explicitly use passthrough here, but instead to use libname and execute the query as you would normally. SAS may well help you out here and do the work for you without actually copying all of B's rows in first.