Performance Tuning in ODI 12c - sql

I want to migrate more than 10 million of data from source (oracle) to target (oracle) in shorter time in Oracle Data Integrator 12c. I have tried to create multiple scenarios and assign each scenario a million records and run the package of 10 scenarios. Time was reduced but is there any other way so that i can increase the performance of my ODI mapping having more than 10 million records?
I expect a less time for the execution of the mapping for better performance.

Further to add, on the ODI 12c you have option to place the source side indxes, in case the query is performing bad on the source side under the joins option you that option in the physical tab, else you can also deploy the hints in the physical tab. please let me know if these work.

Related

How I can find sql query for execution plan?

Some programm generate and send queries to sql server(on high load production). I want take plan of concrete query of concrete table. I start profiler with "Showplan XML" and set filter on TextData(like %MyTable%) and DatabaseName. It show rows with xml in TextData that describe execution plans(for all queries of my table). But I know that exist 5 different sql queries for this table.
How I can match some concrete query with correspond plan without use statistic?
Is there a reason this has to be done on the production environment? Most really bad execution plans (missing indexes causing table scans etc.) will be obvious enough on a dev environment where you can use all the diagnostics you want.
Otherwise running the SQL on the query cache (as in the linked question someone else mentioned) will probably have the lowest impact as it just queries a system table rather than adding diagnostics to every query.

Performance issues with outer joins to view in Oracle 12c

Two of my clients have recently upgraded to Oracle 12c 12.1.0.2. Since the upgrade I am experiencing significant performance degradation on queries using views with outer joins. Below is an example of a simple query that runs in seconds on the old Oracle 11g 11.2.0.2 database but takes several minutes on the new 12c database. Even more perplexing, this query runs reasonably fast (but not as fast) on one of the 12c databases, but not at all on the other. The performance is so bad on the one 12c database that the reporting I've developed is unusable.
I've compared indexes and system parameters between the 11g and two 12c databases, and have not found any significant differences. There is a difference between the Execution Plans, however. On 11g the outer join is represented as VIEW PUSHED PREDICATE but on 12c it is represented as a HASH JOIN without the PUSHED PREDICATE.
When I add the hint /*+ NO_MERGE(pt) PUSH_PRED(pt) */ to the query on the 12c database, then the performance is within seconds.
Adding a hint to the SQL is not an option within our Crystal Reports (at least I don't believe so and also there are several reports), so I am hoping we can figure out why performance is acceptable on one 12c database but not on the other.
My team and I are stumped at what to try next, and particularly why the response would be so different between the two 12c databases. We have researched several articles on performance degradation in 12c, but nothing appears particularly applicable to this specific issue. As an added note, queries using tables instead of views are returning results within an acceptable timeframe. Any insights or suggestions would be greatly appreciated.
Query:
select pi.*
, pt.*
from policyissuance_oasis pi
, policytransaction_oasis pt
where
pi.newTranKeyJoin = pt.polTranKeyJoin(+)
and pi.policyNumber = '1-H000133'
and pi.DateChars='08/10/2017 09:24:51' -- 2016 data
--and pi.DateChars = '09/26/2016 14:29:37' --2013 data
order by pi.followup_time
as krokodilko says, perform these :
explain plan for
select pi.*
, pt.*
from policyissuance_oasis pi
, policytransaction_oasis pt
where
pi.newTranKeyJoin = pt.polTranKeyJoin(+)
and pi.policyNumber = '1-H000133'
and pi.DateChars='08/10/2017 09:24:51' -- 2016 data
--and pi.DateChars = '09/26/2016 14:29:37' --2013 data
order by pi.followup_time;
select * from table(dbms_xplan.display());
and then, you probably see this at the bottom of the plan :
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
there,
dynamic sampling
concept should be the center of concern for performance problems(level=2 is the default value, ranges between 0-11 ).
In fact, Dynamic sampling (DS) was introduced to improve the optimizer's ability to generate good execution plans. This feature was enhanced and renamed Dynamic Statistics in Oracle Database 12c. The most common misconception is that DS can be used as a substitute for optimizer statistics, whereas the goal of DS is to augment optimizer statistics; it is used when regular statistics are not sufficient to get good quality cardinality estimates.
For serial SQL statements the dynamic sampling level is controlled by the
optimizer_dynamic_sampling
parameter but note that from Oracle Database 12c Release 1 the existence of SQL plan directives can also initiate dynamic statistics gathering when a query is compiled. This is a feature of adaptive statistics and is controlled by the database parameter
optimizer_adaptive_features (OAF)
in Oracle Database 12c Release 1 and
optimizer_adaptive_statistics (OAS)
in Oracle Database 12c Release 2.
In other words, from Oracle Database 12c Release 1(we also use db12.1.0.2 at office) onwards, DS will be used if certain adaptive features are enabled by setting the relevant parameter to TRUE.
Serial statements are typically short running and any DS overhead at compile time can have a large impact on overall system performance (if statements are frequently hard parsed). For systems that match this profile, setting OAF=FALSE is recommended( alter session set optimizer_adaptive_features=FALSE notice that
you shouldn't alter system but session
).
For Oracle Database 12c Release 2 onwards, using the default OAS=FALSE is recommended.
Parallel statements are generally more resource intensive, so it's often worth investing in additional overhead at compile time to potentially find a better SQL execution plan.
For serial type SQL statements, you may try to manual set the value for optimizer_dynamic_sampling(assuming that there were no relevant SQL plan directives). If we were to issue a similar style of query against a larger table that had the parallel attribute set we can see the dynamic sampling kicking in.
When should you use dynamic sampling? DS is typically recommended when you know you are getting a bad execution plan due to complex predicates. But shouldn't be system-wide as i mentioned before.
When is it not a good idea to use dynamic sampling?
If the queries compile times need to be as fast as possible, for example, unrepeated OLTP queries where you can't amortize the additional cost of compilation over many executions.
As the last word, for your case, it could be beneficient to set
optimizer_adaptive_features parameter to FALSE for individual SQL
statements and see the gained results.
We discovered the cause of the performance issue. The following 2 system parameters were changed at the system level by the DBAs for the main application that uses our client's Oracle server:
_optimizer_cost_based_transformation = OFF
_optimizer_reduce_groupby_key = FALSE
When we changed them at the session level to the following, the query above that joins the 2 views returns results in less than 2 seconds:
alter session set "_optimizer_cost_based_transformation"=LINEAR;
alter session set "_optimizer_reduce_groupby_key" = TRUE;
COMMIT;
Changing this parameter did not have an impact on performance:
alter session set optimizer_adaptive_features=FALSE;
COMMIT;
We also found that changing the following parameter improved performance even more for our more complex views:
alter session set optimizer_features_enable="11.2.0.3";
COMMIT;

How to optimize the downloading of data to the server in SSIS

Good day.
Need to get records from an Oracle database to a database in SQL Server. The data source type (ODBC) the performed using a SQL command, where I am taking all possible indices according to my requirement. The process runs fine, the problem is that it takes a long time and I need to be something quick. The process can not be performed with lookup, requires merge or merge join, simply load a table from Oracle to SQL under certain conditions.
Thank you for your help
Check what is your limiting factor. Generally there are 3 points to check:
Remote server is slow.
Source DB can run low on memory, read speed or free CPU. Substitute you query with a straight SELECT statement with no WHERE clause or JOINs and see if your SSIS package runs faster.
Target DB.
You may have indexes enabled, high write latency on HDD or not enough CPU.
Run an INSERT for your target table and see how longer it takes.
Problem may be in the middle: transfer between 2 servers. Network usually is main bottleneck. Is SSIS hosted on the same server as SQL server? then you have 2 network connections + possible hardware bottleneck on dedicated SSIS machine.
Depending on the bottleneck there are different solutions.
If you have network capacity and bottleneck is 1 CPU per query on Oracle, then you can partition your data horisontally (IDs 1 to 100, 101 to 200 etc); establish multiple connections to Oracle and load data in several streams. Number of streams is 1 less then number of CPUs on Oracle, SSIS or SQL Server (which ever is smaller).

Oracle Performance Issues

I'm new to Oracle. I've worked with Microsoft SQL Server for years, though. I was brought into a project that was already overdue and over budget, and I need to be "the Oracle expert." I've got a table that has 14 million rows in it. And I've got a complicated query that updates the table. But I'm going to start with the simple queries. When I issue a simple update that modifies a small number of records (100 to maybe 10,000 or so) it takes no more than 2 to 5 minutes to table scan the table and update the affected records. (Less time if the query can use an index.) But if I update the entire table with:
UPDATE MyTable SET MyFlag = 1;
Then it takes 3 hours!
If the table scan completes in minutes, why should this take hours? I could certainly use some advise on how to troubleshoot this, since I don't have enough experience with Oracle to know what diagnostics queries to run. (I'm on Oracle 11g and using Oracle SQL Developer as the client.)
Thanks.
When you do the UPDATE in Oracle, the data you are modifying are sequentially appended to the redo log and then distributed among the data blocks by a process called CHECKPOINT.
In addition, the old version of the data are copied into UNDO space to support possible transaction rollbacks and access to the old version of data by concurrent processes.
This all may take significantly more time than pure read operations which don't modify data.

Fastest way to insert in parallel to a single table

My company is cursed by a symbiotic partnership turned parasitic. To get our data from the parasite, we have to use a painfully slow odbc connection. I did notice recently though that I can get more throughput by running queries in parallel (even on the same table).
There is a particularly large table that I want to extract data from and move it into our local table. Running queries in parallel I can get data faster, but I also imagine that this could cause issues with trying to write data from multiple queries into the same table at once.
What advice can you give me on how to best handle this situation so that I can take advantage of the increased speed of using queries in parallel?
EDIT: I've gotten some great feedback here, but I think I wasn't completely clear on the fact that I'm pulling data via a linked server (which uses the odbc drivers). In other words that means I can run normal INSERT statements and I believe that would provide better performance than either SqlBulkCopy or BULK INSERT (actually, I don't believe BULK INSERT would even be an option).
Have you read Load 1TB in less than 1 hour?
Run as many load processes as you have available CPUs. If you have
32 CPUs, run 32 parallel loads. If you have 8 CPUs, run 8 parallel
loads.
If you have control over the creation of your input files, make them
of a size that is evenly divisible by the number of load threads you
want to run in parallel. Also make sure all records belong to one
partition if you want to use the switch partition strategy.
Use BULK insert instead of BCP if you are running the process on the
SQL Server machine.
Use table partitioning to gain another 8-10%, but only if your input
files are GUARANTEED to match your partitioning function, meaning
that all records in one file must be in the same partition.
Use TABLOCK to avoid row at a time locking.
Use ROWS PER BATCH = 2500, or something near this if you are
importing multiple streams into one table.
For SQL Server 2008, there are certain circumstances where you can utilize minimal logging for a standard INSERT SELECT:
SQL Server 2008 enhances the methods that it can handle with minimal
logging. It supports minimally logged regular INSERT SELECT
statements. In addition, turning on trace flag 610 lets SQL Server
2008 support minimal logging against a nonempty B-tree for new key
ranges that cause allocations of new pages.
If your looking to do this in code ie c# there is the option to use SqlBulkCopy (in the System.Data.SqlClient namespace) and as this article suggests its possible to do this in parallel.
http://www.adathedev.co.uk/2011/01/sqlbulkcopy-to-sql-server-in-parallel.html
If by any chance you've upgraded to SQL 2014, you can insert in parallel (compatibility level must be 110). See this:
http://msdn.microsoft.com/en-us/library/bb510411%28v=sql.120%29.aspx