Never ending query - sql

My query worked fine up to this day.
I suppose there is some problem with index.
Note: On column type (in table_1) there is index, and type is not unique in table_1
Here is what query looks alike:
--works fine(finish in 5 seconds)
select *
from table_1 a,table_2 b
where a.id=b.id
and a.date < date'2014-6-31'
and a.type=2
When I increase date range (include 1 month more, thats about 1000 records more) it doesn't finish, so I have to stop it.
--never ending
select *
from table_1 a,table_2 b
where a.id=b.id
and a.date < date'2014-7-31'
and a.type=2
But when I omit column that has index, it's ok:
--works fine
select *
from table_1 a,table_2 b
where a.id=b.id
and a.date < date'2014-7-31'
I would be grateful on any hint.

Try to disable index this way:
ALTER INDEX idxname DISABLE;
You can also rebuild index
ALTER INDEX idxname REBUILD;
or gather statistics on the table
EXEC DBMS_STATS.GATHER_TABLE_STATS ('yourschema', 'table');
But be careful it could take long time!

You can use query plan to see what's happened.
Before you runstats or disable index, recode the query plan first, compare the query plan after runstats or disable index. Then you can figure out what's happened in your database by your self.
You should know, not all index helps.
Some times without index, dbms will choose full table scan and hash join to improve the performance.
But with some index, dbms may use next loop join, this type of join may cause less cost, but may execute serial.

Related

SQL Oracle - better way to to write update and delete query

I have around 270 million rows indexed on month||id_nr and below update/delete query takes around 4 hours to complete.
I was wondering if there is any other way to do update/delete which will be faster.
Update Query:-
update table_A
set STATUS='Y'
where
month||id_nr in (select distinct month||id_nr from table_A where STATUS='Y');
Delete Query:-
Delete from table_B
where
month||id_nr in (select distinct month||id_nr from table_A where STATUS='Y');
Why the string concatenation? And never try to force the DBMS to make rows distinct in an IN clause. Let the DBMS decide what it considers the best approach to look up the data.
So, just:
where (month, id_nr) in (select month, id_nr from table_A where status = 'Y');
I suppose that id_nr is not a unique ID in table_b, for otherwise you wouln't have to look at it combined with the month. An appropriate index would hence be:
create index idx_b on table_b (id_nr, month);
Or maybe, if you work a lot with the month, it may be a good idea to even partition the table by month. This could speed up queries, updates, and deletes immensely.
For table_a I suggest
create index idx_a on table_a (status, id_nr, month);
which is a covering index. The first column will help find the desired rows quickly; the other two columns will be available without having to read the table row.

Tuning for Oracle query

I need to increase performance of this query:
select t.*
,(select max(1)
from schema1.table_a t1
where 1=1
to_date(t.misdate, 'YYYYMMDD') between t1.startdateref and t1.enddateref
and sysdate between t1.startdatevalue and t1.enddatevalue
and t1.idpma = t.idpm)
from schema2.table_b t
Any Ideas?
Thanks
Well you don't have any filtering condition on table_b. This means the best plan includes a full table scan on table_b. This would be optimal.
Having said that, now you need to focus on table_a. That one should be accessed using index range scans on either:
idpma, then by startdateref.
or idpma, then by startdateref.
Yes, it's one or the other. For Oracle's cost-based optimizer (CBO) to pick the best plan, you'll need to add the following indexes:
create index ix1 on schema1.table_a (idpma, startdateref);
create index ix2 on schema1.table_a (idpma, startdatevalue);
Try with this ones and see how it works.

Oracle picks wrong index when joining table

I have a long query, but similar to the short version here:
select * from table_a a
left join table_b b on
b.id = a.id and
b.name = 'CONSTANT';
There are 2 indexes on table_b for both id and name, idx_id has fewer duplicates and idx_name has a lot of duplicates. This is quite a large table (20M+ records). And the join is taking 10min+.
A simple explain plan shows a lot of memory uses on the join part, and it shows it uses the index for name as opposed to id.
How to solve this issue? How to force using the idx_id index?
I was thinking of putting b.name='CONSTANT' to where clause, but this is a left join and where will remove all the record that exists in table_a.
Updated explain plan. Sorry cannot paste the whole plan.
Explain plan with b.name='CONSTANT':
Explain plan when commenting b.name clause:
Add an optimizer hint to your query.
Without knowing your 'long' query, it's difficult to know if Oracle is using the wrong one, or if your interpretation that indexb < indexa so therefore must be quicker for query z is correct.
To add a hint the syntax is
select /*+ index(table_name index_name) */ * from ....;
What is the size of TABLE_A relative to TABLE_B? It wouldn't make sense to use the ID index unless TABLE_A had significantly less rows than TABLE_B.
Index range scans are generally only useful when they access a small percentage of the rows in a table. Oracle reads the index one-block-at-a-time, and then still has to pull the relevant row from the table. If the index isn't very selective, that process can be slower than a multi-block full table scan.
Also, it might help if you can post the full explain plan using this text format:
explain plan for select ... ;
select * from table(dbms_xplan.display);

Indexing Null Values in PostgreSQL

I have a query of the form:
select m.id from mytable m
left outer join othertable o on o.m_id = m.id
and o.col1 is not null and o.col2 is not null and o.col3 is not null
where o.id is null
The query returns a few hundred records, although the tables have millions of rows, and it takes forever to run (around an hour).
When I check my index statistics using:
select * from pg_stat_all_indexes
where schemaname <> 'pg_catalog' and (indexrelname like 'othertable_%' or indexrelname like 'mytable_%')
I see that only the index for othertable.m_id is being used, and that the indexes for col1..3 are not being used at all. Why is this?
I've read in a few places that PG has traditionally not been able to index NULL values. However, I've read this has supposedly changed since PG 8.3? I'm currently using PostgreSQL 8.4 on Ubuntu 10.04. Do I need to make a "partial" or "functional" index specifically to speed up IS NOT NULL queries, or is it already indexing NULLs and I'm just misunderstanding the problem?
You could try a partial index:
CREATE INDEX idx_partial ON othertable (m_id)
WHERE (col1 is not null and col2 is not null and col3 is not null);
From the docs: http://www.postgresql.org/docs/current/interactive/indexes-partial.html
Partial indexes aren't going to help you here as they'll only find the records you don't want. You want to create an index that contains the records you do want.
CREATE INDEX findDaNulls ON othertable ((COALESCE(col1,col2,col3,'Empty')))
WHERE col1 IS NULL AND col2 IS NULL AND col3 IS NULL;
SELECT *
FROM mytable m
JOIN othertable o ON m.id = o.m_id
WHERE COALESCE(col1,col2,col3,'Empty') = 'Empty';
BTW searching for null left joins generally isn't as fast as using EXISTS or NOT EXISTS in Postgres.
A single index on m_id, col1, col2 and o.col3 would be my first thought for this query.
And use EXPLAIN on this query to see how it is executed and what takes so much time. You could show us the results to help you out.
A partial index seems the right way here:
If you have a table that contains both
billed and unbilled orders, where the
unbilled orders take up a small
fraction of the total table and yet
those are the most-accessed rows, you
can improve performance by creating an
index on just the unbilled rows.
Perhaps those nullable columns (col1,col2,col3) act in your scenario as some kind of flag to distinguish some subclass of records in your table? (for example, some sort of "logical deletion") ? In that case, besides the partial index solution, you might prefer to rethink your design, and put them in different physical tables (perhaps using inheritance), one for the "live records" other for the "historical records" and access the full set (only when needed) thrugh a view.
Did you try to create a combined index on othertable(m_id, col1, col2, col3)?
You should also check the execution plan (using EXPLAIN) rather than checking the system tables for the index usage.
PostgreSQL 9.0 (currently in beta) will be able to use and index for a IS NULL condition. That feature got postponed

Why is this query doing a full table scan?

The query:
SELECT tbl1.*
FROM tbl1
JOIN tbl2
ON (tbl1.t1_pk = tbl2.t2_fk_t1_pk
AND tbl2.t2_strt_dt <= sysdate
AND tbl2.t2_end_dt >= sysdate)
JOIN tbl3 on (tbl3.t3_pk = tbl2.t2_fk_t3_pk
AND tbl3.t3_lkup_1 = 2577304
AND tbl3.t3_lkup_2 = 1220833)
where tbl2.t2_lkup_1 = 1020000002981587;
Facts:
Oracle XE
tbl1.t1_pk is a primary key.
tbl2.t2_fk_t1_pk is a foreign key on that t1_pk column.
tbl2.t2_lkup_1 is indexed.
tbl3.t3_pk is a primary key.
tbl2.t2_fk_t3_pk is a foreign key on that t3_pk column.
Explain plan on a database with 11,000 rows in tbl1 and 3500 rows in
tbl2 shows that it's doing a full table scan on tbl1. Seems to me that
it should be faster if it could do a index query on tbl1.
Explain plan on a database with 11,000 rows in tbl1 and 3500 rows in
tbl2 shows that it's doing a full table scan on tbl1. Seems to me that
it should be faster if it could do a index query on tbl1.
Update: I tried the hint a few of you suggested, and the explain cost got much worse! Now I'm really confused.
Further Update: I finally got access to a copy of the production database,
and "explain plan" showed it using indexes and with a much lower cost
query. I guess having more data (over 100,000 rows in tbl1 and 50,000 rows
in tbl2) were what it took to make it decide that indexes were worth it. Thanks to everybody who helped. I still think Oracle performance tuning is a black art, but I'm glad some of you understand it.
Further update: I've updated the question at the request of my former employer. They don't like their table names showing up in google queries. I should have known better.
The easy answer: Because the optimizer expects more rows to find then it actually does find.
Check the statistics, are they up to date?
Check the expected cardinality in the explain plan do they match the actual results? If not fix the statistics relevant for that step.
Histogramms for the joined columns might help. Oracle will use those to estimate the cardinality resulting from a join.
Of course you can always force index usage with a hint
It would be useful to see the optimizer's row count estimates, which are not in the SQL Developer output you posted.
I note that the two index lookups it is doing are RANGE SCAN not UNIQUE SCAN. So its estimates of how many rows are being returned could easily be far off (whether statistics are up to date or not).
My guess is that its estimate of the final row count from the TABLE ACCESS of TBL2 is fairly high, so it thinks that it will find a large number of matches in TBL1 and therefore decides on doing a full scan/hash join rather than a nested loop/index scan.
For some real fun, you could run the query with event 10053 enabled and get a trace showing the calculations performed by the optimizer.
Oracle tries to return the result set with the least amount of I/O required (typically, which makes sense because I/o is slow). Indexes take at least 2 I/O calls. one to the index and one to the table. Usually more, depending on the size of the index and tables sizes and the number of records returns, where they are in the datafile, ...
This is where statistics come in. Lets say your query is estimated to return 10 records. The optimizer may calculate that using an index will take 10 I/O calls. Let's say your table, according to the statistics on it, resides in 6 blocks in the data file. It will be faster for Oracle to do a full scan ( 6 I/O) then read the index, read the table, read then index for the next matching key, read the table and so on.
So in your case, the table may be real small. The statistics may be off.
I use the following to gather statistics and customize it for my exact needs:
begin
DBMS_STATS.GATHER_TABLE_STATS(ownname
=> '&owner' ,tabname => '&table_name', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE,granularity
=> 'ALL', cascade => TRUE);
-- DBMS_STATS.GATHER_TABLE_STATS(ownname
=> '&owner' ,tabname => '&table_name',partname => '&partion_name',granularity => 'PARTITION', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE, cascade
=> TRUE);
-- DBMS_STATS.GATHER_TABLE_STATS(ownname
=> '&owner' ,tabname => '&table_name',partname => '&partion_name',granularity => 'PARTITION', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE, cascade
=> TRUE,method_opt => 'for all indexed columns size 254');
end;
You can only tell by looking at the query plan the SQL optimizer/executor creates. It will be at least partial based on index statistics which cannot be predicted from just the definition (and can, therefore, change over time).
SQL Management studio for SQL Server 2005/2008, Query Analyzer for earlier versions.
(Can't recall the right tool names for Oracle.)
Try adding an index hint.
SELECT /*+ index(tbl1 tbl1_index_name) */ .....
Sometimes Oracle just doesn't know which index to use.
Apparently this query gives the same plan:
SELECT tbl1.*
FROM tbl1
JOIN tbl2 ON (tbl1.t1_pk = tbl2.t2_fk_t1_pk)
JOIN tbl3 on (tbl3.t3_pk = tbl2.t2_fk_t3_pk)
where tbl2.t2_lkup_1 = 1020000002981587
AND tbl2.t2_strt_dt <= sysdate
AND tbl2.t2_end_dt >= sysdate
AND tbl3.t3_lkup_1 = 2577304
AND tbl3.t3_lkup_2 = 1220833;
What happens if you rewrite this query to:
SELECT tbl1.*
FROM tbl1
, tbl2
, tbl3
where tbl2.t2_lkup_1 = 1020000002981587
AND tbl1.t1_pk = tbl2.t2_fk_t1_pk
AND tbl3.t3_pk = tbl2.t2_fk_t3_pk
AND tbl2.t2_strt_dt <= sysdate
AND tbl2.t2_end_dt >= sysdate
AND tbl3.t3_lkup_1 = 2577304
AND tbl3.t3_lkup_2 = 1220833;
Depends on your expected result size you can play arround with some session parameters:
SHOW PARAMETER optimizer_index_cost_adj;
[...]
ALTER SESSION SET optimizer_index_cost_adj = 10;
SHOW PARAMETER OPTIMIZER_MODE;
[...]
ALTER SESSION SET OPTIMIZER_MODE=FIRST_ROWS_100;
and dont forget to check the real executiontime, sometimes the plan is not the real world ;)
It looks like an index for tbl1 table is not being picked up. Make sure
you have an index for t2_lkup_1 column and it should not be multi-column otherwise the index is not applicable.
(in addition to what Matt's comment)
From your query I believe you're joining because you want to filter out
records not to do JOIN which may increase cardinality for result set from
tbl1 table if there are duplicate matches from . See Jeff Atwood comment
Try this, which uses exist function and join (which is really fast on oracle)
select *
from tbl1
where tbl2.t2_lkup_1 = 1020000002981587 and
exists (
select *
from tbl2, tbl3
where tbl2.t2_fk_t1_pk = tbl1.t1_pk and
tbl2.t2_fk_t3_pk = tbl3.t3_pk and
sysdate between tbl2.t2_strt_dt and tbl2.t2_end_dt and
tbl3.t3_lkup_1 = 2577304 and
tbl3.t3_lkup_2 = 1220833);