Find time duration between indexes - indexing

Have a table with 3 columns; index, status, t_stamp. With every status change the table indexes and updates the timestamp. I cannot figure out the SQL query to solve for time difference at each index. Suggestions please or point in the right direction.

Related

poorly performing query on order lines table

I have this query on the order lines table. Its a fairly large table. I am trying to get quantity shipped by item in the last 365 days. The query works, but is very slow to return results. Should I use a function based index for this? I read a bit about them, but havent work with them much at all.
How can I make this query faster?
select OOL.INVENTORY_ITEM_ID
,SUM(nvl(OOL.shipped_QUANTITY,0)) shipped_QUANTITY_Last_365
from oe_order_lines_all OOL
where ool.actual_shipment_date>=trunc(sysdate)-365
and cancelled_flag='N'
and fulfilled_flag='Y'
group by ool.inventory_item_id;
Explain plan:
Stats are up to date, we regather once a week.
Query taking 30+ minutes to finish.
UPDATE
After adding this index:
The explain plan shows the query is using index now:
The query runs faster but not 'fast.' Completing in about 6 minutes.
UPDATE2
I created a covering index as suggested by Matthew and Gordon:
The query now completes in less than 1 second.
Explain Plan:
I still wonder why or if a function-based index would have also been a viable solution, but I dont have time to play with it right now.
As a rule, using an index that access a "significant" percentage of the rows in your table is slower than a full table scan. Depending on your system, "significant" could be as low as 5% or 10%.
So, think about your data for a minute...
How many rows in OE_ORDER_LINES_ALL are cancelled? (Hopefully not many...)
How many rows are fulfilled? (Hopefully almost all of them...)
How many rows where shipped in the last year? (Unless you have more than 10 years of history in your table, more than 10% of them...)
Put that all together and your query is probably going to have to read at least 10% of the rows in your table. This is very near the threshold where an index is going to be worse than a full table scan (or, at least not much better than one).
Now, if you need to run this query a lot, you have a few options.
Materialized view, possibly for the prior 11 months together with a live query against OE_ORDER_LINES_ALL for the current month-to-date.
A covering index (see below).
You can improve the performance of an index, even one accessing a significant percentage of the table rows, by making it include all the information required by the query -- allowing Oracle to avoid accessing the table at all.
CREATE INDEX idx1 ON OE_ORDER_LINES_ALL
( actual_shipment_date,
cancelled_flag,
fulfilled_flag,
inventory_item_id,
shipped_quantity ) ONLINE;
With an index like that, Oracle can satisfy the query by just reading the index (which is faster because it's much smaller than the table).
For this query:
select OOL.INVENTORY_ITEM_ID,
SUM(OOL.shipped_QUANTITY) as shipped_QUANTITY_Last_365
from oe_order_lines_all OOL
where ool.actual_shipment_date >= trunc(sysdate) - 365 and
cancelled_flag = 'N' and
fulfilled_flag = 'Y'
group by ool.inventory_item_id;
I would recommend starting with an index on oe_order_lines_all(cancelled_flag, fulfilled_flag, actual_shipment_date). That should do a good job in identifying the rows.
You can add the additional columns inventory_item_id and quantity_shipped to the index as well.
Let recapitulate the facts:
a) You access about 300K rows from your table (see cardinality in the 3rd line of the execution plan)
b) you use the FULL TABLE SCAN the get the data
c) the query is very slow
The first thing is to check why is the FULL TABLE SCAN so very slow - if the table is extremly large (check the BYTES in user_segments) you need to optimize the access to your data.
But remember no index will help you the get 300K rows from say 30M total rows.
Index access to 300K rows can take 1/4 of an hour or even more if th eindex is not much used and a large part of it s on the disk.
What you need is partitioning - in your case a range partitioning on actual_shipment_date - for your data size on a monthly or yearly basis.
This will eliminate the need of scaning the old data (partition pruning) and make the query much more effective.
Other possibility - if the number of rows is small, but the table size is very large - you need to reorganize the table to get better full scan time.

Possibilities of Query tuning in my case using SQL Server 2012

I have 2 tables called Sales and SalesDetails; SalesDetails has 90 million rows.
When I want to retrieve all records for 1 year, it almost takes 15 minutes, and it's still not yet completed.
I tried to retrieve records for 1 month, it took 1 minute 20 seconds and returns around 2.5 million records. I know it's huge.
Is there any solution to reduce the execution time?
Note
I don't want to create any index, because it already has enough indexes by default
I don't know what you mean when say that you have indices "by default." As far as I know, creating the two tables you showed us above would not create any indices by default (other than maybe the clustered index).
That being said, your query is tough to optimize, because you are aggregating and taking sums. This behavior generally requires touching every record, so an index may not be usable. However, we may still be able to speed up the join using something like this:
CREATE INDEX idx ON sales (ID, Invoice) INCLUDE (Date, Register, Customer)
Assuming SQL Server chooses to use this index, it could scan salesDetails and then quickly lookup every record against this index (instead of the sales table itself) to complete the join. Note that the index covers all columns required by the select statement.

Best Index For Partitioned Table

I am querying a fairly large table that has been range partitioned (by someone else) by date into one partition per day. On average there are about 250,000 records per day. Frequently queries will be by a range of days -- usually looking for one day, 7 day week or a calendar month. Right now querying for more than 2 weeks is not performing well--have a normal date index created. If I query for more than 5 days it doesn't use the index, if I use an index hint it performs o.k. from about 5 days to 14 days but beyond that the index hint doesn't help much.
Given that the hint does better than the optimizer I am doing a gather statistics on the table.
However, my question going forward is, in general, if I wanted to create an index on the date field in the table, is it best to create a range partitioned index? Is it best to create a range index with a daily range similar to the table partition? What would be the best strategy?
This is Oracle 11g.
Thanks,
related to your question, partitioning strategy will depend on how you are going to query the data, the best strategy would be to query as fewer partitions as possible. e.g. if you are going to run monthly reports you'd rather create montly range partitioning and not daily range partitioning. If all your queryies will be around data that's within a couple of days then daily range partitioning would be fine.
Given numbers you provided in my opininon you overpartition data.
p.s. quering each partition requires additional reading(than if it were just one partition), so optimizer opts for table access full to reduce reading of indexes.
Try to create a global index on date column. If the index is partitioned and you select -let's say- 14 days, then Oracle has to read 14 indexes. Having a single index on the entire table, i.e. "global index" it has to read only 1 index.
Note, when you truncate or drop a partition then you have to rebuild the index afterwards.
I'm guessing that you could be writing your SQL wrong.
You said you're querying by date. If your date column has time part and you want to extract records from one day, from specific time of the day, e.g. 20:00-21:00, then yes, an index would be beneficial and I would recommend a local index for this (partitioned by day as just like table).
But since your queries span a range of days, it seems this is not the case and you just want all data (maybe filtered by some other attributes). If so, a partition full scan will always be much faster than index access... provided you benefit from partition pruning! Because if not - and you're actually performing a full table scan - this is expected to be very, very slow (in most cases).
So what could go wrong? Are you using plain date in WHERE clause? Note that:
SELECT * FROM trx WHERE trx_date = to_date('2014-04-03', 'YYYY-MM-DD');
will scan only one partition, whereas:
SELECT * FROM trx WHERE trunc(trx_date) = to_date('2014-04-03', 'YYYY-MM-DD');
will scan all partitions, as you apply a function to partitioning key and the optimizer can no longer determine which partitions to scan.
It would be much easier to tell for sure if you provided table definition, total number of partitions, sample data and your queries with explain plans. If possible, please edit your question and include more details.

Oracle sql statement on very large table

I relative new to sql and I have a statement which takes forever to run.
SELECT
sum(a.amountcur)
FROM
custtrans a
WHERE
a.transdate <= '2013-12-31';
I's a large table but the statemnt takes about 6 minutes!
Any ideas why?
Your select, as you post it, will read 99% of the whole table (2013-12-31 is just a week ago, and i assume most entries are before that date and only very few after). If your table has many large columns (like varchar2(4000)), all that data will be read as well when oracle scans the table. So you might read several KB each row just to get the 30 bytes you need for amountcur and transdate.
If you have this scenario. create a combined index on transdate and amountcur:
CREATE INDEX myindex ON custtrans(transdate, amountcur)
With the combined index, oracle can read the index to fulfill your query and doesn't have to touch the main table at all, which might result in considerably less data that needs to be read from disk.
Make sure the table has an index on transdate.
create index custtrans_idx on custtrans (transdate);
Also if this field is defined as a date in the table then do
SELECT sum(a.amountcur)
FROM custtrans a
WHERE a.transdate <= to_date('2013-12-31', 'yyyy-mm-dd');
If the table is really large, the query has to scan every row with transdate below given.
Even if you have an index on transdate and it helps to stop the scan early (which it may not), when the number of matching rows is very high, it would take considerable time to scan them all and sum the values.
To speed things up, you could calculate partial sums, e.g. for each passed month, assuming that your data is historical and past does not change. Then you'd only need to scan custtrans only for 1-2 months, then quickly scan the table with monthly sums, and add the results.
Try to create an index only on column amountcur:
CREATE INDEX myindex ON custtrans(amountcur)
In this case Oracle will read most probably only the Index (Index Full Scan), nothing else.
Correction, as mentioned in comment. It must be a composite Index:
CREATE INDEX myindex ON custtrans(transdate, amountcur)
But maybe it is a bit useless to create an index just for a single select statement.
One option is to create an index on the column used in the where clause (this is useful if you want to retrieve only 10-15% rows by using indexed column).
Another option is to partition your table if it has millions of rows. In this case also if you try to retrieve 70-80% data, it wont help.
The best option is first to analyze your requirements and then make a choice.
Whenever you deal with date functions it's better to use to_date() function. Do not rely on implicit data type conversion.

Why SQL query can take so long time to return results?

I have an SQL query as simple as:
select * from recent_cases where user_id=1000000 and case_id=10095;
It takes up to 0.4 seconds to execute it in Oracle. And when I do 20 requests in a row, it takes > 10s.
The table 'recent_cases' has 4 columns: ID, USER_ID, CASE_ID and VISITED_DATE. Currently there are only 38 records in this table.
Also, there are 3 indexes on this table: on ID column, on USER_ID column, and on (USER_ID, CASE_ID) columns pair.
Any ideas?
One theory -- the table has a very large data segment and high water mark near the end, but the statistics are not prompting the optimiser to use an index. Therefore you're getting a slow full table scan. You could ALTER TABLE ... MOVE and rebuild the indexes to fix such a problem, or COALESCE it.
Oracle Databases have a function called "analyze table". This function can speed up select statements a lot, even if there are just a few rows in the table.
Here are some links which might help you:
http://www.dba-oracle.com/t_oracle_analyze_table.htm
http://docs.oracle.com/cd/B28359_01/server.111/b28310/general002.htm