How come using an index isn't affecting my query length? - sql

I'm trying to grasp the concept of indexing. I'm using the Adventureworks2016 database in SQL Server. Adventureworks comes with the
IX_SalesOrderDetail_ProductID index in the SalesOrderDetail table
and the
IX_PurchaseOrderDetail_ProductID index in the PurchaseOrderDetail table.
I ran the query below and it took 13 seconds on my computer.
SELECT *
FROM [AdventureWorks2016].[Sales].[SalesOrderDetail] sod
inner join [Purchasing].[PurchaseOrderDetail] pod on sod.ProductID=pod.ProductID
I then removed both the IX_SalesOrderDetail_ProductID and the IX_PurchaseOrderDetail_ProductID indexes and it still took 13 seconds.
This is just one example. I've tried doing this with other tables, same thing happened. Why would the query take the same amount of time with the index as without it? I thought it was supposed to be quicker. Maybe someone can clear this up for me and possibly provide a query where I might see a noticable difference with and without the index.
Thanks in advance

Related

How to Performance tune a query that has Between statement for range of dates

I am working on performance tuning all the slow running queries. I am new to Oracle have been using sql server for a while. Can someone help me tune the query to make it run faster.
Select distinct x.a, x.b from
from xyz_view x
where x.date_key between 20101231 AND 20160430
Appreciate any help or suggestions
First, I'd start by looking at why the DISTINCT is there. In my experience many developers tack on the DISTINCT because they know that they need unique results, but don't actually understand why they aren't already getting them.
Second, a clustered index on the column would be ideal for this specific query because it puts all of the rows right next to each other on disk and the server can just grab them all at once. The problem is, that might not be possible because you already have a clustered index that's good for other uses. In that case, try a non-clustered index on the date column and see what that does.
Keep in mind that indexing has wide-ranging effects, so using a single query to determine indexing isn't a good idea.
I would also add if you are pulling from a VIEW, you should really investigate the design of the view. It typically has a lot of joins that may not be necessary for your query. In addition, if the view is needed, you can look at creating an indexed view which can be very fast.
There is not much more you can do to optimize this query so long as you have established that the DISTINCT is really needed.
You can add a [NOLOCK] to the FROM clause if reading uncommitted pages is not an issue.
However you can analyze if the time is being inserted as well, and if so, is it really relevant, if not set the time to midnight this will improve indexes.
Biggest improvements I've seen is dividing the date field in the table into 3 fields, 1 for each date part. This can really improve performance.

Speed Up Processing of SQL in Access 2007

I've been trying to get this SQL query I am using in an Access 2007 database to execute faster. I've already eliminated any distinct queries involved -- and Access doesn't really give me too much information on where the hang-up is. The query is pulling about 150,000 rows and is taking about 2 minutes to complete.
I'm still learning the syntax for sql in access, but I think I have it setup correctly. I'd appreciate any insights or hints on what I might be missing.
SELECT OLS_UNITS_GROSS_ACRES.AGMT_NUM,
MAX(OLS_UNITS_GROSS_ACRES.UNIT_GROSS_ACRES) AS UNIT_GROSS,
SUM(MIS_ACREAGES.ACRE_AMT) AS [RELATED ACRES]
FROM ((OLS_UNITS_GROSS_ACRES
INNER JOIN MIS_XREFERENCED_AGMTS_M ON OLS_UNITS_GROSS_ACRES.ARRG_KEY = MIS_XREFERENCED_AGMTS_M.ACTIVE_ARRG_KEY)
INNER JOIN ALL_AGMTS1 ON = MIS_XREFERENCED_AGMTS_M.RELATED_ARRG_KEY = ALL_AGMTS1.ARRG_KEY)
INNER JOIN MIS_ACREAGES ON ALL_AGMTS1.ARRG_KEY = MIS_ACREAGES.ARRG_KEY
WHERE (((ALL_AGMTS1.SUBJ_CODE)="LSE")
AND (((MIS_ACREAGES.ACRE_TYPE_CODE)="CNT")
OR ((MIS_ACREAGES.ACRE_TYPE_CODE)="STN")
OR ((MIS_ACREAGES.ACRE_TYPE_CODE)="DNT")))
GROUP BY OLS_UNITS_GROSS_ACRES.AGMT_NUM;
There are several things you could try. Eliminating any DISTINCT Queries is a good start. Another way to speed it up would be to create an index for fields that are frequently queried and or sorted. This will speed up the SELECT query, however it does slow down any INSERT commands as it now has to insert information into the index as well as the table.If your database design is stable and its objects will not be renamed, you can safely turn off Name AutoCorrect to improve performance.Over time, the performance of a database file can become slow because of space that remains allocated to deleted or temporary objects. The Compact and Repair command removes this wasted space and can help a database run faster and more efficiently. I got most of this from the following link. It had several other ideas, but those were the best ones; http://office.microsoft.com/en-us/access-help/help-access-run-faster-HA010235589.aspx

Slow SQL Queries, Order Table by Date?

I have a Sql-Server-2008 database that I am querying from on the regular that was over 30 million entries (joy!). Unfortunately this database cannot be drastically changed because it is still in use for R/D.
When I query from this database, it takes FOREVER. By that I mean I haven't been patient enough to wait for results (after 2 mins I have to cancel to avoid locking the R/D department out). Even if I use a short date range (more than a few months), it is basically impossible to get any results from it. I am querying with requirements from 4 of the columns and unfortunately have to use an inner-join for another table (which I've been told is very costly in terms of query efficiency -- but it unavoidable). This inner joined table has less than 100k entries.
What I was wondering, is it is possible to organize the table to have it defaultly be ordered by date to reduce the number of results it has to search through?
If this is not possible, is there anything I can do to reduce query times? Is there any other useful information that could assist me in coming up with a solution?
I have included a sample of the query that I use:
SELECT DISTINCT N.TestName
FROM [DalsaTE].[dbo].[ResultsUut] U
INNER JOIN [DalsaTE].[dbo].[ResultsNumeric] N
ON N.ModeDescription = 'Mode 8: Low Gain - Green-Blue'
AND N.ResultsUutId = U.ResultsUutId
WHERE U.DeviceName = 'BO-32-3HK60-00-R'
AND U.StartDateTime > '2011-11-25 01:10:10.001'
ORDER BY N.TestName
Any help or suggestions are appreciated!
It sounds like datetime may be a text based field and subsequently an index isn't being used?
Could you try the following to see if you have any speed improvement:
select distinct N.TestName
from [DalsaTE].[dbo].[ResultsUut] U
inner join [DalsaTE].[dbo].[ResultsNumeric] N
on N.ModeDescription = 'Mode 8: Low Gain - Green-Blue'
and N.ResultsUutId = U.ResultsUutId
where U.DeviceName = 'BO-32-3HK60-00-R'
and U.StartDateTime > cast('2011-11-25 01:10:10.001' as datetime)
order by N.TestName
It would also be worth trying changing your inner join to a left outer join as those occasionally perform faster for no conceivable reason (at least one that I'm not aware of).
you can add an index based on your date column, which should improve your query time. You can either use an alter table command, or use the table designer.
Is the sole purpose of the join to provide sorting? If so, a quick thing to try would be to remove this, and see how much of a difference it makes - at least then you'll know where to focus your attention.
Finally, SQL server management studio has some useful tools such as execution plans that can help diagnose performance issues. Good luck!
There are a number of problems which may be causing delays in the execution of your query.
Indexes (except the primary key) do not reorder the data, they merely create an index (think phonebook) which orders a number of values and points back to the primary key.
Without seeing the type of data or the existing indexes, it's difficult, but at the very least, the following ASCENDING indexes might help:
[DalsaTE].[dbo].[ResultsNumeric] ModeDescription and ResultsUutId and TestName
[DalsaTE].[dbo].[ResultsUut] StartDateTime and DeviceName and ResultsUutId
Without the indexes above, the sample query you gave can be completed without performing a single lookup on the actual table data.

SQL Server: How can a table scan be so expensive on a tiny table?

I'm looking at an execution plan from a troublesome query.
I can see that 45% of the plan is taken up doing a table scan on a table with seven (7) rows of data.
I am about to put a clustered index to cover the columns in my query on a table with seven rows and it feels...wrong. How can this part of my query take up so much of the plan given the table is so tiny?
I was reading up here and it feel it might just be becuase of non-contiguous data - there are no indexes at all on the table in question. Overall though our database is large-ish (7GB) and busy.
I'd love to know what others think - thanks!
EDIT:
The query is run very frequently and was involved in deadlock (and chosen as the victim). Right now it's taking between 300ms and 500ms to run, but will take longer when the database is busier.
The query:
select l.team1Score, l.team2Score, ls.team1ExternalID, ls.team2ExternalID, et.eventCategoryID, e.eventID, ls.statusCode
from livescoretracking l(nolock)
inner join liveScores ls (nolock) on l.liveScoreID = ls.liveScoreID
inner join db1.dbo.events e on e.gameid = ls.gameid
inner join db1.dbo.eventtype et (nolock) on e.eventTypeID = et.eventTypeID
inner join eventCategoryPayTypeMappings ecb (nolock) on ( et.eventCategoryID = ecb.eventCategoryID and e.payTypeID = ecb.payTypeID and ecb.mainEvent = 1 )
where ls.gameID = 286711 order by l.dateinserted
The problem table is the eventCategoryPayTypeMappings table - thanks!
A percentage cost is meaningless without knowing the total cost in real terms. e.g. if the query takes 1 ms to execute a 45% cost for a table scan is .45 of a milisecond which is not worth trying to optimise, if the query takes 10 seconds to execute then the 45% cost is significant and worth optimising.
A table scan on a seven row table is not expensive. Barring query hints, the query engine will use a table scan on such a small table no matter what indexes exist. Can you show us more about the query in question and the problem with the execution plan?
If there are no indexes on the table, the query engine will always have to do a table scan. There's no other way it can process the data.
Many RDBMS platforms will do a table scan on a table that small even if there are indexes. (I'm not sure about SQL Server specifically.)
I would be more concerned about the actual numbers in the query plan.
Deadlocks are usually more indicative of a resource access ordering issue than a problem with query design in particular. I would look at the other participant(s) in the deadlock and take a look at what objects each transaction had locked that were required by the other(s). If you can reorder to ensure consistent access order you may be able to avoid contention issues entirely.
It really depends how long the query takes from start to finish. 45% doesn't mean its taking a long time if the query is only taking say 10ms. All it really says is most of the time is spent doing the table scan which is understandable.
Having an index may help when the table grows and is probably not a bad idea unless you know this table is not going to grow. However you will find that adding an index to a table with 7 records makes little to no difference to performance.
A table scan on a small table is not a bad thing - If it fits in a single read into the cache the optimizer will calculate that a table scan costs less than reading through an index chain.
I would only recommend a clustered index if you want to help insure that the contents will 'tend' to be sorted that way (though you will need an explicit order by to guarantee that).

Need tips for optimizing SQL Query using a JOIN

The Query I'm writing runs fine when looking at the past few days, once I go over a week it crawls (~20min). I am joining 3 tables together. I was wondering what things I should look for to make this run faster. I don't really know what other information is needed for the post.
EDIT: More info: db is Sybase 10. Query:
SELECT a.id, a.date, a.time, a.signal, a.noise,
b.signal_strength, b.base_id, b.firmware,
a.site, b.active, a.table_key_id
FROM adminuser.station AS a
JOIN adminuser.base AS b
ON a.id = b.base_id
WHERE a.site = 1234 AND a.date >= '2009-03-20'
I also took out the 3rd JOIN and it still runs extremely slow. Should I try another JOIN method?
I don't know Sybase 10 that well, but try running that query for say 10-day period and then 10 times, for each day in a period respectively and compare times. If the time in the first case is much higher, you've probably hit the database cache limits.
The solution is than to simply run queries for shorter periods in a loop (in program, not SQL). It works especially well if table A is partitioned by date.
You can get a lot of information (assuming you're using MSSQL here) by running your query in SQL Server Management Studio with the Include Actual Execution Plan option set (in the Query menu).
This will show you a diagram of the steps that SQLServer performs in order to execute the query - with relative costs against each step.
The next step is to rework the query a little (try doing it a different way) then run the new version and the old version at the same time. You will get two execution plans, with relative costs not only against each step, but against the two versions of the query! So you can tell objectively if you are making progress.
I do this all the time when debugging/optimizing queries.
Make sure you have indexes on the foreign keys.
It sounds more like you have a memory leak or aren't closing database connections in your client code than that there's anything wrong with the query.
[edit]
Nevermind: you mean quering over a date range rather than the duration the server has been active. I'll leave this up to help others avoid the same confusion.
Also, it would help if you could post the sql query, even if you need to obfuscate it some first, and it's a good bet to check if there's an index on your date column and the number of records returned by the longer range.
You may want to look into using a PARTITION for the date ranges, if your DB supports it. I've heard this can help significantly.
Grab the book "Professional SQL Server 2005 Performance Tuning" its pretty great.
You didn't mention your database. If it's not SQL Server, the specifics of how to get the data might be different, but the advice is fundamentally the same.
Look at indexing, for sure, but the first thing to do is to follow Blorgbeard's advice and scan for execution plans using Management Studio (again, if you are running SQL Server).
What I'm guessing you'll see is that for small date ranges, the optimizer picks a reasonable query plan, but that when the date range is large, it picks something completely different, likely involving either table scans or index scans, and possibly joins that lead to very large temporary recordsets. The execution plan analyzer will reveal all of this.
A scan means that the optimizer thinks that grinding over the whole table or the whole index is cheaper for what you are trying to do than seeking specific values.
What you eventually want to do is get indexes and the syntax of your query set up such that you keep index seeks in the query plan for your query regardless of the date range, or, failing that, that the scans you require are filtered as well as you can manage to minimize temporary recordset size and thereby avoid excessive reads and I/O.
SELECT
a.id, a.date, a.time, a.signal, a.noise,a.site, b.active, a.table_key_id,
b.signal_strength, b.base_id, b.firmware
FROM
( SELECT * FROM adminuser.station
WHERE site = 1234 AND date >= '2009-03-20') AS a
JOIN
adminuser.base AS b
ON
a.id = b.base_id
Kind of rewrote the query, so as to first filter the desired rows then perform a join rather than perform a join then filter the result.
Rather than pulling * from the sub-query you can just select the columns you want, which might be little helpful.
May be this will of little help, in speeding things.
While this is valid in MySql, I am not sure of the sysbase syntax though.