Why is my index not automatically used? - sql

I have a table
Archive(VarId SMALLINT, Timestamp DATETIME, Value FLOAT)
VarId is not unique. The table contains measurements. I have a clustered index on Timestamp. Now i have the requirement of finding a measurement for a specific VarId before a specific date. So I do:
SELECT TOP(1) *
FROM Archive
WHERE VarId = 135
AND Timestamp < '2012-06-01 14:21:00'
ORDER BY Timestamp DESC;
If there is no such measurement this query searches the whole table. So I introduced another index on (VarId, Timestamp).
My problem is: SQL Server doesn't seem to care about it, the query still takes forever. When I explicitly state 'WITH (INDEX = <id>)' it works as it should. What can I do so SQL Server uses my index automatically?
I'm using SQL Server 2005.

There are different possibilities with this.
I'll try help you to isolate them:
It could be SQL Server is favouring your Clustered Index (very likely it's the Primary Key) over your newly created index. One way to solve this is to have a NonClustered Primary Key and cluster the index on the other two fields (varid and timestamp). That is, if you don't want varid and timestamp to be the PK.
Also, looking at the (estimated) execution plan might help.
But I believe #1 only works nicely if those 2 fields are the most commonly used (queried) index. To find out if this is the case, it would be good to analyse which index users are most likely use (from http://sqlblog.com/blogs/louis_davidson/archive/2007/07/22/sys-dm-db-index-usage-stats.aspx):
select
ObjectName = object_schema_name(indexes.object_id) + '.' + object_name(indexes.object_id),
indexes.name,
case when is_unique = 1 then 'UNIQUE ' else '' end + indexes.type_desc,
ddius.user_seeks,
ddius.user_scans,
ddius.user_lookups,
ddius.user_updates
from
sys.indexes
left join sys.dm_db_index_usage_stats ddius on (
indexes.object_id = ddius.object_id
and indexes.index_id = ddius.index_id
and ddius.database_id = db_id()
)
WHERE
object_schema_name(indexes.object_id) != 'sys' -- exclude sys objects
AND object_name(indexes.object_id) LIKE 'Archive'
order by
ddius.user_seeks + ddius.user_scans + ddius.user_lookups
desc
Good luck

My guess is that your index design is the issue. You have a CLUSTERED index on a DATETIME field and I suspect that it is not unique data, much like VarId, and hence you did not declare it as UNIQUE. Because it is not unique there is a hidden, 4-byte "uniqueifier" field (so that each row can by physically unique regardless of you not giving it unique data) and the rows with the same DATETIME value are essentially random within the group of same DATETIME values (so even narrowing down a time still requires scanning through that grouping). You also have a NONCLUSTERED index on VarId, Timestamp. NONCLUSTERED indexes include the data from the CLUSTERED index so internally your NONCLUSTERED index is really: VarId, Timestamp, Timestamp (from the CLUSTERED index). So you could have left off the Timestamp column in the NONCLUSTERED index and it would have all been the same to the optimizer, but in a sense it would have been better as it would be a smaller index.
So your physical layout is based on a date while the VarId values are spread across those dates. Hence VarId = 135 can be spread very far apart in terms of data pages. Yes, your non-clustered index does group them together, but the optimizer is probably looking at the fact that you are wanting all fields (the "SELECT *" part) and the Timestamp < '2012-06-01 14:21:00' condition in addition to that seems to get most of what you need as opposed to finding a few rows and doing a bookmark lookup to get the "Value" field to fulfill the "SELECT *". Quite possibly if you do just "SELECT TOP(1) VarId, Timestamp" it would more likely use your NONCLUSTERED index without needing the "INDEX =" hint.
Another issue affecting performance overall could be that the ORDER BY is requesting the Timestamp in DESC order and if you have the CLUSTERED index in ASC order then it would be the opposite direction of what you are looking for (at least in this query). Of course, in that case then it might be ok to have Timestamp in the NONCLUSTERED index if it was in DESC order.
My advice is to rethink the CLUSTERED index. Judging on just this query alone (other queries/uses might alter the recommendation), try dropping the NONCLUSTERED index and recreate the CLUSTERED index with the Timestamp field first, in DESC order, and also with the VarId so it can be delcared UNIQUE. So:
CREATE UNIQUE CLUSTERED INDEX [UIX_Archive_Timestamp_VarId]
ON Archive (Timestamp DESC, VarId ASC)
This, of course, assumes that the Timestamp and VarId combination is unique. If not, then still try this without the UNIQUE keyword.
Update:
To pull all of this info and advice together:
When designing indexes you need to consider the distribution of the data and the use-cases for interacting with it. More often than not there is A LOT to consider and several different approaches will appear good in theory. You need to try a few approaches, profile/test them, and see which works best in reality. There is no "always do this" approach without knowing all aspects of what you are doing and what else is going on and what else is planned to use and/or modify this table which I suspect has not been presented in the original question.
So to start the journey, you are ordering records by date and are looking at ranges of dates AND dates naturally occur in order so putting Timestamp first benefits more of what you are doing and has less fragmentation, especially if defined as DESC in the CREATE. Having an NC index on just VarId at that point will then be fine, even if spread out, for looking at a set of rows for a particular VarId. So maybe start there (change order of direction of CLUSTERED index and remove Timestamp from the NC index). See how those changes compare to the existing structure. Then try moving the VarId field into the CLUSTERED index and remove the NC index. You say that the combination is also not unique but does increase the predictability of the ordering of the rows. See how that works. Does this table ever get updated? If not and if the Value field along with Timestamp and VarId would be unique, then try adding that to the CLUSTERED index and be sure to create with the UNIQUE keyword. See how these different approaches work by looking at the Actual Execution Plan and use SET STATISTICS IO ON before running the query and see how the logical reads between the different approaches compare.
Hope this helps :)

You might need to analyze your table to collect statistics, so the optimizer can determine whether to use the index or not.

Related

MS SQL: Performance for querying ID descending

This question relates to a table in Microsoft SQL Server which is usually queried with ORDER BY Id DESC.
Would there be a performance benefit from setting the primary key to PRIMARY KEY CLUSTERED (Id DESC)? Or would there be a need for an index? Or is it as fast as it gets without any of it?
Table:
CREATE TABLE [dbo].[Items] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[Category] INT NOT NULL,
[Name] NVARCHAR(255) NULL,
CONSTRAINT [PK_Items] PRIMARY KEY CLUSTERED ([Id] ASC)
)
Query:
SELECT TOP 1 * FROM [dbo].[Items]
WHERE Catgory = 123
ORDER BY [Id] DESC
Would there be a performance benefit from setting the primary key to PRIMARY KEY
CLUSTERED (Id DESC)?
Given as you show is: IT DEPENDS.
The filter is on Category = 123. To find all entries of Category 123, because there is NO INDEX defined, the server has to do a table scan. Unless you havea VERY large result set, and / or some awfully comically bad configured tempdb and very low memory (because disc is only used when running out of memory for tempdb) the sorting of hte result will be irrelevant compared to the table scan.
You are literally following the wrong tail. You are WAY more likely to speed up the query by adding a non-unique index to Cateogory so that the query can prefilter the data fast based on your query condition.
If you would analzy the query plan for this query (which you should - technically we should not even ANSWER this quesstion without you showing SOME effort, and a look at the query plan is like the FIRST thing you do) you would very likely see that the time is spent on on the query, NOT the result sort.
Creating an index in asc or desc order does not make a big difference in “ORDER BY” when there is only one column, but when there is a need to sort data in two different directions one column in ascending order and the other column in descending order the way the index is created does make a big difference.
Look this article that do many example:
https://www.mssqltips.com/sqlservertip/1337/building-sql-server-indexes-in-ascending-vs-descending-order/
In your scenario I advise you to create an index on Category Column without include “Id” because the clustered index is always included in non-clustered index.
There is no difference according to the following
I'd suggest defining an index on (category, id desc).
It will give you best performance for your query.
As others have indicated, an index on Category (assuming you don't have one) is the biggest performance boost possible here.
But as for your actual question. For a single order by query like you have, it does not matter if the query/index is ordered by desc or asc as far as performance goes. SQL Server can swap those easily (starting a the beginning or the end of the data structure)
Where performance becomes an issue for performance is when you:
Have more than order by column
Your index has more than one column
Your order by is opposing the order on the index.
So, say your Primary Key had ID asc and Category asc, and then you query by ID asc and Category desc. Then SQL Server can't use the order on the index to do the search.
There are a few caveats and gotchas. After searching a bit, this answer seems to have them listed:
SQL Server indexes - ascending or descending, what difference does it make?

SQL index for date range query

For a few days, I've been struggling with improving the performance of my database and there are some issues that I'm still kind a confused about regarding indexing in a SQL Server database.
I'll try to be as informative as I can.
My database currently contains about 100k rows and will keep growing, therfore I'm trying to find a way to make it work faster.
I'm also writing to this table, so if you suggestion will drastically reduce the writing time please let me know.
Overall goal is to select all rows with a specific names that are in a date range.
That will usually be to select over 3,000 rows out of a lot lol ...
Table schema:
CREATE TABLE [dbo].[reports]
(
[id] [int] IDENTITY(1,1) NOT NULL,
[IsDuplicate] [bit] NOT NULL,
[IsNotValid] [bit] NOT NULL,
[Time] [datetime] NOT NULL,
[ShortDate] [date] NOT NULL,
[Source] [nvarchar](350) NULL,
[Email] [nvarchar](350) NULL,
CONSTRAINT [PK_dbo.reports]
PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
This is the SQL query I'm using:
SELECT *
FROM [db].[dbo].[reports]
WHERE Source = 'name1'
AND ShortDate BETWEEN '2017-10-13' AND '2017-10-15'
As I understood, my best approach to improve efficency without hurting the writing time as much would be to create a nonclustered index on the Source and ShortDate.
Which I did like such, index schema:
CREATE NONCLUSTERED INDEX [Source&Time]
ON [dbo].[reports]([Source] ASC, [ShortDate] ASC)
Now we are getting to the tricky part which got me completely lost, the index above sometimes works, sometime half works and sometime doesn't work at all....
(not sure if it matters but currently 90% of the database rows has the same Source, although this won't stay like that for long)
With the query below, the index isn't used at all, I'm using SQL Server 2014 and in the Execution Plan it says it only uses the clustered index scan:
SELECT *
FROM [db].[dbo].[reports]
WHERE Source = 'name1'
AND ShortDate BETWEEN '2017-10-10' AND '2017-10-15'
With this query, the index isn't used at all, although I'm getting a suggestion from SQL Server to create an index with the date first and source second... I read that the index should be made by the order the query is? Also it says to include all the columns Im selecting, is that a must?... again I read that I should include in the index only the columns I'm searching.
SELECT *
FROM [db].[dbo].[reports]
WHERE Source = 'name1'
AND ShortDate = '2017-10-13'
SQL Server index suggestion -
/* The Query Processor estimates that implementing the following
index could improve the query cost by 86.2728%. */
/*
USE [db]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[reports] ([ShortDate], [Source])
INCLUDE ([id], [IsDuplicate], [IsNotValid], [Time], [Email])
GO
*/
Now I tried using the index SQL Server suggested me to make and it works, seems like it uses 100% of the nonclustered index using both the queries above.
I tried to use this index but deleting the included columns and it doesn't work... seems like I must include in the index all the columns I'm selecting?
BTW it also work when using the index I made if I include all the columns.
To summarize: seems like the order of the index didn't matter, as it worked both when creating Source + ShortDate and ShortDate + Source
But for some reason its a must to include all the columns... (which will drastically affect the writing to this table?)
Thanks a lot for reading, My goal is to understand why this stuff happens and what I should do otherwise (not just the solution as I'll need to apply it on other projects as well ).
Cheers :)
Indexing in SQL Server is part know-how from long experience (and many hours of frustration), and part black magic. Don't beat yourself up over that too much - that's what a place like SO is ideal for - lots of brains, lots of experience from many hours of optimizing, that you can tap into.
I read that the index should be made by the order the query is?
If you read this - it is absolutely NOT TRUE - the order of the columns is relevant - but in a different way: a compound index (made up from multiple columns) will only ever be considered if you specify the n left-most columns in the index definition in your query.
Classic example: a phone book with an index on (city, lastname, firstname). Such an index might be used:
in a query that specifies all three columns in its WHERE clause
in a query that uses city and lastname (find all "Miller" in "Detroit")
or in a query that only filters by city
but it can NEVER EVER be used if you want to search only for firstname ..... that's the trick about compound indexes you need to be aware of. But if you always use all columns from an index, their ordering is typically not really relevant - the query optimizer will handle this for you.
As for the included columns - those are stored only in the leaf level of the nonclustered index - they are NOT part of the search structure of the index, and you cannot specify filter values for those included columns in your WHERE clause.
The main benefit of these included columns is this: if you search in a nonclustered index, and in the end, you actually find the value you're looking for - what do you have available at that point? The nonclustered index will store the columns in the non-clustered index definition (ShortDate and Source), and it will store the clustering key (if you have one - and you should!) - but nothing else.
So in this case, once a match is found, and your query wants everything from that table, SQL Server has to do what is called a Key lookup (often also referred to as a bookmark lookup) in which it takes the clustered key and then does a Seek operation against the clustered index, to get to the actual data page that contains all the values you're looking for.
If you have included columns in your index, then the leaf level page of your non-clustered index contains
the columns as defined in the nonclustered index
the clustering key column(s)
all those additional columns as defined in your INCLUDE statement
If those columns "cover" your query, e.g. provide all the values that your query needs, then SQL Server is done once it finds the value you searched for in the nonclustered index - it can take all the values it needs from that leaf-level page of the nonclustered index, and it does NOT need to do another (expensive) key lookup into the clustering index to get the actual values.
Because of this, trying to always explicitly specify only those columns you really need in your SELECT can be beneficial - in this case, you might be able to create an efficient covering index that provides all the values for your SELECT - always using SELECT * makes that really hard or next to impossible.....
In general, you want the index to be from most selective (i.e. filtering out the most possible records) to least selective; if a column has low cardinality, the query optimizer may ignore it.
That makes intuitive sense - if you have a phone book, and you're looking for people called "smith", with the initial "A", you want to start with searching for "smith" first, and then the "A"s, rather than all people whose initial is "A" and then filter out those called "Smith". After all, the odds are that one in 26 people have the initial "A".
So, in your example, I guess you have a wide range of values in short date - so that's the first column the query optimizer is trying to filter out. You say you have few different values in "source", so the query optimizer may decide to ignore it; in that case, the second column in that index is no use either.
The order of where clauses in the index is irrelevant - you can swap them round and achieve the exact same results, so the query optimizer ignores them.
EDIT:
So, yes, make the index. Imagine you have a pile of cards to sort - in your first run, you want to remove as many cards as possible. Assuming it's all evenly spread - if you have 1000 separate short_dates over a million rows, that means you end up with 1000 items if your first run starts on short_date; if you sort by source, you have 100000 rows.
The included columns of an index is for the columns you are selecting.
Due to the fact that you do select * (which isn't good practice), the index won't be used, because it would have to lookup the whole table to get the values for the columns.
For your scenario, I would drop the default clustered index (if there is one) and create a new clustered index with the following statement:
USE [db]
GO
CREATE CLUSTERED INDEX CIX_reports
ON [dbo].[reports] ([ShortDate],[Source])
GO

Getting RID Lookup instead of Table Scan?

SQL Fiddle: http://sqlfiddle.com/#!3/23cf8
In this query, when I have an In clause on an Id, and then also select other columns, the In is evaluated first, and then the Details column and other columns are pulled in via a RID Lookup:
--In production and in SQL Fiddle, Details is grabbed via a RID Lookup after the In clause is evaluated
SELECT [Id]
,[ForeignId]
,Details
--Generate a numbering(starting at 1)
--,Row_Number() Over(Partition By ForeignId Order By Id Desc) as ContactNumber --Desc because older posts should be numbered last
FROM SupportContacts
Where foreignId In (1,2,3,5)
With this query, the Details are being pulled in via a Table Scan.
With NumberedContacts AS
(
SELECT [Id]
,[ForeignId]
--Generate a numbering(starting at 1)
,Row_Number() Over(Partition By ForeignId Order By Id Desc) as ContactNumber --Desc because older posts should be numbered last
FROM SupportContacts
Where ForeignId In (1,2,3,5)
)
Select nc.[Id]
,nc.[ForeignId]
,sc.[Details]
From NumberedContacts nc
Inner Join SupportContacts sc on nc.Id = sc.Id
Where nc.ContactNumber <= 2 --Only grab the last 2 contacts per ForeignId
;
In SqlFiddle, the second query actually gets a RID Lookup, whereas in production with a million records it produces a Table Scan (the IN clause eliminates 99% of the rows)
Otherwise the query plan shown in SQL Fiddle is identical, the only difference being that for the second query the RID Lookup in SQL Fiddle, is a Table Scan in production :(
I would like to understand possibilities that would cause this behavior? What kinds of things would you look at to help determine the cause of it using a table scan here?
How can I influence it to use a RID Lookup there?
From looking at operation costs in the actual execution plan, I believe I can get the second query very close in performance to the first query if I can get it to use a RID Lookup. If I don't select the Detail column, then the performance of both queries is very close in production. It is only after adding other columns like Detail that performance degrades significantly for the second query. When I put it in SQL Fiddle and saw that the execution plan used an RID Lookup, I was surprised but slightly confused...
It doesn't have a clustered index because in testing with different clustered indexes, there was slightly worse performance for this and other queries. That was before I began adding other columns like Details though, and I can experiment with that more, but would like to have a understanding of what is going on now before I start shooting in the dark with random indexes.
What if you would change your main index to include the Details column?
If you use:
CREATE NONCLUSTERED INDEX [IX_SupportContacts_ForeignIdAsc_IdDesc]
ON SupportContacts ([ForeignId] ASC, [Id] DESC)
INCLUDE (Details);
then neither a RID lookup nor a table scan would be needed, since your query could be satisfied from just the index itself....
The differences in the query plans will be dependent on the types of indexes that exist and the statistics of the data for those tables in the different environments.
The optimiser uses the statistics (histograms of data frequency, mostly) and the available indexes to decide which execution plan is going to be the quickest.
So, for example, you have noticed that the performance degrades when the 'Details' column is included. This is an almost sure sign that either the 'Details' column is not part of an index, or if it is part of an index, the data in that column is mostly unique such that the index accesses would be equivalent (or almost equivalent) to a table scan.
Often when this situation arises, the optimiser will choose a table scan over the index access, as it can take advantage of things like block reads to access the table records faster than perhaps a fragmented read of an index.
To influence the path that will be chose by the optimiser, you would need to look at possible indexes that could be added/modified to make an index access more efficient, but this should be done with care as it can adversely affect other queries as well as possibly degrading insert performance.
The other important activity you can do to help the optimiser is to make sure the table statistics are kept up to date and refreshed at a frequency that is appropriate to the rate of change of the frequency distribution in the table data
If it's true that 99% of the rows would be omitted if it performed the query using the relevant index + RID then the likeliest problem in your production environment is that your statistics are out of date and the optimiser doesn't realise that ForeignID in (1,2,3,5) would limit the result set to 1% of the total data.
Here's a good link for discovering more about statistics from Pinal Dave: http://blog.sqlauthority.com/2010/01/25/sql-server-find-statistics-update-date-update-statistics/
As for forcing the optimiser to follow the correct path WITHOUT updating the statistics, you could use a table hint - if you know the index that your plan should be using which contains the ID and ForeignID columns then stick that in your query as a hint and force SQL optimiser to use the index:
http://msdn.microsoft.com/en-us/library/ms187373.aspx
FYI, if you want the best performance from your second query, use this index and avoid the headache you're experiencing altogether:
create index ix1 on SupportContacts(ForeignID, Id DESC) include (Details);

Why am I getting a Clustered Index Scan when the column is indexed?

So, we have a table, InventoryListItems, that has several columns. Because we're going to be looking for rows at times based on a particlar column (g_list_id, a foreign key), we have that foreign key column placed into a non-clustered index we'll call MYINDEX.
So when I search for data like this:
-- fake data for example
DECLARE #ListId uniqueidentifier
SELECT #ListId = '7BCD0E9F-28D9-4F40-BD67-803005179B04'
SELECT *
FROM [dbo].[InventoryListItems]
WHERE [g_list_id] = #ListId
I expected that it would use the MYINDEX index to find just the needed rows, and then look up the information in those rows. So not as good as just finding everything we need in the index itself, but still a big win over doing a full scan of the table.
But instead it seems that I'm still getting a clustered index scan. I can't figure out why that would happen.
If I do something like SELECTing only the values in the included columns of the index, it does what I would expect, an index seek, and just pulls everything from the index.
But if I SELECT *, why does it just bail on the index and do a scan when it seems like it would still benefit greatly from using it because it's referenced in the WHERE clause?
Since you're doing a SELECT * and thus you retrieve all columns, SQL Server's query optimizer may have decided it's easier and more efficient to just do a clustered index scan - since it needs to go to the clustered index leaf level to get all the columns anyway (and doing a seek first, and then a key lookup to actually get the whole data page, is quite an expensive operation - scan might just be more efficient in this setup).
I'm almost sure if you try
SELECT g_list_id
FROM [dbo].[InventoryListItems]
WHERE [g_list_id] = #ListId
then there will be an index seek instead (since you're only retrieving a single column - not everything).
That's one of the reasons why I would recommend to be extra careful when using SELECT * .... - try to avoid it if ever possible.

Smart choice for primary key and clustered index on a table in SQL 2005 to boost performance of selecting single record or multiple records

EDIT: I have added "Slug" column to address performance issues on specific record selection.
I have following columns in my table.
Id Int - Primary key (identity, clustered by default)
Slug varchar(100)
...
EntryDate DateTime
Most of the time, I'm ordering the select statement by EntryDate like below.
Select T.Id, T.Slug, ..., T.EntryDate
From (
Select Id, Slug, ..., EntryDate,
Row_Number() Over (Order By EntryDate Desc, Id Desc) AS RowNum
From TableName
Where ...
) As T
Where T.RowNum Between ... And ...
I'm ordering it by EntryDate and Id in case there are duplicate EntryDates.
When I'm selecting A record, I do the following.
Select Id, Slug, ..., EntryDate
From TableName
Where Slug = #slug And Year(EntryDate) = #entryYear
And Month(EntryDate) = #entryMonth
I have a unique key of Slug & EntryDate.
What would be a smart choice of keys and indexes in my situation? I'm facing performance issues probably because I'm ordering by a column that is not clustered indexed.
Should I have Id set as non-clustered primary key and EntryDate as clustered index?
I appreciate all your help. Thanks.
EDIT:
I haven't tried adding non-clustered index on the EntryDate. Data inserted from back-end, so performance for insert isn't a big deal for me. Also, EntryDate is not always the date when it is inserted. It can be a past date. Back-end user picks the date.
Based on the current table layout you want some indexes like this.
CREATE INDEX IX_YourTable_1 ON dbo.YourTable
(EntryDate, Id)
INCLUDE (SLug)
WITH (FILLFACTOR=90)
CREATE INDEX IX_YourTable_2 ON dbo.YourTable
(EntryDate, Slug)
INCLUDE (Id)
WITH (FILLFACTOR=80)
Add any other columns you are returning to the INCLUDE line.
Change your second query to something like this.
Select Id, Slug, ..., EntryDate
From TableName
Where Slug = #slug
AND EntryDate BETWEEN CAST(CAST(#EntryYear AS VARCHAR(4) + CAST(#EntryMonth AS VARCHAR(2)) + '01' AS DATE) AND DATEADD(mm, 1, CAST(CAST(#EntryYear AS VARCHAR(4) + CAST(#EntryMonth AS VARCHAR(2)) + '01' AS DATE))
The way your second query is currently written the index will never be used. If you can change the Slug column to a related table it will increase your performance and decrease your storage requirements.
Have you tried simply adding a non-clustered index on the entrydate to see what kind of performance gain you get?
Also, how often is new data added? and will new data that is added always be >= the last EntryDate?
You want to keep ID as a clustered index, as you will most likely join to the table off your id, and not entry date.
A simple non clustered index with just the date field would be fine to speed things up.
Clustering is a bit like "index paging", the index is "chunked" instead of simply being a long list. This is helpful when you've got a lot of data. The DB can search within cluster ranges, then find the individual record. It makes the index smaller, therefore faster to search, but less specific. Once if finds the correct spot in the cluster it then needs to search within the cluster.
It's faster with a lot of data, but slower with smaller data sets.
If you're not searching a lot using the primary key, then cluster the date and leave the primary key non-clustered. It really depends on how complex your queries are with joining other tables.
A clustered index will only make any difference at all, if you are returning a bunch of records, and some the fields you return are not part of the index. Otherwise there's no benefit.
You need first to find out what the query plan tells you about why your current queries are slow. Without that, it's mostly idle speculation (which is usually counterproductive when optimizing queries.)
I wouldn't try anything (suggested by me or anyone else) without having a solid queryplan to compare with, to at least know if you're doing good or harm.