Why Sql Indexed View always use Clustered Index - sql-server-2005

I need some pointer on how to debug the following problem.
Environment: SQL Server 2005 Enterprise.
I have an indexed view with contains clustered index and multiple non-unique, non-clustered index. However when I execute the query, SQL server always perform Clustered index scan instead of index seek on my key.
Here is a simplify version.
CREATE VIEW MyIndexedView WITH SCHEMABINDING
SELECT a.Col1, b.Col2, c.Col3, d.Col4
FROM a JOIN b on a.id = b.id
JOIN c on a.id = c.id
JION d on c.id = d.id
There is a clustered index on Col1, and non-unique, non-clustered on Col2, Col3.
When I run the following query
SELECT a.Col1, b.Col2, c.Col3 FROM MyIndexedView WITH(NOEXPAND) WHERE b.Col2='blah'
and look at execution plan, I see SQL server run Clustered index scan on a.Col1 instead of perform index seek on Col2.
I tried to recreate the view and index.
Updated:
I did some additional testing and running these two queries side by side in Query Analyzer.
a) SELECT a.Col1, b.Col2, c.Col3
FROM MyIndexedView WITH(NOEXPAND) WHERE b.Col2='blah'
b) SELECT a.Col1, b.Col2, c.Col3
FROM MyIndexedView WHERE b.Col2 = 'blah'
Query 'a' will take 95% of the time and use Cluster Indexed scan. Query 'b' will only take 5% of the time and use Index Seek on col2. I try to swap the order of queries (run b first and a later) yield the same percentage.
This little experiment confirm that if sql use index seek it will be faster then cluster index scan.
Second I though if I don't include "WITH(NOEXPAND)" then SQL server will not use index on Indexed view. (Maybe I should start another question on the exact step to create indexed view).

I reproduced your sample and came up with the expected results with the index seek on Col2. The only way I was able to get it to do the clustered index scan was if I disabled the index. So first try rebuilding the index on Col2 to make sure it is actually enabled (or check the "Use Index" checkbox in index properties - options).
Here are the scripts I used to create the tables, view & indexes
CREATE TABLE [dbo].[a](
[id] [int] IDENTITY(1,1) NOT NULL,
[Col1] [varchar](100) NOT NULL,
CONSTRAINT [PK_a] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[b](
[id] [int] IDENTITY(1,1) NOT NULL,
[Col2] [varchar](100) NOT NULL,
CONSTRAINT [PK_b] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[c](
[id] [int] IDENTITY(1,1) NOT NULL,
[Col3] [varchar](100) NOT NULL,
CONSTRAINT [PK_c] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[d](
[id] [int] IDENTITY(1,1) NOT NULL,
[Col4] [varchar](100) NOT NULL,
CONSTRAINT [PK_d] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE VIEW [dbo].[MyIndexedView] WITH SCHEMABINDING
AS
SELECT a.Col1, b.Col2, c.Col3, d.Col4
FROM dbo.a JOIN dbo.b on a.id = b.id
JOIN dbo.c on a.id = c.id
JOIN dbo.d on c.id = d.id
GO
/****** Object: Index [IX] Script Date: 11/13/2009 21:50:01 ******/
CREATE UNIQUE CLUSTERED INDEX [IX] ON [dbo].[MyIndexedView]
(
[Col1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
/****** Object: Index [IX2] Script Date: 11/13/2009 21:50:39 ******/
CREATE NONCLUSTERED INDEX [IX2] ON [dbo].[MyIndexedView]
(
[Col2] ASC,
[Col3] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
And I populated the tables like this:
declare #x int
SET #x = 0
while #x < 10
begin
INSERT INTO a (Col1 ) VALUES (newid())
INSERT INTO b (Col2 ) VALUES (newid())
INSERT INTO c (Col3 ) VALUES (newid())
INSERT INTO d (Col4 ) VALUES (newid())
SET #x=#x+1
end
Executing your query
SELECT Col1, Col2, Col3 FROM MyIndexedView WITH(NOEXPAND) WHERE Col2='blah'
shows an index seek on IX2
but if I disable that index
ALTER INDEX [IX2] ON [dbo].[MyIndexedView] DISABLE
and rerun, I see the clustered index scan on MyIndexedView.IX

How many records are there in your view?
If the result of the join is small then it is most cost efficient to scan the clustered index than seek another.

Related

WITH (UPDLOCK, HOLDLOCK) and NON-unique index - does it lock a whole table?

I have a table below with NON-unique index on the regionId column:
CREATE TABLE [dbo].[Localizations]
(
[id] [int] IDENTITY(1,1) NOT NULL,
[name] [nvarchar](50) NOT NULL,
[regionId] [int] NOT NULL,
CONSTRAINT [PK_Localizations]
PRIMARY KEY CLUSTERED ([id] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [RegionId_index]
ON [dbo].[Localizations] ([regionId] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
I am executing the following query:
BEGIN TRAN;
IF NOT EXISTS(SELECT 1
FROM [dbo].[Localizations] WITH (UPDLOCK, HOLDLOCK)
WHERE regionId = 1)
BEGIN
WAITFOR DELAY '00:00:10';
INSERT INTO [dbo].[Localizations]
VALUES ('aa', 1);
END;
COMMIT;
And in the meanwhile the following query:
UPDATE [dbo].[Localizations]
SET name = 'bb'
WHERE regionid = 2
The second query waits till the first query ends. If I change the non-unique index of regionId column to the unique index of regionId column then the second query executes immediately. :O Why??
Does it mean the non-unique index of the regionId column together with WITH (UPDLOCK, HOLDLOCK) locks a whole table?

SQL Server slow select from large Table

i have 2 Really Big sql server Database tables for IOT Project
First TABLE IS Message (rows count 7,423,889,085 rows)
CREATE TABLE [aymax].[Message](
[MessageId] [bigint] IDENTITY(1,1) NOT NULL,
[ObjectId] [int] NOT NULL,
[TimeStamp] [datetime] NOT NULL CONSTRAINT [DF__Message__TimeSta__3B75D760] DEFAULT (getdate()),
[GpsTime] [datetime] NOT NULL,
[VisibleSatelites] [int] NOT NULL,
[X] [float] NOT NULL,
[Y] [float] NOT NULL,
CONSTRAINT [Message_PK] PRIMARY KEY NONCLUSTERED
(
[MessageId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Second table is MessageSensors , row count (26,359,568,037 rows) , this table have value for each sensor in message table
CREATE TABLE [aymax].[MessageSensors](
[MessageId] [bigint] NOT NULL,
[DataSourceId] [int] NOT NULL,
[Value] [float] NOT NULL CONSTRAINT [DF__AnalogDat__Value__5812160E] DEFAULT ((0)),
CONSTRAINT [AnalogData_PK] PRIMARY KEY CLUSTERED
(
[MessageId] ASC,
[DataSourceId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
my problem that seek by time interval between 2 date time is really slow , also it became more slow if i select with message sensor data , also when i use sp_BlitzIndex check from brentozar.com it say that i have
"Indexaphobia: High value missing index"
[aymax].[MessageSensors] (EQUALITY: [DataSourceId], [Value] INCLUDES: [MessageId] )
[aymax].[MessageSensors] EQUALITY: [Value] INCLUDES: [MessageId], [DataSourceId]
I belive that create this 2 index is will increase storage alot , also will take too much time to be created , i need your advice for both table regarding index
my current indexes
1-
CREATE NONCLUSTERED INDEX [IX_gpstime_objectid] ON [aymax].[Message]
(
[GpsTime] ASC
)
INCLUDE ( [MessageId],
[ObjectId]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
2-
alter TABLE [aymax].[Message] ADD CONSTRAINT [Message_PK] PRIMARY KEY NONCLUSTERED
(
[MessageId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
3rd-
ALTER TABLE [aymax].[MessageSensors] ADD CONSTRAINT [AnalogData_PK] PRIMARY KEY CLUSTERED
(
[MessageId] ASC,
[DataSourceId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
4-
CREATE NONCLUSTERED INDEX [MessageData_DataSourceId_IDX] ON [aymax].[MessageSensors]
(
[DataSourceId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
any help please , i need to make a fast retrieve from message , and message sensor
update
while doing some investigate i found that select float value will slow up the result too much , from 1 second to 3 minutes
SELECT m.messageid,
m.objectid,
m.gpstime,
m.x,
m.y,
-- slow is here if i replace md.value with md.messageId will return fast , md.value is float
md.Value ,
0
FROM aymax.[message] m WITH (nolock)
left JOIN aymax.MessageSensors md WITH (nolock)
ON m.messageid = md.messageid
AND md.datasourceid = 425732
WHERE m.objectid = 14099
AND m.gpstime BETWEEN '2017-04-01 19:46:18.607' AND '2017-04-10 19:05:18.607'
Possible solutions:
Filtered index (filter by date and do not index old data)
https://learn.microsoft.com/en-us/sql/relational-databases/indexes/create-filtered-indexes.
Clustered index on GpsTime, MessageId (Espessially if you have no plans about another indexes). Requires rebuild your table.
Partitions (see #Siyaul's comments)

Why is index scan used instead of index seek?

I created unique, clustered indices on two large tables that I have to join, but those indices are not used as evidenced by the query plan. When I force the use of those indices with a hint, an index scan is used and performance gets much worse. The unique key of one table is the foreign key of the second table, so this puzzles me. Here is the schema. The two tables are LOC and POL. LOC has 7 million odd rows, and POL has over 6 million rows.
CREATE TABLE [dbo].[LOC](
[acct_num] [char](30) NOT NULL,
[cntr_num] [char](30) NOT NULL,
[lob_cde] [char](2) NOT NULL,
[ste_locn_nme] [char](30) NOT NULL,
[buldg_num] [char](20) NOT NULL,
[prctr_cde] [char](3) NULL,
...more fields...
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IDX_All] ON [dbo].[LOC]
(
[acct_num] ASC,
[cntr_num] ASC,
[lob_cde] ASC,
[buldg_num] ASC,
[prctr_cde] ASC,
[spcl_cond_1_id] ASC,
[spcl_cond_2_id] ASC,
[spcl_cond_3_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE UNIQUE CLUSTERED INDEX [IDX_LOC_PKEY] ON [dbo].[LOC]
(
[acct_num] ASC,
[lob_cde] ASC,
[prctr_cde] ASC,
[cntr_num] ASC,
[ste_locn_nme] ASC,
[buldg_num] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
... more non-unique indices on LOC, each on single columns: buldg_num, prctr_cde, acct_num
CREATE TABLE [dbo].[POL](
[acct_num] [char](30) NOT NULL,
[cntr_num] [char](30) NOT NULL,
[lob_cde] [char](2) NOT NULL,
[prctr_cde] [char](3) NULL,
...more fields...
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IDX_All] ON [dbo].[POL]
(
[acct_num] ASC,
[cntr_num] ASC,
[lob_cde] ASC,
[prctr_cde] ASC,
[acct_nme] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE UNIQUE CLUSTERED INDEX [IDX_POL_PKEY] ON [dbo].[POL]
(
[acct_num] ASC,
[lob_cde] ASC,
[prctr_cde] ASC,
[cntr_num] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
... more non-unique indices on POL, each on single columns: cntr_num, prctr_cde
As you can see, the four fields of the POL table's unique key (acct_num, lob_cde, prctr_cde, cntr_num) are the first four columns of the LOC table's primary key. Here is one query I want to run, like many that will join these two tables:
select
[Easy matches] = COUNT(*)
FROM LOC INNER JOIN POL ON (
LOC.acct_num = POL.acct_num
AND LOC.lob_cde = POL.lob_cde
AND LOC.prctr_cde = POL.prctr_cde
AND LOC.cntr_num = POL.cntr_num)
Without hints, this likes to use the IDX_prctr_cde index from each table. The prctr_cde column is not very selective; there are only seven different values in the LOC or POL tables. If I hint that the query should use IDX_cntr_num index, I get good performance, since it is a highly selective column (over 6 million distinct values in each table). acct_num is almost as selective as cntr_num, also with over 6 million distinct values.
Why is a non-selective index used by default? Why is switching to using the unique, clustered index making the query run much slower? (10x, 20x or even 30x slower.)
Note: The hint I used was:
OPTION (
TABLE HINT(POL, INDEX (IDX_POL_PKEY)),
TABLE HINT(LOC, INDEX (IDX_LOC_PKEY))
)
Note: I am using SQL Server 2005 and SQL Server 2008.

Why isn't the clustered index being used in this query?

I have the following table:
CREATE TABLE [Cache].[Marker](
[ID] [int] NOT NULL,
[SubID] [varchar](15) NOT NULL,
[ReadTime] [datetime] NOT NULL,
[EquipmentID] [varchar](25) NULL,
[Sequence] [int] NULL
) ON [PRIMARY]
With the following clustered index:
CREATE UNIQUE CLUSTERED INDEX [IX_Marker_EquipmentID_ReadTime_SubID] ON [Cache].[Marker]
(
[EquipmentID] ASC,
[ReadTime] ASC,
[SubID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
And this query:
Declare #EquipmentId nvarchar(50)
Set #EquipmentId = 'KLM52B-MARKER'
SELECT TOP 1
cr.C44DistId,
cr.C473RightLotId
From Cache.Marker m
INNER JOIN Cache.vwCoaterRecipe AS cr ON cr.MarkerId = m.ID
Where m.EquipmentID = #EquipmentId And m.ReadTime >= '3/1/2013'
ORDER BY m.Id desc
Here is the query plan being generated:
My question is this. Why isn't the clustered index on the Cache.Marker table being used with a seek instead of a scan on another index? Furthermore, SSMS query analyzer is suggesting I add an index on Marker.ReadTime with ID and EquipmentID columns included.
There are roughly 1M rows in the Cache.Marker table.
How many unique equipment ID's do you have? It's probably decided date is a better first lookup (perhaps mistakenly). You can force it to use your index though with the WITH( INDEX() ) statement. FORCESEEK can help as well. I highly recommend this because then index behavior is predictable as databases grow to large sizes.
SELECT TOP 1
cr.C44DistId,
cr.C473RightLotId
From Cache.Marker m
WITH ( INDEX( IX_Marker_EquipmentID_ReadTime_SubID ), FORCESEEK )
INNER JOIN Cache.vwCoaterRecipe AS cr
ON cr.MarkerId = m.ID
Where m.EquipmentID = #EquipmentId And m.ReadTime >= '3/1/2013'
ORDER BY m.Id desc

Optimizing CLUSTERED INDEX for use with JOIN

table optin_channel_1 (for each 'channel' there's a dedicated table)
CREATE TABLE [dbo].[optin_channel_1](
[key_id] [bigint] NOT NULL,
[valid_to] [datetime] NOT NULL,
[valid_from] [datetime] NOT NULL,
[key_type_id] [int] NOT NULL,
[optin_flag] [tinyint] NOT NULL,
[source_proc_id] [int] NOT NULL,
[date_inserted] [datetime] NOT NULL
) ON [PRIMARY]
CREATE CLUSTERED INDEX [ix_id] ON [dbo].[optin_channel_1]
(
[key_type_id] ASC,
[key_id] ASC,
[valid_to] ASC,
[valid_from] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
table profile_conns
CREATE TABLE [dbo].[profile_conns](
[profile_key_id] [bigint] NOT NULL,
[valid_to] [datetime] NOT NULL,
[valid_from] [datetime] NOT NULL,
[conn_key_id] [bigint] NOT NULL,
[conn_key_type_id] [int] NOT NULL,
[conn_type_id] [int] NOT NULL,
[source_proc_id] [int] NOT NULL,
[date_inserted] [datetime] NOT NULL
) ON [PRIMARY]
CREATE CLUSTERED INDEX [ix_id] ON [dbo].[profile_conns]
(
[profile_key_id] ASC,
[conn_key_type_id] ASC,
[conn_key_id] ASC,
[valid_to] ASC,
[valid_from] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
table lu_channel_conns
CREATE TABLE [dbo].[lu_channel_conns](
[channel_id] [int] NOT NULL,
[conn_type_id] [int] NOT NULL,
CONSTRAINT [PK_lu_channel_conns] PRIMARY KEY CLUSTERED
(
[channel_id] ASC,
[conn_type_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
table lu_conn_type
CREATE TABLE [dbo].[lu_conn_type](
[conn_type_id] [int] NOT NULL,
[default_key_type_id] [int] NOT NULL,
[master_key_type_id] [int] NOT NULL,
[date_inserted] [datetime] NOT NULL,
CONSTRAINT [PK_lu_conns] PRIMARY KEY CLUSTERED
(
[conn_type_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
view v_source_proc_id_by_group_id
SELECT DISTINCT x.source_proc_id, x.source_proc_group_id
FROM lu_source_proc x INNER JOIN lu_source_proc_group y ON x.source_proc_group_id = y.group_id
There's a dynamic SQL statement going to be executed:
SET #sql_str='SELECT #ret=MAX(o.optin_flag)
FROM optin_channel_'+CAST(#channel_id AS NVARCHAR(100))+' o
INNER HASH JOIN dbo.v_source_proc_id_by_group_id y ON o.source_proc_id=y.source_proc_id AND y.source_proc_group_id=#source_proc_group_id
INNER HASH JOIN profile_conns z ON z.profile_key_id=cast(#profile_key_id AS NVARCHAR(100)) AND z.conn_key_type_id=o.key_type_id AND z.conn_key_id=o.[key_id] AND z.valid_to=''01.01.3000''
INNER HASH JOIN lu_channel_conns x ON x.channel_id=#channel_id AND z.conn_type_id=x.conn_type_id
INNER HASH JOIN lu_conn_type ct ON ct.conn_type_id=x.conn_type_id AND ct.default_key_type_id=o.key_type_id'
SET #param='#channel_id INT, #profile_key_id INT, #source_proc_group_id INT, #ret NVARCHAR(400) OUTPUT'
EXEC sp_executesql #sql_str,#param,#channel_id,#profile_key_id,#source_proc_group_id,#ret OUTPUT
I.e. this gives:
SELECT #ret=MAX(o.optin_flag) AS optin_flag
FROM optin_channel_1 o
INNER HASH JOIN dbo.v_source_proc_id_by_group_id y
ON o.source_proc_id=y.source_proc_id
AND y.source_proc_group_id=5
INNER HASH JOIN profile_conns z
ON z.profile_key_id=1
AND z.conn_key_type_id=o.key_type_id
AND z.conn_key_id=o.[key_id]
AND z.valid_to='01.01.3000'
INNER HASH JOIN lu_channel_conns x
ON x.channel_id=1
AND z.conn_type_id=x.conn_type_id
INNER HASH JOIN lu_conn_type ct
ON ct.conn_type_id=x.conn_type_id
AND ct.default_key_type_id=o.key_type_id
These tables are used for an optin database. optin_flag could be 0 or 1. With the last statement I want to get a 1 as optin_flag from optin_channel_1 for the given channel_id=1 for user with profile_key_id=1, when optin was inserted into database by process belonging to source_proc_group_id=5. I hope this is enough to comprehend what's going on.
Is this the best way to use the CLUSTERED INDEX'es? Or would it be better to remove profile_key_id from index on profile_conns and put z.profile_key_id=1 in a WHERE clause?
May be there's a much better way for optimizing this select (changes in database schema is not possible, only changes on indexes and modifing statement).
Without knowing the size of the tables and the sort of data stored in it them it is difficult to gauge.
Assuming optin_channel_1 has a lot of data and profile_cons has a lot of data I would try the following:
Clustered index on optin_channel_1(key_id) or key_type_id depending on which field has the most distinct values. (since you don't have a covering index)
Clustered index on profile_conns (cons_key_id) or cons_key_type_id depending on what you have chosen in optin_channel_1
etc...
Basically, if your table profile_conns table has not much data, I would put the clustered index on the most fragmented "filter" field (I suspect profile_key_id). If the table has a lot of data I would aim for a hash/merge join and match the clustered index with the clustered index of the optin_channel_1 table.
I would also rewrite the query as such:
SELECT #ret = MAX(o.optin_flag) AS optin_flag
FROM optin_channel_1 o
JOIN dbo.v_source_proc_id_by_group_id y
ON o.source_proc_id = y.source_proc_id
JOIN profile_conns z
ON z.conn_key_type_id = o.key_type_id
AND z.conn_key_id = o.[key_id]
JOIN lu_channel_conns x
ON z.conn_type_id = x.conn_type_id
JOIN lu_conn_type ct
ON ct.conn_type_id = x.conn_type_id
AND ct.default_key_type_id=o.key_type_id
WHERE y.source_proc_group_id = 5
AND z.profile_key_id = 1
AND x.channel_id = 1
AND z.valid_to = '01.01.3000'
The query changed this way because:
Putting the filter conditions in the where clause shows you what are relevant fields to aim for a hash/merge join
Putting join hints is rarely a good idea. It is very hard to beat the query governor to determine the best query plan. A bad plan usually indicates you have an issue with your indexes/statistics.
So as summary:
small table joined to big table ==> go for nested loops & focus your clustered index on the "filter" field in the small table & the join field in the big table.
big table joined to big table => go for hash/merge join and put the clustered index on the matching field on both sides
multi-field indexes usually only a good idea when they are "covering", this means all the fields you query are included in the index. (or are included with the include() clause)