Indexed view in SQL Server not using indexes - sql

My schema:
I need to get count of comment for each tag.
I created a view:
create view dbo.UserTagCommentCount
with schemabinding
as
select
c.UserPK, t.PK TagPK, count_big(*) Count
from
dbo.Comments c
join
dbo.Posts p on p.PK = c.PostPK
join
dbo.PostTags pt on pt.PostPK = p.PK
join
dbo.Tags t on t.PK = pt.TagPK
group by
t.PK, c.UserPK
go
and I created clustered unique index on this view:
create unique clustered index PK_UserTagCommentCount
on dbo.UserTagCommentCount(UserPK, TagPK)
go
But when I select a rows by UserPK - this clustered index is not being used:
select *
from UserTagCommentCount
where UserPK = 19146
order by Count desc
OK. I create a simple index
create index IX_UserTagCommentCount_UserPK
on UserTagCommentCount(UserPK)
go
and use select with it
select *
from UserTagCommentCount with(index(IX_UserTagCommentCount_UserPK))
where UserPK = 19146
order by Count desc
but I see the same plan
Please any ideas? Why are the indexes not used when selecting from this view?
SQL Server 2019 development

Related

Query performance: CTE using ROW_NUMBER() to select first row

We have three environments and when I run my SQL query in two of them just takes 30 or 38 seconds to run but in the other environment running is not completed and I should cancel it. Query is based on two parts, a CTE and a very simple select from a table, in both CTE and select I'm using the same table.
Could you please tell me why it takes so long time? how can I improve the query?
ALTER VIEW [fact].[vPurchase]
AS
WITH VKPL AS
(
SELECT *
FROM
(SELECT
iv.[Delivery_FK],
1 AS column2,
ROW_NUMBER() OVER(PARTITION BY [Delivery_FK] ORDER BY iv.UpdateDate) AS rk
FROM
[fact].[KRMFact] iv
LEFT JOIN
[dimension].[Product] pr ON iv.Product_FK =pr.Product_SK
LEFT JOIN
[dimension].[Delivery] le ON le.Delivery_FK = iv.Delivery_FK
WHERE
pr.Product_Key = '740') X
WHERE
rk = 1
)
SELECT
-- .... here are some columns
Delivery_FK,
Product_FK,
CAST(column2 AS VARCHAR) AS column2,
f.[UpdateDate] AS [Update date]
FROM
[fact].[KRMFact] f
LEFT JOIN
VKPL v ON f.Delivery_FK = v.Delivery_FK
This is guesswork.
I guess the environment where this query is slow is the one with lots of production data in it.
I guess some index on your KRMFact table will, maybe, help you. Here's how to figure out what you need: SQL Server Management Studio (SSMS) has a feature to show you a query's execution plan. Put your query (not simplified, please, the actual query) into SSMS, right click and choose "Include Actual Execution Plan." Then run the query. The execution plan display may recommend an index for you to create to get this query to run faster.
I guess you're trying to find rows with the earliest values of UpdateDate.
Your subquery
SELECT *
FROM
(SELECT
iv.[Delivery_FK],
1 AS column2,
ROW_NUMBER() OVER(PARTITION BY [Delivery_FK] ORDER BY iv.UpdateDate) AS rk
FROM
[fact].[KRMFact] iv
LEFT JOIN
[dimension].[Product] pr ON iv.Product_FK =pr.Product_SK
LEFT JOIN
[dimension].[Delivery] le ON le.Delivery_FK = iv.Delivery_FK
WHERE
pr.Product_Key = '740') X
WHERE
rk = 1
looks like it picks out the row with the earliest KRMFact.UpdateDate for each value of KRMFact.Delivery_FK. That's what the ROW_NUMBER() OVER... WHERE rk=1 language does.
If my guess about that is correct you can do that a different way, which may be more efficient.
SELECT *
FROM
(SELECT
iv.[Delivery_FK],
1 AS column2,
1 AS rk
FROM
[fact].[KRMFact] iv
JOIN ( SELECT Delivery_FK, MIN(UpdateDate) first_update
FROM [fact].[KRMFact]
GROUP BY Delivery_FK
) first_update ON iv.UpdateDate = first_update.first_update
LEFT JOIN
[dimension].[Product] pr ON iv.Product_FK =pr.Product_SK
LEFT JOIN
[dimension].[Delivery] le ON le.Delivery_FK = iv.Delivery_FK
WHERE
pr.Product_Key = '740') X
WHERE
rk = 1
You should probably try out the old and new versions of the subquery to determine whether they will yield the same results.
If you use this subquery query I suggest, this index will help make it run faster by optimizing the new sub-sub-query's MIN() ... GROUP BY operation.
CREATE INDEX x_KRMFact_Product_Update
ON [fact].[KRMFact]
([Product_FK],[UpdateDate])
By the way, WHERE pr.Product_Key = '740' turns your LEFT JOIN [dimension].[Product] operation into an ordinary inner JOIN.

SQL How to optimize insert to table from temporary table

I created procedure where dynamically collecting from various projects (Databases) some records into temporary table and from that temporary table I am inserting into table. With WHERE statement , but unfortunately when I checked with Execution plan I find out, that this query part take a lot of load. How can I optimize this INSERT part or WHERE statement ?
INSERT INTO dbo.PROJECTS_TESTS ( PROJECTID, ANOTHERTID, DOMAINID, is_test)
SELECT * FROM #temp_Test AS tC
WHERE NOT EXISTS (SELECT TOP 1 1
FROM dbo.PROJECTS_TESTS AS ps WITH (NOLOCK)
WHERE ps.PROJECTID = tC.projectId
AND ps.ANOTHERTID = tC.anotherLink
AND ps.DOMAINID = tC.DOMAINID
AND ps.is_test = tC.test_project
)
I think you'd be better served by doing a JOIN than EXISTS. Depending on the cardinality of your join condition (currently in your WHERE) you might need DISTINCT in there too.
INSERT INTO dbo.PROJECTS_TESTS ( PROJECTID, ANOTHERTID, DOMAINID, is_test)
SELECT <maybe distinct> tC.* FROM #temp_Test AS tC
LEFT OUTER JOIN FROM dbo.PROJECTS_TESTS AS ps on
ps.PROJECTID = tC.projectId
AND ps.ANOTHERTID = tC.anotherLink
AND ps.DOMAINID = tC.DOMAINID
AND ps.is_test = tC.test_project
where ps.PROJECT ID IS NULL
or something like that

Clustered and non-clustered index seeking increase execution time in stored procedure

I have a stored procedure which takes over 3 minutes to execute, when I show the execution plan I find
Clustered index seeking and non-clustered index seeking
index seeking
clustered index seek
My query:
SELECT distinct
[tbl_worflowprocess].[currenttid]
,USR2.[firstname] AS [prev_action_user_name]
,USR3.[firstname] AS [current_action_user_name]
,COD2.[Code] AS [reasontext]
,[tbl_application_details].[application_id] AS [ApplicationId]
,[tbl_application_details].[application_number] AS [ApplicationNumber]
,[dbo].[fn_app_GetApplicationId]([tbl_application_details].[link_application_id]) AS [LinkApplicationId]
,[tbl_application_details].[link_type] AS [LinkType]
,[dbo].[fn_app_CountProductsInApplication]([tbl_application_details].[application_id]) AS [ProductsCount]
,[tbl_application_details].[submission_date] AS [SubmissionDate]
,[tbl_jurisdiction].[jurisdictionname]
,[tbl_devicetype].[devicetype]
,COD1.[Code] AS [ClassificationName]
,EST1.[name] AS [ApplicantName]
,EST2.[name] AS [ManufacturerName]
,[dbo].[fnGetApplicationStatusFromTaskId]([tbl_worflowprocess].[currenttid]) AS [AppStatus]
,[dbo].[fnGetApplicationStatusText](#pLoggedInUserRoleId,[tbl_worflowprocess].[currenttid]) AS [StatusText]
,[Paid] = (CASE [tbl_application_details].[paid] WHEN 1 THEN 'Yes' ELSE 'No' END)
,[CreationDate] = [tbl_worflowprocess].[creationdate]
,[CommentText] =
(select CommentText
from dbo.tbl_application_comments
where Id = (select max(Id) from dbo.tbl_application_comments
where ApplicationId= [tbl_application_details].[application_id] and UserId = #pLoggedInUserID ))
,[LastCab] = (select isnull(dbo.fnGetLastCabForApplication([tbl_application_details].[application_id]),'-'))
,[tbl_application_details].[ArExpired]
FROM
[tbl_worflowprocess]
INNER JOIN
(SELECT
[application_id], [actionbyuser_id],
[actionbyrole_id], [reason_id], createddate
FROM
[tbl_applicationworkflowhistory]
INNER JOIN
(SELECT
[application_id] AS C1, MAX([version]) AS C2
FROM
[tbl_applicationworkflowhistory]
WHERE
(#pCurrentRoleId IS NULL
OR [application_id] IN (SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id] = [tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory = 0 OR [tbl_applicationworkflowhistory].[actionbyrole_id] = #pCurrentRoleId OR [tbl_worflowprocess].[currentroleid] = #pCurrentRoleId)
AND (#pSearchInHistory = 1 OR [tbl_worflowprocess].[currentroleid] = #pCurrentRoleId)
)
) AND
(#pCurrentUserId IS NULL OR [application_id] IN (
SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id]=[tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory=0 OR [tbl_applicationworkflowhistory].[actionbyuser_id] =#pCurrentUserId OR [tbl_worflowprocess].[currentuserid]=#pCurrentUserId)
AND (#pSearchInHistory=1 OR [tbl_worflowprocess].[currentuserid]=#pCurrentUserId)
)
) AND
(#pCurrentEstablishmentId IS NULL OR [application_id] IN (
SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id]=[tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory=0 OR [tbl_applicationworkflowhistory].[actionbyuser_id] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId) OR [tbl_worflowprocess].[currentuserid] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId))
AND (#pSearchInHistory=1 OR [tbl_worflowprocess].[currentuserid] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId))
)
)
GROUP BY [application_id]
)AS T1 ON ([tbl_applicationworkflowhistory].[application_id]=T1.C1 AND [tbl_applicationworkflowhistory].[version]=T1.C2)
) AS T2 ON([tbl_worflowprocess].[applicationid]=T2.[application_id])
INNER JOIN [tbl_application_details] ON [tbl_application_details].[application_id]=[tbl_worflowprocess].[applicationid]
INNER JOIN [tbl_user] USR1 ON USR1.[user_id]=[tbl_application_details].[responsible_user_id]
INNER JOIN [tbl_establishments] EST1 on EST1.[establishment_id] = USR1.[establishment_id]
LEFT OUTER JOIN [tbl_user] USR2 ON USR2.[user_id]=T2.[actionbyuser_id]
LEFT OUTER JOIN [tbl_user] USR3 ON USR3.[user_id]=[tbl_worflowprocess].[currentuserid]
LEFT OUTER JOIN [tbl_establishments] EST2 on EST2.[establishment_id] = [tbl_application_details].[manufacturer_id]
LEFT OUTER JOIN [tbl_jurisdiction] ON [tbl_jurisdiction].[jurisdiction_id]=[tbl_application_details].[jurisdiction_id]
LEFT OUTER JOIN [tbl_devicetype] ON [tbl_devicetype].[devicetype_id]=[tbl_application_details].[device_type_id]
LEFT OUTER JOIN [tbl_codes] COD1 ON COD1.[code_id]=[tbl_application_details].[device_classification_id]
LEFT OUTER JOIN [tbl_codes] COD2 ON COD2.[code_id]=T2.[reason_id]
LEFT OUTER JOIN [tbl_certificates] CERTF ON CERTF.[application_id]=[tbl_application_details].[application_id]
WHERE
(#pWFTasks IS NULL OR
[tbl_worflowprocess].[currenttid] IN (SELECT item
FROM [dbo].[fnSplit](#pWFTasks,',')))
Any way to improve my query?
try to create indexes on tables based on the query - use suggested performance improvements if exists and do not interfere with the rest of your DB.
if you have table scan in query execution plan, while index already exists on a table on that field - try to change the index to include the columns you select.
if you can - avoid using UDF's in case of many results returned by the query
if you can pre-calculate - do that using table variables or CTEs even: for example (if it returns more then 1 value) this can be stored in a table variable: SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId)
queries, such as - "select max(Id) from dbo.tbl_application_comments" - can be improved by using simple variable before the query
use SNAPSHOT or READ UNCOMMITED or at least (nolock)
make sure statistics on tables are updated!
check you are using left join correctly (vs inner join which is faster)
limit the number of rows for each join as much as possible - use where statements to cut the data
more can be advise, query optimization is an interesting field

SQL select from table - only including data in specific filegroup

I followed this article:
http://www.mssqltips.com/sqlservertip/1796/creating-a-table-with-horizontal-partitioning-in-sql-server/
Which in essence does the following:
Creates a database with three filegroups, call them A, B, and C
Creates a partition scheme, mapping to the three filegroups
Creates table - SalesArchival, using the partition scheme
Inserts a few rows into the table, split over the filegroups.
I'd like to perform a query like this (excuse my pseudo-code)
select * from SalesArchival
where data in filegroup('A')
Is there a way of doing this, or if not, how do I go about it.
What I want to accomplish is to have a batch run every day that moves data older than 90 days to a different file group, and perform my front end queries only on the 'current' file group.
To get at a specific filegroup, you'll always want to utilize partition elimination in your predicates to ensure minimal records get read. This is very important if you are to get any benefits from partitioning.
For archival, I think you're looking for how to split and merge ranges. You should always keep the first and last partitions empty, but this should give you an idea of how to use partitions for archiving. FYI, moving data from 1 filegroup to another is very resource intensive. Additionally, results will be slightly different if you use a range right pf. Since you are doing partitioning, hopefully you've read up on best practices.
DO NOT RUN ON PRODUCTION. THIS IS ONLY AN EXAMPLE TO LEARN FROM.
This example assumes you have 4 filegroups (FG1,FG2,FG3, & [PRIMARY]) defined.
IF EXISTS(SELECT NULL FROM sys.tables WHERE name = 'PartitionTest')
DROP TABLE PartitionTest;
IF EXISTS(SELECT NULL FROM sys.partition_schemes WHERE name = 'PS')
DROP PARTITION SCHEME PS;
IF EXISTS(SELECT NULL FROM sys.partition_functions WHERE name = 'PF')
DROP PARTITION FUNCTION PF;
CREATE PARTITION FUNCTION PF (datetime) AS RANGE LEFT FOR VALUES ('2012-02-05', '2012-05-10','2013-01-01');
CREATE PARTITION SCHEME PS AS PARTITION PF TO (FG1,FG2,FG3,[PRIMARY]);
CREATE TABLE PartitionTest( Id int identity(1,1), DT datetime) ON PS(DT);
INSERT PartitionTest (DT)
SELECT '2012-02-05' --FG1
UNION ALL
SELECT '2012-02-06' --FG2(This is the one 90 days old to archive into FG1)
UNION ALL
SELECT '2012-02-07' --FG2
UNION ALL
SELECT '2012-05-05' --FG2 (This represents a record entered recently)
Check the filegroup associated with each record:
SELECT O.name TableName, fg.name FileGroup, ps.name PartitionScheme,pf.name PartitionFunction, ISNULL(prv.value,'Undefined') RangeValue,p.rows
FROM sys.objects O
INNER JOIN sys.partitions p on P.object_id = O.object_id
INNER JOIN sys.indexes i on p.object_id = i.object_id and p.index_id = i.index_id
INNER JOIN sys.data_spaces ds on i.data_space_id = ds.data_space_id
INNER JOIN sys.partition_schemes ps on ds.data_space_id = ps.data_space_id
INNER JOIN sys.partition_functions pf on ps.function_id = pf.function_id
LEFT OUTER JOIN sys.partition_range_values prv on prv.function_id = ps.function_id and p.partition_number = prv.boundary_id
INNER JOIN sys.allocation_units au on p.hobt_id = au.container_id
INNER JOIN sys.filegroups fg ON au.data_space_id = fg.data_space_id
WHERE o.name = 'PartitionTest' AND i.type IN (0,1) --Remove nonclustereds. 0 for heap, 1 for BTree
ORDER BY O.name, fg.name, prv.value
This proves that 2012-02-05 is in FG1 while the rest are in FG2.
In order to archive, your' first instinct is to move the data. When partitioning though, you actually have to slide the partition function range value.
Now let's move 2012-02-06 (90 days or older in your case) into FG1:--Move 2012-02-06 from FG2 to FG1
ALTER PARTITION SCHEME PS NEXT USED FG1;
ALTER PARTITION FUNCTION PF() SPLIT RANGE ('2012-02-06');
Rerun the filegroup query to verify that 2012-02-06 got moved into FG1.
$PARTITION (Transact-SQL) should have what you want to do.
Run the following to know the size of your partitions and ID:
USE AdventureWorks2012;
GO
SELECT $PARTITION.TransactionRangePF1(TransactionDate) AS Partition,
COUNT(*) AS [COUNT] FROM Production.TransactionHistory
GROUP BY $PARTITION.TransactionRangePF1(TransactionDate)
ORDER BY Partition ;
GO
and the following should give you data from given partition id:
SELECT * FROM Production.TransactionHistory
WHERE $PARTITION.TransactionRangePF1(TransactionDate) = 5 ;
No. You need to use the exact condition that you use in your partition function. Which is probably like
where keyCol between 3 and 7

Error adding an index to a view

I have created a view using the following code
CREATE VIEW dbo.two_weeks_performance WITH SCHEMABINDING
AS
SELECT dbo.day_dim.date_time AS Date,
dbo.order_dim.quantity AS Target_Acheived
FROM dbo.day_dim
JOIN dbo.order_fact ON dbo.day_dim.day_id = dbo.order_fact.day_id
JOIN dbo.branch_dim ON dbo.order_fact.branch_id = dbo.branch_dim.branch_id
JOIN dbo.order_dim ON dbo.order_fact.order_id = dbo.order_dim.order_id
GROUP BY dbo.order_dim.quantity, dbo.day_dim.date_time`
Now when I use:
CREATE UNIQUE CLUSTERED INDEX two_weeks_performance_I ON two_weeks_performance (Date)
I am getting an error:
Cannot create index because its select list does not use the correct usage of COUNT_BIG(). Consider adding COUNT_BIG(*) to the select.
Please help me solve this issue.
The error tells you exactly what you have to do - add COUNT_BIG(*) to your select list.
From Creating Indexed Views:
If GROUP BY is specified, the view
select list must contain a
COUNT_BIG(*) expression, and the view
definition cannot specify HAVING,
ROLLUP, CUBE, or GROUPING SETS.
CREATE VIEW dbo.two_weeks_performance WITH SCHEMABINDING
AS
SELECT dbo.day_dim.date_time AS Date,
dbo.order_dim.quantity AS Target_Acheived,
COUNT_BIG(*) as Cnt
FROM dbo.day_dim
JOIN dbo.order_fact ON dbo.day_dim.day_id = dbo.order_fact.day_id
JOIN dbo.branch_dim ON dbo.order_fact.branch_id = dbo.branch_dim.branch_id
JOIN dbo.order_dim ON dbo.order_fact.order_id = dbo.order_dim.order_id
GROUP BY dbo.order_dim.quantity, dbo.day_dim.date_time
GO
CREATE UNIQUE CLUSTERED INDEX two_weeks_performance_I ON two_weeks_performance (Date)