Improving Query performance - T-SQL - sql

I have a table that is already ordered by a column 'datetime'. Because when it is inserted I store the UTC date, so it is ordered. It's a very populated table. So I am trying to improve the query performance, if it is possible.
When I use something WHERE columnDateTime > dateToSearch it takes too long to return the rows. As my table is already ordered by columnDateTime what could I do to improve this query performance. For example, when a table is ordered by a cod and you try to search for cod > 40 T-SQL optimization will stop the search when it finds a cod = 41 and will return the rest of the table, cause it knows the table is ordered by that index. Is that a way that could tell T-SQL that my table is already ordered by that columnDateTime too?

Inserting the data in order doesn't mean it is saved in order. Without getting too technical and for faster performance:
Create a CLUSTERED INDEX on that column. This requiers that there are no other clustered indexes on you table and it doesn't have a PRIMARY KEY (or it has it NONCLUSTERED which is not the default). With a clustered index, the engine will do a index scan (not a full table scan) when filtering with > datetimeValue and doesn't need to access aditional pages for the data, since a clustered index leaves are the data.
Create a NONCLUSTERED INDEX on that column. No restrictions on this clause (at least for this case), but for each match with your filtered date, the engine will need to access another page with the requested columns, unless you INCLUDE them when creating your index. Keep in mind that inlcuded columns will raise the size of the index and will need additional maintenance tasks like, for example, when an included column is modified.
That aside, you should check your query plan; if you have joins, function calls or additional conditions, the SQL engine might not use the indexes even if they exist. There are many things that could make a query run slow, you will have to post the full query execution plan (for a start) to check the details.
You can use this query to check if your table already has indexes:
DECLARE #table_name VARCHAR(200) = 'YourTableName'
SELECT
SchemaName = SCHEMA_NAME(t.schema_id),
TableName = t.name,
IndexName = ind.name,
IndexType = CASE ind.index_id WHEN 0 THEN 'Heap' WHEN 1 THEN 'Clustered' ELSE 'Nonclustered' END,
Disabled = ind.is_disabled,
ColumnOrder = ic.index_column_id,
ColumnName = col.name,
ColumnType = y.name,
ColumnLength = y.max_length,
ColumnIncluded = ic.is_included_column
FROM
sys.indexes ind
INNER JOIN sys.index_columns ic ON ind.object_id = ic.object_id and ind.index_id = ic.index_id
INNER JOIN sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id
INNER JOIN sys.tables t ON ind.object_id = t.object_id
INNER JOIN sys.types y ON y.user_type_id = col.user_type_id
WHERE
t.is_ms_shipped = 0 AND
t.name = #table_name
ORDER BY
SchemaName,
t.name,
ind.name,
ic.index_column_id
You need to make sure that there is at least one index that has your datetimeColumn with ColumnOrder = 1 and it's not disabled. If it already exists then your problem lies elsewhere and we won't be able to help much without more detail.

Related

TSQL see if column is part of unique index without using dynamic management function

I want to write a query to see information about whether a column in a table is part of a unique index or not.
Usually i write the following to get the information I need:
SELECT name
, is_part_of_unique_key
FROM sys.dm_exec_describe_first_result_set(N'SELECT * FROM dbo.Department', NULL, 1)
WHERE is_hidden = 0
the "is_part_of_unique_key" from the above query does the following according to MS:
Returns 1 if the column is part of a unique index (including unique
and primary constraints) and 0 if it is not. Returns NULL if it cannot
be determined that the column is part of a unique index. Is only
populated if browsing information is requested.
Link: https://learn.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-exec-describe-first-result-set-transact-sql?view=sql-server-2017
however I am in a situation where I dont have SELECT-permissions on the above mentioned table, so the query above will not work.
I need the information about whether the column is part of a unique index or not by looking at i.e. the INFORMATION_SCHEMA-views
I have permissions to do the following for instance, but this doesn't give me information about unique indexes:
SELECT
COLUMN_NAME AS name
,DATA_TYPE AS system_type_name
,CHARACTER_MAXIMUM_LENGTH
IS_NULLABLE
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME LIKE 'dbo.Department'
I am in doubt if sys.indexes will give me the correct results since i am interested in if the "column is
part of a unique index (including unique and primary constraints)"
So I am only interested in unique indexes. How would I write a query to see if the column is part of a unique index with the restriced permissions i have?
Help is much appreciated!
Let's assume you have this table:
CREATE TABLE MyIndexTestTable
(
MyIndexTestTableId INT NOT NULL IDENTITY(1,1),
code VARCHAR(10) NOT NULL UNIQUE,
otherUnique VARCHAR(10) NOT NULL,
name VARCHAR(100) NULL,
CONSTRAINT PK_MyIndexTestTable PRIMARY KEY(MyIndexTestTableId),
CONSTRAINT UQ2_MyIndexTestTable UNIQUE (otherUnique)
)
Then with the following statement you will get all columns with UNIQUE
constraint and which are part of the primary key:
SELECT COLUMN_NAME AS uniqueColumns FROM sys.objects c JOIN sys.objects t ON c.parent_object_id = t.object_id
JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE cu ON cu.TABLE_NAME = t.name AND cu.CONSTRAINT_NAME = c.name
WHERE c.type IN ('PK', 'UQ')
AND t.name = 'MyIndexTestTable'
This answer does not cover cases in which the primary key contains multiple columns - each of them are not unique. For this case you have to alter the query in order to exclude all primary key columns, if there is more than one. Please comment if you also need this example.
Something like sp_help has the Column index_description. You can pull it from that Logic
sp_helptext sp_help
Grab it from this Section in the Procedure -- DISPLAY TABLE INDEXES & CONSTRAINTS
This may Work for you.
Sp_help My Example Table
The following query will give you a list of unique indexes and columns involved with that index. Modify the where clause accordingly:
SELECT sys.indexes.name, sys.columns.name
FROM sys.tables
INNER JOIN sys.indexes ON sys.tables.object_id = sys.indexes.object_id
INNER JOIN sys.index_columns ON sys.indexes.object_id = sys.index_columns.object_id
AND sys.indexes.index_id = sys.index_columns.index_id
INNER JOIN sys.columns ON sys.index_columns.object_id = sys.columns.object_id
AND sys.index_columns.column_id = sys.columns.column_id
WHERE sys.tables.name = 'table name'
AND sys.indexes.is_unique = 1

Improving the performance of an SQL query

I have a SQL Server table containing around 50,000,000 rows and I need to run the following two queries on it:
SELECT Count(*) AS Total
FROM TableName WITH(NOLOCK)
WHERE Col1 = 'xxx' AND Col2 = 'yyy'
then
SELECT TOP 1 Col3
FROM TableName WITH(NOLOCK)
WHERE Col1 = 'xxx' AND Col2 = 'yyy'
ORDER BY TableNameId DESC
The table has the following structure:
dbo.TableName
TableNameId int PK
Col1 varchar(12)
Col2 varchar(256)
Col3 int
Timestamp datetime
As well as running queries on it, there are loads of inserts every second going into the table hence the NOLOCK. I've tried creating the following index:
NONCLUSTERED INDEX (Col1, Col2) INCLUDE (TableNameId, Col3)
I need these queries to return results as quick as possible (1 second max). At this stage, I have the ability to restructure the table as the data isn't live yet, and I can also get rid of the Timestamp field if I need to.
Firstly - include TableNameID in your index and not as included column, then you can specify descending in the index order.
That should speed up things regarding your TOP 1 ... ORDER BY TableNameId DESC
Secondly - check up on how much time is I/O for example (SET STATISTICS IO ON) and how much is CPU (SET STATISTICS TIME ON).
If it is I/O there's not much you can do because you do have to move through a lot of data.
If you'd like a very fast estimate of how many rows are in the table, try quering sys.dm_db_partition_stats:
-- Report how many rows are in each table
SELECT o.name as [Table Name], ddps.row_count
FROM sys.indexes (nolock) AS i
INNER JOIN sys.objects (nolock) AS o ON i.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.dm_db_partition_stats (nolock) AS ddps ON i.OBJECT_ID = ddps.OBJECT_ID
AND i.index_id = ddps.index_id
WHERE i.index_id < 2
AND o.is_ms_shipped = 0 -- Remove to include system objects
AND ddps.row_count > 0
ORDER BY ddps.row_count DESC
If you have multiple partitions, this may not work. You might need to get the SUM of the row_count.
However, if you need an accurate count, you will need to count the rows, and this will take a while. Also, you may get the error "Could not continue scan with NOLOCK due to data movement."
You didn't mention how long your indexed query is running. The index and your query look fine to me.

What makes count(*) query to run for 30 sec? [duplicate]

This question already has an answer here:
Is COUNT(*) indexed?
(1 answer)
Closed 9 years ago.
I have a MS SQL table with over 250 million rows. Whenever I execute the following query
SELECT COUNT(*) FROM table_name
it takes over 30 seconds to get me the output. Why is it taking so much time? Does this do a count when I query? I'm assuming till date that it stores this information somewhere (probably in the table meta data. I m not sure if table meta even exists).
Also, I would like to know if this query is IO/Processor/Memory intensive?
Thanks
Every time you execute SELECT COUNT(*) from TABLE SQL server actually goes through the table and counts all rows. To get estemated row count on one or more tables you can run the following query which gets stored information and returns in under 1 sec.
SELECT OBJECT_NAME(OBJECT_ID) TableName, st.row_count
FROM sys.dm_db_partition_stats st
WHERE index_id < 2
ORDER BY st.row_count DESC
Read more about it here http://technet.microsoft.com/en-us/library/ms187737.aspx
No, sql server dosen't store this information. It computes it every query. But it can cache execution plan to emprove perfomace. So, if you want to get results quickly, you need a primary key at least.
As for what SQL server is doing and how expensive it is, you can look this up yourself. In SSMS enable the execution plan button for the query and run a select count(*). You will see that the server actually does an index scan (full table scan). (I would have expected the PK to be used for that, but in my test case it used some other non-clustered index.).
To get a feel for the cost right-click your query editor window, select Query Options... -> Execution -> Advanced and activate the check boxes for SET STATISTICS TIME and SET STATISTICS IO. The messages tab will contain information regarding IO and timing after you re-executed the select statement.
Also note that a select count(*) is quite aggressive in terms of shared locks it uses. To guarantee the result the whole table will be locked with a shared lock.
A very fast, lock-free alternative is to use the meta data for the table. The count you get from the meta-data is almost always accurate, but there is no guarantee.
USE <database_name,,>
GO
SELECT ddps.row_count
FROM sys.indexes AS i
INNER JOIN sys.objects AS o
ON i.object_id = o.object_id
AND o.name = '<your_table,,>'
INNER JOIN sys.dm_db_partition_stats AS ddps
ON i.object_id = ddps.object_id
AND i.index_id = ddps.index_id
WHERE i.index_id = 1
This is a SSMS template. Copy this into a query window and hit CTRL+SHIFT+M to get a dialog that asks you for values for database_name and table_name.
If you are looking for approximation counts on tables and your version is greater than or equal to SQL Server 2005, you can simply use:
SELECT t.NAME AS 'TableName'
,s.Name AS 'TableSchema'
,p.rows AS 'RowCounts'
FROM sys.tables t
INNER JOIN sys.schemas s
ON t.schema_id = s.schema_id
INNER JOIN sys.indexes i
ON t.OBJECT_ID = i.object_id
INNER JOIN sys.partitions p
ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
WHERE
t.is_ms_shipped = 0
GROUP BY
t.NAME, s.Name, p.Rows
ORDER BY
s.Name, t.Name
Doing a count(*) would only consume a small amount of memory/processor. It isn't that big of an operation in terms of database functions.

How can you identify the PK columns in a View

I used to use 'GetSchemaTable' to read schema information, but it was missing some 'stuff', so I wrote a big query, referencing, among other columns, sys.columns,sys.index_columns, and sys.indexes (and other tables) to return the same information I used to get from GetSchemaTable and also return the other pieces of information I want.
Problem is that GetSchemaTable will tell me if a column returned from a view is a Key column from the underlying tables but my new query does not. It'll give me the right answer all day long for tables, but not for views.
Does anyone have a solution to this? I'd hate to have to go back to GetSchemaTable just for that one bit of information, when I'm examing a view. (Plus, I really just want a SQL based solution, ideally.)
Thanks!
Unfortunately in SQL Server 2005 this is not very easy. I have played with this a bit, and it is very close, but it relies on the fact that you name your columns in your view exactly the same as they are named in the base table. This is because the now-deprecated-in-SQL-Server-2008 view sys.sql_dependencies does not properly store the referencing column_id, so there is no way to match this up with the actual columns in the view. I think SQL Server 2008 will have better options for you as they have yet again introduced a new set of dependency objects. I also didn't chase down any paths with INFORMATION_SCHEMA.KEY_COLUMN_USAGE but since these views rely solely on names and not id's of any kind you are likely in the same pickle there. So maybe this can be a start for you but like I said this will only cover the simple cases. If you alias your columns you will be out of luck. Maybe someone else with some insight into the intricacies of how these things are referenced will pull a rabbit out and figure out how to reference mismatched columns...
-- very simple; one-column key:
CREATE TABLE dbo.boo
(
far INT PRIMARY KEY
);
GO
CREATE VIEW dbo.view_boo
AS
SELECT far FROM dbo.boo;
GO
-- slightly more complex. Two-column key,
-- not all columns are in key, view columns
-- are in different order:
CREATE TABLE dbo.foo
(
splunge INT,
a INT,
mort INT,
PRIMARY KEY(splunge, mort)
);
GO
CREATE VIEW dbo.view_foo
AS
SELECT
splunge,
mort,
a
FROM
dbo.foo;
GO
SELECT
QUOTENAME(OBJECT_SCHEMA_NAME(v.[object_id])) + '.'
+ QUOTENAME(v.name) + '.' + QUOTENAME(vc.name)
+ ' references '
+ QUOTENAME(OBJECT_SCHEMA_NAME(t.[object_id]))
+ '.' + QUOTENAME(t.name) + '.' + QUOTENAME(tc.name)
FROM
sys.views AS v
INNER JOIN
sys.sql_dependencies AS d
ON v.[object_id] = d.[object_id]
INNER JOIN
sys.tables AS t
ON d.referenced_major_id = t.[object_id]
INNER JOIN
sys.columns AS tc
ON tc.[object_id] = t.[object_id]
INNER JOIN
sys.index_columns AS ic
ON tc.[object_id] = ic.[object_id]
AND tc.column_id = ic.column_id
AND tc.column_id = d.referenced_minor_id
INNER JOIN
sys.columns AS vc
ON vc.[object_id] = v.[object_id]
AND vc.name = tc.name -- the part I don't like
INNER JOIN
sys.indexes AS i
ON ic.[object_id] = i.[object_id]
AND i.is_primary_key = 1
ORDER BY
t.name,
ic.key_ordinal;
GO
DROP VIEW dbo.view_boo, dbo.view_foo;
DROP TABLE dbo.foo, dbo.boo;

How can I tell if an index contains a column of type varchar(max)?

I'm working on my MSSQL index defragmentation script. Certain kinds of indexes can be rebuilt online, and other kinds can't.
For clustered indexes, it's easy enough to see if the table contains any LOB columns, but for a non-clustered index I need to specifically know if there is any LOB columns covered by that specific index.
I used to be able to do this by looking at the alloc_unit_type_desc in dm_db_index_physical_stats, but this doesn't work for columns of type varchar(max) and xml.
This isn't for my database, so I don't want to get into a discussion over whether or not the index is appropriate, let's just accept that it exists and that I'd like the script to be able to handle this situation.
Does anyone know what kind of SQL I can write to check for this? Assume I have all the relevant object ids and object names in scalar variables.
If you have a char or nvarchar with a max length, then it will have an entry in the sys.columns table with the appropriate system type id for the field, with -1 as the max length.
So, if you want to find all the ids of all the indexes that have a varchar (system type id 167), you would do this:
select distinct
si.*
from
sys.indexes as si
inner join sys.index_columns as ic on
ic.object_id = si.object_id and
ic.index_id = si.index_id
inner join sys.columns as sc on
sc.object_id = ic.object_id and
sc.column_id = ic.column_id
where
sc.system_type_id = 167 and
sc.max_length = -1
I think that for "max" columns, the length or size feild in the sys.columns table should be -1. Don't have the documentation in front of me, but let me know if this works.
Be careful, folks. The Clustered Index is a different animal when it comes to LOBs. Let's do a test to see what I mean.
First, let's setup a test table. No data is needed for this test but we do have a Clustered Index (IndexID=1) as the PK. We also have a Non Clustered Index (IndexID=2) that contains no LOB columns as an INCLUDE and we also have a Non Clustered Index that does contain an LOB column as an INCLUDE. Here's the test setup code...
--========================================================================
-- Test Setup
--========================================================================
--===== If the test table already exists,
-- drop it to make reruns in SSMS easier.
IF OBJECT_ID('dbo.IndexTest','U') IS NOT NULL
DROP TABLE dbo.IndexTest
;
GO
--===== Create the test table
CREATE TABLE dbo.IndexTest
(
SomeID INT IDENTITY(1,1)
,SomeInt INT
,SomeLOB1 VARCHAR(MAX)
,CONSTRAINT PK_IndexTest_Has_LOB
PRIMARY KEY CLUSTERED (SomeID)
)
;
--===== Add an index that has no INCLUDE of a LOB
CREATE INDEX IX_Has_No_LOB
ON dbo.IndexTest (SomeInt)
;
--===== Add an index that has INCLUDEs a LOB
CREATE INDEX IX_Includes_A_LOB
ON dbo.IndexTest (SomeInt) INCLUDE (SomeLOB1)
;
Now, let's try the code that uses sys.index_columns to find indexes that contain LOBs. I've commented out the system_type_id in the WHERE clause to open it up a bit...
--========================================================================
-- Test for LOBs using sys.index_columns.
--========================================================================
select distinct
si.*
from
sys.indexes as si
inner join sys.index_columns as ic on
ic.object_id = si.object_id and
ic.index_id = si.index_id
inner join sys.columns as sc on
sc.object_id = ic.object_id and
sc.column_id = ic.column_id
where
--sc.system_type_id = 167 and
sc.max_length = -1
;
Here's the output from the run above...
object_id name index_id type type_desc ...
----------- ----------------- ----------- ---- ------------ ...
163204448 IX_Includes_A_LOB 3 2 NONCLUSTERED ...
It couldn't tell that the Clustered Index contains an LOB because the LOB is not one of the index columns. Trying to rebuild this Clustered Index will cause a failure.
ALTER INDEX PK_IndexTest_Has_LOB
ON dbo.IndexTest REBUILD WITH (ONLINE = ON)
;
Msg 2725, Level 16, State 2, Line 1 Online index operation cannot be
performed for index 'PK_IndexTest_Has_LOB' because the index contains
column 'SomeLOB1' of data type text, ntext, image, varchar(max),
nvarchar(max), varbinary(max) or xml. For non-clustered index the
column could be an include column of the index, for clustered index it
could be any column of the table. In case of drop_existing the column
could be part of new or old index. The operation must be performed
offline.
With a tip of the hat to Remus Rusanu (system wouldn't let me post the link)...
... we can try something a bit different. Every index (clustered, non-clustered, or HEAP) shows up as an allocation unit and will also identify in-row data, out-of-row data, and LOBs. The following code finds ALL indexes that have an LOB associated with them... even the Clustered Index.
--===== Find all indexes that contain any type of LOB
SELECT SchemaName = OBJECT_SCHEMA_NAME(p.object_id)
,ObjectName = OBJECT_NAME(p.object_id)
,IndexName = si.name
,p.object_id
,p.index_id
,au.type_desc
FROM sys.system_internals_allocation_units au --Has allocation type
JOIN sys.system_internals_partitions p --Has an Index_ID
ON au.container_id = p.partition_id
JOIN sys.indexes si --For the name of the index
ON si.object_id = p.object_id
AND si.index_id = p.index_id
WHERE p.object_id = OBJECT_ID('IndexTest')
AND au.type_desc = 'LOB_DATA'
;
That produces the following output for this particular test. Notice that it did pick up on the Clustered Index by object_id and index_id where the code based on sys.index_columns didn't.
SchemaName ObjectName IndexName object_id index_id type_desc
---------- ---------- -------------------- --------- -------- ---------
dbo IndexTest PK_IndexTest_Has_LOB 163204448 1 LOB_DATA
dbo IndexTest IX_Includes_A_LOB 163204448 3 LOB_DATA
You could also check the DMV - sys.dm_db_index_physical_stats.
It has a alloc_unit_type_desc column that would tell us if the index has LOB_DATA or not.
SELECT S.name as 'Schema',
T.name as 'Table',
I.name as 'Index',
DDIPS.avg_fragmentation_in_percent,
DDIPS.page_count,
DDIPS.alloc_unit_type_desc
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, 'LIMITED') AS DDIPS
INNER JOIN sys.tables T on T.object_id = DDIPS.object_id
INNER JOIN sys.schemas S on T.schema_id = S.schema_id
INNER JOIN sys.indexes I ON I.object_id = DDIPS.object_id
AND DDIPS.index_id = I.index_id
WHERE DDIPS.database_id = DB_ID()
and I.name is not null
and DDIPS.alloc_unit_type_desc = 'LOB_DATA'