Sitecore IDTable primary key - sql

In Sitecore there's a table called IDTable with the following structure:
CREATE TABLE [dbo].[IDTable](
[ID] [uniqueidentifier] NOT NULL DEFAULT (newid()),
[Prefix] [varchar](255) NOT NULL,
[Key] [varchar](255) NOT NULL,
[ParentID] [uniqueidentifier] NOT NULL,
[CustomData] [varchar](255) NOT NULL
) ON [PRIMARY]
It has following indexes:
CREATE UNIQUE NONCLUSTERED INDEX [ndxID] ON [dbo].[IDTable]
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [ndxPrefixKey] ON [dbo].[IDTable]
(
[Prefix] ASC,
[Key] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
And it is used to map IDs to keys for use with a custom data provider. So this table is being hit near constantly when custom items are being requested and very heavily when indexes are being created etc... One of the most common lookups passing through is based on the ID column.
My question is: Why would it have been decided that this table requires no primary key? And what would be the pro-contra arguments for adding a primary key?

You're asking a "why" question that could be difficult to answer. But I'll offer some information that might help you out.
<IDTable type="Sitecore.Data.$(database).$(database)IDTable, Sitecore.Kernel" singleInstance="true">
<param connectionStringName="master"/>
<param desc="cacheSize">500KB</param>
</IDTable>
First of all, the IDTable isn't taking the direct hit from the lookups all the time. Sitecore has a caching mechanism around the IDTable, defined in web.config as per above. You can and should increase this cache size if you're dataproviding large quantities of data.
That being said, it probably wouldn't be harmful to add an index to this table. I just don't think you would gain all that much. It depends I guess; how big is the data set you're dataproviding?

Related

How to turn off Sorting in group by function as it's taking much resource & not required in my case

I have a SQL query for which I am doing "group by", but sorting is not necessary.
I am joining 2 table,
roleMstr
aclMstr
roleAclMap
'roleMstr', 'aclMstr' are many to many & stored in 'roleAclMap'
I am trying to fetch acl assigned to that role & are active (due to soft delete) & group them to find assigned & total
When I checked in SQL Server Profiler, Sorting is taking 68%, & index scan 32%
BEGIN
DECLARE #RoleCode VARCHAR(20);
SET #RoleCode = 'CLN';
SELECT am.aclGroup, am.subAclGroup, SUM(CASE ram.roleCode WHEN #RoleCode THEN 1 ELSE 0 END) AS assignedChild, COUNT(*) AS totalChild
FROM aclMstr am
LEFT JOIN roleAclMap ram
ON am.acl_code = ram.acl_code AND ram.roleCode = #RoleCode
WHERE am.isActive = 1
group by am.aclGroup, am.subAclGroup;
END
Query plan executor text link1, link2
I am counting how many acl are assigned to the 'acl group' based on 'role code'
I am getting the sorted column too.
roleMstr contains roleCode
CONSTRAINT [pk_rolemstr_rolecode] PRIMARY KEY CLUSTERED
(
[roleCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
aclMstr contains aclCode, isActive, aclGroup, subAclGroup
CONSTRAINT [pk_aclmstr_aclcode] PRIMARY KEY CLUSTERED
(
[aclCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[roleAclMap](
[id] [int] IDENTITY(1,1) NOT NULL,
[roleCode] [varchar](20) NOT NULL,
[aclCode] [varchar](50) NOT NULL,
CONSTRAINT [pk_roleaclmap_id] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [uk_roleaclmap_rolecode_aclcode] UNIQUE NONCLUSTERED
(
[roleCode] ASC,
[aclCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[roleAclMap] WITH CHECK ADD CONSTRAINT [fk_roleaclmap_aclmstr_aclcode] FOREIGN KEY([aclCode])
REFERENCES [dbo].[aclMstr] ([aclCode])
GO
ALTER TABLE [dbo].[roleAclMap] CHECK CONSTRAINT [fk_roleaclmap_aclmstr_aclcode]
GO
ALTER TABLE [dbo].[roleAclMap] WITH CHECK ADD CONSTRAINT [fk_roleaclmap_rolemstr_rolecode] FOREIGN KEY([roleCode])
REFERENCES [dbo].[roleMstr] ([roleCode])
GO
ALTER TABLE [dbo].[roleAclMap] CHECK CONSTRAINT [fk_roleaclmap_rolemstr_rolecode]
GO
If sort can be turned off through query somehow. The query will be executed within half of time.

SQL one to many relation table design, the right way

Plot: I need to book an Order which relies on 3 different type of Factory.
I have individual tables for both order and Factory. Now I need to make a one to many relations between Order and Booking Factory.
Order Table:
CREATE TABLE [dbo].[tbl_OrderInformation](
[OrderInformationId] [int] IDENTITY(1,1) NOT NULL,
[OrderId] [int] NOT NULL,
[OrderNo] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_tbl_OrderInformation] PRIMARY KEY CLUSTERED
( [OrderInformationId] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Factory Table:
CREATE TABLE [dbo].[tbl_Factory](
[FactoryId] [int] IDENTITY(1,1) NOT NULL,
[FactoryName] [nvarchar](50) NOT NULL,
[FactoryType] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_tbl_Factory] PRIMARY KEY CLUSTERED
( [FactoryId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Sample Order Data
Sample Factory Data
Now, an Order relies on multiple garments, dyeing, and printing Factory.
Suppose, Order C101 relies on Garments-A, Dyeing-A, Printing-A, Printing-B, Printing-C.
Now, I can design OrderBooking table in 2 ways.
CREATE TABLE [dbo].[tbl_OrderBooking_1](
[OrderBookingId] [INT] IDENTITY(1,1) NOT NULL,
[OrderId] [INT] NOT NULL,
[FactoryId] [INT] NULL,
[FactoryType] [NVARCHAR](50) NULL,
CONSTRAINT [PK_tbl_OrderBooking_1] PRIMARY KEY CLUSTERED
(
[OrderBookingId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
This the data will look like below:
And The second way,
CREATE TABLE [dbo].[tbl_OrderBooking_2](
[OrderBookingId] [INT] IDENTITY(1,1) NOT NULL,
[OrderId] [INT] NULL,
[garmentsFactoryId] [INT] NULL,
[dyeingFactoryId] [INT] NULL,
[printingFactoryId] [INT] NULL,
CONSTRAINT [PK_tbl_OrderBooking_2] PRIMARY KEY CLUSTERED
(
[OrderBookingId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Here the data will look like,
Now, which approach of designing the OrderBooking table is more accurate and why ?
Please keep in mind that the type of factory is fixed to 3, and OrderBooking table will grow quite large over time thus tend to have heavy read and write operations.
The link table between Order & factory would be the best approach.
You will get better performance and you can create indexes on the numeric columns.
Going forward if you have any new factory then it is also each to insert without any issue.
The link table will help you to align with the Normalization rule as well. so my suggestion is go with that.

Index fragmentation SQL Server

I have problem with my indexes on two tables.
Here is code for creating the tables:
CREATE TABLE [dbo].[Table]
(
[ID] [uniqueidentifier] NOT NULL,
[IP] [nvarchar](15) NULL,
[Referrer] [nvarchar](1000) NULL,
[Domain] [nvarchar](100) NULL,
[RegID] [int] NULL,
[Agent] [nvarchar](500) NULL,
CONSTRAINT [PK_Table] PRIMARY KEY CLUSTERED
([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Table]
ADD CONSTRAINT [DF_Table_ID] DEFAULT (newsequentialid()) FOR [ID]
GO
And index
CREATE NONCLUSTERED INDEX [Reg_ID] ON [dbo].[Table]
(
[RegID] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
And another table with index
CREATE TABLE [dbo].[Table2]
(
[Table2_ID] [int] IDENTITY(1,1) NOT NULL,
[TracID] [uniqueidentifier] NOT NULL,
[F_URL] [nvarchar](1500) NULL,
[S_URL] [nvarchar](100) NULL,
[Time] [datetime] NULL,
CONSTRAINT [PK_Table2] PRIMARY KEY CLUSTERED ([Table2_ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Table2] WITH CHECK
ADD CONSTRAINT [FK_Table2_Table]
FOREIGN KEY([TracID]) REFERENCES [dbo].[Table] ([Web_Visitor_ID])
GO
ALTER TABLE [dbo].[Table2] CHECK CONSTRAINT [FK_Table2_Table]
GO
Index
CREATE NONCLUSTERED INDEX [IX_TracID] ON [dbo].[Table2]
(
[TracID] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
In first table I have about 6M rows and in second 8M rows (a couple of thousand each day).
I have problem because indexes are fragmented up to 99% in 4 hours.
I run query (sys.columns) to get size in bytes and there are results
Table 1 Table 2
name bytes name bytes
ID 16 ID 4
IP 30 TracID 16
Referrer 2000 F_URL 3000
Domain 200 S_URL 200
RegID 4 Time 8
Agent 1000
Does anyone have some idea witch can help me to fix that fragmentation ?
Are you SURE you need to defragment? With proper hardware, fragmentation rarely matters anymore. Many old-school SQL people still recommend it, but in reality in most cases it is a relic of a past.
There are two reasons it has become irrelevant. First, all reads should be cached in RAM (if not, you need more RAM--it's cheap and will give you WAY more bang for the buck than effort spent defragmenting). Second, SSDs eliminate seeks times anyway, so fragmentation is irrelevant. As a result of these two changes, the time spent defragmenting is usually wasted.

SQL Table Performance - Large Table

I have a very large table ~55,000,000 records.
Indexes have been added to the most commonly used columns, but the table is still very slow.
Are there any suggestions as to how the tables performance could be improved?
I have thought about partitioning the table, but was not sure it was necessary.
--Table
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[EngineRecord](
[Id] [uniqueidentifier] NOT NULL,
[CreateDate] [datetime] NOT NULL,
[ChangeDate] [datetime] NOT NULL,
[CompanyId] [uniqueidentifier] NOT NULL,
[DriverEmployeeId] [uniqueidentifier] NOT NULL,
[EobrDeviceId] [uniqueidentifier] NOT NULL,
[EobrTimestampUtc] [datetime] NOT NULL,
[EobrOverallStatus] [int] NOT NULL,
[Speedometer] [decimal](14, 4) NOT NULL,
[Odometer] [decimal](14, 4) NOT NULL,
[Tachometer] [decimal](14, 4) NOT NULL,
[GpsTimestampUtc] [datetime] NULL,
[GpsLatitude] [decimal](18, 8) NULL,
[GPSLongitude] [decimal](18, 8) NULL,
[RecordType] [int] NOT NULL,
[FuelEconomyAverage] [decimal](8, 4) NOT NULL,
[FuelEconomyInstant] [decimal](8, 4) NOT NULL,
[FuelUseTotal] [decimal](14, 4) NOT NULL,
[BrakePressure] [decimal](8, 4) NOT NULL,
[CruiseControlSet] [bit] NOT NULL,
[TransmissionAttained] [nvarchar](2) NULL,
[TransmissionSelected] [nvarchar](2) NULL,
[IsProcessed] [bit] NOT NULL,
[LastChangedByUserId] [uniqueidentifier] NOT NULL,
CONSTRAINT [PK_EngineRecord] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY],
CONSTRAINT [NK_EngineRecord] UNIQUE CLUSTERED
(
[CompanyId] ASC,
[EobrDeviceId] ASC,
[EobrTimestampUtc] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[EngineRecord] WITH NOCHECK ADD CONSTRAINT [FK_EngineRecord_CompanyLevel] FOREIGN KEY([CompanyId])
REFERENCES [dbo].[CompanyLevel] ([Id])
GO
ALTER TABLE [dbo].[EngineRecord] CHECK CONSTRAINT [FK_EngineRecord_CompanyLevel]
GO
ALTER TABLE [dbo].[EngineRecord] WITH NOCHECK ADD CONSTRAINT [FK_EngineRecord_Employee] FOREIGN KEY([DriverEmployeeId])
REFERENCES [dbo].[Employee] ([Id])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[EngineRecord] CHECK CONSTRAINT [FK_EngineRecord_Employee]
GO
ALTER TABLE [dbo].[EngineRecord] WITH NOCHECK ADD CONSTRAINT [FK_EngineRecord_EobrDevice] FOREIGN KEY([EobrDeviceId])
REFERENCES [dbo].[EobrDevice] ([Id])
GO
ALTER TABLE [dbo].[EngineRecord] CHECK CONSTRAINT [FK_EngineRecord_EobrDevice]
GO
---------------------
--Indexes/Constraints
---------------------
ALTER TABLE [dbo].[EngineRecord] ADD CONSTRAINT [PK_EngineRecord] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [NC_EngineRecord_Employee] ON [dbo].[EngineRecord]
(
[DriverEmployeeId] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [NC_RecordType] ON [dbo].[EngineRecord]
(
[RecordType] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
ALTER TABLE [dbo].[EngineRecord] ADD CONSTRAINT [NK_EngineRecord] UNIQUE CLUSTERED
(
[CompanyId] ASC,
[EobrDeviceId] ASC,
[EobrTimestampUtc] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_EngineRecord_DBA] ON [dbo].[EngineRecord]
(
[CompanyId] ASC,
[GpsLatitude] ASC,
[GPSLongitude] ASC
)
INCLUDE ( [EobrDeviceId],
[EobrTimestampUtc]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [NC_IsProcessed] ON [dbo].[EngineRecord]
(
[IsProcessed] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
GO
EDIT:
Here is a sproc that takes some time to run that is used often.
CREATE PROCEDURE [dbo].[EngineRecord__GetEobrListToProcessByRecordType]
#RecordTypeEnum int
AS
DECLARE #ChangeHistory bit -- dummy variable for VS 2008 database project
SET NOCOUNT ON
SELECT EobrDevice.[Id] as EobrDeviceId,
EobrDevice.[UnitId],
CompanyGroupRoot.[Id] as CGRootId,
CompanyGroup.[Id] as CompanyGroupId,
EobrDevice.[CompanyId]
FROM dbo.EobrDevice
INNER JOIN dbo.CompanyLevel ON EobrDevice.[CompanyId] = CompanyLevel.[Id]
INNER JOIN dbo.CompanyGroup ON CompanyLevel.ParentGroupId = CompanyGroup.[Id]
INNER JOIN dbo.CompanyGroupRoot ON CompanyGroup.CGRootId = CompanyGroupRoot.[Id]
WHERE EobrDevice.[Id] IN ( SELECT DISTINCT EngineRecord.EobrDeviceId FROM dbo.EngineRecord WHERE IsProcessed = 0 AND RecordType = #RecordTypeEnum )
AND EobrDevice.UnitId IS NOT NULL
EDIT 2:
This is something we run every night to purge out old records. This always takes a lot of time.
DECLARE #dt6MonthsPrior datetime
SET #dt6MonthsPrior = DATEADD(m, -6, getdate())
SELECT * FROM EngineRecord
WHERE EngineRecord.EobrTimeStampUtc < #dt6MonthsPrior
ORDER BY EobrTimestampUtc ASC
None of the fields in your WHERE criteria are contained in an index. Indexing those fields will help. The efficacy of your other indices is impossible to determine without a more thorough understanding of how the table is used.
If you really wanted this query to fly you could have a clustered index on Odometer and Tachometer, but that's probably not reasonable given the table's other uses.
Update:
Your 2nd stored proc doesn't seem like it should be terribly slow, it does seem like the only thing that would help that is an index on the date.
55 million records isn't that big these days, I'm no expert on partitioning, but I don't think you'd see much if any improvement by partitioning your table, I usually don't bother unless I expect a table to exceed a few hundred million records, but in a production environment there are other benefits of partitioning.
Are you certain hardware is not responsible for the poor performance you're seeing? There are a host of settings/features in SQL Server that affect performance as well.
An index like this might help this specific query:
CREATE INDEX x ON dbo.EngineRecord(Odometer, Tachometer) WHERE FuelUseTotal IS NOT NULL;
This will help most if you stop ordering by the timestamp.
Do you know how to get Execution Plans? You have no index on tach or odo or FuelUse, so your sample query will result in a full table scan. From Sql Management Studio, right click in the query window, select "Include Actual Execution Plan" and then run your query. You will see an output that explains to you the steps SQL server will has to perform to actually run your query. This can be very instructive once you take the time to understand an execution plan.
Also, you might want to investigate covering indexes. These can be a dramatic difference if you have some queries that you use frequently. Of course, like any index, there is more overhead when you add/delete
Indexing the fields in the WHERE like Goat CO suggests is a good start, I would also advise to move the WHERE condition to the first INNER JOIN, that way the temporary table created for further handling is already much smaller after the first INNER JOIN (I've seen it do performance wonders)
SELECT EobrDevice.[Id] as EobrDeviceId,
EobrDevice.[UnitId],
CompanyGroupRoot.[Id] as CGRootId,
CompanyGroup.[Id] as CompanyGroupId,
EobrDevice.[CompanyId]
FROM dbo.EobrDevice
INNER JOIN dbo.CompanyLevel
ON EobrDevice.UnitId IS NOT NULL
AND EobrDevice.[CompanyId] = CompanyLevel.[Id]
AND EobrDevice.[Id] IN (
SELECT DISTINCT EngineRecord.EobrDeviceId
FROM dbo.EngineRecord
WHERE IsProcessed = 0
AND RecordType = #RecordTypeEnum
)
INNER JOIN dbo.CompanyGroup ON CompanyLevel.ParentGroupId = CompanyGroup.[Id]
INNER JOIN dbo.CompanyGroupRoot ON CompanyGroup.CGRootId = CompanyGroupRoot.[Id]
I've also moved the EobrDevice.UnitId IS NOT NULL condition to be checked first so that checking with other tables and running the subquery only happens when that condition is met.
PARTITIONNING INDEXES should have an impact in your performance. But they have to be done within appropriate separate drives. You give no information of your hardwares (what are you using, NAS ? SAS Drives ? ... )
Also, Normalization is not always the best choice regarding the objectives of your process, especially for Analytics purpose. Some fields (CompanyLevel, CompanyGroup) denormalized within your main table would have a better impact in your selection -
Well, every master chef have his own Kitchen, so let's skip this discussion....
The built of your indexes do not fit the way you purge your data. You will get a better performance if you decide to change your
[EobrTimestampUtc] ASC
change to
[EobrTimestampUtc] DESC
will impact the seek index for EngineRecord.EobrTimeStampUtc < #dt6MonthsPrior

Is this index redundant?

A recent review of a fairly high traffic table defined as follows:
CREATE TABLE [dbo].[SomeTable](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[SomeId] [bigint] NOT NULL,
[Time] [time](0) NOT NULL,
[InsertTime] [datetime] NOT NULL,
[SequenceNumber] [int] NOT NULL,
[OtherId] [int] NULL,
CONSTRAINT [PK_Tracks] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
reveals the following index definitions:
CREATE NONCLUSTERED INDEX [i1] ON [dbo].[SomeTable]
(
[SomeId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
-and-
CREATE NONCLUSTERED INDEX [i2] ON [dbo].[SomeTable]
(
[SomeId] ASC,
[OtherId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
This area isn't really my strength, but isn't index i1 superfluous?
Maybe, maybe not. If you were the optimizer, which index would you use for the following query:
select [SomeId]
from [dbo].[SomeTable]
If that query is the kind that's critical to your application and the table is large, having that targeted index could be useful. But you're right in that any query that could be satisfied by i1 could also be satisfied (perhaps more expensively) by i2.
Yes, it is redundant. You'll often end up with this kind of situation when someone adds a new index without reviewing whether it makes any existing indices redundant.
You'll find it worthwhile reading this post that describes circumstances where the apparrently redundant index would be useful. However, since your table doesn't contain large columns, it won't apply to you.
For reference, this blog describes how you can eliminate redundant indices from your databases.
You should not only think about the width of the extra column in the i2 when you are weighting the redundance of the index but also the granularity of it in relation of the first column.
If OtherId has and takes a lot of values for every valor of the SomeId col, queries with only SomeId in the where will take more time without the seemingly redundant index than with it.