I recently discovered included columns in SQL Server indexes. Do included columns in an index take up extra memory or are they stored on disk?
Also can someone point me to performance implications of including columns of differing data types as included columns in a Primary Key, which in my case is typically an in?
Thanks.
I don't fully understand the question: "Do included columns in an index take up extra memory or are they stored on disk?" Indexes are both stored on disk (for persistence) and in memory (for performance when being used).
The answer to your question is that the non-key columns are stored in the index and hence are stored both on disk and memory, along with the rest of the index. Included columns do have a significant performance advantage over key columns in the index. To understand this advantage, you have to understand the key values may be stored more than once in a b-tree index structure. They are used both as "nodes" in the tree and as "leaves" (the latter point to the actual records in the table). Non-key values are stored only in leaves, providing potentially a big savings in storage.
Such a savings means that more of the index can be stored in memory in a memory-limited environment. And that an index takes up less memory, allowing memory to be used for other things.
The use of included columns is to allow the index to be a "covering" index for queries, with a minimum of additional overhead. An index "covers" a query when all the columns needed for the query are in the index, so the index can be used instead of the original data pages. This can be a significant performance savings.
The place to go to learn more about them is the Microsoft documentation.
In SQL Server 2005 or upper versions, you can extend the functionality of nonclustered indexes by adding nonkey columns to the leaf level of the nonclustered index.
By including nonkey columns, you can create nonclustered indexes that cover more queries. This is because the nonkey columns have the following benefits:
• They can be data types not allowed as index key columns. ( All data types are allowed except text, ntext, and image.)
• They are not considered by the Database Engine when calculating the number of index key columns or index key size. You can include nonkey columns in a nonclustered index to avoid exceeding the current index size limitations of a maximum of 16 key columns and a maximum index key size of 900 bytes.
An index with included nonkey columns can significantly improve query performance when all columns in the query are included in the index either as key or nonkey columns. Performance gains are achieved because the query optimizer can locate all the column values within the index; table or clustered index data is not accessed resulting in fewer disk I/O operations.
Example:
Create Table Script
CREATE TABLE [dbo].[Profile](
[EnrollMentId] [int] IDENTITY(1,1) NOT NULL,
[FName] [varchar](50) NULL,
[MName] [varchar](50) NULL,
[LName] [varchar](50) NULL,
[NickName] [varchar](50) NULL,
[DOB] [date] NULL,
[Qualification] [varchar](50) NULL,
[Profession] [varchar](50) NULL,
[MaritalStatus] [int] NULL,
[CurrentCity] [varchar](50) NULL,
[NativePlace] [varchar](50) NULL,
[District] [varchar](50) NULL,
[State] [varchar](50) NULL,
[Country] [varchar](50) NULL,
[UIDNO] [int] NOT NULL,
[Detail1] [varchar](max) NULL,
[Detail2] [varchar](max) NULL,
[Detail3] [varchar](max) NULL,
[Detail4] [varchar](max) NULL,
PRIMARY KEY CLUSTERED
(
[EnrollMentId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
Stored procedure script
CREATE Proc [dbo].[InsertIntoProfileTable]
As
BEGIN
SET NOCOUNT ON
Declare #currentRow int
Declare #Details varchar(Max)
Declare #dob Date
set #currentRow =1;
set #Details ='Let''s think about the book. Every page in the book has the page number. All information in this book is presented sequentially based on this page number. Speaking in the database terms, page number is the clustered index. Now think about the glossary at the end of the book. This is in alphabetical order and allow you to quickly find the page number specific glossary term belongs to. This represents non-clustered index with glossary term as the key column. Now assuming that every page also shows "chapter" title at the top. If you want to find in what chapter is the glossary term, you have to lookup what page # describes glossary term, next - open corresponding page and see the chapter title on the page. This clearly represents key lookup - when you need to find the data from non-indexed column, you have to find actual data record (clustered index) and look at this column value. Included column helps in terms of performance - think about glossary where each chapter title includes in addition to glossary term. If you need to find out what chapter the glossary term belongs - you don''t need to open actual page - you can get it when you lookup the glossary term. So included column are like those chapter titles. Non clustered Index (glossary) has addition attribute as part of the non-clustered index. Index is not sorted by included columns - it just additional attributes that helps to speed up the lookup (e.g. you don''t need to open actual page because information is already in the glossary index).'
while(#currentRow <=200000)
BEGIN
insert into dbo.Profile values( 'FName'+ Cast(#currentRow as varchar), 'MName' + Cast(#currentRow as varchar), 'MName' + Cast(#currentRow as varchar), 'NickName' + Cast(#currentRow as varchar), DATEADD(DAY, ROUND(10000*RAND(),0),'01-01-1980'),NULL, NULL, #currentRow%3, NULL,NULL,NULL,NULL,NULL, 1000+#currentRow,#Details,#Details,#Details,#Details)
set #currentRow +=1;
END
SET NOCOUNT OFF
END
GO
Using the above SP you can insert 200000 records at one time.
You can see that there is a clustered index on column “EnrollMentId”.
Now Create a non-Clustered index on “ UIDNO” Column.
Script
CREATE NONCLUSTERED INDEX [NonClusteredIndex-20140216-223309] ON [dbo].[Profile]
(
[UIDNO] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Now Run the following Query
select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile --Takes about 30-50 seconds and return 200,000 results.
Query 2
select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile
where DOB between '01-01-1980' and '01-01-1985'
--Takes about 10-15 seconds and return 36,479 records.
Now drop the above non-clustered index and re-create with following script
CREATE NONCLUSTERED INDEX [NonClusteredIndex-20140216-231011] ON [dbo].[Profile]
(
[UIDNO] ASC,
[FName] ASC,
[DOB] ASC,
[MaritalStatus] ASC,
[Detail1] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
It will throw the following error
Msg 1919, Level 16, State 1, Line 1
Column 'Detail1' in table 'dbo.Profile' is of a type that is invalid for use as a key column in an index.
Because we can not use varchar(Max) datatype as key column.
Now Create a non-Clustered Index with included columns using following script
CREATE NONCLUSTERED INDEX [NonClusteredIndex-20140216-231811] ON [dbo].[Profile]
(
[UIDNO] ASC
)
INCLUDE ( [FName],
[DOB],
[MaritalStatus],
[Detail1]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Now Run the following Query
select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile --Takes about 20-30 seconds and return 200,000 results.
Query 2
select UIDNO,FName,DOB, MaritalStatus, Detail1 from dbo.Profile
where DOB between '01-01-1980' and '01-01-1985'
--Takes about 3-5 seconds and return 36,479 records.
Included columns provide functionality similar to a clustered index where the row contents are kept in the leaf node of the primary index. In addition to the key columns in the index, additional attributes are kept in the index table leaf nodes.
This permits immediate access to the column values without having to access another page in the database. There is a trade off with increased index size and general storage against the improved response from not having to indirect through a page reference in the index. The impact is likely similar with adding multiple indices to tables.
From here:-
An index with nonkey columns can significantly improve query
performance when all columns in the query are included in the index
either as key or nonkey columns. Performance gains are achieved
because the query optimizer can locate all the column values within
the index; table or clustered index data is not accessed resulting in
fewer disk I/O operations.
Related
i have the following table structure :
CREATE TABLE [dbo].[TableABC](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[FieldA] [nvarchar](36) NULL,
[FieldB] [int] NULL,
[FieldC] [datetime] NULL,
[FieldD] [nvarchar](255) NULL,
[FieldE] [decimal](19, 5) NULL,
PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
I do two type of CRUD operations with this table.
SELECT * FROM [dbo].[TableABC] WHERE FieldA = #FieldA
INSERT INTO [dbo].[TableABC](FieldA,FieldB,FieldC,FieldD,FieldE) VALUES (#FieldA,#FieldB,#FieldC,#FieldD,#FieldE)
FieldA has a unique value, but there is no constraint in the table.
Currently there are 6070755 rows in the table. Along with data growing , performance is getting slow.
Any suggestion , how to improve perfomance ? How to make CREATE and READ operation faster ?
now i faced problem , that select and insert takes too long , sometime more then 60 seconds
Read up on SQL basics- and Indices DEFINITELY are one. And if you have a unique value and no index on the field (constraint is irrelevant, unique index is good neough) - yes, that will get slower. SQL Server has to check the whole table.
So:
Add a unique index to Field a.
Given your 2 statements and the little "FieldA has a unique value, but there is no constraint in the table." I assume you are trying to enforce unique values there by selecting first. This will slow you down.
Instead make the index, and then try/catch the non unique sql errors - WAY faster. WAY faster. The index will make the insert a LITTLE slower, but you can save on the very slow select you do not totally.
First off, I am not a database programmer.
I have built the following table for stock market tick data:
CREATE TABLE [dbo].[Tick]
(
[trade_date] [int] NOT NULL,
[delimiter] [tinyint] NOT NULL,
[time_stamp] [int] NOT NULL,
[exchange] [tinyint] NOT NULL,
[symbol] [varchar](10) NOT NULL,
[price_field] [tinyint] NOT NULL,
[price] [int] NOT NULL,
[size_field] [tinyint] NOT NULL,
[size] [int] NOT NULL,
[exchange2] [tinyint] NOT NULL,
[trade_condition] [tinyint] NOT NULL
) ON [PRIMARY]
GO
The table will store 6 years of data to begin with. At an average of 300 million ticks per day that would be about 450 billion rows.
Common query on this table is to get all the ticks for some symbol(s) over a date range:
SELECT
trade_date, time_stamp, symbol, price, size
WHERE
trade_date > 20160101 and trade_date < 20170101
AND symbol = 'AAPL'
AND price_field = 0
ORDER BY
trade_date, time_stamp
This is my first attempt at an index:
CREATE UNIQUE CLUSTERED INDEX [ClusteredIndex-20180324-183113]
ON [dbo].[Tick]
(
[trade_date] ASC,
[symbol] ASC,
[time_stamp] ASC,
[price_field] ASC,
[delimiter] ASC,
[exchange] ASC,
[price] ASC,
[size_field] ASC,
[size] ASC,
[exchange2] ASC,
[trade_condition] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
First, I put date before symbol because there's less days than symbol so the shorter path is to get to date first.
I have included all the columns I would potentially need to retrieve. When I tested building it for one day's worth of data the size of the index was relatively quite large, about 4gb for a 20gb table.
Two questions:
Is my not including a primary key to save space a wise choice assuming my query requirements don't change?
Would I save space if I only include trade_date and symbol in the index? How would that affect performance, because I've been told I need to include all the columns I need in the index otherwise retrieval would be very slow because it would have to go back to the primary key to find the values of columns not included in the index. If this is true, how would that even work when my table doesn't have a primary key?
Your unique clustered index should contain the minimum amount of columns necessary to uniquely identify a row in your table. If that means almost every column in your table, I would think you should add an artificial primary key. Cutting an artificial primary key to save space is a poor decision IMO, only cut it if you can create a natural primary key out of your data.
The clustered index is essentially where all your data is stored. The leaf nodes of the index contain all the data for that row, the columns that make up the index determine how to reach those leaf nodes.
Including extra columns in your index to speed up queries only applies to NONCLUSTERED indexes, as there the leaf node generally only contains a lookup value. For these indexes, the way to include extra columns is to use the INCLUDE clause, not just list them all as part of the index. For example.
CREATE NONCLUSTERED INDEX [IX_TickSummary] ON [dbo].[Tick]
(
[trade_date] ASC,
[symbol] ASC
)
INCLUDE (
[time_stamp],
[price],
[size],
[price_field]
)
This is a concept known as creating a covering index, where the index itself contains all the columns needed to process your query so no additional lookup into the data table is needed. The up side of this is increased speed. The down side is that those INCLUDE'ed columns are essentially duplicated resulting in a large index and eating more space.
Include columns that are used very frequently, such as those used to generate summary listings. Columns that are queried infrequently, such as those only needed in detailed views, should be left out of the index to save space.
Potentially helpful reading: Using Covering Indexes to Improve Query Performance
Looking at your most common query, you should create a composite index based first on the columns involved in the where clause:
trade_date, simbol, price_field
then in select
time_stamp, symbol, price, size
This way, you can use the index for where and select column retrieving avoiding access to the data table
trade_date, simbol, price_field, time_stamp, symbol, price, size
In your sequence you have time_stamp before price_field .. a select column before a where column this don't let the db engine use completely the power of index
Table Props already has a non-clustered index for column 'CreatedOn' but this index doesn't include certain other columns that are required in order to significantly improve the query performance of a frequently run query.
To fix this is it best to;
1. create an additional non-clustered index with the included columns or
2. alter the existing index to add the other columns as included columns?
In addition:
- How will my decision affect the performance of other queries currently using the non-clustered index?
- If it is best to alter the existing index should it be dropped and re-created or altered in order to add the included columns?
A simplified version of the table is below along with the index in question:
CREATE TABLE dbo.Props(
PropID int NOT NULL,
Reference nchar(10) NULL,
CustomerID int NULL,
SecondCustomerID int NULL,
PropStatus smallint NOT NULL,
StatusLastChanged datetime NULL,
PropType smallint NULL,
CreatedOn datetime NULL,
CreatedBy int NULL
CONSTRAINT PK_Props PRIMARY KEY CLUSTERED
(
PropID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
GO
Current index:
CREATE NONCLUSTERED INDEX idx_CreatedOn ON dbo.Props
(
CreatedOn ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
All 5 of the columns required in the new or altered index are; foreign key columns, a mixture of smallint and int, nullable and non-nullable.
In the example the columns to include are: CustomerID, SecondCustomerID, PropStatus, PropType and CreatedBy.
As always... It depends...
As a general rule having redundant indexes is not desirable. So, in the absence of other information, you'd be better off adding the included columns, making it a covering index.
That said, the original index was likely built for another "high frequency" query... So now you have to determine weather or not the the increased index page count is going adversely affect the existing queries that use the index in it's current state.
You'd also want to look at the expense of doing a key lookup in relation to the rest of the query. If the key lookup in only a minor part of the total cost, it's unlikely that the performance gains will offset the expense of maintaining the larger index.
I have a database table with 5 million rows. The clustered index is auto-increment identity column. There PK is a code generated 256 byte VARCHAR which is a SHA256 hash of a URL, this is a non-clustered index on the table.
The table is as follows:
CREATE TABLE [dbo].[store_image](
[imageSHAID] [nvarchar](256) NOT NULL,
[imageGUID] [uniqueidentifier] NOT NULL,
[imageURL] [nvarchar](2000) NOT NULL,
[showCount] [bigint] NOT NULL,
[imageURLIndex] AS (CONVERT([nvarchar](450),[imageURL],(0))),
[autoIncID] [bigint] IDENTITY(1,1) NOT NULL,
CONSTRAINT [PK_imageSHAID] PRIMARY KEY NONCLUSTERED
(
[imageSHAID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [autoIncPK] ON [dbo].[store_image]
(
[autoIncID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
imageSHAID is a SHA256 hash of an image URL e.g. "http://blah.com/image1.jpg", it is hashed into a varchar of 256 length.
imageGUID is a code generated guid in which I identify the image (it will be used as an index later, but for now I have omitted this column as an index)
imageURL is the full URL of the image (up to 2000 characters)
showCount is the number of times the image is shown, this is incremented each time this particular image is shown.
imageURLIndex is a computed column limited by 450 characters, this allows me to do text searches on the imageURL should I choose to, it is indexable (again index is omitted for brevity)
autoIncID is the clustered index, should allow faster inserting of data.
Periodically I merge from a temp table into the store_image table. The temp table structure is as follows (very similar to the store_image table):
CREATE TABLE [dbo].[store_image_temp](
[imageSHAID] [nvarchar](256) NULL,
[imageURL] [nvarchar](2000) NULL,
[showCount] [bigint] NULL,
) ON [PRIMARY]
GO
When the merge process is run, I write a DataTable to the temp table using the following code:
using (SqlBulkCopy bulk = new SqlBulkCopy(storeConn, SqlBulkCopyOptions.KeepIdentity | SqlBulkCopyOptions.KeepNulls, null))
{
bulk.DestinationTableName = "[dbo].[store_image_temp]";
bulk.WriteToServer(imageTableUpsetDataTable);
}
I then run the merge command to update the showCount in the store_image table by merging from the temp table based on the imageSHAID. If the image doesn't currently exist in the store_image table, I create it:
merge into store_image as Target using [dbo].[store_image_temp] as Source
on Target.imageSHAID=Source.imageSHAID
when matched then update set
Target.showCount=Target.showCount+Source.showCount
when not matched then insert values (Source.imageSHAID,NEWID(), Source.imageURL, Source.showCount);
I'm typically trying to merge 2k-5k rows from the temp table to the store_image table at any one merge process.
I used to run this DB on a SSD (only SATA 1 connected) and it was very fast (under 200 ms). I ran out of room on the SSD so I swapped the DB to a 1TB 7200 cache spinning disk, since then completion times are over 6-100 seconds (6000 - 100000MS). When the bulk insert is running I can see disk activity of around 1MB-2MB/sec, low CPU usage.
Is this a typical write time for this amount of data? It seems a little slow to me, what is causing the slow performance? Surely with the imageSHAID being indexed we should expect quicker seek times than this?
Any help would be appreciated.
Thanks for your time.
Your UPDATE clause in the MERGE updates showCount. This requires a key lookup on the clustered index.
However, the clustered index is also declared non-unique. This gives information to the optimiser even though the underlying column is unique.
So, I'd make these changes
the clustered primary key to be autoIncID
the current PK on imageSHAID to be a standalone unique index (not constraint) and add an INCLUDE for showCount. Unique constraints can't have INCLUDEs
More observations:
you don't need nvarchar for the hash or URL columns. These are not unicode.
A hash is also fixed length so can be char(64) (for SHA2-512).
The length of a column defines how much memory to assign to the query. See this for more: is there an advantage to varchar(500) over varchar(8000)?
I'm creating a database that holds yield values of electric engines. The yield values are stored in an Excel file which I have to transfer to the database. Each test for an engine has 42 rows (torque) and 42 columns (power in kw) with the values stored in these cells.
(kw) 1,0 1,2 ...(42x)
-------- -------
(rpm)2000 76,2 77,0
2100 76,7 77,6
...
(42x)
Well I thought of creating a column for engine_id, test_id (each engine can have more than one test), and 42 columns for the corresponding yield values. For each test I have to add 42 rows for one single engine with the yield values. This doesn't seem efficient nor easy to implement to me.
If there are 42 records (rows) for 1 single engine, in a matter of time the database will hold up several thousands of rows and searching for a specific engine with the corresponding values will be an exhausting task.
If I make for each test for a specific engine a separate table, again after some time I would I have probably thousands of tables. Now what should I go for, a table with thousands of records or a table with 42 columns and 42 rows? Either way, I still have redundant records.
A database is definitely the answer (searching through many millions, or hundred of millions of rows is pretty easy once you get the hang of SQL (the language for interacting with databases). I would recommend a table structure of
EngineId, TestId, TourqueId, PowerId, YieldValue
Which would have values...
Engine1, Test1, 2000, 1.0, 73.2
So only 5 columns. This will give you the flexibility to add more yield results in future should it be required (or even if its not, its just an easier schema anyway). You will need to learn SQL, however, to realise the power of the database over a spreadsheet. Also, there are many techniques for importing Excel data to SQL, so you should investigate that (Google it). If you find you are transferring all that data by hand then you are doing something wrong (not wrong really, but inefficient!).
Further to your comments, here is the exact schema with index (in MS SQL Server)
CREATE TABLE [dbo].[EngineTestResults](
[EngineId] [varchar](50) NOT NULL,
[TestId] [varchar](50) NOT NULL,
[Tourque] [int] NOT NULL,
[Power] [decimal](18, 4) NOT NULL,
[Yield] [decimal](18, 4) NOT NULL,
CONSTRAINT [PK_EngineTestResults] PRIMARY KEY CLUSTERED
(
[EngineId] ASC,
[TestId] ASC,
[Tourque] ASC,
[Power] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
/****** Object: Index [IX_EngineTestResults] Script Date: 01/14/2012 14:26:21 ******/
CREATE NONCLUSTERED INDEX [IX_EngineTestResults] ON [dbo].[EngineTestResults]
(
[EngineId] ASC,
[TestId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
So note that there is no incrementing primary key...the key is (EngineId, TestId, Torque, Power). To get the results for a particular engine you would run a query like the following:
Select * from EngineTestResults where engineId = 'EngineABC' and TestId = 'TestA'
Note that I have added an index for that set of criteria.
The strength of a relational database is the ability to normalize data across multiple tables, so you could have one table for engines, one for tests and one for results. Something like the following:
CREATE TABLE tbl__engines (
`engine_id` SMALLINT UNSIGNED NOT NULL,
`name` VARCHAR(255) NOT NULL,
PRIMARY KEY(engine_id)
);
CREATE TABLE tbl__tests (
`test_id` INT UNSIGNED NOT NULL,
`engine_id` SMALLINT UNSIGNED NOT NULL,
PRIMARY KEY(test_id),
FOREIGN KEY(engine_id) REFERENCES tbl__engines(engine_id)
);
CREATE TABLE tbl__test_result (
`result_id` INT UNSIGNED NOT NULL,
`test_id` INT UNSIGNED NOT NULL,
`torque` INT NOT NULL,
`power` DECIMAL(6,2) NOT NULL,
`yield` DECIMAL(6,2) NOT NULL,
FOREIGN KEY(test_id) REFERENCES tbl__tests(test_id)
);
Then you can simply perform a join across these three tables to return the required results. Something like:
SELECT
*
FROM `tbl__engines` e
INNER JOIN `tbl__tests` t ON e.engine_id = t.engine_id
INNER JOIN `tbl__results` r ON r.test_id = t.test_id;