indexing of Temptables

indexing of Temptables - sql

Good Day to every one!
i have a migration process from a remote query, i fetch data and store it in a #Temptable,
the question is, what would be better? putting index after Creation of table of #temptable or insert data first in the #temtable before putting an index? and why? or it is better to process the data while in the remote query before inserting the data in a #temptable
ex.
Select * into #BiosData
from sometable a
where (a.Status between 3 and 5)
CREATE CLUSTERED INDEX IDX_MAINID ON #BiosData([MAINID])
**Process the data retrieved above....**
OR this?
select A.MAINIDinto #BiosData
from table a
inner join Transactions.sometable c
on a.ID= c.fld_ID
inner join Reference.sometable b
on cast(a.[ID]/1000000000000 as decimal (38,0)) = b.fld_ID
where a.version > b.fld_version
and (a.Status between 3 and 5)
thank you for your tips and suggestions :) im a newbie in Sql please be gentle to me :)

As a generic rule:
If you create a fresh table and are going to insert data into it and it needs an index then it is faster to insert the data first and create the index afterwards. Why: because creating an index means calculating it if data exists, but inserting data on an indexed table will continiously reshuffle the index contents which also need to be written. So by creating the index afterwards you avoid the overhead of updating the index while inserting
Exception 1: if you want to have the index combined with the data hence when a read occurs to the index t find a particular value it also has the data available in the same read operation. In oracle they call it an index organized table. I think in MS SQL it might be called an clustered index, but not 100% sure.
Exception 2: if your index is used to enforce some constraint then creating the index first is a good option to make sure that during the inserts the constraint is maintained.
In your case: I notice that in the complex query there is an additional where clause: it may result in fewer inserts hence faster processing, however if the tables used in the complex query have additional indexes which speed up the query, make sure similar indices are also created on the temp table.
Finally: indices are typically used to reduce disk i/o, temporary tables are if I am not mistaken maintained in memory. So adding indices are not guaranteed to increase speed...

Related

Creating a non clustered index on a table with existing 1mln records affects that data immediately?

I have a column with 1 mln records. If I create a non clustered index on Column 'A', and then perform filtering by that column, should I immediately feel that the request takes much less time? Or I should create the index on empty table first, and only then add data to table in order to feel the power of index?

I cannot explain why you would or would not feel that a query is taking too much time.
But, once you have added an index -- and the statement completes -- then the index is available for any query that is compiled after that point in time.
As a rule, we can think that creating an index will remove the plan from the query cache. This is effectively what happens, but the actual sequence of events is that the next execution of the query will replace the plan. You can think of this as "delayed removal".
Creating an index on table when it is created means that the index will be available for all queries on the table.

Creating index for a query

I have one table Person with two columns Name and Gender and suppose in my application if I have a query which is called frequently :
select * from Person where Gender = 'M'
So is it advisable to create an index on the column Gender?

It's not advisable unless there is loads of one an only a few of the other and your query only looks at the few. A full table scan would give you a much more efficient result than diving through an index. In fact, even if you created the index, it's highly unlikely the optimiser would use it.

Below points might give you the idea:
From Documentation
In general, index access paths are more efficient for statements that retrieve a small subset of table rows, whereas full table scans are more efficient when accessing a large portion of a table.
Do not index columns that are modified frequently. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo.
When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL trace facility.

Indexing a table with duplicate records

I have a SQL Server table with around 50,000 rows. The table gets updated once in a day by some upstream process.
The following query has been fired from application:
SELECT * FROM Table1 where Field1 = "somevalue"
The "Field1" column contains duplicate values. I am trying to improve performance of the above query. I cannot modify the code in the application side. So limiting column instead of "SELECT *" is not possible. I am planning to index the table. Should I define a NON-CLUSTERED index on "Field1" column in order to improve performance? Or some other kind of indexing would help? Is there any other ways to improve performance from DB side ?

Yes, a non-clustered index on Field1 should serve your purposes...
For example,
CREATE NONCLUSTERED INDEX Idx_Table1_Field1 ON Table1 (Field1)

The best thing you can do is run SP_BlitzIndex by Brent Ozar to get a better picture of your entire database index setup (including this table).
http://www.brentozar.com/blitzindex/
If your table already has a clustered index (which it should - apply one following these principles), you should first look at the execution plan to see what it is advocating.
Further, if the table is only updated every day, and presumably during off hours, you can easily compress the table and given it has repetitive data mostly, you will save over 50% IO and space on the query and incur a small CPU overhead. Table compression has no effect on the data itself, only on the space it holds. This feature is only available in SQL Server Enterprise.
Last but not least, are your data types properly set, i.e. are you pulling from datetime when the column could easily be date, or are you pulling from bigint when the column could easily be int.
Asking a question as to how to make an index really isn't a proper question for Stack, i.e.
CREATE NONCLUSTERED INDEX Idx_Table1_Field1 ON Table1 (Field1)
As it is already on MSDN and can even be created via SSMS via Create Index drop down right clicking on the index burst out section under a given table icon, the question you should be asking is how do I properly address performance improvements in my environment related to indexing. Finally, analyze whether or not your end query result really necessitates a select * - this is a common oversight on data display, a table with 30 columns is selected from a dataset when the developer only plans on showing 5 of the columns, which would be a 600% IO gain if the dataset only populated 5 columns.
Please also note the famous index maintenance script by Ole Hallengren

Creating a clustered index on this heap?

So I am curious to know if it is worth creating a clustered index on a heap table that has about 30M rows of data. Before now, it wasn't going to be used in any application that we have but now we are creating an app to query that table.
The reason why I ask if it is worth it is because the application we are creating is basically doing this type of query.
SELECT *
FROM [table];
I am leaving the * in to represent that we are basically pulling all fields.
So my question is, is it worth creating a clustered index on a table that does not have one even though we are going to be selecting all fields and rows for our application?
Thanks for any info/advice.

No it is not worth it. If you are going to run a select without a where clause, a clustered index will just add more data to the Page files, depending on what you choose for your index(It all really depends on your data). Creating a larger scan of the table. A Heap table is the actual better performance wise in many situations(if you are just getting all rows from a table and not using joins/wheres/filter clauses of some sort), because it is stored in less page files.
Having a clustered index, when it isnt used will also bear some overhead in updating/creating stats on a table and doing inserts (page splits)
So if you arent going to use the index, and aren't going to filter on your table you are better off without the index

how to optimize sql server table for faster response?

i found a in a table there are 50 thousands records and it takes one minute when we fetch data from sql server table just by issuing a sql. there are one primary key that means a already a cluster index is there. i just do not understand why it takes one minute. beside index what are the ways out there to optimize a table to get the data faster. in this situation what i need to do for faster response. also tell me how we can write always a optimize sql. please tell me all the steps in detail for optimization.
thanks.

The fastest way to optimize indexes in table is to use SQL Server Tuning Advisor. Take a look http://www.youtube.com/watch?v=gjT8wL92mqE <-- here

Select only the columns you need, rather than select *. If your table has some large columns e.g. OLE types or other binary data (maybe used for storing images etc) then you may be transferring vastly more data off disk and over the network than you need.
As others have said, an index is no help to you when you are selecting all rows (no where clause). Using an index would be slower in such cases because of the index read and table lookup for each row, vs full table scan.

If you are running select * from employee (as per question comment) then no amount of indexing will help you. It's an "Every column for every row" query: there is no magic for this.
Adding a WHERE won't help usually for select * query too.
What you can check is index and statistics maintenance. Do you do any? Here's a Google search
Or change how you use the data...
Edit:
Why a WHERE clause usually won't help...
If you add a WHERE that is not the PK..
you'll still need to scan the table unless you add an index on the searched column
then you'll need a key/bookmark lookup unless you make it covering
with SELECT * you need to add all columns to the index to make it covering
for a many hits, the index will probably be ignored to avoid key/bookmark lookups.
Unless there is a network issue or such, the issue is reading all columns not lack of WHERE
If you did SELECT col13 FROM MyTable and had an index on col13, the index will probably be used.
A SELECT * FROM MyTable WHERE DateCol < '20090101' with an index on DateCol but matched 40% of the table, it will probably be ignored or you'd have expensive key/bookmark lookups

Irrespective of the merits of returning the whole table to your application that does sound an unexpectedly long time to retrieve just 50000 rows of employee data.
Does your query have an ORDER BY or is it literally just select * from employee?
What is the definition of the employee table? Does it contain any particularly wide columns? Are you storing binary data such as their CVs or employee photo in it?
How are you issuing the SQL and retrieving the results?
What isolation level are your select statements running at (You can use SQL Profiler to check this)
Are you encountering blocking? Does adding NOLOCK to the query speed things up dramatically?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas