Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I understand that it is not advisable to create indexes on tables that will be frequently updated. However, does the same hold true for other DML operations? Is it recommended to create an index on a table that will have frequent INSERT and DELETE operations performed on it?
Indexing overhead is highly dependent on table size, complexity and the number and size of the various INSERT/UPDATE/DELETE operations.
Sometimes it's faster to drop the indexes, perform the operations then recreate the indexes than it is to perform the operations with the indexes intact.
Other times it's slower.
You also need to weigh this against the impact on any SELECT operations that would be going on at the same time.
"Premature optimization is the root of all evil (or at least most of
it) in programming" (Knuth, 1974 Turing Award Lecture).
Until you're faced with actual performance problems that can't be fixed by improving your query, I'd ignore all the fringe-last-ditch-effort options like "not having indexes". Having the right indexes is almost always a performance improvement in normal operations.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
For what practical purposes would I'd potentially need to add an index to columns in my table? What are they typically needed for?
Indexes are database structures that improve the speed of retrieving data from the columns they are applied on. The wikipedia article on the subject gives a pretty good overview without going in to too much implementation-specific details.
Basic indexes have two common uses.
They speed up queries.
They implement unique constraints (and hence help define primary keys).
In addition, specialized indexes can enable functionality in some databases, in particular, text search and GIS queries.
Indexing columns speeds up queries on tables with many rows.
Indexes allow your database to search for the desired row using searching algorithms like binary search.
This would only be helpful if you had a large number of rows, for example 16 or more (this number is taken from the quicksort algorithm, which says if sorting 16 or less items, just do an insertion sort). Otherwise there would be negligible performance gain compared to a plain linear search.
If a table had 100 rows and you wanted to find the 80th row, without indexes, it might take 80 operations to find the 80th row. However with indexes, assuming they enable something like binary search, you could find the 80th row in something like 10 or less operations.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Optimization was never one of my expertise. I have users table. every user has many followers. So now I'm wondering if I should use a counter column in case that some user has a million followers. So instead of counting a whole table of relations, shouldn't I use a counter?
I'm working with SQL database.
Update 1
Right now I'm only writing the way I should build my site. I haven't write the code yet. I don't know if I'll have slow performance, that's why I'm asking you.
You should certainly not introduce a counter right away. The counter is redundant data and it will complicate everything. You will have to master the additional complexity and it'll slow down the development process.
Better start with a normalized model and see how it works. If you really run into performance problems, solve it then then.
Remember: premature optimization is the root of all evils.
It's generally a good practice to avoid duplication of data, such as summarizing one data point in another data's table.
It depends on what this is for. If this is for reporting, speed is usually not an issue and you can use a join.
If it has to do with the application and you're running into performance issues with join or computed column, you may want to consider summary table generated on a schedule.
If you're not seeing a performance issue, leave it alone.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I would like to speed up our SQL queries. I have started to read a book on Datawarehousing, where you have a separate database with data in different tables etc. Problem is I do not want to create a separate reporting database for each of our clients for a few reasons:
We have over 200, maintenance on these databases is enough
Reporting data must be available immediately
I was wondering, if i could simply denormalize the tables that i report on, as currently there are a lot of JOINs and believe these are expensive (about 20,000,000 rows in tables). If i copied the data into multiple tables, would this increase the performance by a far bit? I know there are issues with data being copied all over the place, but this could also be good for a history point of view.
Denormalization is no guarantee of an improvement in performance.
Have you considered tuning your application's queries? Take a look at what reports are running, identify places where you can add indexes and partitioning. Perhaps most reports only look at the last month of data - you could partition the data by month, so only a small amount of the table needs to be read when queried. JOINs are not necessarily expensive if the alternative is a large denormalized table that requires a huge full table scan instead of a few index scans...
Your question is much too general - talk with your DBA about doing some traces on the report queries (and look at the plans) to see what you can do to help improve report performance.
The question is very general. It is hard to answer whether denormalization will increase performance.
Basically, it CAN. But personally, I wouldn't consider denormalizing as a solution for Reporting issues. In my practice business people love to build huuuge reports which would kill OLTP DB in the least appropriate time. I would continue reading Datawarehousing :)
Yes for OLAP application your performance will improve by denormalization. but if you use same denormalized table for your OTLP application you will see a performance bottleneck over there. I suggest you too create new denormlize tables or materialized view for your reporting purpose and also you can incremently fast refresh your MV so you will get reporting data immediately.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
Every 5 seconds I want insert around 10k rows into the table. Table is unnormalized and has no primary keys or any indexes. I noticed that insert performance is very slow - 10k rows in 20 seconds, which is unacceptable for me.
In my understanding indexing could improve only searching performance but not insert. Is it true? Do you have any suggestions how it is possible improve performance?
Besides what Miky's suggesting, you can also improve the performance optimizing your db structure by for example reducing the length of varchar fields, using enums instead of texts and so on. It is also related to referential integrity, and first of all I think, you should normalize the database anyways. Then you can go on optimizing the queries.
You're right in that indexing will do nothing to improve the insert performance (if anything it might hurt it due to extra overhead).
If inserts are slow it could be due to external factors such as the IO performance of the hardware running your SQL Server instance or it could be contention at the database or table level due to other queries. You'll need to get the performance profiler running to determine the cause.
If you're performing the inserts sequentially, you may want to look into performing a bulk insert operation instead which will have better performance characteristics.
And finally, some food for thought, if you're doing 10K inserts every 5 seconds you might want to consider a NoSQL database for bulk storage since they tend to have better performance characteristics for this type of application where you have large and frequent writes.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I know this could be a stupid question, but a beginner I must ask this question to experts to clear my doubt.
When we use Entity Framework to query data from database by joining multiple tables it creates a sql query and this query is then fired to database to fetch records.
"We know that if we execute large query from .net code it will
increase network traffic and performance will be down. So instead of
writing large query we create and execute stored procedure and that
significantly increases the performance."
My question is - does EF not using the same old concept of creating large query that leads to degrade performance.
Experts please clear my doubts.
Thanks.
Contrary to popular myth, stored procedure are not any faster than a regular query. There are some slight, possible direct performance improvements when using stored procedures ( execution plan caching, precompiltion ) but with a modern caching environment and newer query optimizers and performance analysis engines, the benefits are small at best. Combine this with the fact that these potential optimization were already just a small part of the query results generation process, the most time-intensive part being the actual collection, seeking, sorting, merging, etc. of data, these stored procedure advantages are downright irrelevant.
Now, one other point. There is absolutely no way, ever, that by creating 500 bytes for the text of a query versus 50 bytes for the name of a stored procedure that you are going to have any effect on a 100 M b / s link.