Is it quicker to insert sorted data into a Sybase table? - sql

A table in Sybase has a unique varchar(32) column, and a few other columns. It is indexed on this column too.
At regular intervals, I need to truncate it, and repopulate it with fresh data from other tables.
insert into MyTable
select list_of_columns
from OtherTable
where some_simple_conditions
order by MyUniqueId
If we are dealing with a few thousand rows, would it help speed up the insert if we have the order by clause for the select? If so, would this gain in time compensate for the extra time needed to order the select query?
I could try this out, but currently my data set is small and the results don’t say much.

With only a few thousand rows, you're not likely to see much difference even if it is a little faster. If you anticipate approaching 10,000 rows or so, that's when you'll probably start seeing a noticeable difference -- try creating a large test data set and doing a benchmark to see if it helps.
Since you're truncating, though, deleting and recreating the index should be faster than inserting into a table with an existing index. Again, for a relatively small table, it shouldn't matter -- if everything can fit comfortably in the amount of RAM you have available, then it's going to be pretty quick.
One other thought -- depending on how Sybase does its indexing, passing a sorted list could slow it down. Try benchmarking against an ORDER BY RANDOM() to see if this is the case.

I don't believe order speeds in INSERT, so don't run ORDER BY in a vain attempt to improve performance.

I'd say that it doesn't really matter in which order you execute these functions.
Just use the normal way of inserting INSERT INTO, and do the rest afterwards.

I can't say about sybase, but MS SQL inserts faster if records are sorted carefully. Sorting can minimize number of index expansions. As you know it is better to populate the table ant then create index. Sorting data before insertion leads to the similar effect.

The order in which you insert data will generally not improve performance. The issues that affect insert speed have more to do with your databases mechanisms for data storage than the order of inserts.
One performance problem you may experience when inserting a lot of data into a table is the time it takes to update indexes on the table. However again in this case the order in which you insert data will not help you.
If you have a lot of data and by a lot I mean hundreds of thousands perhaps millions of records you could consider dropping the indexes on the table, inserting the records then recreating the indexes.

Dropping and recreating indexes (at least in SQL server) is by far the best way to do the inserts. At least some of the time ;-) Seriously though, if you aren't noticing any major performance problems, don't mess with it.

Related

Does a unique column constraint affect the performance of INSERT operations in SQL Server

I am wondering if having a UNIQUE column constraint will slow down the performance of insert operations on a table.
Yes, indexes will slow it down marginally (perhaps not even noticeably).
HOWEVER, do not forego proper database design to because you want it to be fast as possible. Indexes will slow down an insert a tiny amount; if this amount is unacceptable, your design is almost certainly wrong in the first place and you are attacking the issue from the wrong angle.
When in doubt, test. If you need to be able to insert 100,000 rows a minute. Test that scenario.
I would say, yes.
The difference between inserting into HEAP with and without such constraint will be visible since general rule applies: more indexes - slower inserts. Also unique index has to be checked if a row can be inserted or such value (or combination) already exists, so double work.
The slowdown will be more visible on bulk inserts of large amounts of rows. And vice versa on single spotted inserts the impact going to be smaller.
The other thing that unique constraints and indexes help query optimizer to build better SELECT plans...
Typically no. SQL Server creates an index and SQL Server can quickly determine if the value already exist. Maybe in a an enormous table (billions of rows) but I have never seen it. Unique constraints are very a very useful and convenient way to guarantee data consistancy.

How to Performance tune a query that has Between statement for range of dates

I am working on performance tuning all the slow running queries. I am new to Oracle have been using sql server for a while. Can someone help me tune the query to make it run faster.
Select distinct x.a, x.b from
from xyz_view x
where x.date_key between 20101231 AND 20160430
Appreciate any help or suggestions
First, I'd start by looking at why the DISTINCT is there. In my experience many developers tack on the DISTINCT because they know that they need unique results, but don't actually understand why they aren't already getting them.
Second, a clustered index on the column would be ideal for this specific query because it puts all of the rows right next to each other on disk and the server can just grab them all at once. The problem is, that might not be possible because you already have a clustered index that's good for other uses. In that case, try a non-clustered index on the date column and see what that does.
Keep in mind that indexing has wide-ranging effects, so using a single query to determine indexing isn't a good idea.
I would also add if you are pulling from a VIEW, you should really investigate the design of the view. It typically has a lot of joins that may not be necessary for your query. In addition, if the view is needed, you can look at creating an indexed view which can be very fast.
There is not much more you can do to optimize this query so long as you have established that the DISTINCT is really needed.
You can add a [NOLOCK] to the FROM clause if reading uncommitted pages is not an issue.
However you can analyze if the time is being inserted as well, and if so, is it really relevant, if not set the time to midnight this will improve indexes.
Biggest improvements I've seen is dividing the date field in the table into 3 fields, 1 for each date part. This can really improve performance.

is performance of sql insert dependent on number of rows in table

I want to log some information from the users of a system to a special statistics table.
There will be a lot of inserts into this table, but no reads (not for the users, only I will read)
Will I get a better performance to have two tables where I move the rows into a table I can use for querying, just to keep the "insert" table as small as possible?
E.g. I the user-behavior results in 20000 inserts per day, the table will grow rapidly, and I am afraid the inserts get slower and slower as more and more rows are inserted.
Inserts get slower if SQL Server has to update indexes. If your indexes mean reorganisation isn't required then inserts won't be particularly slow even for large amounts of data.
I don't think you should prematurely optimise in the way you're suggesting unless you're totally sure it's necessary. Most likely you're fine just taking a simple approach, then possibly adding a periodically-updated stats table for querying if you don't mind that being somewhat out of date ... or something like that.
The performance of the inserts depends on whether there are indexes on the table. You say that users aren't going to be reading from the table. If you yourself only need to query it rarely, then you could just leave off the indexes entirely, in which case insert performance won't change.
If you do need indexes on the table, you're right that performance will start to suffer. In this case, you should look into partitioning the table and performing minimally-logged inserts. That's done by inserting the new records into a new table that has the same structure as your main table but without indexes on it. Then, once the data has been inserted, adding it to the main table becomes nothing more than a metadata operation; you simply redefine that table as a new partition of your main table.
Here's a good writeup on table partitioning.

how to optimize sql server table for faster response?

i found a in a table there are 50 thousands records and it takes one minute when we fetch data from sql server table just by issuing a sql. there are one primary key that means a already a cluster index is there. i just do not understand why it takes one minute. beside index what are the ways out there to optimize a table to get the data faster. in this situation what i need to do for faster response. also tell me how we can write always a optimize sql. please tell me all the steps in detail for optimization.
thanks.
The fastest way to optimize indexes in table is to use SQL Server Tuning Advisor. Take a look http://www.youtube.com/watch?v=gjT8wL92mqE <-- here
Select only the columns you need, rather than select *. If your table has some large columns e.g. OLE types or other binary data (maybe used for storing images etc) then you may be transferring vastly more data off disk and over the network than you need.
As others have said, an index is no help to you when you are selecting all rows (no where clause). Using an index would be slower in such cases because of the index read and table lookup for each row, vs full table scan.
If you are running select * from employee (as per question comment) then no amount of indexing will help you. It's an "Every column for every row" query: there is no magic for this.
Adding a WHERE won't help usually for select * query too.
What you can check is index and statistics maintenance. Do you do any? Here's a Google search
Or change how you use the data...
Edit:
Why a WHERE clause usually won't help...
If you add a WHERE that is not the PK..
you'll still need to scan the table unless you add an index on the searched column
then you'll need a key/bookmark lookup unless you make it covering
with SELECT * you need to add all columns to the index to make it covering
for a many hits, the index will probably be ignored to avoid key/bookmark lookups.
Unless there is a network issue or such, the issue is reading all columns not lack of WHERE
If you did SELECT col13 FROM MyTable and had an index on col13, the index will probably be used.
A SELECT * FROM MyTable WHERE DateCol < '20090101' with an index on DateCol but matched 40% of the table, it will probably be ignored or you'd have expensive key/bookmark lookups
Irrespective of the merits of returning the whole table to your application that does sound an unexpectedly long time to retrieve just 50000 rows of employee data.
Does your query have an ORDER BY or is it literally just select * from employee?
What is the definition of the employee table? Does it contain any particularly wide columns? Are you storing binary data such as their CVs or employee photo in it?
How are you issuing the SQL and retrieving the results?
What isolation level are your select statements running at (You can use SQL Profiler to check this)
Are you encountering blocking? Does adding NOLOCK to the query speed things up dramatically?

Sql optimization Techniques

I want to know optimization techniques for databases that has nearly 80,000 records,
list of possibilities for optimizing
i am using for my mobile project in android platform
i use sqlite,i takes lot of time to retreive the data
Thanks
Well, with only 80,000 records and assuming your database is well designed and normalized, just adding indexes on the columns that you frequently use in your WHERE or ORDER BY clauses should be sufficient.
There are other more sophisticated techniques you can use (such as denormalizing certain tables, partitioning, etc.) but those normally only start to come into play when you have millions of records to deal with.
ETA:
I see you updated the question to mention that this is on a mobile platform - that could change things a bit.
Assuming you can't pare down the data set at all, one thing you might be able to do would be to try to partition the database a bit. The idea here is to take your one large table and split it into several smaller identical tables that each hold a subset of the data.
Which of those tables a given row would go into would depend on how you choose to partition it. For example, if you had a "customer_id" field that could range from 0 to 10,000 you might put customers 0 - 2500 in table1, 2,500 - 5,000 in table2, etc. splitting the one large table into 4 smaller ones. You would then have logic in your app that would figure out which table (or tables) to query to retrieve a given record.
You would want to partition your data in such a way that you generally only need to query one of the partitions at a time. Exactly how you would partition the data would depend on what fields you have and how you are using them, but the general idea is the same.
Create indexes
Delete indexes
Normalize
DeNormalize
80k rows isn't many rows these days. Clever index(es) with queries that utlise these indexes will serve you right.
Learn how to display query execution maps, then learn to understand what they mean, then optimize your indices, tables, queries accordingly.
Such a wide topic, which does depend on what you want to optimise for. But the basics:
indexes. A good indexing strategy is important, indexing the right columns that are frequently queried on/ordered by is important. However, the more indexes you add, the slower your INSERTs and UPDATEs will be so there is a trade-off.
maintenance. Keep indexes defragged and statistics up to date
optimised queries. Identify queries that are slow (using profiler/built-in information available from SQL 2005 onwards) and see if they could be written more efficiently (e.g. avoid CURSORs, used set-based operations where possible
parameterisation/SPs. Use parameterised SQL to query the db instead of adhoc SQL with hardcoded search values. This will allow better execution plan caching and reuse.
start with a normalised database schema, and then de-normalise if appropriate to improve performance
80,000 records is not much so I'll stop there (large dbs, with millions of data rows, I'd have suggested partitioning the data)
You really have to be more specific with respect to what you want to do. What is your mix of operations? What is your table structure? The generic advice is to use indices as appropriate but you aren't going to get much help with such a generic question.
Also, 80,000 records is nothing. It is a moderate-sized table and should not make any decent database break a sweat.
First of all, indexes are really a necessity if you want a well-performing database.
Besides that, though, the techniques depend on what you need to optimize for: Size, speed, memory, etc?
One thing that is worth knowing is that using a function in the where statement on the indexed field will cause the index not to be used.
Example (Oracle):
SELECT indexed_text FROM your_table WHERE upper(indexed_text) = 'UPPERCASE TEXT';