Completion latency to copy a SQL Azure database?

Completion latency to copy a SQL Azure database? - sql

I am wondering about how much times it takes to complete a database copy over SQL Azure. I am considering a scenario where :
a single database is populated first, and then stay read-only.
a set of copies are created.
an embarrassingly task gets paralleled over each copy (read-only).
copies are deleted to lower hosting cost.
Such a scenario makes sense if the database copy on SQL Azure is reasonably fast.
Does anyone has some information concerning the latency to complete a copy of an SQL Azure Database, maybe w/o of the GB size of the database (assuming that smaller DB get copied faster than big ones)?
Subsidiary question: if 10 copies of the DB are triggered at the same time, will it takes 10x more time to complete the 10th copy? or does SQL Azure support some level of parallelization for such an operation.

I have no empirical numbers for how long it takes to copy DBs of various sizes, but in my experience the time is usually minutes. For the DBs that I regularly work with which are less than 100MB I allow 5 minutes, but this is probably quite generous. I've occasionally copied larger databases and it doesn't seem to take much larger than that, I suspect a lot of the time is actually spent provisioning the new database rather than copying data.
I'm taking a guess at what would happen if you initiated multiple copies, but because of the SQL Azure infrastructure I'd be surprised if there was much of a slowdown if multiple copies were initiated at the same time.
I don't know how long you want the whole process to take, but I think it's basically a good idea. I'd highly recommend doing some of your own benchmarking though.

I've found it particularly slow. I just copied a tiny 3.45MB database, and it took in excess of 5 minutes. It started 6:42, finished 6:49.
This was just using the SQL command line create with copy. Eg:
CREATE DATABASE NewDB AS COPY OF OldDB;
I'm not sure what the deal was - whenever I went to check progress, it wasn't showing anything, then suddenly it was just done.
Eg:
SELECT * FROM sys.dm_database_copies c
JOIN sys.databases d ON c.database_id = d.database_id
WHERE databases.name = 'NewDB';
The percent_complete column was null each time I looked. I was actually concerned I'd done something wrong...

Related

Synchronize SQL Server databases

I have a new idea and question about that I would like to ask you.
We have a CRM application on-premise / in house. We use that application kind of 24X7. We also do billing and payroll on the same CRM database which is OLTP and also same thing with SSRS reports.
It looks like whenever we do operation in front end which does inserts and updates to couple of entities at the same time, our application gets frozen until that process finishes. e.g. extracting payroll for 500 employees for their activities during last 2 weeks. Basically it summarize total working hours pulls that numbers from database and writes/updates that record where it says extract has been accomplished. so for 500 employees we are looking at around 40K-50K rows for Insert/Select/Update statements together.
Nobody can do anything while this process runs! We are considering the following options to take care of this issue.
Running this process in off-hours
OR make a copy of DB of Dyna. CRM and do this operations(extracting thousands of records and running multiple reports) on copy.
My questions are:
how to create first of all copy and where to create it (best practices)?
How to make it synchronize in real-time.
if we do select statement operation in copy DB than it's OK, but if we do any insert/update on copy how to reflect that on actual live db? , in short how to make sure both original and copy DB are synchronize to each other in real time.
I know I asked too many questions, but being SQL person, stepping into CRM team and providing suggestion, you know what I am trying to say.
Thanks folks for your any suggestion in advance.

Well to answer your question in regards to the live "copy" of a database a good solution is an alwayson availability group.
https://blogs.technet.microsoft.com/canitpro/2013/08/19/step-by-step-creating-a-sql-server-2012-alwayson-availability-group/
Though I dont think that is what you are going to want in this situation. Alwayson availability groups are typically for database instances that require very low failure time frames. For example: If the primary DB server goes down in the cluster it fails over to a secondary in a second or two at the most and the end users only notice a slight hiccup for a second.
What I think you would find better is to look at those insert statements that are hitting your database server and seeing why they are preventing you from pulling data. If they are truly locking the table maybe changing a large amount of your reads to "nolock" reads might help remedy your situation.
It would also be helpful to know what kind of resources you have allocated and also if you have proper indexing on the core tables for your DB. If you dont have proper indexing then a lot of the queries can take longer then normal causing the locking your seeing.
Finally I would recommend table partitioning if the tables you are pulling against are to large. This can help with a lot of disk speed issues potentially and also help optimize your querys if you partition by time segment (i.e. make a new partition every X months so when a query pulls from one time segment they only pull from that one data file).
https://msdn.microsoft.com/en-us/library/ms190787.aspx
I would say you need to focus on efficiency more then a "copy database" as your volumes arent very high to be needing anything like that from the sounds of it. I currently have a sql server transaction database running with 10 million+ inserts on it a day and I still have live reports hit against it. You just need the resources and proper indexing to accommodate.

What's the easiest way to add an index on a live myISAM table?

I have a myISAM table running in production on mySQL, and by doing a few tests, we've found we can tremendously speed up a query by adding a certain compound index. So far so good. However, I am not really about the best way to add this index in a production environment without locking the table for a long time (it's got 27GBs of data, so not so much, but it does take a while).
Do you have any tips? If this was a more sophisticated setup of course we'd have a live replica of all of the data on another machine, and we could safely switch. Unfortunately, we're not there yet, and I would like to speed up this query as soon as possible (it's causing big customer headaches). Is there some simple way to replicate the data and then do a swap-out trick? Some other tricks that I am missing?
UPDATE: Reading about "Online Index Operations" in SQL Server makes me very jealous http://msdn.microsoft.com/en-us/library/ms191261.aspx :)
Thanks!

you can use replication to get downtime on the order of a couple minutes, instead of the hours it might take to create an index on that table.
to set up the slave, see http://dev.mysql.com/doc/refman/5.0/en/replication-howto-existingdata.html
a recommendation i can make to help speed up the process is in step 2 follow the "Creating a Data Snapshot Using Raw Data Files" method. but instead of copying over the wire to the slave, copy to a different location on the master. and bring the master back up as soon at the copy is done and you've made the necessary changes to the config file (set server-id and enabled binary logging). this will minimize your downtime to just a minute or two. once the server is back up, you can copy the copied files to the slave box.
once you have the slave up and running and you have verified everything is replicating properly, you can pause the slave. create the index on the salve. when the index creation is complete, resume the slave. this will catch the slave up to the master. on the master, use FLUSH TABLE WITH READ LOCK. check the slave status to make sure the log position on the master and the slave match. if they do, shut down the slave and copy the files for that table back to the master.

I'm with Randy. We've been in a similar situation, and there are two ways in MySQL to accomplish something like this:
Take down the server while it runs. This is what you'll probably do. It's simple, it's easy, it works. Time to do? Maybe a half hour/45 minutes, dependent on disk bandwidth. See below.
Make a new table with the new index, copy all the data over, pause the server delete the first table, alter the new one to the old name, start the server. Downtime? 10 minutes, maybe, but really complicated.
Option two works, and saves you the downtime of creating the index (if it takes a long time). But it takes more space, it's more complicated (since you have to deal with the new records inserted off the main table, and it will probably lock on MyISAM while copying the data out. Deleting a table will take some time, altering the table to the new name will take some time. It's just really complicated. If you had a 2TB table this might be useful, but for 27G it's probably overkill.
Do you have a second server that is close in specifications to your production server? Load up your most recent backup and do the index there, so you know about how long it will take to add. Then plan for downtime.
InnoDB is better about many things, but new indexes still lock the table. Those abilities that MSSQL (and I think PostgreSQL) have to do those kind of things without locking would be great.

Find your low usage window and take your application offline during the index build. Since you don't have replication or a multimaster or whatever, you're just going to have to bite the bullet on this one. See you at 1am. :-)

Not much you can do with one server here.
If you copy the table and do a dry run, at least you'll find out how long it's going to take without locking the live table, so you can schedule some maintenance time if necessary, or make a decision whether you can just push the button and leave users hanging for a couple of minutes :)
Or schedule it for a quiet time...
at 04:00 /usr/bin/mysql -uXXX -pXXX -e 'alter table mytable add key(col1, col2)'

How to replicate database A to B, then truncate data on database A, leaving B alone?

I am having a problem with my SQL Server 2005 database. The database must handle 1000 inserts a sec constantly. This is proving to be very difficult when the database must also handle reporting of the data, thus indexing. It seems to slow down after a couple of days only achieving 300 inserts per sec. By 10 days it is almost non functional.
The requirement is to store 14 days worth of data. So far I can only manage 3 or 4 before everything falls apart. Is there a simple solution to this problem?
I was thinking that I could replicate the primary database allowing the new database to be the reporting database storing the 14 days worth of database, then truncate the primary database daily. Would this work?

It is unlikely you will want reporting running against a database capturing 1000 records per second. I'd suggest two databases, one handling the constant stream of inserts and a second reporting database that only loads records at an interval, either by querying the first for a finite set since the last load or by caching the incoming data and loading it separately.
However, reporting in near real time against a database capturing 86 million rows per day and carrying approximately 1.2 billion rows will require significant planning and hardware demands. Further, on the backend as you reach day 14 and start to remove old data you will put more load on the database. If you can run without logging that will help the primary system, but the reporting system with indexing demands and such will require some pretty significant performance considerations.

If the server has multiple harddrives I would try to split the database (or even the tables) in partitions.

Yeah, you dont need to copy a database over and then truncate/delete the live database on the fly. My guess is that the slowness is because your transaction logs are growing like crazy?
I think you are trying to say that you want to "shrink" the database periodically. If you have a FULL backup scheme, I think that if you backup the transaction logs once in a while that will shrink things down to normal again.

Sql Server 2000 - tempdb growing very large

We have a SQL Server 2000 production environment where suddenly (ie. the last 3 days) something has caused the tempdb data file to grow very large (45 gigs with a database which is only 10 gigs).
Yesterday, after it happened again we shrank the database and ran the major batch processes individually without any problems. However, this morning the database was back up to 45 gigs.
Is there a simple way to find out what is causing this database to grow so large? Ideally, something which could be looked at today but if that is not available something which can be set to get that information tomorrow.
BTW: Shrinking the database gets back the space within a few seconds.

Agreed with Jimmy, you need use SQL Profiler to find which temporary objects created so intensive. This may be temporary tables that uses some reports or something like.

I wanted to thank everyone for their answers as they definitely led to the cause of the problem.
We turned on SQL profiler and sure enough a large bulk load showed up. As we are working on a project to move the "offending" job to work also in mysql we will probably just watch things for now.

do you have a job running that rebuilds the index? It is possible that it uses SORT_IN_TEMPDB
or any other large queries that do sorting might expand tempdb

This may have something to do with the recovery model that the TempDB is set to. It may be set to FULL instead of BULK-LOGGED. FULL recovery increases the transaction log size until a backup is performed.
Look at the data file size vs. the transaction log size.

I'm not a DBA but some thoughts:
Is it possible that there are temp
tables being created but not dropped?
##tempTable?
Is it possible that
there is a large temp table being
created (and dropped) but the space
isn't reclaimed?
Are you doing any
sort of bulk loading where the system
might use the temp table? (I'm not
sure if you can) but can you turn on
Auto-Shrink for the tempdb?

mysql slow on first query, then fast for related queries

I have been struggling with a problem that only happens when the database has been idle for a period of time for the data queried. The first query will be extremely slow, on the order of 30 seconds and then related queries will be fast like 0.1 seconds. I am assuming this is related to caching, but I have been unable to find the cause of it.
Changing the mysql variables tmp_table_size, max_heap_table_size to a larger size had no effect except to create the temp tables in memory.
I do not think this is related to the query itself as it is well indexed and after the first slow query, variants of the same query do not show up in the slow query log. I am most interested in trying to determine the cause of this or a way to reset the offending cache so I can troubleshoot the issue.

Pages of the innodb data files get cached in the innodb buffer pool. This is what you'd expect. Reading files is slow, even on good hard drives, especially random reads which is mostly what databases see.
It may be that your first query is doing some kind of table scan which pulls a lot of pages into the buffer pool, then accessing them is fast. Or something similar.
This is what I'd expect.
Ideally, use the same engine for all tables (exceptions: system tables, temporary tables (perhaps) and very small tables or short-lived ones). If you don't do this then they have to fight for ram.
Assuming all your tables are innodb, make the buffer pool use up to 75% of the server's physical ram (assuming you don't run too many other tasks on the machine).
Then you will be able to fit around 12G of your database into ram, so once it's "warmed up", the "most used" 12G of your database will be in ram, where accessing it is nice and fast.
Some users of mysql tend to "warm up" production servers following a restart by sending them queries copied from another machine for a while (these will be replication slaves) until they add them into their production pool. This avoids the extreme slowness seen while the cache is cold. For example, Youtube does this (or at least it used to; Google bought them and they may now use Google-fu)

MySQL Workbench:
The below isn't 100% related to this SO question, but the symptoms are very related and this is the first SO result when searching for "mysql workbench slow" or related terms, so hopefully it's useful for others.
Clear the query history! - following the process at MySql workbench query history ( last executed query / queries ) i.e. create / alter table, select, insert update queries to clear MySQL Workbench's query history really sped up the program for me.
In summary: change the Output pane to History Output, right click on a Date and select Delete All Logs.
The issue I was experiencing was "slow first query" in that it would take a few seconds to load the results even though the duration/fetch were well under 1 second. After clearing my query history, the duration/fetch times stayed the same (well under 1 second, so no DB behavior actually changed), but now the results loaded instantly rather than after a few second delay.

Is anything else running on your mysql server? My thought is that maybe after the first query, your table is still cached in memory. Once it's idle, another process is causing it to be de-cached. Just a guess though.
How much memory do you have any what else is running?

I had an SSIS package that was timing out. The query was very simple, from a single MySQL table, but it sometimes returned a lot of records and would sometimes take a few minutes initially to execute, then only a few milliseconds afterwards if I wanted to query it again. We were stuck with the ADO connection, which meant it would time out after 30 seconds, so about half the databases we were trying to load were failing.
After beating my head against the wall I tried performing an initial query first; very simple and only returning a few rows. Since it was so simple it executed fast and set the table in the cache for faster querying. In the next step of the package I would do the more complex query which returned the large data set that kept timing out. Problem solved - all tables loaded. I may start doing this on a regular basis, the complex queries execute much faster by doing a simple query first.

Ttry and compare the output of "vmstat 1" on the linux command line when running the query after a period of time, vs when you re-run it and get results fast. Specifically check the "bi" column (that's the kb read from disk per second).
You may find the operating system is caching the disk blocks in the fast case (and thus a low "bi" figure), but not in the slow case (and hence a large "bi" figure).
You might also find that vmstat shows high/low cpu usage in either case. If it's low when fast, and disk throughput is also low, then your system may still be returning a cached query, even though you've indicated the relevant config value is set to zero. Perhaps check the output of show engine innodb status and SHOW VARIABLES and confirm.
innodb_buffer_pool_size may also be set high (it should be...), which would cache the blocks even before the OS can return them.
You might also find that "key_buffer" is set high - this would cache the keys in the indexes, which could make your select blindingly fast.
Try check the mysql performance blog site for lots of useful info.

I had issue when MySQL 5.6 was slow on first query after idle period. This was a connection problem, not MySQL instance problem, e.g. if you run MYSQL Query Browser execute "select * from some_queue", leave it alone for a couple of hours, then execute any query, it runs slow, while at the same time processes on server or new instance of Browser will select from same tables instantly.
Adding skip-host-cache, skip-name-resolve to my.ini file solved this problem.
I don't know why is that. Why I tried this: MySQL 5.1 without those options was slowly establishing connections from other networks (e.g. server is 192.168.1.100, 192.168.1.101 connects fast, 192.168.2.100 connects slow), MySQL 5.6 didn't have such problem to start with so we didn't add these to my.ini initially.
UPD: Solved half the cases, actually. Setting wait_timeout to maximum integer fixed the other half. Maybe I even now can remove skip-host-cache, skip-name-resolve and it won't slow down in 100% of the cases

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas