I'm running Firebird 2.5 (and have also tried earlier versions) on Windows. Every day after 12:00PM running insert/update queries on one specific table hang, but complete successfully by 12:35 or so, no matter when started. It does seem that Firebird is doing some kind of maintenance on the table and it takes half an hour to complete, during which time the table cannot be written to (but the reads are fast). The table itself is really small, some 10000 rows, compared to millions of rows we have in other tables - and other tables do not get stuck.
I haven't been able to find any reason or solution. I tried dumping the table and restoring it, which didn't help, I tried switching between superserver and classic, changed versions with no success.
Has anyone experienced a problem like this?
No. Firebird doesn't have any internal maintenance procedures bind to some specified time of a day. Seems, there is some task on your server scheduled to run at 12:00 PM. Or there are network users of the server who start doing some heavy access at 12:00 PM.
The only maintenance FB does is "garbage collection" (geting rid of old record versions) and this is done on "when needed" basis (usually when records were selected, see the GCPolicy in firebird.conf) not on some predefined time.
Do you experience this hang only on during these certain hours or is it always slow to insert to that table? Have you checked the server load during the slowdown (ie in the task manager, is the CPU maxed out)? Anyway, here is some ideas to check:
What constraints / triggers do you have on the table? If they involve some extensive checks (ie against the other tables which contain millions of rows) this could be the reason inserts take so long.
Perhaps there is some other service which is triggered at that time? Ie do you have a cron job to make backup of the DB at that time? Or perhaps some other system service which runs at that time with higher priority slows down the server?
Do you have trace service active for the table? See fbtrace.conf in FireBird root directory. If it is active, extensive logging might be the cause of slowdown, if it isn't active, using it might help you to find the cause.
What are the setings for ForcedWrites / UnflushedWrites (see firebird.conf)? Does changing them make difference?
Is there something logged for this troublesome timeframe in firebird.log?
To me it looks like you have a process which starts at 12:00 and does something which locks the entire table. Use the monitoring table or the trace manager to see if there is any connection or active transaction which looks suspicious.
I also think your own transaction are started with the WAIT clause without a LOCK TIMEOUT, you might want to change this to NO WAIT or WAIT with a LOCK TIMEOUT, so that your transactions either fail immediately or after the timeout.
My suggestion is to use the TRACE API in 2.5 to track down what is happening near or around that time. That should help get you more information as to what is happening.
I use this for debugging http://upscene.com/products.misc.fbtm.php kinda buggy itself, but when it is working it is a god send.
Are some Client-Connections going DOWN at 12:00 PM? I had a similar problem on a 70.000 records sized table:
Client "A" has a permanently open DB Connection like "select * from TABLE". This is a "read only transaction" but reason enough for the server to generate Record-Versions. Why?
Client "B" made massive Updates to this Table, the Server tries to preserve the world like it was when "A" startet her "select". This is normal for Transaction able DB-Servers, and its implemented by creating Record Copies of the record-data before its updated.
So in my case for this TABLE 170.000 Record Versions existed. You can measure this by
gstat -r -t TABLE db.fdb | grep versions
If Client "B" goes down, the count of Record-Versions is NOT growing any more. Client "A" is the guilty one, freezing all this versions, forces the server to hold it. Finally if Client "A" goes down (or for example a firewall rule cuts all pending connections) Firebird is happy to start the process of getting rid of the now useless Record-Versions.
This "sweep"?! is bad programmed (even 2.5.2) cpu is 3% it do only <10.000 Versions / Minute so this TABLE has a performance of about 2%.
Related
I have two databases in different servers - center_db on siglv01\sql2008 and center_db on sig\sql2008.
Can I restart replication without needing to reinitialize it? The connection dropped more than 3 days ago and is now too slow: so I want to start replication without a reinitialize.
Based on the brief conversation above, I don't think you can do this without a re-init. Specifically, the distribution database only keeps so many commands before it starts trimming. The default is 72 hours. If the last command delivered to all of your subscribers is older than that, the distribution database doesn't have what it needs to play forward all of the activity that has happened since then.
Your only hope would be if the distribution agent is still running (it knows when the above situation happens and will give you an error saying as much). If so, try to figure out why delivery is slow (troubleshoot this like any other "slow application"; replication isn't magic) and see if it can get caught up that way. Depending on how many commands are remain undelivered, it may be faster to just re-init.
On our production server, for some reason at specific amount of time, thread count goes over and over to the certain point that though CPU Utilization is normal(30-50%), but the query starting to run slow we so lot more blocking statements.
I am not sure where to look at it, basically when our site runs normally, the thread count is around 150 threads, but during the specific time in a day(during 1:30 to 2:30) it come up to 270 threads.there are no extra sql transaction goes on, everything as normal as it was before but thread count grows and sql start behaving very very slow.
After restarting the SQL service immediately thread count comes to normal, and our site function fine for another 24 hours.
We are using SQL Server 2005, it is 24 core machine.
any idea?
Blocking statements steal workers (sys.dm_os_workers) so the server will spawn more workers to handle the incoming tasks. At 24 cores you'll have some 700 max worker threads out-of-the-box by default. So seeing 270 'threads' is not an issue, is well within the normal functioning parameters. You real problem must be the blocking, and you have to investigate it accordingly: who is blocking who and why. My bet is that you have a job running between 1:30 and 2:30 that is locking large portions of the database (a delete job perhaps?) and your queries block on locked rows. You'll have to investigate, find the root cause, and act accordingly. Reboot is not a solution, nor is blaming unrelated components (thread count). Use Activity Monitor, use Who Is Active, follow the methodical approach of Waits and Queues methodology. There are plenty of ways to identify the real problem. SQL Server will never appear slow due to thread count. It simply doesn't work like that.
You can control the degree of parallalism using the MAXDOP query hint. For more details please check this article:
http://blog.sqlauthority.com/2010/03/15/sql-server-maxdop-settings-to-limit-query-to-run-on-specific-cpu/
Thanks for your valuable feedback, yes it is true it is nothing which sql is behaving weiredly, it is something our Site which is based on Ektron CMS is responsible for, one of the Functionality of Ektron CMS (which is PageBuilder), while doing operation on this piece of Content was holding of the table badly, we have around 10 million users to our site, and probably since this was blocking the tables SQL Server goes nuts and does not respond very well.
We have finally eliminated the issue.
A colleague of mine (I promise it was a colleague!) has left an update running on our main SQL Server since last Thursday (yes that's right folks, we're pushing 100 hours now!). The SQL in question (in one transaction, I might add) is:
update daily_prices set min_date = (select min(a.date)
from daily_prices a
where a.key = daily_prices.key and
a.iid = daily_prices.iid)
(Yeah I know, heinous...)
The total cost in the query plan is coming out as 22186.7, the estimated number of rows to update is around 151 million.
We obviously need to resolve this query one way or another, we realise that if we are to kill the query we're going to generate some brutal rollback, but we've got no way of knowing how far it has gotten. The only thing we do know is this entry from sys.dm_exec_requests:
session_id status query_text cpu_time total_elapsed_time reads writes logical_reads
52 suspended update daily_prices... 2328469 408947075 13831137 42458588 151809497
So my question is, what would be our best course of action?
wait it out
kill it and roll back, and hope that it rolls back before the next ice age
something else?
I personally would want to wait it out unless I though it had no chance of finishing this week, the roll back at this stage could take far longer than the query has to date. If it's a production server, I really wouldn't take option 2 and kill it unless I absolutely had to.
In terms of regaining some control / working system if you have suitable backups, bring online another database restore the backup / tlog backups, but you will not want to restore to beyond when the transaction was started (or it will still have to roll it back.) This at least gives you a system you could continue dev work against, but unlikely to be the ideal situation for a prod system.
If it's a production server, have some kind words with the individual as to the suitability of testing queries and query plans prior to it being executed. I am sure many DBA's can suggest the less polite methods of instruction :)
So we got fed up with waiting for our transaction to complete, (after a full week on
one piece of SQL, who wouldn't?), and as it was interfering with our backup
process, we thought killing it was a necessary evil.
The database started to rollback the transaction.
5 days passed.
We noted with some posts elsewhere on the internet that sometimes some magic
happened when the database was restarted and the transaction would "go away",
although these are generally debunked*, and it makes no sense, we thought we
had nothing left to lose so we gave it a go. We knew the database would go into
recovery mode, but the database was becoming increasingly sick anyway and unable
to run anything but its current rollback work anyway, and we've seen SQL Server misbehave with hogging system resources and not diverting them to where it needs to do the work.
(* we also know enough database theory to know that the DB wouldn't just "forget"
about a transaction in progress, but we were also seeing stack dumps in the
SQL Server error logs which kind of told us that the SQL Server was getting
increasingly grumpy at the amount of rollback it was having to undertake)
So we restarted the database.
Sure enough the database went into recovery mode. However, the SQL Server event Log
was now giving us an update every 20 seconds or so as to how long it was going to
take (in all, it reckoned about 25 hours from the log messages, but it ended up being
just an hour and a half (!)).
Whether this method of recovery/rollback is faster, I would strongly doubt (as I expect
SQL Server had to do the same level of work to unwind the transaction as before), however it did finish within an hour and a half, either way, I don't want to make a habit of restarting my production database when it is halfway through a rollback). The update messages in the event log were an absolute godsend, as anyone who has written a batch program
will tell you; however inaccurate they turned out to be - at least they were a worst case.
As we had the luxury of being the only two people using this production box, choosing to
send the database into recovery mode worked for us, and gave us informational messages we
didn't have access to with just our previous rollback state (or at least nothing we could
interpret given our lacking DBA skills). Would I recommend doing this in future?
....Absolutely not, however, hopefully the concerned parties have learnt their lesson, and
we can ask the board for some money for a proper development server! (epic Joel-Test fail!)
i ran a t-sql query on SQL Server 2005 DB it was suppose to run for 4 days , but when i checked after 4 days i found the machine was abruptly closed. So i want to trace what happened with the query or in other words whats was the status of the query when machine went down
Is there a mechanism to check this
It may be very difficult to do such a post-mortem. But while it may not be possible to determine what portion of the long running query was active when the shutdown occured, it is important to try and find the root cause of the problem!
Depending on the way the query is written, SQL will attempt to roll-back any SQL that wasn't completed (or any uncommitted transaction if transactions were used). This will happen the first time the server is restarted; If you have any desire to analyze the SQL transaction log (yuck!) make a copy.
Before getting to the part of the query which may have been running at the time of the shutdown, it may be preferable to look into SQL logs (I mean application logs, with information messages, timestamps and such, not SQL transaction logs for a given database), in search of basic facts, and possibly of an error message, happening some time prior to the shutdown and that could plausibly be the origin of the problem. Do also check the Windows event logs, as they may indicate abnormal system-wide conditions that could be a cause or consequence of the SQL shutdown.
The SQL transaction log can be "decompiled" and used to understand what was readily updated/inserted/deleted in the database, however this type of info for a 4 days run may be burried quite deep and long and possibly intersped with unrelated queries. In fact an out of disk space for the SQL log could possibly cause the server to become erratic (I would expect a more elegant handling of such situation, but if somehow the drive in question was also the system drive, bad things could happen...)
Finaly, by analyzing the "4 days long" SQL script it may be possible to infer which parts were completed, thanks to various side-effects of the script. In nothing else, this review may allow putting back the database in a coherent state if necessary, or to truncate the script to exclude the parts readily completed, in preparation for a new run for completign the work.
I have been struggling with a problem that only happens when the database has been idle for a period of time for the data queried. The first query will be extremely slow, on the order of 30 seconds and then related queries will be fast like 0.1 seconds. I am assuming this is related to caching, but I have been unable to find the cause of it.
Changing the mysql variables tmp_table_size, max_heap_table_size to a larger size had no effect except to create the temp tables in memory.
I do not think this is related to the query itself as it is well indexed and after the first slow query, variants of the same query do not show up in the slow query log. I am most interested in trying to determine the cause of this or a way to reset the offending cache so I can troubleshoot the issue.
Pages of the innodb data files get cached in the innodb buffer pool. This is what you'd expect. Reading files is slow, even on good hard drives, especially random reads which is mostly what databases see.
It may be that your first query is doing some kind of table scan which pulls a lot of pages into the buffer pool, then accessing them is fast. Or something similar.
This is what I'd expect.
Ideally, use the same engine for all tables (exceptions: system tables, temporary tables (perhaps) and very small tables or short-lived ones). If you don't do this then they have to fight for ram.
Assuming all your tables are innodb, make the buffer pool use up to 75% of the server's physical ram (assuming you don't run too many other tasks on the machine).
Then you will be able to fit around 12G of your database into ram, so once it's "warmed up", the "most used" 12G of your database will be in ram, where accessing it is nice and fast.
Some users of mysql tend to "warm up" production servers following a restart by sending them queries copied from another machine for a while (these will be replication slaves) until they add them into their production pool. This avoids the extreme slowness seen while the cache is cold. For example, Youtube does this (or at least it used to; Google bought them and they may now use Google-fu)
MySQL Workbench:
The below isn't 100% related to this SO question, but the symptoms are very related and this is the first SO result when searching for "mysql workbench slow" or related terms, so hopefully it's useful for others.
Clear the query history! - following the process at MySql workbench query history ( last executed query / queries ) i.e. create / alter table, select, insert update queries to clear MySQL Workbench's query history really sped up the program for me.
In summary: change the Output pane to History Output, right click on a Date and select Delete All Logs.
The issue I was experiencing was "slow first query" in that it would take a few seconds to load the results even though the duration/fetch were well under 1 second. After clearing my query history, the duration/fetch times stayed the same (well under 1 second, so no DB behavior actually changed), but now the results loaded instantly rather than after a few second delay.
Is anything else running on your mysql server? My thought is that maybe after the first query, your table is still cached in memory. Once it's idle, another process is causing it to be de-cached. Just a guess though.
How much memory do you have any what else is running?
I had an SSIS package that was timing out. The query was very simple, from a single MySQL table, but it sometimes returned a lot of records and would sometimes take a few minutes initially to execute, then only a few milliseconds afterwards if I wanted to query it again. We were stuck with the ADO connection, which meant it would time out after 30 seconds, so about half the databases we were trying to load were failing.
After beating my head against the wall I tried performing an initial query first; very simple and only returning a few rows. Since it was so simple it executed fast and set the table in the cache for faster querying. In the next step of the package I would do the more complex query which returned the large data set that kept timing out. Problem solved - all tables loaded. I may start doing this on a regular basis, the complex queries execute much faster by doing a simple query first.
Ttry and compare the output of "vmstat 1" on the linux command line when running the query after a period of time, vs when you re-run it and get results fast. Specifically check the "bi" column (that's the kb read from disk per second).
You may find the operating system is caching the disk blocks in the fast case (and thus a low "bi" figure), but not in the slow case (and hence a large "bi" figure).
You might also find that vmstat shows high/low cpu usage in either case. If it's low when fast, and disk throughput is also low, then your system may still be returning a cached query, even though you've indicated the relevant config value is set to zero. Perhaps check the output of show engine innodb status and SHOW VARIABLES and confirm.
innodb_buffer_pool_size may also be set high (it should be...), which would cache the blocks even before the OS can return them.
You might also find that "key_buffer" is set high - this would cache the keys in the indexes, which could make your select blindingly fast.
Try check the mysql performance blog site for lots of useful info.
I had issue when MySQL 5.6 was slow on first query after idle period. This was a connection problem, not MySQL instance problem, e.g. if you run MYSQL Query Browser execute "select * from some_queue", leave it alone for a couple of hours, then execute any query, it runs slow, while at the same time processes on server or new instance of Browser will select from same tables instantly.
Adding skip-host-cache, skip-name-resolve to my.ini file solved this problem.
I don't know why is that. Why I tried this: MySQL 5.1 without those options was slowly establishing connections from other networks (e.g. server is 192.168.1.100, 192.168.1.101 connects fast, 192.168.2.100 connects slow), MySQL 5.6 didn't have such problem to start with so we didn't add these to my.ini initially.
UPD: Solved half the cases, actually. Setting wait_timeout to maximum integer fixed the other half. Maybe I even now can remove skip-host-cache, skip-name-resolve and it won't slow down in 100% of the cases