I've got a pretty simple update statement in my application in one of the APIs.
UPDATE "table" SET "column" = "column" + 1 WHERE "id" = $1
The problem is when this API incurs a sudden load it starts to fail. There is no increase in DB CPU only the number of active sessions shoot up. When checking the RDS performance insights this is what I see.
The number of updates to this table doesn't shoot up more than 300. I don't think postgres should be behaving this way with such low number of updates (time spent in waits). It'll be helpful if I could have suggestions what might be going wrong here? I understand what does wait in tuple and transactionid mean but how can I reduce these waits?
Thanks
The problem is obvious: all these sessions are waiting for a row lock. If all these sessions are running the statement you quote with the same id (or only a low number of different ids), then they are blocking each other.
Since the sessions don't seem to do any work (if I read the graph right, they are all waiting), I'd say that the problem is that your transactions are taking too long. PostgreSQL holds the row lock, like all other user-facing locks, until the transaction ends, so having that UPDATE in your workload effectively serializes processing.
You could improve the situation by performing the UPDATE at the very end of the transaction, so that the lock is not held for a long time. In addition, you should reduce your connection pool size considerably. I don't think that your database can deal with 1000 active connections.
Reduce the maximum number of parallel HTTP requests your web server accepts (assuming HTTP(S)) so you don't overload the database. This will cause the API requests to your service to queue waiting for a TCP connection. This will result in improved performance rather than worse at time of peak loads.
Related
Am doing some POC with real-time scenarios for SaaS product to handle high volume of message, this will reach peak within few seconds(send/process) and listener side processing message then storing that computed data into Azure SQL Database(Separate Elastic Pool, 100 eDTU with Standard subscription), to mimic this am sending & processing message in parallel with few nodes and threads, in this case am facing some slowness in first few seconds of database operation when DTU reached maximum level the query execution is normal
Is this expected behavior?
What will happen if executes query during scaling of DTU?
How to avoid this?
When you scale up or down the service tier of an Azure SQL Database open transactions are rolled back, server logins may be disconnected, query plans may vary because the number of threads available for query changes, and the data cache and query cache will be cleared.
Since the data cache is empty, the first time you run a query it has to do a lot of physical IO, memory allocation raises and it's slow. You may take a look at queries performing slow and they may be showing the PAGEIOLATCH_SH and MEMORY_ALLOCATION_EXT waits and that corresponds to pages being pulled from disk to the buffer. The second time you run the query the data is stored on the data cache and it runs faster.
If the database faces high DTU usage for a good period of time throttling may see connection timeouts and poor performance on queries.
Is it possible to specify the maximum amount of time a SELECT query may take with SQLITE?
The situation would be useful where you have big tables and the users have to possibility to enter free search terms. If the searched for terms is not quickly found then the entire table is scanned which can take a very long time as indices cannot generally be used.
So having SQLITE give up after a few seconds would be useful.
I am using SQLITE via the System.Data.Sqlite and it seemed that SqliteCommand.CommandTimeout would be what I want, but setting this seem to have no effect for some reason. Perhaps I'm missing something.
For a simple select query, no, there doesn't appear to be a way to set a timeout, or maximum time to execute, on SQLite itself. The only mention of timeout in the documentation is the busy timeout. So, if you need to limit the maximum amount of time a select query can take, you'll need to wrap your connection with a timeout in the application level, and cancel/close your connection if that timeout is exceeded. How to do that would obviously be application/language specific.
For a busy timeout (the timeout before the connection stops waiting for locks to clear) you can do it through the provided C interface, or through the SQLite driver provided to your application.
We have one table having continuous inserts from 3 windows services in SQL Server 2008 with same SQL user. This makes table heavily loaded and retrieval operation and IO becomes slow.
We have decided to split this table in to two i.e. one for latest data and another for history data. I have one question here is, whether I gain performance benefit if I create one separate user for each windows service, so total 3 user in our case, for insert operation. I think there will be 3 session here i.e. separate session for each user and that might improve performance.
Am I right?
3 sessions don't necessarily require 3 different user id's. Each of the three data loading processes could establish a session using the same credentials, and you'd still have 3 sessions.
However, you may now run into contention, where each process locks the other out, which may result in slower overall performance. This could be avoided by configuring row-level locking on the table space, which itself will usually cost a slight performance hit.
You might still get better performance by batching your inserts into groups of say 5, 10 or 25 records before committing the operation. The downside to this approach is that on exception when you have to rollback and re-do, it takes longer because the unit of work is bigger.
I don't think there will be any performance increase by using different users. There will still be three different database sessions.
Depending on your setup and database, I think you may alliviate IO load with one or several of the following tips:
Batch inserts. Build a single service that can batch inserts. Less insert operations is way better (IO wise) than many insert operations.
Depending on your scenario, you may gain performance by lowering transaction isolation level for reads and inserts to that table.
Minimize the number of indexes on that table. Inserts to tables with indexes are more expensive.
Make sure the tables are stored on a disk that is fast enough for the IO throughput you need. Make sure the disk is not being used by other services.
I hope that helps.
I have been struggling with a problem that only happens when the database has been idle for a period of time for the data queried. The first query will be extremely slow, on the order of 30 seconds and then related queries will be fast like 0.1 seconds. I am assuming this is related to caching, but I have been unable to find the cause of it.
Changing the mysql variables tmp_table_size, max_heap_table_size to a larger size had no effect except to create the temp tables in memory.
I do not think this is related to the query itself as it is well indexed and after the first slow query, variants of the same query do not show up in the slow query log. I am most interested in trying to determine the cause of this or a way to reset the offending cache so I can troubleshoot the issue.
Pages of the innodb data files get cached in the innodb buffer pool. This is what you'd expect. Reading files is slow, even on good hard drives, especially random reads which is mostly what databases see.
It may be that your first query is doing some kind of table scan which pulls a lot of pages into the buffer pool, then accessing them is fast. Or something similar.
This is what I'd expect.
Ideally, use the same engine for all tables (exceptions: system tables, temporary tables (perhaps) and very small tables or short-lived ones). If you don't do this then they have to fight for ram.
Assuming all your tables are innodb, make the buffer pool use up to 75% of the server's physical ram (assuming you don't run too many other tasks on the machine).
Then you will be able to fit around 12G of your database into ram, so once it's "warmed up", the "most used" 12G of your database will be in ram, where accessing it is nice and fast.
Some users of mysql tend to "warm up" production servers following a restart by sending them queries copied from another machine for a while (these will be replication slaves) until they add them into their production pool. This avoids the extreme slowness seen while the cache is cold. For example, Youtube does this (or at least it used to; Google bought them and they may now use Google-fu)
MySQL Workbench:
The below isn't 100% related to this SO question, but the symptoms are very related and this is the first SO result when searching for "mysql workbench slow" or related terms, so hopefully it's useful for others.
Clear the query history! - following the process at MySql workbench query history ( last executed query / queries ) i.e. create / alter table, select, insert update queries to clear MySQL Workbench's query history really sped up the program for me.
In summary: change the Output pane to History Output, right click on a Date and select Delete All Logs.
The issue I was experiencing was "slow first query" in that it would take a few seconds to load the results even though the duration/fetch were well under 1 second. After clearing my query history, the duration/fetch times stayed the same (well under 1 second, so no DB behavior actually changed), but now the results loaded instantly rather than after a few second delay.
Is anything else running on your mysql server? My thought is that maybe after the first query, your table is still cached in memory. Once it's idle, another process is causing it to be de-cached. Just a guess though.
How much memory do you have any what else is running?
I had an SSIS package that was timing out. The query was very simple, from a single MySQL table, but it sometimes returned a lot of records and would sometimes take a few minutes initially to execute, then only a few milliseconds afterwards if I wanted to query it again. We were stuck with the ADO connection, which meant it would time out after 30 seconds, so about half the databases we were trying to load were failing.
After beating my head against the wall I tried performing an initial query first; very simple and only returning a few rows. Since it was so simple it executed fast and set the table in the cache for faster querying. In the next step of the package I would do the more complex query which returned the large data set that kept timing out. Problem solved - all tables loaded. I may start doing this on a regular basis, the complex queries execute much faster by doing a simple query first.
Ttry and compare the output of "vmstat 1" on the linux command line when running the query after a period of time, vs when you re-run it and get results fast. Specifically check the "bi" column (that's the kb read from disk per second).
You may find the operating system is caching the disk blocks in the fast case (and thus a low "bi" figure), but not in the slow case (and hence a large "bi" figure).
You might also find that vmstat shows high/low cpu usage in either case. If it's low when fast, and disk throughput is also low, then your system may still be returning a cached query, even though you've indicated the relevant config value is set to zero. Perhaps check the output of show engine innodb status and SHOW VARIABLES and confirm.
innodb_buffer_pool_size may also be set high (it should be...), which would cache the blocks even before the OS can return them.
You might also find that "key_buffer" is set high - this would cache the keys in the indexes, which could make your select blindingly fast.
Try check the mysql performance blog site for lots of useful info.
I had issue when MySQL 5.6 was slow on first query after idle period. This was a connection problem, not MySQL instance problem, e.g. if you run MYSQL Query Browser execute "select * from some_queue", leave it alone for a couple of hours, then execute any query, it runs slow, while at the same time processes on server or new instance of Browser will select from same tables instantly.
Adding skip-host-cache, skip-name-resolve to my.ini file solved this problem.
I don't know why is that. Why I tried this: MySQL 5.1 without those options was slowly establishing connections from other networks (e.g. server is 192.168.1.100, 192.168.1.101 connects fast, 192.168.2.100 connects slow), MySQL 5.6 didn't have such problem to start with so we didn't add these to my.ini initially.
UPD: Solved half the cases, actually. Setting wait_timeout to maximum integer fixed the other half. Maybe I even now can remove skip-host-cache, skip-name-resolve and it won't slow down in 100% of the cases
Our database architecture consists of two Sql Server 2005 servers each with an instance of the same database structure: one for all reads, and one for all writes. We use transactional replication to keep the read database up-to-date.
The two servers are very high-spec indeed (the write server has 32GB of RAM), and are connected via a fibre network.
When deciding upon this architecture we were led to believe that the latency for data to be replicated to the read server would be in the order of a few milliseconds (depending on load, obviously). In practice we are seeing latency of around 2-5 seconds in even the simplest of cases, which is unsatisfactory. By a simplest case, I mean updating a single value in a single row in a single table on the write db and seeing how long it takes to observe the new value in the read database.
What factors should we be looking at to achieve latency below 1 second? Is this even achievable?
Alternatively, is there a different mode of replication we should consider? What is the best practice for the locations of the data and log files?
Edit
Thanks to all for the advice and insight - I believe that the latency periods we are experiencing are normal; we were mis-led by our db hosting company as to what latency times to expect!
We're using the technique described near the bottom of this MSDN article (under the heading "scaling databases"), and we'd failed to deal properly with this warning:
The consequence of creating such specialized databases is latency: a write is now going to take time to be distributed to the reader databases. But if you can deal with the latency, the scaling potential is huge.
We're now looking at implementing a change to our caching mechanism that enforces reads from the write database when an item of data is considered to be "volatile".
No. It's highly unlikely you could achieve sub-1s latency times with SQL Server transactional replication even with fast hardware.
If you can get 1 - 5 seconds latency then you are doing well.
From here:
Using transactional replication, it is
possible for a Subscriber to be a few
seconds behind the Publisher. With a
latency of only a few seconds, the
Subscriber can easily be used as a
reporting server, offloading expensive
user queries and reporting from the
Publisher to the Subscriber.
In the following scenario (using the
Customer table shown later in this
section) the Subscriber was only four
seconds behind the Publisher. Even
more impressive, 60 percent of the
time it had a latency of two seconds
or less. The time is measured from
when the record was inserted or
updated at the Publisher until it was
actually written to the subscribing
database.
I would say it's definately possible.
I would look at:
Your network
Run ping commands between the two servers and see if there are any issues
If the servers are next to each other you should have < 1 ms.
Bottlenecks on the server
This could be network traffic (volume)
Like network cards not being configured for 1GB/sec
Anti-virus or other things
Do some analysis on some queries and see if you can identify indexes or locking which might be a problem
See if any of the selects on the read database might be blocking the writes.
Add with (nolock), and see if this makes a difference on one or two queries you're analyzing.
Essentially you have a complicated system which you have a problem with, you need to determine which component is the problem and fix it.
Transactional replication is probably best if the reports / selects you need to run need to be up to date. If they don't you could look at log shipping, although that would add some down time with each import.
For data/log files, make sure they're on seperate drives so the performance is maximized.
Something to remember about transaction replication is that a single update now requires several operations to happen for that change to occur.
First you update the source table.
Next the log readers sees the change and writes the change to the distribution database.
Next the distribution agent sees the new entry in the distribution database and reads that change, then runs the correct stored procedure on the subscriber to update the row.
If you monitor the statement run times on the two servers you'll probably see that they are running in just a few milliseconds. However it is the lag time while waiting for the log reader and distribution agent to see that they need to do something which is going to kill you.
If you truly need sub second processing time then you will want to look into writing your own processing engine to handle data moving from one server to another. I would recommend using SQL Service Broker to handle this as this way everything is native to SQL Server and no third party code has to be written.