How to determine buffer size - ruby-on-rails-3

Let me explain what information I need for this:
I have several concurrent users hitting the same record at once. This means, there will be some queuing and locking going on, on the db end. How big is this buffer? Can queue and locking hold for 200+ concurrent users?
How can I determine the size of this buffer in our setup? Is there a default setting?

There is no query queue ("buffer") in the database.
Each concurrent connection to the database can have one query in flight. Other queries cannot be queued up behind it*.
Your application probably uses an internal connection pool, being Rails, so you can have however many queries waiting as you have slots in the connection pool.
If you have an external connection pool like PgBouncer proxying between your app and PostgreSQL then you can have more queries queued because you can have a much larger pool size in the app when connecting to pgbouncer as pgbouncer connections are so lightweight. PgBouncer will service those requests on a smaller number of real connections to PostgreSQL. That effectively makes PgBouncer a query queue (though not necessarily a FIFO queue) when used this way. HOWEVER because those queries don't actually hit Pg when they're issued they don't take locks while waiting in PgBouncer. This could be important for some concurrency designs.
* OK, so you can send multiple semicolon separated queries at once, but not in series like a queue.

Related

Why is there more DB connection pool than main thread in Webflux?

I use Webflux R2DBC.
I know, the default value creates 1 thread per CPU core.
However, I know that the default value of R2DBC's connection pool is 10.
I can't understood that....
DB connections operate asynchronously. Then, DB connection don't need many thread pool.
Because the DB connection only sends and receives, connection can do other work(other sending or receiving) while waiting for it.
Rather, since the main thread(1 per CPU core) performs various calculations, the number of main threads should be large.
What am I missing?
I think you kind of have it backwards.
since the main thread(1 per CPU core) performs various calculations, the number of main threads should be large.
A (logical) CPU can perform only one task at a time and since threads in a reactive application should never just wait, since all calls are non-blocking, it doesn't make sense to have more threads than CPU's.
DB connections operate asynchronously. Then, DB connection don't need many thread pool.
Correct, you don't need a thread pool for your database connectivity. But a connection pool isn't a thread pool. I holds connections. Database connections are still expensive to create, so you want to reuse them and database transactions are bound to the database connection, so you need multiple connections in order to process different requests in separate transactions.
Of course how big a connection pool should be is a completely different and rather complex question.

Service to accept SQL queries and run in the background

Is there a service to accept large numbers of SQL queries and run them in the background with retires and logging?
I have multiple clients running large numbers of queries directly against a SQL Server database but because they’re only inserts it would be far more efficient to post the queries to some service which can run them offline in transactions freeing the clients from having to wait for the queries to finish and reducing the connections to the database.
Because the result isn’t needed by the application, I’d like to “fire and forget” the SQL statements knowing they’ll eventually complete, even if they need to retry due to timeouts or network issues.
Does such a service exist?
Does such a service exist?
There is not such a service out-of-the box. As suggested by Gordon Linhoff, you can SEND the batches into a Servcie Broker Queue, or INSERT them into regular Table, and have a background process run them.
In the case of Service Broker, the setup, programming, and troubledhooting is a bit trickier, but you get the Internal Activation to trigger a stored procedure you write when messages appear on the queue.
With a regular table you would just write a SQL Agent job (or similar) that runs in a loop and looks for new rows in the target table, runs the batches it finds, and deletes (or marks) the batches as complete. You don't get the low latency and automatic scale-out that Service Broker Activation provides, but it's much simpler to implement.

Persistent connections to 100K of devices

Server needs to push data to 100K of clients which cannot be connected directly since the machine are inside private network. Currently thinking of using Rabbitmq, Each client subscribed to separate queue, when server has data to be pushed to the client, it publish the data to the corresponding queue. Is there any issues with the above approach? Number of clients may go upto 100K. Through spike, i expecting the memory size to be of 20GB for maintaining the connection. We can still go ahead with this approach if the memory not increasing more than 30GB.
the question is too much generic.
I suggest to read this RabbitMQ - How many queues RabbitMQ can handle on a single server?
Then you should consider to use a cluster to scale the number of the queues

Low performance ActiveMQ

I am performance testing my piece of code working on activeMQ,
I use virtual topics in there. when I send about a 1000 Concurrent requests to en-queue my messages,it takes ages to en-queue all the messages, and sometimes it just hangs in between and starts back after sometime.
I am using JDBC message store,I know some performance effect might be because of that.
Is this hit on performance mainly due to virtual topics?,because on activemq Website they Specify a very high performance of the topic(under ideal conditions ofcourse)
P.S: 1 message takes almost 13-15 milliseconds to be enqueued and dequeued, which is way too high than what performance activeMQ claims to have
http://activemq.apache.org/performance.html
The performance hit is mainly because of the JDBC message store. Virtual Topics do not differ much in performance compared to durable subscriptions.
Please use LevelDB or KahaDB if you want performance. The JDBC store is mainly there for compability with setups that already uses fail-over secured databases with backups etc and want to use them for messages as well. You won't come even close to the numbers in the performance page with plain JDBC.

MSSQL multi-user access

I am experiencing problems with a MSSQL instance, where deadlocks occur from time to time. I have a Table A, which holds temperature measurements. My application contains 1-10 worker threads, which collect measurements via TCP from remote locations and then want to store them inside the database. Of course these workers use transactions to conduct their tasks. The IsolationLevel of the transactions is set to ReadCommitted. Still deadlocks occur and the CPU load of the database server is up at 100%. Can anyone tell me, what I have to consider to get this working? I thought the database system will do the multi-user-synchronization for me. At least this is, what I learned at university.
My suggestion is to create another thread that will handle your updates into the database. So add the information into a collection from the threads in a thread safe manner, and let 1 worker thread do the updates/inserts into the table. You can even concatenate 10-30 of these statements and execute them together.
This is what we have done on a SMS Sender where we used up to 50 threads each sending SMS a SMS every 100ms. It worked brilliantly for us.