Does UniData only allow one command or query per connection at a time? - unidata

Does UniData (Rocket U2) only allow one query or command to run at a time per connection?
I know each connection has a process or two (udapi_server/slave/etc. I believe), and we pool connections via UniObjects connection pooling. I know there have been optimizations to the UniRPC service in recent releases allowing threaded connection acceptance, but my suspicion is that even with that only one query is executed at a time on each connection synchronously.
i.e. if you have a maximum of 10 pooled connections and 10 long running queries, nothing else is even starting until one process completes its query - even if they are all I/O bound and CPU idle.

In connection pooling the process using the connection needs to finish prior to another request being accepted and using the same CP process. Connection pooling is ideal for small requests. Large queries returning the entire result set at once will tie up the CP process, and cause other requests for a connection to queue up.
Ian provided one possible solution, and I expect there are many ways you could break up the request to smaller requests that would not tie up the connection pooling licenses for prolonged periods of time.

Related

Why is there more DB connection pool than main thread in Webflux?

I use Webflux R2DBC.
I know, the default value creates 1 thread per CPU core.
However, I know that the default value of R2DBC's connection pool is 10.
I can't understood that....
DB connections operate asynchronously. Then, DB connection don't need many thread pool.
Because the DB connection only sends and receives, connection can do other work(other sending or receiving) while waiting for it.
Rather, since the main thread(1 per CPU core) performs various calculations, the number of main threads should be large.
What am I missing?
I think you kind of have it backwards.
since the main thread(1 per CPU core) performs various calculations, the number of main threads should be large.
A (logical) CPU can perform only one task at a time and since threads in a reactive application should never just wait, since all calls are non-blocking, it doesn't make sense to have more threads than CPU's.
DB connections operate asynchronously. Then, DB connection don't need many thread pool.
Correct, you don't need a thread pool for your database connectivity. But a connection pool isn't a thread pool. I holds connections. Database connections are still expensive to create, so you want to reuse them and database transactions are bound to the database connection, so you need multiple connections in order to process different requests in separate transactions.
Of course how big a connection pool should be is a completely different and rather complex question.

Pooling, Client Checked out, idleTimeoutMillis

This is my understanding after reading the Documents:
Pooling, like many other DBs, we have only a number of allowed connections, so you guys all line-up and wait for a free connection returned to the pool. (a connection is like a token in a sense)
at any given time, number of active and/or available connections is controlled in the range of 0-max.
idleTimeoutMillis said is "milliseconds a client must sit idle in the pool and not be checked out before it is disconnected from the backend and discarded." Not clear on this. Generally supposed when a client say a web app has done its CRUD but not return the connection voluntarily believed is idle. node-postgres will start the clock, once reaches the number of milliseconds will take the connection back to the pool for next client. So what is not be checked out before it is disconnected from the backend and discarded?
Say idleTimeoutMillis: 100, does it mean this connection will be literally disconnected (log-out) after idle for 100 millisecond? If yes then it's not returning to the pool and will result in frequent login connection as the doc said below:
Connecting a new client to the PostgreSQL server requires a handshake
which can take 20-30 milliseconds. During this time passwords are
negotiated, SSL may be established, and configuration information is
shared with the client & server. Incurring this cost every time we
want to execute a query would substantially slow down our application.
Thanks in advance for the stupid questions.
Sorry this question was not answered for so long but I recently came across a bug which questioned my understanding of this library too.
Essentially when you're pooling you're saying to the library you can have a maximum of X connections to the Database simultaneously open. So every request that comes into a CRUD API for example will open a new connection and you will have a total of X requests possible as each request opens a new connection. Now that means as soon as a request comes in it 'checks out' a connection from the pool. This also means another request cannot use this connection as it is currently blocked by another request.
So in order to let's say 'reuse' the same connection when one request is done with that connection you have to release it and say it's ready to use again 'checking out'. Now when another request comes in it is able to use this connection and do the aforementioned query.
idleTimeoutMillis this variable to me is very confusing to me and took a while to get my head around. When there is an open connection to a DB which has been released or 'checked out' it is in an IDLE state, which means that anyone wanting to make a request can make a request with this connection as it is not being used. This variable says that when a connection is in an IDLE state how long do we wait until we can close this connection. For various things this may be used. Obviously having open DB connections requires memory and so forth so closing them might be beneficial. Also when autoscaling - let's say you been at max requests/second and and you're using all DB conns then this is useful to keep IDLE connections open for a bit. However if this is too long and you scale down then you can run into prolonged memory as each IDLE connection will require some memory space.
The benefit of this is when you have an open connection and just send a query with it you don't need to re-authenticate with the DB it's authenticated and ready to go.

Service to accept SQL queries and run in the background

Is there a service to accept large numbers of SQL queries and run them in the background with retires and logging?
I have multiple clients running large numbers of queries directly against a SQL Server database but because they’re only inserts it would be far more efficient to post the queries to some service which can run them offline in transactions freeing the clients from having to wait for the queries to finish and reducing the connections to the database.
Because the result isn’t needed by the application, I’d like to “fire and forget” the SQL statements knowing they’ll eventually complete, even if they need to retry due to timeouts or network issues.
Does such a service exist?
Does such a service exist?
There is not such a service out-of-the box. As suggested by Gordon Linhoff, you can SEND the batches into a Servcie Broker Queue, or INSERT them into regular Table, and have a background process run them.
In the case of Service Broker, the setup, programming, and troubledhooting is a bit trickier, but you get the Internal Activation to trigger a stored procedure you write when messages appear on the queue.
With a regular table you would just write a SQL Agent job (or similar) that runs in a loop and looks for new rows in the target table, runs the batches it finds, and deletes (or marks) the batches as complete. You don't get the low latency and automatic scale-out that Service Broker Activation provides, but it's much simpler to implement.

How to determine buffer size

Let me explain what information I need for this:
I have several concurrent users hitting the same record at once. This means, there will be some queuing and locking going on, on the db end. How big is this buffer? Can queue and locking hold for 200+ concurrent users?
How can I determine the size of this buffer in our setup? Is there a default setting?
There is no query queue ("buffer") in the database.
Each concurrent connection to the database can have one query in flight. Other queries cannot be queued up behind it*.
Your application probably uses an internal connection pool, being Rails, so you can have however many queries waiting as you have slots in the connection pool.
If you have an external connection pool like PgBouncer proxying between your app and PostgreSQL then you can have more queries queued because you can have a much larger pool size in the app when connecting to pgbouncer as pgbouncer connections are so lightweight. PgBouncer will service those requests on a smaller number of real connections to PostgreSQL. That effectively makes PgBouncer a query queue (though not necessarily a FIFO queue) when used this way. HOWEVER because those queries don't actually hit Pg when they're issued they don't take locks while waiting in PgBouncer. This could be important for some concurrency designs.
* OK, so you can send multiple semicolon separated queries at once, but not in series like a queue.

how to deal with read() timeout in Redis client?

Assume that my client send a 'INCR' command to redis server, but the response packet is lost, so my client's read() will times out, but client is not able to tell if INCR operation has been performed by server.
what to do next? resending INCR or continuing next command? If client resends INCR, but in case redis had carried out INCR in server side before, this key will be increased two times, which is not what we want.
This is not a problem specific to Redis: it also applies to any other data stores (including transactional ones). There is no solution to this problem: you can only hope to minimize the issue.
For instance, some people tend to put very aggressive values for their timeout thinking that Redis is supposed to be a soft real-time data store. Redis is fast, but you also need to consider the network, and the system itself. Network related problems may generate high latencies. If the system starts swapping, it will very seriously impact Redis response times.
I tend to think that putting a timeout under 2 secs is a nonsense on any Unix/Linux system, and if a network is involved, I am much more comfortable with 10 secs. People put very low values because they want to avoid their application to block: it is a mistake. Rather than setting very low timeouts and keep the application synchronous, they should design the application to be asynchronous and set sensible timeouts.
After a timeout, a client should never "continue" with the next command. It should close the connection, and try to open a new one. If a reply (or a query) has been lost, it is unlikely that the client and the server can resynchronize. It is safer to close the connection.
Should you try to issue the INCR again after the reconnection? It is really up to you. But if a read timeout has just been triggered, there is a good chance the reconnection will time out as well. Redis being single-threaded, when it is slow for one connection, it is slow for all connections simultaneously.