Pooling, Client Checked out, idleTimeoutMillis - node-postgres

This is my understanding after reading the Documents:
Pooling, like many other DBs, we have only a number of allowed connections, so you guys all line-up and wait for a free connection returned to the pool. (a connection is like a token in a sense)
at any given time, number of active and/or available connections is controlled in the range of 0-max.
idleTimeoutMillis said is "milliseconds a client must sit idle in the pool and not be checked out before it is disconnected from the backend and discarded." Not clear on this. Generally supposed when a client say a web app has done its CRUD but not return the connection voluntarily believed is idle. node-postgres will start the clock, once reaches the number of milliseconds will take the connection back to the pool for next client. So what is not be checked out before it is disconnected from the backend and discarded?
Say idleTimeoutMillis: 100, does it mean this connection will be literally disconnected (log-out) after idle for 100 millisecond? If yes then it's not returning to the pool and will result in frequent login connection as the doc said below:
Connecting a new client to the PostgreSQL server requires a handshake
which can take 20-30 milliseconds. During this time passwords are
negotiated, SSL may be established, and configuration information is
shared with the client & server. Incurring this cost every time we
want to execute a query would substantially slow down our application.
Thanks in advance for the stupid questions.

Sorry this question was not answered for so long but I recently came across a bug which questioned my understanding of this library too.
Essentially when you're pooling you're saying to the library you can have a maximum of X connections to the Database simultaneously open. So every request that comes into a CRUD API for example will open a new connection and you will have a total of X requests possible as each request opens a new connection. Now that means as soon as a request comes in it 'checks out' a connection from the pool. This also means another request cannot use this connection as it is currently blocked by another request.
So in order to let's say 'reuse' the same connection when one request is done with that connection you have to release it and say it's ready to use again 'checking out'. Now when another request comes in it is able to use this connection and do the aforementioned query.
idleTimeoutMillis this variable to me is very confusing to me and took a while to get my head around. When there is an open connection to a DB which has been released or 'checked out' it is in an IDLE state, which means that anyone wanting to make a request can make a request with this connection as it is not being used. This variable says that when a connection is in an IDLE state how long do we wait until we can close this connection. For various things this may be used. Obviously having open DB connections requires memory and so forth so closing them might be beneficial. Also when autoscaling - let's say you been at max requests/second and and you're using all DB conns then this is useful to keep IDLE connections open for a bit. However if this is too long and you scale down then you can run into prolonged memory as each IDLE connection will require some memory space.
The benefit of this is when you have an open connection and just send a query with it you don't need to re-authenticate with the DB it's authenticated and ready to go.

Related

Does UniData only allow one command or query per connection at a time?

Does UniData (Rocket U2) only allow one query or command to run at a time per connection?
I know each connection has a process or two (udapi_server/slave/etc. I believe), and we pool connections via UniObjects connection pooling. I know there have been optimizations to the UniRPC service in recent releases allowing threaded connection acceptance, but my suspicion is that even with that only one query is executed at a time on each connection synchronously.
i.e. if you have a maximum of 10 pooled connections and 10 long running queries, nothing else is even starting until one process completes its query - even if they are all I/O bound and CPU idle.
In connection pooling the process using the connection needs to finish prior to another request being accepted and using the same CP process. Connection pooling is ideal for small requests. Large queries returning the entire result set at once will tie up the CP process, and cause other requests for a connection to queue up.
Ian provided one possible solution, and I expect there are many ways you could break up the request to smaller requests that would not tie up the connection pooling licenses for prolonged periods of time.

Performance difference between global database connection and opening connection everytime on Golang

In my current project I was opening a new database connection every time when user makes request. For example:
func login(w http.ResponseWriter, r *http.Request) {
...
db, err := sqlx.Connect("postgres", "user=postgres password=*** dbname=postgres")
if err != nil {
ErrorWithJSON(w, err.Error(), http.StatusBadRequest)
return
}
db.SetMaxIdleConns(0)
db.SetConnMaxLifetime(time.Second * 30)
user, err := loginManager(db, m)
...
err = db.Close()
}
When I searched for other people's code, I've seen that most of the developers create a global variable for database connection, set it on the main and use this variable on entire project.
I was wondering is there any difference between these approaches? If I use global variable will there be any latency when 5 different users makes requests for register/login etc. If there will be latency, should I create multiple database connections and store them in a slice for future requests so I can pick randomly when users make request. Like a simple load balancer, I don't know?
Sorry for multiple questions. Thank you!
Yes, there might be a huge performance difference (might be several order of magnitude depending on the nature of queries you run and on system and server configuration).
The sqlx.DB type wraps (embeds) an sql.DB type, which manages a pool of connections:
DB is a database handle representing a pool of zero or more underlying connections. It's safe for concurrent use by multiple goroutines.
The sql package creates and frees connections automatically; it also maintains a free pool of idle connections. If the database has a concept of per-connection state, such state can only be reliably observed within a transaction.
Every time you open a new connection, a lot of things have to happen in the "background": connection string has to be parsed, a TCP connection has to be estabilished, authentication / authorization must be performed, resources must be allocated at both sides (client and server) etc. These are just the main, obvious things. Even though some of these may be provided / implemented optimized, cached, there is still a significant overhead compared to having a single DB instance which might have multiple established, authenticated connections ready in a pool, waiting to be used / utilized.
Also quoting from sql.Open():
The returned DB is safe for concurrent use by multiple goroutines and maintains its own pool of idle connections. Thus, the Open function should be called just once. It is rarely necessary to close a DB.
sqlx.Connect() which you used calls sqlx.Open() which is "the same as sql.Open, but returns an *sqlx.DB instead".
So all in all, use a single, global sqlx.DB or sql.DB instance, and share / use that everywhere. It provides you automatic connection- and connection pool management. This will provide you the best performance. You may fine-tune the connection pool with the DB.SetConnMaxLifetime(), DB.SetMaxIdleConns() and DB.SetMaxOpenConns() methods.
Idle connections (DB.SetMaxIdleConns()) are those that are not in-use currently, but sitting in the pool, waiting for someone to pick them up. You should definitely have some of these, e.g. 5 or 10 of them, or even more. DB.SetConnMaxLifetime() controls how long a new connection may be used. Once it grows older than this, it will be closed (and a new one will be opened if needed). You shouldn't change this, default behavior is never to expire connections. Basically all defaults are sensible, you should only play with them if you experience performance problems. Also, read docs of these methods to have a clear picture.
See this similar, possible duplicate question:
mgo - query performance seems consistently slow (500-650ms)

When to open and close the connection to DB

I'm coding in Java EE and I have some class that manage all the actions with my DB.
I was asking myself when should I open/close the connection with the DB.
Is it better to open and close it in each method ?
Or is it better to open it in the constructor and close it when I finished using my class ?
Thx
there is no generic solution. all decisions depends of concrete task. you should remember next things:
every connect to the database is the application time. if your method will be called too often, your application will waste lot of time to connect and disconnect tasks and much more with the slow network. in case of rarely calls, it will not so important;
if your method connects to database in the constructor for the long time without any operations the connection may be dropped. this is not neccesary, but may cause of the network issues or database connections policy. so before every query should be checked with fast and simple operation like select 'some random text' from dual;
database resources is not infinite and total number of connections is limited. this limit can be very large, but it still exist. so, if your application can be used in parallel several (hungred, thousands) times, it may reach the limits with permanent connections.
if you have no information of future usage of method, I advise use time limited permament connection. it should be opened with first query and closed with timer if method did no queries through this connection for some time like 3-5 seconds. sure, any querier should check connection status before query. open it, if it closed, with touching the closing timer. and don't forget explicitly close connection at destructor.
I would argue that you shouldn't be managing it in code. Every EE server I've worked on has connection pooling to remove the need for you to care about this. Basically you "open" the connection when you need and "close" it when you need. Those words are in quotes because it is up to the pool to manage when a connection is truly opened and closed.
From a design perspective then use the connection only when you need it. Object construction doesn't make sense - what if a method in the class doesn't get called for an hour? What is the purpose of having it open when you don't need it? So if a method needs a connection open and close it in the method.

If you have two distinct Data Source Connections in ColdFusion with the same settings do they share the same pool?

I have created 2 distinct data source connections (to MS SQL Server 2008) in the ColdFusion Administrator that have exactly the same settings except for the actual name of the connection. My question is will this create two distinct connection pools or will they share one?
They will have different pools. The pools are defined at the data source level and you have two distinct data sources as far as ColdFusion is concerned. Why would you have two different data sources with the exact same settings? I guess if you wanted to force them to use different connection pools. I can't think of a reason why though.
I found this page that documents how database connections are handled in ColdFusion. Note that the "Maintain Database Connections" setting is set for each data source.
Here is the section related to connection pooling from that page (in case it goes away):
If the "Maintain Database Connections" is set for a data source, how does ColdFusion Server maintain the connection pool?
When "Maintain Database Connections" is set for a data source, ColdFusion keeps the connection open after its first connection to the database. It does not log out of the database after this first connection. You can change this setting according to the instructions in step d above. Another setting in the ColdFusion Administrator, called "Limit cached database connection inactive time to X minutes," closes a "maintained" database connection after X inactive minutes. This setting is server wide and determines when a "maintained" connection is finally closed. You can modify this setting by going to the "Caching" tab within the ColdFusion Administrator. The interface for modifying the "Limit cached database connection inactive time to X minutes" looks like the following:
If a request is using a data source connection that is already opened, and another request to the data source comes in, a new connection is established. Since only one request can use a connection at any time, the simultaneous request will open up a new connection because no idle cached connections are available. The connection pool can increase up to the setting for simultaneous connections limit which is set for each data source. This setting, called, "Limit Connections," is in the ColdFusion Administrator. Click on one of the data source tabs and then click on one of your data sources. Click on "CF Settings" and put a check next to "Limit Connections" and enter a number in the sentence, "Enable the limit of X simultaneous connections." Please note that if you do not set this under the data source setting, ColdFusion Server will use the server wide "Simultaneous Requests" setting.
At this point, there is a pool of two database connections that ColdFusion Server maintains. Each connection remains in the pool until either the "Connection Timeout" period is reached or exceeds the inactivity time. If neither of the first two options are implemented, the connections remain in the pool until ColdFusion is restarted.
The "Connection Timeout" setting closes the connection and eliminates it from the pool whether or not it has been active or inactive. If the process is active, it will not terminate the connection. You can change this setting by going to "CF Settings" for your data source in the ColdFusion Administrator. Note: Only the "Cached database connection inactive time" setting will end the connection and eliminate it from the pool if it hasn't been used. You can also use the "Connection Timeout" to override the"Cached database connection inactive" setting as it applies only to a single data source, not all data sources.
They have different pools. Pooling is implemented by cf java code. (Or was that part in the jrun code.... ). It doesn't use any jdbc based pooling. Cf10 could have switched to jdbc based pooling but I doubt it.
As a test
Set the 'verify connection' sql to wait-for delay '00:01:00' or similar (wait for 1 minute) on both pools. As pool access is single-threaded for each pool - including the time taken to run the verify - have 2 pages each accessing a different data source , request both. If they complete after 1 minute it's 2 pools, if one page takes 1 minute and the other takes 2 minutes - it's one pool
As a side note, if during this 1 minute verify you yank out the network cable (causing the jdbc socket to stay open forever waiting for a response ) your thread pool is now dead and you need to restart CF
Try to create temporary table with two different datasource, if you get error for second query it use same pool and run perfectly file means different pool.

How long should you keep an ODBC SQL connection open?

A long-running application uses the MS C OBDC API to create and use SQL connections to an Oracle DB. The application was originally designed to establish an ODBC connection at startup and keep that connection indefinitely as the application runs, potentially for weeks or months.
We're seeing very infrequent cases where a connection suddenly dies and I wondered if that's because we're using them wrong, or if it's considered OK to hold a connection like this. Can anyone point me to some definitive information on the subject?
I'm not sure there's definitive information regarding this, but with long-running programs you always have to be prepared for this kind of incidents, they just happen (and not only with db connections but also with sockets that remain open for extended periods). I have no experience with Oracle, but I have a very similar setup with Informix, and this is (in pseudocode) what we do
while (programissupposedtorun) {
opendb();
do {
youractivities();
} while(dbisok);
closedbandcleanup();
}
As long as you are able to correctly detect that the connection died and are able to resume processing without losing data you should be OK.
I'm not familiar with ODBC but such use cases are best handled with a connection pool. Your application can simply request a connection from the pool when it has some work and release it as soon as its done -- the pool will take care of actually (re)connecting to the database.
A quick search for ODBC connection pooling brought this up: Driver Manager Connection Pooling