JedisConnectionException Read timed out intermittently - redis

My application is running on ECS cluster and Redis is ap part of docker on ECS .
The application runs fine for a week or more but all of a sudden it started throwing Time out exception .
The issue reported in below block of query
api.query("MATCH (ag:dGrp{v:" + rec.DocGroupId + "}),(pg:resUGrp{v:" + rec.userGroupUID + "}) CREATE (pg)-[:dgE{ppv:" + ppv + ","+IdentifierFlag+":1}]->(ag)");
Full stack Trace
redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:205)
at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43)
at redis.clients.jedis.Protocol.process(Protocol.java:155)
at redis.clients.jedis.Protocol.read(Protocol.java:220)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:283)
at redis.clients.jedis.Connection.getOne(Connection.java:261)
at redis.clients.jedis.Jedis.sendCommand(Jedis.java:4119)
at com.redislabs.redisgraph.impl.api.ContextedRedisGraph.sendQuery(ContextedRedisGraph.java:52)
at com.redislabs.redisgraph.impl.api.RedisGraph.sendQuery(RedisGraph.java:68)
at com.redislabs.redisgraph.impl.api.AbstractRedisGraph.query(AbstractRedisGraph.java:46)
and this
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.net.SocketInputStream.read(SocketInputStream.java:127)
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
when we restart our ECS task the problem disappear and comes back after a week .
We have increased max connection to 160 .
When we try to re produce this issue in Lower env by putting heavy request and in bulk for a week also but this issue we were not able to re produce .
We are using Radis 3.5.1 version .
jedis redis -client.jar(version 3.5.1)
Redis Time out Configuration set in redis.conf file:
timeout 0
bind 127.0.0.1
tcp-backlog 511
tcp-keepalive 300
lua-time-limit 5000
loadmodule /usr/lib64/redis/modules/redisgraph.so
Socket Time out in my Code
socket.setSoTimeout(60000);//keep connection for max 60s when idle
this time out is from import java.net.Socket;
RedisGraph Query
And at Reis Graph Query Level we use this Method that does not Time out parameter so i assume it might be taking default one .
/**
* Execute a Cypher query.
* #param graphId a graph to perform the query on
* #param query Cypher query
* #return a result set
*/
public ResultSet query(String graphId, String query) {
return sendQuery(graphId, query);
}
But it has one method that has Timeout parameter also
/**
* Execute a Cypher query with timeout.
* #param graphId a graph to perform the query on
* #param timeout
* #param query Cypher query
* #return a result set
*/
#Override
public ResultSet query(String graphId, String query, long timeout) {
return sendQuery(graphId, query, timeout);
}

I see you are doing graph query which could take more time than general Redis commands. So, for this purpose you can try increasing socket timeout.

Related

Database timeout in Azure SQL

We have a .Net Core API accessing Azure SQL (Gen5, 4 vCores)
Since quite some time,
the API keeps throwing below exception for a specific READ operation
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout
Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
The READ operation has code to read rows of data and convert an XML column into a specific output format.
Most of the read operation extracts hardly 4-5 rows # a time.
The tables involved in the query have ~ 500,000 rows
We are clueless on Root Cause of this issue.
Any hints on where to start looking # for root cause?
Any pointer would be highly appreciated.
NOTE : Connection string has following settings, apart from others
MultipleActiveResultSets=True;Connection Timeout=60
Overall code looks something like this.
HINT: The above timeout exception comes # ConvertHistory, when the 2nd table is being read.
HttpGet]
public async Task<IEnumerable<SalesOrders>> GetNewSalesOrders()
{
var SalesOrders = await _db.SalesOrders.Where(o => o.IsImported == false).OrderBy(o => o.ID).ToListAsync();
var orders = new List<SalesOrder>();
foreach (var so in SalesOrders)
{
var order = ConvertSalesOrder(so);
orders.Add(order);
}
return orders;
}
private SalesOrder ConvertSalesOrder(SalesOrder o)
{
var newOrder = new SalesOrder();
var oXml = o.XMLContent.LoadFromXMLString<SalesOrder>();
...
newOrder.BusinessUnit = oXml.BusinessUnit;
var history = ConvertHistory(o.ID);
newOrder.history = history;
return newOrder;
}
private SalesOrderHistory[] ConvertHistory(string id)
{
var history = _db.OrderHistory.Where(o => o.ID == id);
...
}
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
From Microsoft Document,
You will get this error in both conditions Connection timeout or Query or command timeout. first identify it from call stack of the error messages.
If you found it as a connection issue you can either Increase connection timeout parameter. if you are still getting same error, it is causing due to a network issue.
from information that you provided It is Query or command timeout error to work around this error you can set CommandTimeout for query or command
command.CommandTimeout = 10;
The default timeout value is 30 seconds, the query will continue to run until it is finished if the time-out value is set to 0 (no time limit).
For more information refer Troubleshoot query time-out errors provided by Microsoft.

RabbitMQ and queue data

I have an application with RabbitMQ where I get the number of messages in a Rabbit queue using the HTTP API (/api/queues/vhost/name).
However, it appears that this information is refreshed from time to time (by default every 5 seconds). I thought the information was always up to date, and it was the administration page that was updated in a given interval.
Is there any way to get the number of messages in a queue with real-time information?
Thank you
The management database is updated each 5 seconds by default.
use the command line rabbitmqctl list_queues for real-time values.
Try to use:
channel.messageCount(you_queue)
see if it works for you
/**
* Returns the number of messages in a queue ready to be delivered
* to consumers. This method assumes the queue exists. If it doesn't,
* an exception will be closed with an exception.
* #param queue the name of the queue
* #return the number of messages in ready state
* #throws IOException Problem transmitting method.
*/
long messageCount(String queue) throws IOException;

AWS RDS PostgreSQL error "remaining connection slots are reserved for non-replication superuser connections"

In the dashboard I see there are currently 22 open connections to the DB instance, blocking new connections with the error:
remaining connection slots are reserved for non-replication superuser connections.
I'm accessing the DB from web service API running on EC2 instance and always keep the best practise of:
Connection connection = DriverManager.getConnection(URL, USER_NAME, PASSWORD);
Class.forName(DB_CLASS);
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(SQL_Query_String);
...
resultSet.close();
statement.close();
connection.close();
Can I do something else in the code?
Should I do something else in the DB management?
Is there a way to periodically close connections?
Amazon has to set the number of connections based on each model's right to demand a certain amount of memory and connections
MODEL max_connections innodb_buffer_pool_size
--------- --------------- -----------------------
t1.micro 34 326107136 ( 311M)
m1-small 125 1179648000 ( 1125M, 1.097G)
m1-large 623 5882511360 ( 5610M, 5.479G)
m1-xlarge 1263 11922309120 (11370M, 11.103G)
m2-xlarge 1441 13605273600 (12975M, 12.671G)
m2-2xlarge 2900 27367833600 (26100M, 25.488G)
m2-4xlarge 5816 54892953600 (52350M, 51.123G)
But if you want you can change the max connection size to custom value by
From RDS Console > Parameter Groups > Edit Parameters,
You can change the value of the max_connections parameter to a custom value.
For closing the connections periodically you can setup a cron job some thing like this.
select pg_terminate_backend(procpid)
from pg_stat_activity
where usename = 'yourusername'
and current_query = '<IDLE>'
and query_start < current_timestamp - interval '5 minutes';
I'm using Amazon RDS, SCALA, Postgresql & Slick. First of all - number of available connections in RDS depends on the amount of available RAM - i.e. size of the RDS instance. It's best not to change the default conn number.
You can check the max connection number by executing the following SQL statement on your RDS DB instance:
show max_connections;
Check your SPRING configuration to see how many threads you're spawning:
database {
dataSourceClass = org.postgresql.ds.PGSimpleDataSource
properties = {
url = "jdbc:postgresql://test.cb1111.us-east-2.rds.amazonaws.com:6666/dbtest"
user = "youruser"
password = "yourpass"
}
numThreads = 90
}
All of the connections ARE made upon SRING BOOT initialization so beware not to cross the RDS limit. That includes other services that connect to the DB. In this case the number of connections will be 90+.
The current limit for db.t2.small is 198 (4GB of RAM)
You can change in the parameter group idle_in_transaction_session_timeout to remove idle connections.
idle_in_transaction_session_timeout (integer)
Terminate any session with an open transaction that has been idle for
longer than the specified duration in milliseconds. This allows any
locks held by that session to be released and the connection slot to
be reused; it also allows tuples visible only to this transaction to
be vacuumed. See Section 24.1 for more details about this.
The default value of 0 disables this feature.
The current value in AWS RDS is 86400000 which when converted to hours (86400000/1000/60/60) is 24 hours.
you can change the max connections in the Parameters Group for your RDS instance. Try to increase it.
Or you can try to upgrade your instance, as the max connexions is set to {DBInstanceClassMemory/31457280}
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html

Bigquery Api Java client intermittently returning bad results

I am executing some long running quires using the big-query java client.
I construct a big-query job and execute like this
val queryRequest = new QueryRequest().setQuery(query)
val queryJob = client.jobs().query(ProjectId, queryRequest)
queryJob.execute()
The problem I am facing is the for the same query, the client returns before the job is complete i.e. the number of rows in result is zero.
I tried printing the response and it shows
{"jobComplete":false,"jobReference":{"jobId":"job_bTLRGrw5_xR26i9Li3a9EQvuA6c","projectId":"analytics-production"},"kind":"bigquery#queryResponse"}
From that I can see that the job is not complete. The why did the client return before the job is complete ?
While building the client, I use the HttpRequestInitializer and in the initialize method I provide the timeout parameters.
override def initialize(request: HttpRequest): Unit = {
request.setConnectTimeout(...)
request.setReadTimeout(...)
}
Tried giving high values for timeout like 240 seconds etc..but no luck. The behavior is still the same. It fails intermitently.
Make sure you set the timeout on the Bigquery request body, and not the HTTP object.
val queryRequest = new QueryRequest().setQuery(query).setTimeoutMs(10000) //10 seconds
The param is timeoutMs. This is documented here: https://cloud.google.com/bigquery/docs/reference/v2/jobs/query
Please also read the docs regarding this field: How long to wait for the query to complete, in milliseconds, before the request times out and returns. Note that this is only a timeout for the request, not the query. If the query takes longer to run than the timeout value, the call returns without any results and with the 'jobComplete' flag set to false. You can call GetQueryResults() to wait for the query to complete and read the results. The default value is 10000 milliseconds (10 seconds).
More about Synchronous queries here
https://cloud.google.com/bigquery/querying-data#syncqueries

Read Timed Out : sychronous query via Bigquery java API

We are using the big query JAVA API to retrieve results for our analytics reporting frontend. We are trying to retrieve the results synchronously. A lot of times we get Read timed out error, even before the query timeout as specified in the parameters. Here's the stack trace for a sample fail:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:331)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:830)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:787)
at com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
at com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:36)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:94)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:965)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:410)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:343)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:460)
I am not able to retrieve the job id of the resulting job as the error occurs before I can retrieve a JobReference object. The timeout specified in this case was 300 sec. The query failed well before it. The query contains three JOIN's and several GROUP EACH BY clauses. Can you suggest us a possible way to debug this ?
Adding the code snippet:
QueryRequest queryInfo = new QueryRequest().setQuery(sql)
.setTimeoutMs(timeOutInSec * 1000);
// get project id
BQGameConnectionDetails details = Config
.getBQConnectionDetails(gameId);
String projectId = details.getProjectId();
Bigquery.Jobs.Query queryRequest = getInstance(gameId).jobs()
.query(projectId, queryInfo);
QueryResponse response = queryRequest.execute();
There are two timeouts involved. The first timeout is in the HTTP request you've sent to bigquery. The second is in the bigquery request timeout. It sounds like you've set the latter to a large value, but the former is likely the timeout that you're hitting. If the HTTP request times out before the BigQuery timeout, the connection will be closed and BigQuery won't have a chance to respond.
There are two options: First is to increase the HTTP request timeout (which depends on the libraries you're using, but this page here may be helpful). The second is to decrease the bigquery timeout. This means you'll have to use jobs.getQueryResults() to read the actual results, but this is a more robust method because it doesn't matter how long the query takes, you can just call getQueryResults() in a loop. I would post a link to a good java sample that does this, but I don't know that one exists, unfortunately.