Redis timeout issue - redis

I am getting timeout issue with redis.
Following is the exception
System.TimeoutException: Timeout performing EVAL, inst: 1, mgr: ProcessReadQueue, err: never, queue: 7, qu: 0, qs: 7, qc: 0, wr: 0, wq: 0, in: 507, ar: 1, clientName: EG-APP04, serverEndpoint: Unspecified/####, keyHashSlot: 12598, IOCP: (Busy=0,Free=1600,Min=800,Max=1600), WORKER: (Busy=187,Free=1413,Min=800,Max=1600), Local-CPU: unavailable
Configuration: Connection string
connectionString="####,connectRetry=10, abortConnect=false, allowAdmin=true, ssl=false"
Am I missing something?

Try increasing your syncTimeout to a high value like below:
connectionString="####,connectRetry=10, abortConnect=false,
allowAdmin=true, ssl=false, syncTimeout=30000

Related

Catch Scrapy exception when crawling from Airflow

I'm trying to catch the exception that occurs on my spider in a manner that I can mark the task instance as failed. Currently the task finishes and is marked as succeeded. I'm calling the crawl() from PythonOperator in Airflow, as follow:
with DAG(
'MySpider',
default_args=default_args,
schedule_interval=None) as dag:
t1 = python_task = PythonOperator(
task_id="crawler_task",
python_callable=run_crawler,
op_kwargs=dag_kwargs
)
Here is my run_crawler() method:
def run_crawler(**kwargs):
project_settings = set_project_settings({
'FEEDS': {
f'{kwargs["bucket"]}%(time)s.{kwargs["format"]}': {
'format': kwargs["format"],
'encoding': 'utf8',
'store_empty': kwargs["store_empty"]
}
}
})
print("Project settings: ")
pprint(project_settings.attributes.items())
set_connection("airflow", kwargs["gcs_connection_id"])
process = CrawlerProcess(project_settings)
process.crawl(spider.MySpider)
print("Starting crawler...")
process.start()
When running, I'm having problems with GCS credentials, which leads me to an Exception, as follow:
google.auth.exceptions.DefaultCredentialsError: The file /tmp/file_my_credentials.json does not have a valid type. Type is None, expected one of ('authorized_user', 'service_account', 'external_account', 'external_account_authorized_user', 'impersonated_service_account', 'gdch_service_account').
{logging_mixin.py:115} WARNING - [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 21087,
'downloader/request_count': 68,
'downloader/request_method_count/GET': 68,
'downloader/response_bytes': 1863876,
'downloader/response_count': 68,
'downloader/response_status_count/200': 68,
'elapsed_time_seconds': 25.647386,
'feedexport/failed_count/GCSFeedStorage': 1,
'httpcompression/response_bytes': 9212776,
'httpcompression/response_count': 68,
'item_scraped_count': 66,
'log_count/DEBUG': 136,
'log_count/ERROR': 1,
'log_count/INFO': 10,
'log_count/WARNING': 3,
'memusage/max': 264441856,
'memusage/startup': 264441856,
'request_depth_max': 1,
'response_received_count': 68,
'scheduler/dequeued': 68,
'scheduler/dequeued/memory': 68,
'scheduler/enqueued': 68,
'scheduler/enqueued/memory': 68,
[2032-13-13, 09:04:28 UTC] {engine.py:389} INFO - Spider closed (finished)
[2032-13-13, 09:04:28 UTC] {logging_mixin.py:115} WARNING -
[scrapy.core.engine] INFO: Spider closed (finished)
[2032-13-13, 09:04:28 UTC] {python.py:173} INFO - Done. Returned value was: None
[2032-13-13, 09:04:28 UTC] {taskinstance.py:1408} INFO - Marking task as SUCCESS. dag_id=MySpider, task_id=crawler_task, execution_date=2032-13-13, start_date=2032-13-13, end_date=2032-13-13
[2032-13-13, 09:04:28 UTC] {local_task_job.py:156} INFO - Task exited with return code 0
[2032-13-13, 09:04:28 UTC] {local_task_job.py:279} INFO - 0 downstream tasks scheduled from follow-on schedule check
As you can see, even having this exception, the task itself is marked as "SUCCESS". Is it possible to catch it in order to mark as FAILED, then we can follow it on airflow (Composer) interface?
Thank you
I don't understand why in this case the exception doesn't break the task.
You can add a try except in the run_crawler method and then raise you own exception in the except bloc :
import logging
def run_crawler(**kwargs):
class CustomException(Exception):
pass
try:
project_settings = set_project_settings({
'FEEDS': {
f'{kwargs["bucket"]}%(time)s.{kwargs["format"]}': {
'format': kwargs["format"],
'encoding': 'utf8',
'store_empty': kwargs["store_empty"]
}
}
})
print("Project settings: ")
pprint(project_settings.attributes.items())
set_connection("airflow", kwargs["gcs_connection_id"])
process = CrawlerProcess(project_settings)
process.crawl(spider.MySpider)
print("Starting crawler...")
process.start()
except Exception as err:
logging.error("Error in the Airflow task !!!!!", err)
raise CustomException("Error in the Airflow task !!!!!", err)
In this case when your custom exception will be raised, it will break the Airflow and mark it as failed.

SSL handshake 20s delay to complete process

We are experiencing a delay when accessing https:// connection to our server.
There is a delay around the 20s to display the content of the page which was trace to SSL handshake process
Using
strace -o /tmp/curl.trace.log -f -tt curl -kvv --trace-time https://10.20.23.7:
We trace this to
91445 09:22:03.711590 getpeername(3, {sa_family=AF_INET, sin_port=htons(9070), sin_addr=inet_addr("10.20.23.7")}, [16]) = 0
91445 09:22:03.773153 sendto(3, "\26\3\1\0\220\1\0\0\214\3\3\332\3n\204\371\2472v\341E[\2247=2;\336\214\266}\4"..., 149, 0, NULL, 0) = 149
91445 09:22:03.773267 recvfrom(3, 0x1c2b258, 5, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
91445 09:22:03.773315 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 5000) = 0 (Timeout)
91445 09:22:08.778047 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 5000) = 0 (Timeout)
91445 09:22:13.783040 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 5000) = 0 (Timeout)
91445 09:22:18.787090 poll([{fd=3, events=POLLIN|POLLPRI}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
91445 09:22:20.183606 recvfrom(3, "\26\3\3\n\264", 5, 0, NULL, NULL) = 5
91445 09:22:20.183724 recvfrom(3, "\2\0\0M\3\3]\312\300\34\367\363\256%V\25s)\tSVX\317*\272=\205(\311\1<\345"..., 2740, 0, NULL, NULL) = 2740
91445 09:22:20.184242 fcntl(4, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}) = 0
As I understand recvfrom is not able to read request which was sendto(..)
What mean line sendto, where request was sendto?
91445 09:22:03.773153 sendto(3, "\26\3\1\0\220\1\0\0\214\3\3\332\3n\204\371\2472v\341E[\2247=2;\336\214\266}\4"..., 149, 0, NULL, 0) = 149
Thank you for info.

RabbitMQ - vhost '/' is down for user 'XYZ'. even after user has all access

I am using RabbitMQ version 3.7.17
As my AWS hard disk was completely occupied(100% full). Due to which all the services stopped working
Solution to this: I extended AWS server memory and than tried to start all the API services after that it started throwing error. (Post this it started giving error)
Connection.open: (541) INTERNAL_ERROR - access to vhost '/' refused for user 'XYZ': vhost '/' is down
Restarted RabbitmMQ server using the below code still it was giving error:
sudo service rabbitmq-server restart
If I checked the permission for my user using:
sudo rabbitmqctl list_permissions --vhost /
Response shows that user has all the access.
Listing permissions for vhost "/" ...
user configure write read
XYZ .* .* .*
Thank You.
As the Memory was full the RabbitMQ that was processing was not completed which resulted it in error in vhost.
When tried to restart vhost sudo rabbitmqctl restart_vhost it got error:
ERROR:
Failed to start vhost '/' on node 'rabbit#ip-172-31-16-172'Reason: {:shutdown, {:failed_to_start_child, :rabbit_vhost_process, {:error, {{{:function_clause, [{:rabbit_queue_index, :journal_minus_segment1, [{{true, <<230, 140, 82, 5, 193, 81, 136, 75, 11, 91, 31, 232, 119, 30, 99, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 144>>, <<131, 104, 6, 100, 0, 13, 98, 97, 115, 105, 99, 95, 109, 101, 115, 115, 97, 103, 101, 104, 4, 100, 0, 8, 114, 101, 115, 111, 117, 114, ...>>}, :no_del, :no_ack}, {{true, <<230, 140, 82, 5, 193, 81, 136, 75, 11, 91, 31, 232, 119, 30, 99, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 144>>, <<131, 104, 6, 100, 0, 13, 98, 97, 115, 105, 99, 95, 109, 101, 115, 115, 97, 103, 101, 104, 4, 100, 0, 8, 114, 101, 115, 111, 117, ...>>}, :del, :no_ack}], [file: 'src/rabbit_queue_index.erl', line: 1231]}, {:rabbit_queue_index, :"-journal_minus_segment/3-fun-0-", 4, [file: 'src/rabbit_queue_index.erl', line: 1208]}, {:array, :sparse_foldl_3, 7, [file: 'array.erl', line: 1684]}, {:array, :sparse_foldl_2, 9, [file: 'array.erl', line: 1678]}, {:rabbit_queue_index, :"-recover_journal/1-fun-0-", 1, [file: 'src/rabbit_queue_index.erl', line: 915]}, {:lists, :map, 2, [file: 'lists.erl', line: 1239]}, {:rabbit_queue_index, :segment_map, 2, [file: 'src/rabbit_queue_index.erl', line: 1039]}, {:rabbit_queue_index, :recover_journal, 1, [file: 'src/rabbit_queue_index.erl', line: 906]}]}, {:gen_server2, :call, [#PID<10397.473.0>, :out, :infinity]}}, {:child, :undefined, :msg_store_persistent, {:rabbit_msg_store, :start_link, [:msg_store_persistent, '/var/lib/rabbitmq/mnesia/rabbit#ip-172-31-16-172/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L', [], {#Function<2.32138423/1 in :rabbit_queue_index>, {:start, [{:resource, "/", :queue, "xx_queue"}, {:resource, "/", :queue, "app_xxx_queue"}, {:resource, "/", :queue, "default"}, {:resource, "/", :queue, "xx_priority_queue"}, {:resource, "/", :queue, "xxx_queue"}, {:resource, "/", :queue, "xxxx_queue"}, {:resource, "/", :queue, "yyy_queue"}, {:resource, "/", :queue, "zzz_queue"}, {:resource, "/", :queue, "aaa_queue"}]}}]}, :transient, 30000, :worker, [:rabbit_msg_store]}}}}}
STEPS TO SOLVE IT
Stop your app node by below command.
sudo rabbitmqctl stop_app
Reset your node by below command.
Removes the node from any cluster it belongs to, removes all data from the management database, such as configured users and vhosts, and deletes all persistent messages.(Be careful while using it.)
To backup your data before reset look here
sudo rabbitmqctl reset
Start your Node by below command.
sudo rabbitmqctl start_app
Restart your vhost by below commad.
sudo rabbitmqctl restart_vhost
And if you are using some application that is depended on RabbitMQ. Such as I using celery you will have to restart them again.
This was the link that helped me to solve it.

Can we use pytorch scatter_ on GPU

I'm trying to do one hot encoding on some data with pyTorch on GPU mode, however, it keeps giving me an exception. Can anybody help me?
Here's one example:
def char_OneHotEncoding(x):
coded = torch.zeros(x.shape[0], x.shape[1], 101)
for i in range(x.shape[1]):
coded[:,i] = scatter(x[:,i])
return coded
def scatter(x):
return torch.zeros(x.shape[0], 101).scatter_(1, x.view(-1,1), 1)
So if I give it an tensor on GPU, it shows like this:
x_train = [[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[14, 13, 83, 18, 14],
[ 0, 0, 0, 0, 0]]
print(char_OneHotEncoding(torch.tensor(x_train, dtype=torch.long).cuda()).shape)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-62-95c0c4ade406> in <module>()
4 [14, 13, 83, 18, 14],
5 [ 0, 0, 0, 0, 0]]
----> 6 print(char_OneHotEncoding(torch.tensor(x_train, dtype=torch.long).cuda()).shape)
7 x_train[:5, maxlen:maxlen+5]
<ipython-input-53-055f1bf71306> in char_OneHotEncoding(x)
2 coded = torch.zeros(x.shape[0], x.shape[1], 101)
3 for i in range(x.shape[1]):
----> 4 coded[:,i] = scatter(x[:,i])
5 return coded
6
<ipython-input-53-055f1bf71306> in scatter(x)
7
8 def scatter(x):
----> 9 return torch.zeros(x.shape[0], 101).scatter_(1, x.view(-1,1), 1)
RuntimeError: Expected object of backend CPU but got backend CUDA for argument #3 'index'
BTW, if we simply remove the .cuda() here, everything goes one well
print(char_OneHotEncoding(torch.tensor(x_train, dtype=torch.long)).shape)
torch.Size([5, 5, 101])
Yes, it is possible. You have to pay attention that all tensors are on GPU. In particular, by default, constructors like torch.zeros allocate on CPU, which will lead to this kind of mismatches. Your code can be fixed by constructing with device=x.device, as below
import torch
def char_OneHotEncoding(x):
coded = torch.zeros(x.shape[0], x.shape[1], 101, device=x.device)
for i in range(x.shape[1]):
coded[:,i] = scatter(x[:,i])
return coded
def scatter(x):
return torch.zeros(x.shape[0], 101, device=x.device).scatter_(1, x.view(-1,1), 1)
x_train = torch.tensor([
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[14, 13, 83, 18, 14],
[ 0, 0, 0, 0, 0]
], dtype=torch.long, device='cuda')
print(char_OneHotEncoding(x_train).shape)
Another alternative are constructors called xxx_like, for instance zeros_like, though in this case, since you need different shapes than x, I found device=x.device more readable.

how to redirect GAP output to a text file on a local drive?

Example
m1;
[ [ -1, 0, 0, 0, 0, 0 ], [ 1, 1, -1, 0, 0, 0 ], [ 0, 0, -1, 0, 0, 0 ], [ 0, 0, 0, -1, 0, 0 ], [ 0, 0, 0, 1, 1, 0 ],
[ 0, 0, 0, 0, 0, -1 ] ]
in the Windows version of the GAP system, how do it redirect any output to a text file on a local drive?
You may use LogTo command to save inputs and outputs of the whole GAP session, or you may use PrintTo to print the object to the text file.
Enter ?LogTo and `?PrintTo' in GAP to see the documentation.
P.S. If you prefer to ask questions about GAP in StackExchange framework, I'd recommend to try to ask them at Mathematics Q&A site here.