Reducers failing - apache

We are using 3 cluster machine and mapreduce.tasktracker.reduce.tasks.maximum property is set to 9. When I set no of reducer is equal to or less than 9 job is getting succeeded but if I set greater than 9 then it is failing with the exception "Task attempt_201701270751_0001_r_000000_0 failed to ping TT for 60 seconds. Killing!". Can any one guide me what will be the problem

There seem to be some bug in hadoop -0.20.
https://issues.apache.org/jira/browse/MAPREDUCE-1905 (for reference ).
Can you please try to increase the task timeout ?
(mapreduce.task.timeout to a higher value ) ( 0 will disable the timeout )

Related

Circuit breaker config not working properly

I am setting up new molecular project and trying to config circuit breaker in my project under molecular.config.js and set windowOpen to 6 sec. But when i do any operation and throw an error. Circuit doesn't break down. I am not able to find any solution for this.
Need Help
Here you can see the parameters: https://moleculer.services/docs/0.13/fault-tolerance.html#Settings
The default values don't trip the CB after the first error. E.g. the number of requests should reach the minRequestCount. If you want to trip the CB after the first error, set threshold: 1 & minRequestCount: 0

Login fail attempt delay

I've read some about login security and I've found a good practice for preventing rapid-fire login attempts. The idea is apply a short time delay that increases with the number of failed attempts, like:
1 failed attempt = no delay
2 failed attempts = 2 sec delay
3 failed attempts = 4 sec delay
4 failed attempts = 8 sec delay
5 failed attempts = 16 sec delay
etc.
I understand the idea, but I would like to know how to code this.
Where and how should I put the delay? In the backend or in the frontend? I think it would be in the backend... But, how could I do that? How can I stop the current attempt for any seconds and continue? Any idea?
Thanks!
I find that I should put it in the backend using some method that delay the current thread like it's been seen here.
If I do that, it won't affect the other users, isn't it?

What is the race condition for Redis INCR Rate Limiter 2?

I have read the INCR documentation here but I could not understand why the Rate limiter 2 has a race condition.
In addition, what does it mean by the key will be leaked until we'll see the same IP address again in the documentation?
Can anyone help explain? Thank you very much!
You are talking about the following code, which has two problems in multiple-threaded environment.
1. FUNCTION LIMIT_API_CALL(ip):
2. current = GET(ip)
3. IF current != NULL AND current > 10 THEN
4. ERROR "too many requests per second"
5. ELSE
6. value = INCR(ip)
7. IF value == 1 THEN
8. EXPIRE(ip,1)
9. END
10. PERFORM_API_CALL()
11.END
the key will be leaked until we'll see the same IP address again
If the client dies, e.g. client is killed or machine is down, before executing LINE 8. Then the key ip won't be set an expiration. If we'll never see this ip again, this key will always persist in Redis database, and is leaked.
Rate limiter 2 has a race condition
Suppose key ip doesn't exist in the database. If there are more than 10 clients, say, 20 clients, execute LINE 2 simultaneously. All of them will get a NULL current, and they all will go into the ELSE clause. Finally all these clients will execute LINE 10, and the API will be called more than 10 times.
This solution fails, because these's a time window between LINE 2 and LINE 3.
A Correct Solution
value = INCR(ip)
IF value == 1 THEN
EXPIRE(ip, 1)
END
IF value <= 10 THEN
return true
ELSE
return false
END
Wrap the above code into a Lua script to ensure it runs atomically. If this script returns true, perform the API call. Otherwise, do nothing.

Mono error when load testing

During load testing (using Load UI) of a new .Net web api using Mono hosted on a medium sized Amazon server I'm receiving the following results (in chronological order over the course of about ten minutes)
5 connections per second for 60 seconds
No errors
50 connections per second for 60 seconds
No errors
100 connections per second for 60 seconds
Received 3 errors, appearing later during the run
2014-02-07 00:12:10Z Error HttpResponseExtensions Error occured while Processing Request: [IOException] Write failure Write failure|The socket has been shut down
2014-02-07 00:12:10Z Info HttpResponseExtensions Failed to write error to response: {0} Cannot be changed after headers are sent.
5 connections per second for 60 seconds
No errors
100 connections per second for 30 seconds
No errors
100 connections per second for 60 seconds
Received 1 error same as above, appearing later during the run
100 connections per second for 45 seconds
No errors
Doing some research on this, this error seems to be a standard one received when a client closed the connection. As this is only occurring during the heavier load tests, I am wondering if it is just getting to the upper limits of what the server instance can support? If not any suggestions on hunting down the source of the errors?

errors while running hive queries

I am trying to run hive queries but I am getting errors as:
hive> FROM (
> FROM t1
> MAP t1.patient_mrn, t1.encounter_date
> USING 'retrieve'
> AS mp1, mp2
> CLUSTER BY mp1) map_output
> INSERT OVERWRITE TABLE t3
> REDUCE map_output.mp1, map_output.mp2
> USING 'q1.txt'
> AS reducef1, reducef2;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201112281627_0097, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201112281627_0097
Kill Command = /home/hadoop/hadoop-0.20.2-cdh3u2//bin/hadoop job -Dmapred.job.tracker=localhost:54311 -kill job_201112281627_0097
2011-12-31 03:10:46,391 Stage-1 map = 0%, reduce = 0%
2011-12-31 03:11:29,794 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201112281627_0097 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
hive>
Best advice without knowing a lot more is where to find the error logs. So go to your JobTracker's web page, find the page for that job, and drill down to find the error logs.
Look for any "failed" tasks, click there to get to the page for that specific task.
You'll eventually get to the page containing the task-specific log, and that should help you diagnose the problem.
This could happen in n number of scenarios. Rerun the query once more and check the jobtracker for the failed/killed attempts and go through the logs for exact reason.