Timeout while waiting for a reply from the bitbake server (60s) (Toaster tool) - embedded

Bitbake build is failing with timeout error if toaster is running. Build is going successfully without toaster.
Seems like it is failing here https://github.com/openembedded/bitbake/blob/master/lib/bb/server/process.py#L403
As a workaround I am able to make it work by increasing the timeout limit to 5 min.
bitbake-cookerdaemon.log

Related

Is there a way to run command lines asynchronously in Azure-DevOps Build Pipeline?

I'm setting up an Azure-DevOps pipeline in which I want to include automated tests via the Newman CLI.
Imagine a pipeline like this.
Build Project
Copy build to test folder
Run the application => (API-Server)
Run Newman
Kill API Server Process
On Success Copy Build to another folder.
My Problem is that my server application is in a waiting state after it's initialization.
The next Task in my build pipeline won't start.
Is there a way to run multiple command lines asynchronously in Azure-DevOps?
Starting the process in via "start" won't work since it throws me an
ERROR: Input redirection is not supported, exiting the process immediately.
start "%TESTDIR%\foo\bar.exe"
timeout 10

TFS - 'Run SSH task' option times out

In TFS, Am using SSH task with 'Commands' option to connect to a remote machine and run a set of few commands. Am using cd to a particular folder and running a shell script using 'sh '
This script usually takes around 2 hours to finish execution. The ssh task timesout after 15 minutes and exits the task. But when I see in the machine manually, the process is running.
Why doesn't the ssh task wait until the script finishes completely
According to your description, you may encountered a time out limitation of SSH task or build definition.
First, please double check the time out setting under control options.
Specifies the maximum time, in minutes, that a task is allowed to
execute before being cancelled by server. A zero value indicates an
infinite timeout.
Another place to check is build job time out, under the settings of your build definition: Option ->Build job timeout in minutes.
Specifies the maximum time a build job is allowed to execute on an
agent before being canceled by the server.
An empty or zero value indicates an infinite timeout.
If both set properly and you still get the time out, please attach more detail related build failed log with Verbose Debug Mode by setting system.debug=true for troubleshooting.

Flink job started from another program on YARN fails with "JobClientActor seems to have died"

I'm new flink user and I have the following problem.
I use flink on YARN cluster to transfer related data extracted from RDBMS to HBase.
I write flink batch application on java with multiple ExecutionEnvironments (one per RDB table to transfer table rows in parrallel) to transfer table by table sequentially (because call of env.execute() is blocking).
I start YARN session like this
export YARN_CONF_DIR=/etc/hadoop/conf
export FLINK_HOME=/opt/flink-1.3.1
export FLINK_CONF_DIR=$FLINK_HOME/conf
$FLINK_HOME/bin/yarn-session.sh -n 1 -s 4 -d -jm 2048 -tm 8096
Then I run my application on YARN session started via shell script transfer.sh. Its content is here
#!/bin/bash
export YARN_CONF_DIR=/etc/hadoop/conf
export FLINK_HOME=/opt/flink-1.3.1
export FLINK_CONF_DIR=$FLINK_HOME/conf
$FLINK_HOME/bin/flink run -p 4 transfer.jar
When I start this script from command line manually it works fine - jobs are submitted to YARN session one by one without errors.
Now I should be able to run this script from another java program.
For this aim I use
Runtime.exec("transfer.sh");
(maybe are there better ways to do this? I have seen at REST API but there are some difficulties because job manager is proxied by YARN).
At the beginning is works as usually - first several jobs are submitted to session and finished successfully. But the following jobs are not submitted to YARN session.
In /opt/flink-1.3.1/log/flink-tsvetkoff-client-hadoop-dev1.log I see error (and no another errors found in DEBUG level)
The program execution failed: JobClientActor seems to have died before the JobExecutionResult could be retrieved.
I have tried to analyse this problem by myself and found out that this error has occurred in JobClient class while sending ping request with timeout to JobClientActor (i.e. YARN cluster).
I tried to increase multiple heartbeat and timeout options like akka.*.timeout, akka.watch.heartbeat.* and yarn.heartbeat-delay options but it doesn't solve the problem - new jobs are not submit to YARN session from CliFrontend.
The environment for both case (manual call and call from another program) is the same. When I call
$ ps axu | grep transfer
it will give me output
/usr/lib/jvm/java-8-oracle/bin/java -Dlog.file=/opt/flink-1.3.1/log/flink-tsvetkoff-client-hadoop-dev1.log -Dlog4j.configuration=file:/opt/flink-1.3.1/conf/log4j-cli.properties -Dlogback.configurationFile=file:/opt/flink-1.3.1/conf/logback.xml -classpath /opt/flink-1.3.1/lib/flink-metrics-graphite-1.3.1.jar:/opt/flink-1.3.1/lib/flink-python_2.11-1.3.1.jar:/opt/flink-1.3.1/lib/flink-shaded-hadoop2-uber-1.3.1.jar:/opt/flink-1.3.1/lib/log4j-1.2.17.jar:/opt/flink-1.3.1/lib/slf4j-log4j12-1.7.7.jar:/opt/flink-1.3.1/lib/flink-dist_2.11-1.3.1.jar:::/etc/hadoop/conf org.apache.flink.client.CliFrontend run -p 4 transfer.jar
I also tried to update flink to 1.4.0 release or change parallelism of job (even to -p 1) but error has still occurred.
I have no idea what could be different? Is any workaround by the way?
Thank you for any help.
Finally I find out how to resolve that error
Just replace Runtime.exec(...) with new ProcessBuilder(...).inheritIO().start().
I really don't know why the call of inheritIO helps in that case because as I understand it just redirects IO streams from child process to parent process.
But I have checked that if I comment out this line of code the program begins to fall again.

Whether Drone support configuring timeout value for build

I setup a local drone server for our CI. And our project is a java project managed by maven. When run the mvn clean install command, maven will download all dependencies into ~/.m2 directory. The first time run this command will download a huge of data from maven remote repository which may take a very long time. In this case, I got below error on drone CI.
ERROR: terminal inactive for 15m0s, build cancelled
I understand that this message means there is no output on the console for 15 minutes. But it is a normal case in my build environment. I wander whether I can configure the 15m to be a larger value so I can build our project.
My previous answer is outdated. You can now change the default timeout for each individual repository from the repository settings screen. This setting is only available to system administrators.
You can increase the terminal inactive timeout by passing DRONE_TIMEOUT=<duration> to each of your agents.
docker run -e DRONE_TIMEOUT=15m drone/drone:0.5 agent
The timeout value may be any valid Go duration string [1].
# 30 minute timeout
DRONE_TIMEOUT=30m
# 1 hour timeout
DRONE_TIMEOUT=1h
# 1 hour, 30 minute timeout
DRONE_TIMEOUT=1h30m
[1] https://golang.org/pkg/time/#ParseDuration
Looking at the drone source code, it looks like they have environment variables DRONE_TIMEOUT and DRONE_TIMEOUT_INACTIVITY that you can use to configure the inactivity timeout. I tried putting it in my .drone.yml file and it didn't seem to do anything, so this may only be available at a higher level.
Here is the reference to the environment variable DRONE_TIMEOUT_INACTIVITY:
https://github.com/drone/drone/blob/17e5eb50363f3fcdc0a0461162bee93041d600b7/drone/exec.go#L62
Here is the reference to the environment variable DRONE_TIMEOUT:
https://github.com/drone/drone/blob/eee4fd1fd2556ac9e4115c746ce785c7364a6f12/drone/agent/agent.go#L95
Here is where the error is thrown:
https://github.com/drone/drone/blob/5abb5de44aa11ea546db1d3846d603eacef7f0d9/agent/agent.go#L206

StatLight hangs when run from TeamCity as single command

I'm running TeamCity 6.5 on a Windows Server, with a couple of build agents on the same server (all running as the system user as services). I had been building SilverLight projects and running the StatLight (v 1.4.4147) tests previously under Jenkins with no problems. On Jenkins, I called the StatLight test in a custom script as follows:
StatLight.exe -x="Tests.xap"
StatLight.exe -x="MoreTests.xap"
StatLight.exe -x="EvenMoreTests.xap"
... etc., but when I migrated my build jobs to TeamCity, I also changed these into a single command line step as follows:
StatLight.exe --teamcity -x="Tests.xap" -x="MoreTests.xap" -x="EvenMoreTests.xap"
This works about 50% of the time, but when it fails, there's no output in the build log to tell me why - I just get:
[11:41:18]: [MyProject\bin\Release\MoreTests.xap] Tests.ExtensionsTests.WatchObservableCollection
[11:41:18]: [MyProject\bin\Release\MoreTests.xap] Tests.SubscribingModelBaseTests.DisposeIsCalled
[11:41:18]: [MyProject\bin\Release\MoreTests.xap] --- Completed Test Run at: 28/09/2011 11:41:18. Total Run Time: 00:00:11.8125000
[11:41:19]: [MyProject\bin\Release\MoreTests.xap] Test run results: Total 6, Successful 6, Failed 0,
[11:41:19]: [Step 5/6] MyProject\bin\Release\EvenMoreTests.xap (9m:42s)
... and then nothing more. The time reported in that last line just goes up and up until I kill the the build job. Adding the --debug switch to StatLight doesn't improve the above output either.
Right now, I've switched the TeamCity build step to call each test individually as I was in Jenkins, but this is more of a workaround than a proper solution. And of course, I may still run into the above problem - I've yet to find out.
What I'd like to know is what steps I can take to debug this issue properly, or whether there are known issues that can cause the above behaviour?
There was one issue fixed in the 1.5 version relating to teamcity. http://statlight.codeplex.com/workitem/13654
I'm not sure it will fix your issue, but would you mind upgrading, trying and reporting back?