SSH command step not working for one command - in Jmeter - ssh

I have a unique problem using jmeter SSH command.
I use this step to run spark jobs.
the problem is that one of the commands not working, to clarify it connects and not get response and just wait and wait for hours, and nothing displayed on screen.
I know how to work with the tool, and this behavior is special for this script alone.
All other script worked, I duplicate one that worked for example
sudo /run_stg.sh this command worked
sudo /run_off2-stg.sh this command not worked
if I run the job manually via jenkins it worked
if I entered to command line and use plik ssh it worked,
the problem is just Jmeter, that is waiting and waiting and I can not understand for what?
the job is about 3 minutes, and I wait for response in Jmeter for 4 hours and nothing Jmeter just waiting.
in the console log I set to trace level and nothing, absolutely no idea how to start handle this issue in Jmeter.
an anyone please assists how to make Jmeter to write what happened?
or just to know if he connect or anything
since this behavior all the test can not be performed

Most probably you are as usual misconfiguring the SSH Command sampler.
The idea is not to run the script per se, you need to delegate the script execution to the Unix Shell, for example Bash this way you will be able to combine several commands together, see the output, amend debugging level, etc.
So I would recommend setting your command to something like /bin/bash -c -x /your/script.sh
Another guess, given you use sudo it might be the case that the sudo command simply waits for the password (which JMeter never provides), if this is the case try amending your script permissions using chmod command and allowing your user its execution without root privileges.
And finally, given you're able to run your command using "plik ssh" (whatever it is) you can run it using OS Process Sampler
More information: How to Run External Commands and Programs Locally and Remotely from JMeter

Related

Execute a shell command outside of a sandbox while in a sandbox

I'm using singularity to run python in an environnement deprived of python. I'm also running a mysql instance as explained by the IOWA state university (running an instance of mysql, and closing it when done).
For clarity, I'm using a bash script to open mysql, then do what i have to do (a python script) and close mysql, and it works fine. But Python's only way to stop if an error occured is sys.exit([value]) and this not only stops the python script, but also the bash script that ran it. This makes it impossible for me to manage the errors and close the instance of mysql if the python script exits.
My question is : Is there a way for me to execute a 'singularity instance stop mysql' while being in the python sandbox. Something to tell singularity "hey, this command here must be used on the host !" ?
I keep searching but can't find anything.
I only tried to execute it with subprocess like any other command, but it returned an error message because I don't have this instance inside the python sandbox. I don't even have singularity in this sandbox.
For any clarifications, just ask me, I'm trying to be clear but I'm pretty sure it's not very clear.
Thanks a lot !
Generally speaking, it would be a big security issue if a process could be initiated from inside a container (docker or singularity) but run in the host OS's namespace.
If the bash script is exiting on the python failure, it sounds like you're using set -e or #!/bin/bash -e. This causes the script to abort if any command returns non-zero. It's commonly recommended for safer processing, but can cause problems like this at times. To bypass that for the python step you can modify your script:
# start mysql, do some stuff
set +x # disable abort on non-zero return
python my_script.py
set -x # re-enable abort on non-zero
# shut down mysql, do other stuff

Flink job started from another program on YARN fails with "JobClientActor seems to have died"

I'm new flink user and I have the following problem.
I use flink on YARN cluster to transfer related data extracted from RDBMS to HBase.
I write flink batch application on java with multiple ExecutionEnvironments (one per RDB table to transfer table rows in parrallel) to transfer table by table sequentially (because call of env.execute() is blocking).
I start YARN session like this
export YARN_CONF_DIR=/etc/hadoop/conf
export FLINK_HOME=/opt/flink-1.3.1
export FLINK_CONF_DIR=$FLINK_HOME/conf
$FLINK_HOME/bin/yarn-session.sh -n 1 -s 4 -d -jm 2048 -tm 8096
Then I run my application on YARN session started via shell script transfer.sh. Its content is here
#!/bin/bash
export YARN_CONF_DIR=/etc/hadoop/conf
export FLINK_HOME=/opt/flink-1.3.1
export FLINK_CONF_DIR=$FLINK_HOME/conf
$FLINK_HOME/bin/flink run -p 4 transfer.jar
When I start this script from command line manually it works fine - jobs are submitted to YARN session one by one without errors.
Now I should be able to run this script from another java program.
For this aim I use
Runtime.exec("transfer.sh");
(maybe are there better ways to do this? I have seen at REST API but there are some difficulties because job manager is proxied by YARN).
At the beginning is works as usually - first several jobs are submitted to session and finished successfully. But the following jobs are not submitted to YARN session.
In /opt/flink-1.3.1/log/flink-tsvetkoff-client-hadoop-dev1.log I see error (and no another errors found in DEBUG level)
The program execution failed: JobClientActor seems to have died before the JobExecutionResult could be retrieved.
I have tried to analyse this problem by myself and found out that this error has occurred in JobClient class while sending ping request with timeout to JobClientActor (i.e. YARN cluster).
I tried to increase multiple heartbeat and timeout options like akka.*.timeout, akka.watch.heartbeat.* and yarn.heartbeat-delay options but it doesn't solve the problem - new jobs are not submit to YARN session from CliFrontend.
The environment for both case (manual call and call from another program) is the same. When I call
$ ps axu | grep transfer
it will give me output
/usr/lib/jvm/java-8-oracle/bin/java -Dlog.file=/opt/flink-1.3.1/log/flink-tsvetkoff-client-hadoop-dev1.log -Dlog4j.configuration=file:/opt/flink-1.3.1/conf/log4j-cli.properties -Dlogback.configurationFile=file:/opt/flink-1.3.1/conf/logback.xml -classpath /opt/flink-1.3.1/lib/flink-metrics-graphite-1.3.1.jar:/opt/flink-1.3.1/lib/flink-python_2.11-1.3.1.jar:/opt/flink-1.3.1/lib/flink-shaded-hadoop2-uber-1.3.1.jar:/opt/flink-1.3.1/lib/log4j-1.2.17.jar:/opt/flink-1.3.1/lib/slf4j-log4j12-1.7.7.jar:/opt/flink-1.3.1/lib/flink-dist_2.11-1.3.1.jar:::/etc/hadoop/conf org.apache.flink.client.CliFrontend run -p 4 transfer.jar
I also tried to update flink to 1.4.0 release or change parallelism of job (even to -p 1) but error has still occurred.
I have no idea what could be different? Is any workaround by the way?
Thank you for any help.
Finally I find out how to resolve that error
Just replace Runtime.exec(...) with new ProcessBuilder(...).inheritIO().start().
I really don't know why the call of inheritIO helps in that case because as I understand it just redirects IO streams from child process to parent process.
But I have checked that if I comment out this line of code the program begins to fall again.

Jenkins SSH remote process is getting killed as soon as the Jenkins SSH plugin returns back

Jenkins version: 1.574
I created a simple job which performs the following:
Using "Execute shell script on remote host using SSH" as one of the BUILD steps, I'm just calling a shell script. This shell script performs stop and start operations on Tomcat to restart an application on the target machine.
I have a valid username, password, port defined for the target SSH server in Jenkins Global settings.
I saw this behavior that when I run a Jenkins job and call the restart script (which gets the application name as parameter $1), it works fine, but as soon as "Execute shell script on remote host using SSH" step completes, I see the new process dies on the remote/target application server.
If I run the script from the target/remote server itself, everything works fine and the new process/PID remains live forever, but running the same script from Jenkins, though I don't see any errors and everything works as expected, the new process dies as soon as the above mentioned SSH step is complete and control comes back to the next BUILD step in Jenkins job OR the Jenkins job is complete.
I saw a few posts/blogs and tried setting: BUILD_ID=dontKillMe in the Jenkins job (in various places i.e. Prepare Environment variables and also using Inject Environment variables...). When the job's particular build# is complete, I can see Environment Variables for that build# does say BUILD_ID=dontKillMe as its value (instead of the default Timestamp tag value).
I tried putting nohup before calling the restart script, i.e.,
nohup restart_tomcat.sh "${app}"
I also tried:
BUILD_ID=dontKillMe nohup restart_tomcat.sh "${app}"
This doesn't give any error and creates a nohup.out file on the remote server (but I'm not worried about it as the restart_tomcat.sh script itself creates its own LOG file which I'm "cat"ing after the restart_tomcat.sh script is complete. cat'ing on the log file is performed using another "Execute shell script on remote host using SSH" build step, and it successfully shows the log file created by the restart script).
I don't know what I'm missing at this point, but as soon as the restart_tomcat.sh step is complete, the new PID/process on the remote/target server dies.
How can I fix this?
I've been through this myself.
On my first iteration, before I knew about Jenkins ProcessTreeKiller, I ended up just daemonizing Tomcat. The Apache Tomcat documentation includes a section on running as a daemon.
You can also try disabling the ProcessTreeKiller for your whole Jenkins instance, if it's relatively small (read the first link for information).
The BUILD_ID=dontKillMe should be passed to the shell, and therefore it should be in your command line, not in Jenkins global configuration or job parameters.
BUILD_ID=dontKillMe restart_tomcat.sh "${app}" should have worked without problems.
You can also try nohup restart_tomcat.sh "${app}" & with the & at the end.
My solution (it worked after trying everything else) in Ubuntu 14.04 (Trusty Tahr) (Amazon AWS - Amazon EC2), Jenkins 1.601:
Exec command: (setsid COMMAND < /dev/null > /dev/null 2>&1 &);
Exec in PTY: DISABLED
// Example COMMAND=socat TCP4-LISTEN:1337,fork TCP4:127.0.0.1:1338
I created this Transfer as my last one.
#!/bin/ksh
export BUILD_ID=dontKillMe
I added the above line to the start of my script and the issue was resolved.

Start program via ssh in Jenkins and using it in Jenkins build

Hello people.
I'm using Jenkins as CI server and I need to run some performance test using Jmeter. I've setup the plugin and configured my workspace and everything works ok, but I have to do some steps manually and I want a bit more of "automation".
Currently i have some small programs in a remote server. These programs make some specific validations, for instance (just to explain): validates e-mail addresses, phone numbers, etc.
So, before I run the build in jenkins, I have to manually start the program (file.sh) I want:
I have to use putty (or any othe ssh client) to conect to the server and then run, for instance, the command
./email_validation.sh
And the Jmeter test runs in a correct way, and when the test is done I have to manually "shut down" the program I started. But what I want is trying to start the program I need in Jenkins configuration (not manually outside Jenkins, but in "execute shell" or "execute remote shell using ssh" build step).
I have tried to start it, but it get stuck, because when Jenkins build finds the command
./email_validation.sh
the build stops, it waits for the command to finish and then it will continue the other build steps, but obviously, I need this step not to finish until the test is executed.
Is there a way to achieve this? Thanks
Run your command as a background process by adding the & symbol at the end of the command and use the nohup command in case the parent process gets a hangup signal, e.g.
nohup /path/to/email_validation.sh &
If the script produces any output, it will go by default to the file nohup.out in the current directory when the script was launched.
You can kill the process at the end of the build by running:
pkill email_validation.sh

Can't write in tcl command line unless sending echo, after using PLINK

I'm having a confusing problem running a variety of programs from a tcl script. Here's the story: I have a script (in tcl) which executes plink to establish a remote connection on a Linux computer. I basically use eval for calling plink, sending as parameters some ssh commands and info, and also a bash file to be executed on the Linux computer.
So far, that works fine, or at least it does what I intend it to do. The issue here is that after calling this procedure, my prompt stops working the normal way. I can type, but it doesn't appear on screen unless the command I send is "echo" (without ""). If so, I get the "ECHO is on" message and the prompt continues to work normally.
Does anyone have any idea why this could be happening? I thought about just patch it and add the "echo" command inside my script, but it says that it's an invalid command in this case...
Well, thanks for the help!