I have a program where i ssh into a server and gets data. Here is the code... I fork it and the child executes the query and the parent waits for the child for a predetermined amount of time (in function timeout) and then kills the child. I did that part because sometimes, i am not exactly sure why, but the ssh connection stops and doesnot exit. That is there is a "ssh -oConnectTimeout=60 blah blah" in the processes list for a long and the timeout function doesnt seem to work. What am i doing wrong here? The last time this problem occured, there was an ssh in process list for 5 days and still it didnot timeout and the program had stopped because it was waiting for the child. There are those extra wait() functions because previously i was getting a lot of defunct processes a.k.a zombies. So i took the easy way out..
c = fork();
if(c==0) {
close(fd[READ]);
if (dup2(fd[WRITE],STDOUT_FILENO) != -1)
execlp("ssh", "ssh -oConnectTimeout=60", serverDetails.c_str(), NULL);
_exit(1);
}else{
if(timeout(c) == 1){
kill(c,SIGTERM);
waitpid(c, &exitStatus, WNOHANG);
wait(&exitStatus);
return 0;
}
wait(&exitStatus);
}
This is the timeout function.
int timeout(int childPID)
{
int times = 0, max_times = 10, status, rc;
while (times < max_times){
sleep(5);
rc = waitpid(childPID, &status, WNOHANG);
if(rc < 0){
perror("waitpid");
exit(1);
}
if(WIFEXITED(status) || WIFSIGNALED(status)){
/* child exits */
break;
}
times++;
}
if (times >= max_times){
return 1;
}
else return 0;
}
SIGTERM just asks for a polite termination of the process. If it's got stuck, then it won't respond to that, and you'll need to use SIGKILL to kill it. Probably after trying SIGTERM and waiting a little while.
The other possibility is that it's waiting for the output pipe to the parent process to not be full - maybe there's enough output to fill the buffer, and the child is waiting on that rather than the network.
Related
I'm currently trying to Test Auto-Generated-Code for a Controller.
The test will be done in CANoe with Capl.
I've already tried a lot of things out and it's working good, but now I want to
test a "message lost".
I need something like this.
CAN 1 is sending a test message 10 Times. 3 Times there will be a Message lost.
CAN 2 which is receiving the Signals has to react to this with a specific value.
I Need something like WaitForMessage(int aTimeOut, Message yourMessage) which gives for example 0 for succesfully accessing the Message or -1 for timeOut.
on timer sendMessage
{
if(anzahlAnBotschaften > 0) // amount of sent Messages
{
if(anzahlAnBotschaften % 3 == 0) // 3 times message lost
{
botschaftWirdGesendet = 0;
lRet = ???? here is the part where i want to wait for a an answer from CAN2
if(lRet != 0)
{
TestStepPass("010.1", "SNA was triggered");
}
else
{
TestStepFail("010.1", "Timeout was triggered, but no SNA was found");
}
}
else
{
botschaftWirdGesendet = 1;
output(sendingCan_BrkSys);
lRet = TestGetWaitEventMsgData(receivingCan_aMessage);
if(lRet == 0)
{
// same for the positive case
}
}
anzahlAnBotschaften -- ;
setTimer(botschaftsAusfall,20);
}
}
What's the Problem? Just use CAPL-function testWaitForMessage as described in help.
You are using Test-Node as there is TestStepFail/Pass call in your code, so everything you need in terms of control your test-sequence begins with test...
p.s. something else, I doubt that with this code you can detect what is described in comment
if(anzahlAnBotschaften % 3 == 0) // 3 times message lost
anzahlAnBotschaften = in german this means the count of received messages. So when, as described above, you will receive 7 from 10 messages (anzahlAnBotschaften == 7) than this condition is false.
I have looked all over and I cannot seem to figure out how to do this.
I have a parent process that has created a pipe()
Now, I want to fork() the parent and then execlp() and pass the pipe() to the new program as a command line argument.
Then from inside the new program I need to be able to read the pipefd.
I've seen a bunch of stuff on how to do it from inside the same process, but nothing on how to do it like this.
Edit: Initial post is/was rather vague.
What I have so far is:
int pfd[2];
if(pipe(pfd) == -1) {
perror("Creating pipe\n");
exit(1);
}
pid_t pid = fork();
if(pid == -1) {
fprintf (stderr, "Initiator Error Message : fork failed\n");
return -1;
}
else if(pid == 0) { // child process
close(pipe0[1]); // close(write);
execlp("program", "program", pipe0[0], NULL);
}
but then I don't really understand what I should do from inside "program" to get the FD. I tried assigning it to all sorts of things, but they all seem to error.
Thank you in advance!
The forked and execed child automatically inherit the open pipe descriptors and the pipe output is usually fed as standard input so that a command line argument to find the pipe is pretty redundant:
if(!pipe(&pipefd))
switch(fork()) {
case 0: !dup2(pipefd[0],0)&&
execlp("cat","cat","-n","/dev/fd/0",0);
case -1: return perror("fork");
default: write(pipefd[1],"OK\n",3);
}
I'm building a DNS client. A child process handles the request through an UDP socket, while the parent handles the reply. I need the parent to know how many bytes were sent, in order to print the URLs. I tried the following approach with pipe()
childPID = fork();
pipe(fd);
if(childPID == 0){
close(fd[0]);
sent_bytes = sendDNS(sock_udp, &serverAddr, argv[2]);
memcpy(in_buf, &sent_bytes, sizeof(sent_bytes));
write(fd[1], in_buf, sizeof(sent_bytes));
exit(0);
}
else{
close(fd[1]);
int inBytes = -1;
struct sockaddr reply_addr;
n = sizeof(reply_addr);
while(inBytes < 0){
inBytes = recvfrom(sock_udp, buffer, DNS_MAX_RESPONSE, 0, &reply_addr, (socklen_t*)&n);
read(fd[0], out_buf, sizeof(sent_bytes));
memcpy(pipe_msg, out_buf, sizeof(sent_bytes));
printDNSmsg((struct dnsReply*)buffer);
}
}
But GDB shows a SIGPIPE received on the child process. What am I missing?
How would you print a DNS reply (variable length buffer)?
You need to call pipe() before fork(), of course. But you're not actually using the information anywhere. Why do you care how many bytes were sent, as long as you got a reply? And why would you do a UDP send in a separate thread, let alone a separate process? It all seems completely pointless.
I'm trying to share a unnamed mach semaphore between two processes.
I can create one and wait on it in the same process.
semaphore_t semaphore = 0;
mach_error_t err = semaphore_create(mach_task_self(), &semaphore, SYNC_POLICY_FIFO, 0);
...
semaphore_wait(semaphore);
But I want to send it to another process (of which I only have the mach_port_t) and then let it semaphore_signal my own process.
I already tried things like:
mach_port_allocate(target, MACH_PORT_RIGHT_RECEIVE, targetSemaphore)
mach_port_insert_right(target, targetSemaphore, semaphore, MACH_MSG_TYPE_COPY_SEND)
Which will yield an error because the port name already exists in the target process or a "unknown failure" if I don't allocate it in the target process.
And even:
mach_msg_send
mach_msg_receive
But I can't even get a port right form one process to anther to send anything.
What am I doing wrong and is it even possible?
I figured it out:
mach_port_extract_right
is correct way, instead of:
mach_port_insert_right
Then doing this, will do the job:
semaphore_t semaphore = 0;
mach_error_t err = semaphore_create(mach_task_self(), &semaphore, SYNC_POLICY_FIFO, 0);
err = mach_port_allocate(target, MACH_PORT_RIGHT_RECEIVE, &receivePort);
mach_msg_type_name_t type;
semaphore_t sendPort = 0;
err = mach_port_extract_right(target, receivePort, MACH_MSG_TYPE_MAKE_SEND, &sendPort, &type);
//Send semaphore using port
mach_msg_send(&msg.header);
This is a follow-up to this previous question of mine, for which the conclusion was that the program was erroneous, and therefore the expected behavior was undefined.
What I'm trying to create here is a simple error-handling mechanism, for which I use that Irecv request for the empty message as an "abort handle", attaching it to my normal MPI_Wait call (and turning it into MPI_WaitAny), in order to allow me to unblock process 1 in case an error occurs on process 0 and it can no longer reach the point where it's supposed to post the matching MPI_Recv.
What's happening is that, due to internal message buffering, the MPI_Isend may succeed right away, without the other process being able to post the matching MPI_Recv. So there's no way of canceling it anymore.
I was hoping that once all processes call MPI_Comm_free I can just forget about that message once and for all, but, as it turns out, that's not the case. Instead, it's being delivered to the MPI_Recv in the following communicator.
So my questions are:
Is this also an erroneous program, or is it a bug in the MPI implementation (Intel MPI 4.0.3)?
If I turn my MPI_Isend calls into MPI_Issend, the program works as expected - can I at least in that case rest assured that the program is correct?
Am I reinventing the wheel here? Is there a simpler way to achieve this?
Again, any feedback is much appreciated!
#include "stdio.h"
#include "unistd.h"
#include "mpi.h"
#include "time.h"
#include "stdlib.h"
int main(int argc, char* argv[]) {
int rank, size;
MPI_Group group;
MPI_Comm my_comm;
srand(time(NULL));
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_group(MPI_COMM_WORLD, &group);
MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
if (rank == 0) printf("created communicator %d\n", my_comm);
if (rank == 1) {
MPI_Request req[2];
int msg = 123, which;
MPI_Isend(&msg, 1, MPI_INT, 0, 0, my_comm, &req[0]);
MPI_Irecv(NULL, 0, MPI_INT, 0, 0, my_comm, &req[1]);
MPI_Waitany(2, req, &which, MPI_STATUS_IGNORE);
MPI_Barrier(my_comm);
if (which == 0) {
printf("rank 1: send succeed; cancelling abort handle\n");
MPI_Cancel(&req[1]);
MPI_Wait(&req[1], MPI_STATUS_IGNORE);
} else {
printf("rank 1: send aborted; cancelling send request\n");
MPI_Cancel(&req[0]);
MPI_Wait(&req[0], MPI_STATUS_IGNORE);
}
} else {
MPI_Request req;
int msg, r = rand() % 2;
if (r) {
printf("rank 0: receiving message\n");
MPI_Recv(&msg, 1, MPI_INT, 1, 0, my_comm, MPI_STATUS_IGNORE);
} else {
printf("rank 0: sending abort message\n");
MPI_Isend(NULL, 0, MPI_INT, 1, 0, my_comm, &req);
}
MPI_Barrier(my_comm);
if (!r) {
MPI_Cancel(&req);
MPI_Wait(&req, MPI_STATUS_IGNORE);
}
}
if (rank == 0) printf("freeing communicator %d\n", my_comm);
MPI_Comm_free(&my_comm);
sleep(2);
MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
if (rank == 0) printf("created communicator %d\n", my_comm);
if (rank == 0) {
MPI_Request req;
MPI_Status status;
int msg, cancelled;
MPI_Irecv(&msg, 1, MPI_INT, 1, 0, my_comm, &req);
sleep(1);
MPI_Cancel(&req);
MPI_Wait(&req, &status);
MPI_Test_cancelled(&status, &cancelled);
if (cancelled) {
printf("rank 0: receive cancelled\n");
} else {
printf("rank 0: OLD MESSAGE RECEIVED!!!\n");
}
}
if (rank == 0) printf("freeing communicator %d\n", my_comm);
MPI_Comm_free(&my_comm);
MPI_Finalize();
return 0;
}
outputs:
created communicator -2080374784
rank 0: sending abort message
rank 1: send succeed; cancelling abort handle
freeing communicator -2080374784
created communicator -2080374784
rank 0: STRAY MESSAGE RECEIVED!!!
freeing communicator -2080374784
As mentioned in one of the above comments by #kraffenetti, this is an erroneous program because the sent messages are not being matched by receives. Even though the messages are cancelled, they still need to have a matching receive on the remote side because it's possible that the cancel might not be successful for sent messages due to the fact that they were already sent before the cancel can be completed (which is the case here).
This question started a thread on this on a ticket for MPICH, which you can find here that has more details.
I tried to build your code using open mpi and it did not work. mpicc complained about status.cancelled
error: ‘MPI_Status’ has no member named ‘cancelled’
I suppose this is a feature of intel mpi. What happens if you switch for :
...
int flag;
MPI_Test_cancelled(&status, &flag);
if (flag) {
...
This gives the expected output using open mpi (and it makes your code less dependant). Is it the case using intel mpi ?
We need an expert to tell us what is status.cancelled in intel mpi, because i don't know anything about it !
Edit : i tested my answer many times and i found that the output was random, sometimes correct, sometimes not. Sorry for that... As if something in status was not set. Part of the answer may be in MPI_Wait(), http://www.mpich.org/static/docs/v3.1/www3/MPI_Wait.html ,
" The MPI_ERROR field of the status return is only set if the return from the MPI routine is MPI_ERR_IN_STATUS. That error class is only returned by the routines that take an array of status arguments (MPI_Testall, MPI_Testsome, MPI_Waitall, and MPI_Waitsome). In all other cases, the value of the MPI_ERROR field in the status is unchanged. See section 3.2.5 in the MPI-1.1 specification for the exact text. " If MPI_Test_cancelled() makes use of the MPI_ERROR, things might get bad.
So here is the trick : use MPI_Waitall(1,&req, &status) ! The output is correct at last !