Thursday, December 2, 2021

[SOLVED] Signal handlers of C++ application within a docker container are not working

December 02, 2021 c++, docker, linux, signals

Issue

I have the following C++ code:

#include <signal>
#include <iostream>

void sig_handler(int signo, siginfo_t *info, void *_ctx) {
      
    cout << "HANDLE AND RESET!!" << endl;

    // try to forward the signal.
    raise(info->si_signo);

    // terminate the process immediately.
    puts("watf? exit");
    _exit(EXIT_FAILURE);
}


void registerSignalHandlers() {
    vector<int> signals = {
      // Signals for which the default action is "Core".
      SIGABRT, // Abort signal from abort(3)
      SIGBUS,  // Bus error (bad memory access)
      SIGFPE,  // Floating point exception
      SIGILL,  // Illegal Instruction
      SIGIOT,  // IOT trap. A synonym for SIGABRT
      SIGQUIT, // Quit from keyboard
      SIGSEGV, // Invalid memory reference
      SIGSYS,  // Bad argument to routine (SVr4)
      SIGTRAP, // Trace/breakpoint trap
      SIGXCPU, // CPU time limit exceeded (4.2BSD)
      SIGXFSZ, // File size limit exceeded (4.2BSD)
      SIGTERM
    };
    
    
    for (size_t i = 0; i < signals.size(); ++i) {
      struct sigaction action;
      memset(&action, 0, sizeof action);
      action.sa_flags = static_cast<int>(SA_SIGINFO | SA_ONSTACK | SA_NODEFER | SA_RESETHAND);
      sigfillset(&action.sa_mask);
      sigdelset(&action.sa_mask, signals[i]);
      action.sa_sigaction = &sig_handler;

      int r = sigaction(signals[i], &action, nullptr);
    }

}



int main(int argc, char *argv[]) {
    
    registerSignalHandlers();

    //rest of the code goes here

    return 0;
}

I run my application within a docker container with debian:stretch image with tini as the PID 1 (used it directly not through bash). When I run the container on my device (MacBook Pro) everything works fine with no issues at all. I try to cause segmentation fault SIGSEGV exception within my code to test the handler trigger, on my device, it's working perfectly but once I run the same exact container on the server (CentOS 7) the handler is not working at all.

What I mean by not working at all is the signal is never received by my application. I've tried to send the signal manually from inside the container kill -15 PID_OF_APPLICATION and the handler worked just fine as it should but if send kill -11 PID_OF_APPLICATION the handler doesn't work and no idea why!

I tried to check if the signal is being raised by my code using strace and I was able to see that SIGSEGV is raised.

Also, I tried to run a script that runs my application and trap the signal received by it. The signal was received in the script and but also the handler was not triggered too

I'm not sure if I'm missing something related to a configuration of the docker container (I'm using docker-compose) but I think I'm doing everything correctly since the same docker-compose file on my device is starting the container and it's working with no issues.

Are the signal raised by the application within the container handled through the PID 1 too? tini in my case.

Any help is highly appreciated

UPDATE

If I set the entrypoint to sleep infinity and I get inside the docker container with docker exec -it container_id bash then start my application manually as foreground process the handlers works with no problems

Solution

I've been there before. It's a real struggle especially when things work on your device but it's not on other devices.

I'll write the steps I did for my own problem and maybe it can light up some solutions for you.

Things were working perfectly on macOS and I had to compare the docker engine versions between my device and the server which was CentOS7 but no difference!

Then I tried to run the same docker image using docker-compose on Ubuntu instead of CentOS7 and here is the surprise! it was working too, so my problem was only on CentOS7. I would recommend you do the same, try to run your docker image on another OS like Ubuntu just to make sure that's your problem not related to your actual application.

Try to run your app with a cronjob

Yes, try to run it as a cronjob. I'm not sure about the problem's root cause but this worked for me. I think it's related to how docker proxy signals when you run the container Here you will find at the end of the README file a useful conclusion about signals depending on how you run your container.

Also, another possible reason is when the application is running in the background the signals somehow are not proxied to it, so running it as a cronjob would be different from a regular background process.

You can manage this approach as a full solution with maintaining the docker container responses to your app (including the crash) as follows:

Use tini as the entrypoint of your container.
Make tini runs your script as CMD
Your script will update the crontab file with your cronjob (it's up to you to define the frequency of run)
The added cron would run your actual run script.
Your run script should have a trap function. Why? once your app is crashed you can send a KILL signal to your entrypoint script (point #2) which will kill the docker container. This way you maintained the behaviour of running your app as an entrypoint.

Hope this is helpful for your case.

Answered By - Mazen Ak

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0