Issue
I have the following C++ code:
#include <signal>
#include <iostream>
void sig_handler(int signo, siginfo_t *info, void *_ctx) {
cout << "HANDLE AND RESET!!" << endl;
// try to forward the signal.
raise(info->si_signo);
// terminate the process immediately.
puts("watf? exit");
_exit(EXIT_FAILURE);
}
void registerSignalHandlers() {
vector<int> signals = {
// Signals for which the default action is "Core".
SIGABRT, // Abort signal from abort(3)
SIGBUS, // Bus error (bad memory access)
SIGFPE, // Floating point exception
SIGILL, // Illegal Instruction
SIGIOT, // IOT trap. A synonym for SIGABRT
SIGQUIT, // Quit from keyboard
SIGSEGV, // Invalid memory reference
SIGSYS, // Bad argument to routine (SVr4)
SIGTRAP, // Trace/breakpoint trap
SIGXCPU, // CPU time limit exceeded (4.2BSD)
SIGXFSZ, // File size limit exceeded (4.2BSD)
SIGTERM
};
for (size_t i = 0; i < signals.size(); ++i) {
struct sigaction action;
memset(&action, 0, sizeof action);
action.sa_flags = static_cast<int>(SA_SIGINFO | SA_ONSTACK | SA_NODEFER | SA_RESETHAND);
sigfillset(&action.sa_mask);
sigdelset(&action.sa_mask, signals[i]);
action.sa_sigaction = &sig_handler;
int r = sigaction(signals[i], &action, nullptr);
}
}
int main(int argc, char *argv[]) {
registerSignalHandlers();
//rest of the code goes here
return 0;
}
I run my application within a docker container with debian:stretch
image with tini
as the PID 1 (used it directly not through bash
). When I run the container on my device (MacBook Pro) everything works fine with no issues at all. I try to cause segmentation fault SIGSEGV
exception within my code to test the handler trigger, on my device, it's working perfectly but once I run the same exact container on the server (CentOS 7) the handler is not working at all.
What I mean by not working at all is the signal is never received by my application. I've tried to send the signal manually from inside the container kill -15 PID_OF_APPLICATION
and the handler worked just fine as it should but if send kill -11 PID_OF_APPLICATION
the handler doesn't work and no idea why!
I tried to check if the signal is being raised by my code using strace
and I was able to see that SIGSEGV
is raised.
Also, I tried to run a script that runs my application and trap
the signal received by it. The signal was received in the script and but also the handler was not triggered too
I'm not sure if I'm missing something related to a configuration of the docker container (I'm using docker-compose
) but I think I'm doing everything correctly since the same docker-compose file on my device is starting the container and it's working with no issues.
Are the signal raised by the application within the container handled through the PID 1 too? tini
in my case.
Any help is highly appreciated
UPDATE
If I set the entrypoint to sleep infinity
and I get inside the docker container with docker exec -it container_id bash
then start my application manually as foreground process the handlers works with no problems
Solution
I've been there before. It's a real struggle especially when things work on your device but it's not on other devices.
I'll write the steps I did for my own problem and maybe it can light up some solutions for you.
Things were working perfectly on macOS and I had to compare the docker engine versions between my device and the server which was CentOS7 but no difference!
Then I tried to run the same docker image using docker-compose
on Ubuntu instead of CentOS7 and here is the surprise! it was working too, so my problem was only on CentOS7. I would recommend you do the same, try to run your docker image on another OS like Ubuntu just to make sure that's your problem not related to your actual application.
Try to run your app with a cronjob
Yes, try to run it as a cronjob. I'm not sure about the problem's root cause but this worked for me. I think it's related to how docker proxy signals when you run the container Here you will find at the end of the README file a useful conclusion about signals depending on how you run your container.
Also, another possible reason is when the application is running in the background the signals somehow are not proxied to it, so running it as a cronjob would be different from a regular background process.
You can manage this approach as a full solution with maintaining the docker container responses to your app (including the crash) as follows:
- Use
tini
as the entrypoint of your container. - Make
tini
runs your script asCMD
- Your script will update the crontab file with your cronjob (it's up to you to define the frequency of run)
- The added cron would run your actual run script.
- Your run script should have a
trap
function. Why? once your app is crashed you can send a KILL signal to your entrypoint script (point #2) which will kill the docker container. This way you maintained the behaviour of running your app as an entrypoint.
Hope this is helpful for your case.
Answered By - Mazen Ak