Issue
I'm currently working on a project where I have a parent process that sets up a socketpair, forks and then uses this socketpair to communicate. The child, if it wants to open a file (or any other file descriptor based resource) should always go to the parent, request the resource and get the fd
sent via the socketpair. Furthermore I want to prevent the child from opening any file descriptor by itself.
I stumbled over setrlimit
which successfully prevents the child from opening new file descriptors, but it also seems to invalidate any file descriptors sent over the initial socket connection. Is there any method on Linux that allows a single process to open any file, send its file descriptor to other processes and lets them use them without allowing these other processes to open any file descriptor by themselves?
For my use case that can be any kernel configuration, system call, etc. as long as it can be applied after fork and as long as it applies to all file descriptors (not just files but also sockets, socketpairs, etc.).
Solution
What you have here is exactly the use case of seccomp.
Using seccomp, you can filter syscalls in different ways. What you want to do in this situation is, right after fork()
, to install a seccomp
filter that disallows the use of open(2)
, openat(2)
, socket(2)
(and more).
To accomplish this, you can do the following:
- First, create a seccomp context using
seccomp_init(3)
with the default behavior ofSCMP_ACT_ALLOW
. - Then add a rule to the context using
seccomp_rule_add(3)
for each syscall that you want to deny. You can useSCMP_ACT_KILL
to kill the process if the syscall is attempted,SCMP_ACT_ERRNO(val)
to make the syscall fail returning the specifiederrno
value, or any otheraction
value defined in the manual page. - Load the context using
seccomp_load(3)
to make it effective.
IMPORTANT!
Note that a blacklist approach like this one is in general weaker than a whitelist approach. It allows any syscall that is not explicitly disallowed, and could result in a bypass of the filter. If you believe that the child process you want to execute could be maliciously trying to avoid the filter, or if you already know which syscalls will be needed by the children, a whitelist approach is better, and you should do the opposite of the above: create filter with the default action of SCMP_ACT_KILL
and allow the needed syscalls with SCMP_ACT_ALLOW
. In terms of code the difference is minimal (the whitelist is probably longer, but the steps are the same).
Here's an example of the above (I'm doing exit(-1)
in case of error just for simplicity's sake):
#include <stdlib.h>
#include <seccomp.h>
static void secure(void) {
int err;
scmp_filter_ctx ctx;
int blacklist[] = {
SCMP_SYS(open),
SCMP_SYS(openat),
SCMP_SYS(creat),
SCMP_SYS(socket),
SCMP_SYS(open_by_handle_at),
// ... possibly more ...
};
// Create a new seccomp context, allowing every syscall by default.
ctx = seccomp_init(SCMP_ACT_ALLOW);
if (ctx == NULL)
exit(-1);
/* Now add a filter for each syscall that you want to disallow.
In this case, we'll use SCMP_ACT_KILL to kill the process if it
attempts to execute the specified syscall. */
for (unsigned i = 0; i < sizeof(blacklist) / sizeof(blacklist[0]); i++) {
err = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i], 0);
if (err)
exit(-1);
}
// Load the context making it effective.
err = seccomp_load(ctx);
if (err)
exit(-1);
seccomp_release(ctx);
}
Now, in your program, you can call the above function to apply the seccomp filter right after the fork()
, like this:
child_pid = fork();
if (child_pid == -1)
exit(-1);
if (child_pid == 0) {
secure();
// Child code here...
exit(0);
} else {
// Parent code here...
}
A few important notes on seccomp:
- A seccomp filter, once applied, cannot be removed or altered by the process.
- If
fork(2)
orclone(2)
are allowed by the filter, any child processes will be constrained by the same filter. - If
execve(2)
is allowed, the existing filter will be preserved across a call toexecve(2)
. - If the
prctl(2)
syscall is allowed, the process is able to apply further filters.
Answered By - Marco Bonelli Answer Checked By - David Marino (WPSolving Volunteer)