Unprivileged Linux Network Namespaces, Part 2
Go read part 1 for an introduction to network namespaces and the unshare()
and setns()
syscalls. I ended that post with a complaint about using the unshare
and nsenter
programs - they are great programs, just not tailored to my preferred workflow.
The problem is persistence. I don’t want to worry about tracking the PID of the shell that executes unshare -Urn
or having to keep that one shell running so the PID of the namespace is consistent (the namespace continues to exist as long as any process is attached to it). So in a weird way, I want to use a daemon process as a named, persistent handle to the namespace.
Minimal Process
What is the best way to write a daemon program that (almost) never exits?
while (true) {}
This would work, but one of my CPUs might consider mutiny. I could put a sleep
in there, but that still means the program is waking up every so often. To be fair, that would be good enough, but now I’m just curious.
Given what I know about Linux today, a signal seems like the best way to trigger the program’s termination. That way deleting the namespace is as simple as
$ kill -TERM <pid>
Attempt 1: signalfd
I’m just going to start by saying that I find signal handling in Linux to be pretty complicated. The simple cases like catching a Ctrl+C aren’t too tough, but there are so many syscalls for advanced cases. I could try a signalfd()
1 and block on a read()
call.
const std = @import("std");
const debug = std.debug;
const linux = os.linux;
const os = std.os;
pub fn main() !void {
var mask = os.empty_sigset;
linux.sigaddset(&mask, linux.SIG.TERM);
const fd = try os.signalfd(-1, &mask, 0);
defer os.close(fd);
var buf = [_]u8{0} ** @sizeOf(linux.signalfd_siginfo);
const n = try os.read(fd, &buf);
debug.assert(n == @sizeOf(linux.signalfd_siginfo));
const info: *linux.signalfd_siginfo = @ptrCast(@alignCast(&buf));
debug.print("recevied {d}\n", .{info.signo});
}
$ zig build
$ ./zig-out/bin/sigfd
The program appears to hang, but that’s expected because I haven’t used O_NONBLOCK
and thus read()
is blocking.
$ ps -p $(pidof sigfd) -o %cpu,%mem,cmd
%CPU %MEM CMD
0.0 0.0 ./zig-out/bin/sigfd
$ cat /proc/$(pidof sigfd)/status | rg State
State: S (sleeping)
$ kill $(pidof sigfd)
This looks good. The process is in a sleeping state and consuming no resources (more or less - it is a beefy machine). But there’s a problem: other signals such as SIGINT
will terminate the process without returning from read()
, meaning the message is not printed. It’s not much of a problem in this example, but I do want to be able to run cleanup code. Here’s how it looks in strace
:
read(3, 0x7ffda4722590, 128) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
The SA_RESTART
flag can be used with sigaction()
2 to make Linux retry the read()
call rather than exit from it. I opted for signalfd()
in the first place to avoid using sigaction()
and having to use a handler to modify some global state, so this is no good.
Attempt 2: pause
I searched around and learned of a syscall named pause()
3. Unfortunately, pause()
isn’t defined on arm64, and I do a lot of work on a RockPro64 board running Linux. But pause()
is both a syscall and a function defined by the POSIX specification. The glibc source4 has
int
__libc_pause (void)
{
#ifdef __NR_pause
return SYSCALL_CANCEL (pause);
#else
return SYSCALL_CANCEL (ppoll, NULL, 0, NULL, NULL);
#endif
}
I don’t think using ppoll()
is going to be any different than blocking on read()
, so I’m not going to bother chasing that down.
Attempt 3: sigsuspend
Continuing my search yielded sigsuspend
, which both
- replaces the signal mask of the calling process
- suspends the process
That sounds perfect as it may solve the SIGINT
problem.
const std = @import("std");
const debug = std.debug;
const linux = os.linux;
const os = std.os;
fn sig_noop(_: c_int) callconv(.C) void {}
fn sigsuspend(mask: *const linux.sigset_t) void {
const rc = linux.syscall2(.rt_sigsuspend, @intFromPtr(mask), linux.NSIG / 8);
switch (linux.getErrno(rc)) {
.SUCCESS => return,
.INVAL => unreachable,
.FAULT => unreachable,
.INTR => return,
else => unreachable,
}
}
pub fn main() !void {
var mask = os.empty_sigset;
linux.sigaddset(&mask, linux.SIG.TERM);
try os.sigaction(term, &linux.Sigaction{
.handler = .{ .handler = sig_noop },
.mask = mask,
.flags = 0,
}, null);
var mask = linux.all_mask;
sigdelset(&mask, linux.SIG.TERM);
sigsuspend(&mask);
debug.print("awoke from suspend\n", .{});
}
Clearly this is a bit more complicated, and I had to create my own wrapper for rt_sigsuspend
since one isn’t provided by the standard library. The sigaction()
call is necessary because, according to the man page, sigsuspend()
only returns if the signal is caught and after the handler runs. It doesn’t bother me much since the handler can be an empty function.
At first this seemed to work - the program did not terminate when I sent SIGINT
!
$ cat /proc/$(pidof sigsuspend)/status | rg State
State: S (sleeping)
However, when I then sent SIGTERM
, the program exited without printing my message. Ugh. By looking at strace
, I can see that SIGINT
is delivered after SIGTERM
, even though I sent them in the opposite order. I suppose that blocked signals are still delivered when they become unblocked.
Attempt 4: sigprocmask
This syscall is referenced in the man pages for both signalfd()
and sigsuspend()
. I can use it to block all signals in a mask, which in this context is everything except SIGTERM
. Of course, it won’t work with SIGKILL
or SIGSTOP
, but that’s fine by me.
By placing this right before the call o sigsuspend()
os.sigprocmask(linux.SIG.SETMASK, &mask, null);
I can rerun the previous code and see that the final message is printed even when sending SIGINT
before SIGTERM
. But at this point, shouldn’t blocking signals via sigprocmask()
be just as effective with signalfd()
as it is with sigsuspend()
? And by observation it does, so I’m back to square one, but at least it’s working now.
This was a bit of a runaround for me, and I’m still not terribly confident I’ve truly found the best way to do this. At least I learned a few things along the way and have a path forward.