Quite a while ago I wrote portreserve, a utility to prevent ports getting stolen at boot time by portmap. This would happen with CUPS, for example: portmap starts first (to allow for NFS-mounted filesystems), and calls bindresvport(). If the privileged (i.e. in the range 512-1023) port it allocates happens to be 631, when CUPS starts and tries to bind that port it fails. This didn’t just affect CUPS, but any service with a well known port in the privileged range.
The solution we’ve been using for a while is portreserve, and it works by having each privileged service (e.g. cupsd, spamd) provide a file in /etc/portreserve saying which ports it uses. At boot portreserve starts before portmap and binds to those ports, to prevent portmap (or anything else) getting them. When each privileged service starts, it calls portrelease first in order to tell portreserve to let go of its ports.
This works well enough from a simplistic point of view, but doesn’t go far enough. There is a race condition when a privileged service starts: between calling portrelease and the service actually binding to the now-freed port, it could potentially be grabbed by some other service. Also portreserve is a once-per-boot thing: if you stop or restart a protected service after boot there is no protection for its ports.
A really race-proof solution appears to now exist in systemd. It provides port-based socket activation, meaning that it can allocate ports that will later be required during the boot, stopping portmap from getting them. When the relevant service starts, systemd hands the socket file descriptor directly to the service, with no race condition. It even retains the port when the service stops.
What’s the problem? Services don’t always want to be activated on demand. In the case of CUPS, there are two ports: TCP 631 (for IPP) and UDP 631 (for CUPS Browsing). The UDP port is simple for listening out for periodic announcements of network shared CUPS queues. When a packet arrives, there is no need to start CUPS — but the port needs to be protected from portmap, or else CUPS Browsing will mysteriously fail from time to time.
Proposal: separate activation from port reservation
My proposal to fix this is for systemd to separate this socket activation feature from the more fundamental one of reserving ports for services. One way of doing this would be to add ListenStreamNoActivate and ListenDatagramNoActivate configuration directives.
4 responses to “The portreserve problem: is systemd the solution?”
The right way to do this should be to never allow those ports to be assigned to any service they shouldn’t belong to. SELinux can do this today, or if we need new kernel facilities that’s that. There is no inherent reason why these sockets should have to be grabbed beforehand, just as long as the filter is set up.
You do realise that it is possible to fix the port numbers the kernel uses for NFS?
Check the following kernel options:
and the appropriate userspace RPC daemon options:
rpc.mountd -p #
rpc.statd -p # -o #
rpc.rquotad -p #
I’m using those ever since I first hit this problem… totally simple solution.
equinox: that solution doesn’t go far enough. It locks down the ports for NFS, but there are other bindresvport()-using services too. From an architectural point of view a solution attacking that side of the problem will never be sufficient, as there may be 3rd party services installed which use bindresvport().
hpa: interesting, I hadn’t thought of using SELinux for that. Not sure how that would work in practice as the reserved privileged ports will vary from system to system depending on which packages are installed. I guess policy could be adjusted when the package is installed or uninstalled.
The “grabbing ports beforehand” idea came from Ulrich Drepper, as a counter-suggestion to the plan of changing glibc’s bindresvport() implementation so as to avoid certain ports being assigned (see bug #103401, where this all started).
Honestly people should just stop using bindresvport(). It’s such a useless, broken API. I think the right fix would be to just use a normal bind() instead. That way NFS/portmap would get high ports assigned, but that shouldn’t really be a problem these days, or is it?
But anyway, the idea sounds good to me. I added it to our systemd TODO list.