Skip to content

sysctl: Enable ping(8) inside rootless Podman containers#13141

Merged
poettering merged 1 commit intosystemd:masterfrom
debarshiray:wip/rishi/enable-ping-group-range
Jul 24, 2019
Merged

sysctl: Enable ping(8) inside rootless Podman containers#13141
poettering merged 1 commit intosystemd:masterfrom
debarshiray:wip/rishi/enable-ping-group-range

Conversation

@debarshiray
Copy link
Contributor

This makes ping(8) work without CAP_NET_ADMIN and CAP_NET_RAW because
those aren't effective inside rootless Podman containers.

It's quite useful when using OSTree based operating systems like Fedora
Silverblue, where development environments are often set up using
rootless Podman containers with helpers like Toolbox [1]. Not having
a basic network utility like ping(8) work inside the development
environment can be inconvenient.

See:
https://lwn.net/Articles/422330/
http://man7.org/linux/man-pages/man7/icmp.7.html
containers/podman#1550

[1] https://github.com/debarshiray/toolbox

@debarshiray
Copy link
Contributor Author

I am confused about the size of a group identifier on Linux these days. Wikipedia seems to suggest 32 bits, but on Fedora 29, an upper value of 4294967295 kept getting rejected. Should it be detected at build-time instead of being hard coded?

@poettering
Copy link
Member

hmm, i really don't understand the implications of this I must say, and the fact this uses 2^16-1 instead of 2^32-1 (i.e. the maximum of gid_t) appears quite strange to me... Binding this to group range membership sounds quite surprising to me I must say

@poettering
Copy link
Member

(note that I am not against this, I just want to understand this first)

@rhatdan
Copy link
Contributor

rhatdan commented Jul 23, 2019

The beauty of this is you can remove file capabilities from Ping.

# getcap /usr/bin/ping
/usr/bin/ping = cap_net_admin,cap_net_raw+p

If we had a default user set for all real users on Linux then we could make this a little more secure, say any user added by useradd went in the user group, then you could just setup the range by default to be for the user group.

Bottom line, is I believe users of a linux system expect to be able to ping without requiring root, and running ping in a container requires this access versus the file capabilities.

I have no idea what defaults Ubuntu has for this.

@rhatdan
Copy link
Contributor

rhatdan commented Jul 23, 2019

@debarshiray I think this should be marked as a change request for Fedora to allow security types to have their say.
If we go forward with this, then we should also suggest that the file capabilities of ping be dropped.

@debarshiray
Copy link
Contributor Author

the fact this uses 2^16-1 instead of 2^32-1 (i.e. the maximum of gid_t)
appears quite strange to me

Like I mentioned earlier, I had expected 2^32-1 to work, but it gets rejected on both Fedoras 30 and 29 even though __GID_T_TYPE is defined as an unsigned 32-bit integer in /usr/include/bits/typesizes.h. I haven't yet been able to dig up the canonical location where the maximum value of the group identifier is defined.

@giuseppe
Copy link
Contributor

@teknoraver what are the security implications of enabling net.ipv4.ping_group_range? Does it allow more than ping with file capabilities (such as crafting custom packets not handled correctly by the kernel)?

@poettering
Copy link
Member

so 2^31-1 seems to work, 2^31 does not.

@poettering
Copy link
Member

poettering commented Jul 23, 2019

hmm, this limit comes from include/net/ping.h:

#define GID_T_MAX (((gid_t)~0U) >> 1)

which appears to be simply wrong... somebody should file a bug...

that said, maybe this is not too bad for us, given that the upper uids are currently defined as "HIC SVNT LEONES" anyway in systemd:

https://systemd.io/UIDS-GIDS.html#summary

@debarshiray
Copy link
Contributor Author

@debarshiray I think this should be marked as a change request for Fedora to allow
security types to have their say.
If we go forward with this, then we should also suggest that the file capabilities of ping be dropped.

I wrote https://fedoraproject.org/wiki/Changes/EnableSysctlPingGroupRange

@teknoraver
Copy link
Contributor

teknoraver commented Jul 23, 2019

@teknoraver what are the security implications of enabling net.ipv4.ping_group_range? Does it allow more than ping with file capabilities (such as crafting custom packets not handled correctly by the kernel)?

the ICMP address family had a lot of security issues in the past, mostly because it has very few users.
It allows only to create ping packets, any other ICMP code is discarded:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/ping.c#n461

/*
 * Is this a supported type of ICMP message?
 */

static inline int ping_supported(int family, int type, int code)
{
	return (family == AF_INET && type == ICMP_ECHO && code == 0) ||
	       (family == AF_INET6 && type == ICMPV6_ECHO_REQUEST && code == 0);
}

A partial list of past exploitable bug is:

$ git log --grep='^Fixes: c319b4d76b9e' --oneline
torvalds/linux@1b97013bfb11 ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg
torvalds/linux@77d4b1d36926 net: ping: do not abuse udp_poll()
torvalds/linux@43a6684519ab ping: implement proper locking
torvalds/linux@73d2c6678e6c ping: fix a null pointer dereference
torvalds/linux@0eab121ef875 net: ping: check minimum size on ICMP header length
torvalds/linux@f1d8cba61c3c inet: fix possible seqlock deadlocks

@debarshiray
Copy link
Contributor Author

so 2^31-1 seems to work, 2^31 does not.

Thanks for digging into it, @poettering I have now bumped the upper limit to 2147483647 (= 2^31 - 1).

@poettering
Copy link
Member

@debarshiray also needs an announcement in NEWS I figure...

@teknoraver I am not too interested in historical issues with it, more what the current state is, and whether there are negative security implications today if we just turn this on.

@teknoraver
Copy link
Contributor

@teknoraver I am not too interested in historical issues with it, more what the current state is, and whether there are negative security implications today if we just turn this on.

No contraindications other than not being much tested and used.

@poettering
Copy link
Member

Well, if we make the change here upstream in systemd it will get some testing I am sure.

@debarshiray can you add a NEWS entry for this change, please? looks ok to merge then.

@debarshiray
Copy link
Contributor Author

hmm, this limit comes from include/net/ping.h:

#define GID_T_MAX (((gid_t)~0U) >> 1)

which appears to be simply wrong... somebody should file a bug...

There's this comment above that #define, which was added as part of the initial commit.

/*
 * gid_t is either uint or ushort.  We want to pass it to
 * proc_dointvec_minmax(), so it must not be larger than MAX_INT
 */ 

I am not qualified to judge it, but I thought I'd just mention it here. :)

@debarshiray
Copy link
Contributor Author

@debarshiray can you add a NEWS entry for this change, please? looks ok to merge then.

Added a blurb to NEWS.

I wasn't sure of the right order. I stuck it at the top since it starts with E, so it's very roughly alphabetical.

This makes ping(8) work without CAP_NET_ADMIN and CAP_NET_RAW because
those aren't effective inside rootless Podman containers.

It's quite useful when using OSTree based operating systems like Fedora
Silverblue, where development environments are often set up using
rootless Podman containers with helpers like Toolbox [1]. Not having
a basic network utility like ping(8) work inside the development
environment can be inconvenient.

See:
https://lwn.net/Articles/422330/
http://man7.org/linux/man-pages/man7/icmp.7.html
containers/podman#1550

The upper limit of the range of group identifiers is set to 2147483647,
which is 2^31-1. Values greater than that get rejected by the kernel
because of this definition in linux/include/net/ping.h:
  #define GID_T_MAX (((gid_t)~0U) >> 1)

That's not so bad because values between 2^31 and 2^32-1 are reserved
on systemd-based systems anyway [2].

[1] https://github.com/debarshiray/toolbox
[2] https://systemd.io/UIDS-GIDS.html#summary
@poettering poettering added the good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed label Jul 24, 2019
@poettering poettering merged commit 90ce762 into systemd:master Jul 24, 2019
@debarshiray debarshiray deleted the wip/rishi/enable-ping-group-range branch July 24, 2019 14:52
@evverx
Copy link
Contributor

evverx commented Jul 24, 2019

systemd-sysctl has started to fail in nspawn containers launched with -U (which turns on user namespaces): #13177

@evverx
Copy link
Contributor

evverx commented Jul 26, 2019

The PR was reverted in #13187. We keep discussing what to do about user namespaces (where it's not always possible to write 2147483647 because of https://lore.kernel.org/patchwork/patch/959770/) in #13177.

@poettering
Copy link
Member

I posted #13191 now, to revert the revert, and make it work everywhere correctly now

MichaIng added a commit to MichaIng/DietPi that referenced this pull request May 31, 2020
+ Network | Grant all users "ping" access without the need for sudo, setuid, CAP_NET_RAW or CAP_NET_ADMIN. Further infos and discussion about this:
  - https://fedoraproject.org/wiki/Changes/EnableSysctlPingGroupRange
  - systemd/systemd#13141
  - #1012
jian-lin added a commit to linj-fork/nixpkgs that referenced this pull request Sep 21, 2023
From systemd 243 release note[1]:

This release enables unprivileged programs (i.e. requiring neither
setuid nor file capabilities) to send ICMP Echo (i.e. ping) requests
by turning on the "net.ipv4.ping_group_range" sysctl of the Linux
kernel for the whole UNIX group range, i.e. all processes.

So this wrapper is not needed any more.

See also [2] and [3].

This patch also removes:
- apparmor profiles in NixOS for ping itself and the wrapped one
- other references for the wrapped ping

[1]: https://github.com/systemd/systemd/blob/8e2d9d40b33bc8e8f5d3479fb075d3fab32a4184/NEWS#L6457-L6464
[2]: systemd/systemd#13141
[3]: https://fedoraproject.org/wiki/Changes/EnableSysctlPingGroupRange
TheBrainScrambler pushed a commit to TheBrainScrambler/nixpkgs that referenced this pull request Dec 23, 2023
From systemd 243 release note[1]:

This release enables unprivileged programs (i.e. requiring neither
setuid nor file capabilities) to send ICMP Echo (i.e. ping) requests
by turning on the "net.ipv4.ping_group_range" sysctl of the Linux
kernel for the whole UNIX group range, i.e. all processes.

So this wrapper is not needed any more.

See also [2] and [3].

This patch also removes:
- apparmor profiles in NixOS for ping itself and the wrapped one
- other references for the wrapped ping

[1]: https://github.com/systemd/systemd/blob/8e2d9d40b33bc8e8f5d3479fb075d3fab32a4184/NEWS#L6457-L6464
[2]: systemd/systemd#13141
[3]: https://fedoraproject.org/wiki/Changes/EnableSysctlPingGroupRange
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed merged-but-needs-fixing sysctl

Development

Successfully merging this pull request may close these issues.

6 participants