-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Drop DOCKER-ISOLATION rules #49981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Drop DOCKER-ISOLATION rules #49981
Conversation
@robmry I see the milestone v29; does this one have to wait for that, or was that just "need some milestone?" |
This one should wait for v29. |
checkHTTP := func(t *testing.T, addr string, expResp bool) { | ||
t.Parallel() | ||
t.Helper() | ||
url := "http://" + net.JoinHostPort(addr, "80") | ||
res := container.RunAttach(ctx, t, c, | ||
container.WithNetworkMode(clientNetName), | ||
container.WithCmd("wget", "-O-", "-T3", url), | ||
) | ||
if expResp { | ||
// 404 Not Found means the server responded, but it's got nothing to serve. | ||
assert.Check(t, is.Contains(res.Stderr.String(), "404 Not Found"), "url: %s", url) | ||
} else { | ||
assert.Check(t, is.Contains(res.Stderr.String(), "download timed out"), "url: %s", url) | ||
} | ||
} | ||
t.Run("w", func(t *testing.T) { // Wait for the parallel tests to complete. | ||
t.Run("ipv4/pub", func(t *testing.T) { checkHTTP(t, pub4, tc.expPubResp) }) | ||
t.Run("ipv6/pub", func(t *testing.T) { checkHTTP(t, pub6, tc.expPubResp) }) | ||
t.Run("ipv4/unpub", func(t *testing.T) { checkHTTP(t, unpub4, tc.expUnpubResp) }) | ||
t.Run("ipv6/unpub", func(t *testing.T) { checkHTTP(t, unpub6, tc.expUnpubResp) }) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkHTTP := func(t *testing.T, addr string, expResp bool) { | |
t.Parallel() | |
t.Helper() | |
url := "http://" + net.JoinHostPort(addr, "80") | |
res := container.RunAttach(ctx, t, c, | |
container.WithNetworkMode(clientNetName), | |
container.WithCmd("wget", "-O-", "-T3", url), | |
) | |
if expResp { | |
// 404 Not Found means the server responded, but it's got nothing to serve. | |
assert.Check(t, is.Contains(res.Stderr.String(), "404 Not Found"), "url: %s", url) | |
} else { | |
assert.Check(t, is.Contains(res.Stderr.String(), "download timed out"), "url: %s", url) | |
} | |
} | |
t.Run("w", func(t *testing.T) { // Wait for the parallel tests to complete. | |
t.Run("ipv4/pub", func(t *testing.T) { checkHTTP(t, pub4, tc.expPubResp) }) | |
t.Run("ipv6/pub", func(t *testing.T) { checkHTTP(t, pub6, tc.expPubResp) }) | |
t.Run("ipv4/unpub", func(t *testing.T) { checkHTTP(t, unpub4, tc.expUnpubResp) }) | |
t.Run("ipv6/unpub", func(t *testing.T) { checkHTTP(t, unpub6, tc.expUnpubResp) }) | |
}) | |
checkHTTP := func(addr string, expResp bool) func(*testing.T) { | |
return func(t *testing.T) { | |
t.Parallel() | |
t.Helper() | |
url := "http://" + net.JoinHostPort(addr, "80") | |
res := container.RunAttach(ctx, t, c, | |
container.WithNetworkMode(clientNetName), | |
container.WithCmd("wget", "-O-", "-T3", url), | |
) | |
if expResp { | |
// 404 Not Found means the server responded, but it's got nothing to serve. | |
assert.Check(t, is.Contains(res.Stderr.String(), "404 Not Found"), "url: %s", url) | |
} else { | |
assert.Check(t, is.Contains(res.Stderr.String(), "download timed out"), "url: %s", url) | |
} | |
} | |
} | |
t.Run("w", func(t *testing.T) { // Wait for the parallel tests to complete. | |
t.Run("ipv4/pub", checkHTTP(pub4, tc.expPubResp)) | |
t.Run("ipv6/pub", checkHTTP(pub6, tc.expPubResp)) | |
t.Run("ipv4/unpub", checkHTTP(unpub4, tc.expUnpubResp) ) | |
t.Run("ipv6/unpub", checkHTTP(unpub6, tc.expUnpubResp) ) | |
}) |
Let me temporarily move to draft to prevent trigger-happy |
Rebased to resolve conflicts. |
The Inter-Network Communication rules in the iptables chains DOCKER-ISOLATION-STAGE-1 / DOCKER-ISOLATION-STAGE-2 (which are called from filter-FORWARD) currently: - Block access from containers in one bridge network, to ports published to host addresses by containers in other bridge networks, when the userland-proxy is disabled. - But, that access is allowed when the proxy is enabled. - Block access to all ports on container addresses in gateway mode "nat-unprotected" networks. - But, those ports can be accessed from anywhere else, including other hosts. Just not other bridge networks. - Allow access from containers in "nat" bridge networks to published ports on container addresses in "routed" networks. But, to do that, extra INC rules are added for the routed network. The INC rules are no longer needed to block access from containers in one network to unpublished ports on container addresses in other networks. Direct routing to containers in NAT networks is blocked by the "raw-PREROUTING" rules that block access from untrusted interfaces (all interfaces apart from the network's own bridge). Drop these INC rules to resolve the inconsistencies listed above, with this change: - Published ports on host addresses can be accessed from containers in other networks (even without the userland-proxy). - The rules for direct routing between bridge networks are the same as the rules for direct routing from outside the Docker host (allowed for gw modes "routed" and "nat-unprotected", disallowed for "nat"). Fewer rules, so it's simpler, and perhaps slightly faster. Internal networks (with no access to networks outside the host) are also implemented using rules in the DOCKER-ISOLATION chains. This change moves those rules to a new chain, DOCKER-INTERNAL, and drops the DOCKER-ISOLATION chains. Signed-off-by: Rob Murray <rob.murray@docker.com>
Marked as ready for review again ... the master branch is now 29.x, and we'll cherry-pick for 28.x releases. |
- What I did
gateway_mode=routed
#49509 (comment)Simplify iptables rules by dropping Inter-Network Communication (INC) rules, to make behaviour more consistent - and, to avoid carrying inconsistent behaviour forwards into nftables, without introducing semantic differences between iptables and nftables networks.
With this change:
Fewer rules, so it's simpler, and perhaps slightly faster.
Background ...
The Inter-Network Communication rules in the iptables chains
DOCKER-ISOLATION-STAGE-1
/DOCKER-ISOLATION-STAGE-2
(which are called fromfilter-FORWARD
) currently:Since #48724, the INC rules are no longer needed to block access from containers in one network to unpublished ports on container addresses in other networks.
Since #49325, direct routing to containers in NAT networks is blocked by the "raw-PREROUTING" rules that block access from untrusted interfaces (all interfaces apart from the network's own bridge).
- How I did it
Dropped the INC rules to resolve the inconsistencies listed above.
Internal networks (with no access to networks outside the host) were also implemented using rules in the DOCKER-ISOLATION chains. This change moves those rules to a new chain, DOCKER-INTERNAL, and drops the DOCKER-ISOLATION chains.
- How to verify it
New and updated tests.
Also, started a daemon without this change, added some networks to create INC rules in the DOCKER-ISOLATION chains, for normal and internal networks. Stopped that daemon, started one with the change, checked that the DOCKER-ISOLATION chains were removed and the internal network's rules migrated to DOCKER-INTERNAL. (With and without live-restore enabled.)
- Human readable description for the release notes