Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix container name creation to generate valid DNS hostnames #2078

Closed
wants to merge 2 commits into from

Conversation

ramirogm
Copy link

Fixes #2077

Change-type: patch
Signed-off-by: Ramiro Gonzalez [email protected]

If this is a regression, consider adding it to #1898!

Description

Fix container name creation to generate valid DNS hostnames

Fixes #2077

Type of change

Patch

How Has This Been Tested?

Unit tests added

@pipex
Copy link
Contributor

pipex commented Dec 12, 2022

@ramirogm why do you need this change? Could you describe the problem you are trying to solve? Who is trying to use the full container name as a hostname? The supervisor already configures the serviceName as an alias for all services, see

v.aliases.push(serviceName);

The commit in the container name is used in a bunch of places in the code, including to identify the releases when patching the current state so the way you are doing the transformation might not work. The Web terminal also uses the container name for accessing the services, although that is easier to change.

FWIW, I hope that we can remove the serviceId and imageId from the name in the future, making the container name much shorter.

@cywang117
Copy link
Contributor

cywang117 commented Dec 12, 2022

@pipex This looks to be in response to #2077, which was created from the attached JF ticket in that issue. I'll upload a docker-compose.yml with this message shortly with a local reproduction, if possible.

EDIT: I've verified that this can't be reproduced locally @ramirogm. Take the following docker-compose.yml for instance:

version: '2.3'

services:
  one:
    image: alpine:latest
    command: sleep infinity
    stop_signal: SIGKILL
  very-very-very-very-very-very-very-very-very-very-very-long:
    image: alpine:latest
    command: sleep infinity
    stop_signal: SIGKILL

The second service has 59 chars by itself, and after the Supervisor adds service|imageId and commit, goes beyond 63 chars. However, Felipe is right that services are created with aliases equal to their service names. Setting the device in local mode and pushing the above docker-compose.yml, we can see the following:

  • Pinging the long service from the short service only works when using the alias, and errors due to length when using the full name:
balena exec one_1_1_localrelease ping very-very-very-very-very-very-very-very-very-very-very-long
PING very-very-very-very-very-very-very-very-very-very-very-long (172.17.0.2): 56 data bytes
64 bytes from 172.17.0.2: seq=0 ttl=64 time=0.130 ms

...

balena exec one_1_1_localrelease ping very-very-very-very-very-very-very-very-very-very-very-long_2_1_localrelease
ping: bad address 'very-very-very-very-very-very-very-very-very-very-very-long_2_1_localrelease'

I definitely learned about the limits on container name length, so thank you for that, and great work on diving right into the Supervisor codebase! Unfortunately, it looks like the container name length is not the culprit here :(

@pipex
Copy link
Contributor

pipex commented Dec 12, 2022

Oh, thank you @cywang117, I'll reply on the issue

@ramirogm
Copy link
Author

ramirogm commented Dec 13, 2022

@pipex Hi Felipe! Thanks for looking into this. I've just added more context on #2077 (comment)

This is an issue we found with container names being used by the docker embedded DNS server to perform reverse DNS lookups. Some apps use that, and is also used by ping which is what our balena user found to reproduce the issue.

Then I went ahead and created this PR with a candidate solution, just to get this going. The main goal is to get the hostname to be a valid DNS hostname, meaning 1 <= length <= 63. I don't have much knowledge about the dependencies of other projects on that, so I chose a solution that didn't break what's here.

The solution is basically to leave the release and imageId as they currently are, and then shorten the commit to 12 characters which should be enough as a git commit shortener, and then shorten the serviceName if needed.

About the specific questions:

Who is trying to use the full container name as a hostname? The supervisor already configures the serviceName as an alias for all services,
It would be the embedded DNS resolver.
Example:

On a test device, on a container running ubuntu that resolves names using the embedded DNS server:

root@e404e104187f:/# cat /etc/resolv.conf
nameserver 127.0.0.11
options timeout:15 ndots:0

We get the IP from a container with a long but legal name:

root@e404e104187f:/# nslookup zookeeperXX62
Server:         127.0.0.11
Address:        127.0.0.11#53

Non-authoritative answer:
Name:   zookeeperXX62
Address: 172.17.0.4

and if we do the reverse lookup we get the container name, not the alias:

root@e404e104187f:/# nslookup -x 172.17.0.4
4.0.17.172.in-addr.arpa name = zookeeperXX62_5841324_2408231_29796bdae4079eecb9e82d78a18c0156.ae8c6ddc272547a49531149bd2dd187f_default.
zookeeperXX62_5841324_2408231_29796bdae4079eecb9e82d78a18c0156.ae8c6ddc272547a49531149bd2dd187f_default
hostname: 62 chars
domain: 40 chars

If I do the same with a container whose name is 64 bytes, the embedded DNS server logs an error.

root@29eba2b:~# balena ps | grep zookeeperXXXX64
57c4afad8df6   dc8eeceb4a7c                                                     "tail -f /dev/null"      36 minutes ago   Up 36 minutes                    zookeeperXXXX64_5841322_2408231_29796bdae4079eecb9e82d78a18c0156

note: zookeeperXXXX64_5841322_2408231_29796bdae4079eecb9e82d78a18c0156 64 chars.

root@e404e104187f:/# nslookup  zookeeperxxxx64
Server:         127.0.0.11
Address:        127.0.0.11#53

Non-authoritative answer:
Name:   zookeeperxxxx64
Address: 172.17.0.3

# this hangs, and I have to CTRL-C it
root@e404e104187f:/# nslookup -x 172.17.0.3
^C

while on journalctl:

Dec 13 01:43:14 29eba2b balenad[1343]: time="2022-12-13T01:43:14.566083455Z" level=error msg="[resolver] error writing resolver resp, dns: bad data"

The commit in the container name is used in a bunch of places in the code, including to identify the releases when patching the current state so the way you are doing the transformation might not work.

Ok, if we decide to shorten it to 12-chars it should be consistent. Something like https://pkg.go.dev/github.com/docker/docker/pkg/stringid#TruncateID

@cywang117 About

ping: bad address 'very-very-very-very-very-very-very-very-very-very-very-long_2_1_localrelease'

I think that this doesn't faithfully reproduces the issue, because the error we found is with the reverse lookup.

@pipex
Copy link
Contributor

pipex commented Jun 19, 2023

Closing as this should be improved by #2136

@pipex pipex closed this Jun 19, 2023
@ramirogm
Copy link
Author

ramirogm commented Jun 20, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

supervisor generates invalid DNS names that break reverse DNS lookups
3 participants