Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use a NodePort service as ExternalName in another service type ExternalName for configuring default-backend #12158

Closed
chessman opened this issue Oct 10, 2024 · 25 comments · May be fixed by #12480
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@chessman
Copy link
Contributor

What happened:

I'm trying to use an ExternalName service as a default backend for an Ingress to handle situation when the deployment has no active endpoints. I get 504 Gateway Time-out error. In controller logs:

W1010 15:04:46.362237       7 controller.go:1216] Service "default/web" does not have any active Endpoint.
2024/10/10 15:05:07 [error] 1294#1294: *41828 [lua] balancer.lua:348: balance(): error while setting current upstream peer [echo.default.svc.cluster.local]:8080: invalid IPv6 address while connecting to upstream, client: 192.168.49.1, server: hello-world.example, request: "GET / HTTP/1.1", host: "hello-world.example"
2024/10/10 15:05:12 [error] 1294#1294: *41828 upstream timed out (110: Operation timed out) while connecting to upstream, client: 192.168.49.1, server: hello-world.example, request: "GET / HTTP/1.1", upstream: "http://0.0.0.1:80/", host: "hello-world.example"

What you expected to happen:

The request should be passed to the default backend service. It should not treat the external name as an IPv6 address. AFAIK, it works when an external name service is used in Ingress. Why is default backend handled differently?

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
v1.11.2

Kubernetes version (use kubectl version):
v1.31.0

Environment:

Initially I experienced this in EKS, then I reproduced it in Minikube.

  • Cloud provider or hardware configuration: Minikube v1.34.0

  • OS (e.g. from /etc/os-release): Ubuntu 24.04

  • Kernel (e.g. uname -a): 6.8.0-45

  • How was the ingress-nginx-controller installed:
    minikube addons enable ingress

How to reproduce this issue:

Install minikube

minikube start

Install the ingress controller

minikube addons enable ingress

Install hello-app with 0 replicas

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web
  name: web
  namespace: default
spec:
  replicas: 0
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - image: gcr.io/google-samples/hello-app:1.0
        imagePullPolicy: IfNotPresent
        name: hello-app

kubectl expose deployment web --type=NodePort --port=8080

Install echo server

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: echo
  name: echo
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      containers:
      - image: jmalloc/echo-server
        imagePullPolicy: IfNotPresent
        name: echo
kubectl expose deployment echo --type=NodePort --port=8080

Create ExternalName service that points to echo service

apiVersion: v1
kind: Service
metadata:
  name: echo-en
  namespace: default
spec:
  externalName: echo.default.svc.cluster.local
  type: ExternalName
  ports:
  - protocol: TCP
    port: 8080
    targetPort: 8080

Create an ingress for hello-app with echo service external name as default backend

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/default-backend: echo-en
  name: example-ingress
spec:
  ingressClassName: nginx
  rules:
    - host: hello-world.example
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web
                port:
                  number: 8080

Send a request

❯ curl --resolve "hello-world.example:80:$( minikube ip )" -i http://hello-world.example
HTTP/1.1 504 Gateway Time-out
Date: Thu, 10 Oct 2024 15:35:28 GMT
Content-Type: text/html
Content-Length: 160
Connection: keep-alive

<html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>
@chessman chessman added the kind/bug Categorizes issue or PR as related to a bug. label Oct 10, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 10, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@longwuyuan
Copy link
Contributor

/remove-kind bug
The project does not support your expectation

/close

@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. and removed kind/bug Categorizes issue or PR as related to a bug. labels Oct 10, 2024
@k8s-ci-robot
Copy link
Contributor

@longwuyuan: Closing this issue.

In response to this:

/remove-kind bug
The project does not support your expectation

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@chessman
Copy link
Contributor Author

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Oct 11, 2024
@k8s-ci-robot
Copy link
Contributor

@chessman: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@longwuyuan
Copy link
Contributor

/retitle use a NodePort service as ExternalName in another service type ExternalName for configuring default-backend

@k8s-ci-robot k8s-ci-robot changed the title ExternalName service cannot be used as default backend use a NodePort service as ExternalName in another service type ExternalName for configuring default-backend Oct 11, 2024
@longwuyuan
Copy link
Contributor

@chessman thanks for your comments. Using data to discuss helps everyone get on the same page.

  • Use the data that you provided in the issue.

  • As per the data in the issue you have the echo-service(default-backend) and the hello-app ingress in the same namespace(default)

  • So you can just do kubectl expose deployment echo --port=8080 instead of creating service --type NodePort

  • Then you can set the default backend in the ingress you showed there simply like this "nginx.ingress.kubernetes.io/default-backend: echo"

    • Note the the expose command without --name will create a service of --type ClusterIP with same name as deploment "echo"
  • You also have options to set the global default-backend https://kubernetes.github.io/ingress-nginx/examples/customization/custom-errors/

@longwuyuan
Copy link
Contributor

@chessman please discuss the issue here and not in the PR

I am copy/pasting the docs for the annotation default-backend
image

You can see clearly that we support another service in the same namespace. There is no reference of supporting a service from another namespace as default-backend. So this is not a bug.

And there are no resources or interests to start supporting service from another namespace as default-backend. You do have the option to set a global custom-backend https://kubernetes.github.io/ingress-nginx/examples/customization/custom-errors/

@chessman
Copy link
Contributor Author

@longwuyuan <svc name> in the docs is an ExternalName service in the same namespace. The doc does not say that it cannot be ExternalName, that's why I think it is a bug. It points to another in-cluster service in my case but can point to any address in general.

So you can just do kubectl expose deployment echo --port=8080 instead of creating service --type NodePort

Then you can set the default backend in the ingress you showed there simply like this "nginx.ingress.kubernetes.io/default-backend: echo"

Yes, I'm aware of it. I did it just to create a simple reproduction. My use case is more complex, I need to set default backend in different namespaces to one in-cluster service.

You do have the option to set a global custom-backend https://kubernetes.github.io/ingress-nginx/examples/customization/custom-errors/

This is not ideal solution in my case. I need to handle only "no active endpoints" with the service, it can be done only with default backend set in annotations. With global backend I cannot distinguish "no active endpoints" from 503 returned by a service.

@longwuyuan
Copy link
Contributor

@chessman I completely agree with you on all your latest comments above. I also wish that this can be solved for you and that your requirements for using a service from another namespace are met.

But it seems my earlier comments were not precise or informative. So I will try to rephrase my earlier comments.

  • This is a community project maintained by volunteers so the context is mutual benefit of maximum number of users in a conventional approach. Also any approaches that are implied by the upstream K8S Ingress-API specs are acceptable.
  • The relevance here is that even though it is not explicit, the convention implies that the service-type a user may configure will be a service of --type ClusterIP, as it works and has worked for everyone for a long long time.
  • Agreed it is not explicit in the docs, but the practical possibility is that we can edit the docs to get explicit, but the project is hardly likely to change its code to support just one single user, and start supporting service --type ExternalName for that default-backend.
  • There are many reasons for this direction the project has taken, that I already stated earlier. Like lack of resources and a lack of any interest in adding new enhanced capabilities that you are expecting.
  • We are actually deprecating popular fully-working features as its not feasible to support & maintain them, at the cost of security & stability. Hence the suggestion that if you want, then you can fork the project and implement your changes there.
  • Your change may not be in lua part of code. But I was referring to the error message hinting that the lua-balancer was referenced. I am not a developer so wait for comments from developers. If a developer accepts your changes, then it will be communicated here or in the PR obviously.
  • Just FYI, if your change avoids using endpointSlices to connect to default-backend, then the change you are expecting could very likely be hitting the lua codepath as that is where the lua-balancer reference in the error log is. Also if you are skipping endpointSlices, then I suspect the change you are proposing is not at all trivial.
  • I don't see any tests or description of impact on security so that is not helping. Going across namespace is frowned upon and project will get a CVE for such a behaviour

@chessman
Copy link
Contributor Author

@longwuyuan BTW, we can consider the opposite case - ExternalName service in Ingress and ClusterIP in default backend. Due to the fact that currently a wrong service is used for default backend, it will treat the ClusterIP service as an external service. I'm not sure that it works correctly and probably it also entails some security risks.

@chessman
Copy link
Contributor Author

@longwuyuan Also, probably it works without my change if both Ingress and default backend are external services. If you think that this is a security issue, then you need to explicitly validate default backend. But again, I don't understand why it is ok in Ingress and not ok in default backend.

@longwuyuan
Copy link
Contributor

  • There are many production users right now, who configured a ingress resource backend as service --type ExternalName. This is not a topic of debate
  • Your definition of "wrong" so far is based on your problem but not on any data as to what exactly is "wrong". We do not support cross-namespace in ingress. Simple. Clear. Its a security issue and will not change this
  • If you mean ingress backend as ExternalName and default-backend as a ClusterIP service in same namespace as ingress resource, then try it. The response from the ExternalName backend needs to be trapped by default-backend. If it does not work, it is not a problem in the controller-code. I think another user reported that the controller was not using EndpointSlices but was using DNS name and failing. This is also is not a problem with controller code as controller does not write any DNS client code
  • If you want to use default-backend as a URL outside the cluster, then we don't test or support it. Simple. Clear. It will not change due to unknown security risks

Practically, you can try to create default-backend in each namespace where you use a ingress resource. It means deploying same workload many times. But default-backend can be very very light. Our example custom-backend is very very small.

  • If you want to trap 4XX but not trap 5XX, there is no solution AFAIK. BUT, maybe you can try the example I linked from docs and only custom-errors for 4XX and not 5XX and see what happens. I may test it myself in future

@chessman
Copy link
Contributor Author

By "wrong" here I meant how default backend is created:
d3cc0a8#diff-4198ec010671801881244a8052177f31bcbc682c99fbd7391bceb136025c0568R938-R940
As you can see, it uses DeepCopy to make an exact copy of upstream (ingress). Then it replaces name and endpoints to default backend ones but forgets to set the correct service, so the further behavior might be incorrect because it depends on the wrong service. It can lead to other bugs, which are completely unrelated to what we discussed. If the code is using the service reference, then its behavior is incorrect because the service is wrong. I'm not familiar with the code to predict all problems but you can imagine: if it uses a service name - name is wrong, if it needs a service type - type is wrong, and so on.

@longwuyuan
Copy link
Contributor

I am not a developer so I could be talking crap here.
If one endpoint can be used as destination for routing, then what is the requirement to "SET" a service ?

@longwuyuan
Copy link
Contributor

you want to "SET" a service because the ExternalName destination is not a EndPoint obtained from the EndpointSlice that the controller looks at ?

@chessman
Copy link
Contributor Author

The endpoints are correct, they are returned from here https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/endpointslices.go#L55-L77
This code constructs a virtual Endpoint from ExternalName.
I need to set the correct service so that the lua part knows that it needs to resolve the name in the Endpoint https://github.com/kubernetes/ingress-nginx/blob/main/rootfs/etc/nginx/lua/balancer.lua#L112-L114

@longwuyuan
Copy link
Contributor

And if the name in the endpoint is a name that is internal to the cluster and only resolved by CoreDNS in the cluster, then the lua_balancer is not using CoreDNS as nameserver but some other nameserver outside the cluster ?

@chessman
Copy link
Contributor Author

Sorry, this I don't get. I tested after applying the patch, lua_balancer can resolve the internal name. Without the patch, lua_balancer is not trying to resolve.

@longwuyuan
Copy link
Contributor

then back to original question, why do you need to "SET" service ? The error messages you posted are ipv6 related or the unlikely ipv4 address 0.0.0.1 ?

If you "SET" service, then do you suddenly get ipv6 or a different address as compared to 0.0.0.1 ?

@longwuyuan
Copy link
Contributor

longwuyuan commented Oct 11, 2024

In any case, the basic idea of connecting to another namespace is a non starter. The project will not even discuss allowing cross-namespace, as it is literally a CVE.

But all these are my opinions, I have no more input. Please wait for other comments.

@chessman
Copy link
Contributor Author

then back to original question, why do you need to "SET" service ? The error messages you posted are ipv6 related or the unlikely ipv4 address 0.0.0.1 ?

If you "SET" service, then do you suddenly get ipv6 or a different address as compared to 0.0.0.1 ?

If I don't set the service, lua code passes the address "as is" to https://github.com/openresty/lua-resty-core/blob/master/lib/ngx/balancer.lua, which does not expect FQDN, so it tries to interpret it as IPv6. 0.0.0.1 comes from there as well.

If I set the service, lua wrapper resolves the address before passing it to ngx.balancer.

@longwuyuan
Copy link
Contributor

Similar questions here #12173

@longwuyuan
Copy link
Contributor

/close

Closing this because this topic i sbeing discussed in #12173

@k8s-ci-robot
Copy link
Contributor

@longwuyuan: Closing this issue.

In response to this:

/close

Closing this because this topic i sbeing discussed in #12173

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
3 participants