AKS HTTP Application Routing issues with newer Kubernetes Ingress

Photo by Felix Mittermeier / Unsplash
Note: Turns out that this also applies if running Application Gateway Ingress Controller (AGIC) version 1.4.0 and use the v1 Ingress. The exact same solution applies, but with the AGIC service account instead of the one used in this guide. This issue is corrected in v1.5.0-rc1.
Note 2: This is not the only way of fixing this. You could also download the chart and add the correct rights there, but the way I describe doesn't change anything but just adds the correct rights on top of the solution.

I was working with a client where they used the HTTP Application Routing addon for AKS, which basically just creates a DNS sone with a fancy generated domain and an NGINX Ingress Controller. Obviously not what you want to use for production workloads but it's great if you're creating tests deployments and just want an easy Ingress that is completely automatic.

Well, it would be great if we got it to work. I have never used this addon before but there were nothing special steps needed when I read through the docs but for some reason it was not working.

Turns out, the Service Account used to update some of the internal components did not have enough access. I tried both using Azure RBAC and normal Kubernetes RBAC but still not working. When creating a new Ingress it should get updated with the external IP pretty quickly but that never happened:

As you can see here I got a couple of Ingress that have existed for almost an hour with no address set. The class was also empty, but that was set through annotations.

When I checked the logs for the Ingress Controller I see these messages:

W0920 17:43:32.171988       7 status.go:288] error updating ingress rule: ingresses.networking.k8s.io "cm-acme-http-solver-99lwz" is forbidden: User "system:serviceaccount:kube-system:addon-http-application-routing-nginx-ingress-serviceaccount" cannot update resource "ingresses/status" in API group "networking.k8s.io" in the namespace "ris-pullrequest1628": Azure does not have opinion for this user.
W0920 17:43:32.172332       7 status.go:288] error updating ingress rule: ingresses.networking.k8s.io "cm-acme-http-solver-slr4j" is forbidden: User "system:serviceaccount:kube-system:addon-http-application-routing-nginx-ingress-serviceaccount" cannot update resource "ingresses/status" in API group "networking.k8s.io" in the namespace "ris-dev": Azure does not have opinion for this user.
W0920 17:43:32.366825       7 status.go:288] error updating ingress rule: ingresses.networking.k8s.io "cm-acme-http-solver-5ffk2" is forbidden: User "system:serviceaccount:kube-system:addon-http-application-routing-nginx-ingress-serviceaccount" cannot update resource "ingresses/status" in API group "networking.k8s.io" in the namespace "rstest01": Azure does not have opinion for this user.

Which then lead me to the ClusterRole created by Azure for this addon:

We see that we have the right to update status for Ingresses, but only using the deprecated APIGroup extensions/v1beta1 which is going to be removed in version 1.22. When we are using Cert-Manager, this is a problem as the newer versions creates the stable versions Ingress resources for ACME so even if we updated the code to use the old API Group we still wouldn't get it to work.

After reading up on the Addon Manager, I saw that there was an label called addonmanager.kubernetes.io/mode=Reconcile. I could remove these, but that would alter that process and from my experience that usually just ends up in unexpected behavior. So what I tried next was creating my own role and role that I could add the service account to.

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: addon-http-app-routing-fix
rules:
- apiGroups:
  - "networking.k8s.io"
  resources: 
  - "ingresses/status"
  verbs:
  - "update"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: addon-http-app-routing-fix-clusterrolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: addon-http-app-routing-fix
subjects:
  - kind: ServiceAccount
    name: addon-http-application-routing-nginx-ingress-serviceaccount
    namespace: kube-system
---

This worked pretty well, but one problem remains. The DNS zone doesn't get updated based on Ingress like it should. When I check the rights for the service account used for this, it too was referencing the old APIGroup... So, I'm updating the rules to allow the role to check out Ingress objects and bind the role to that service account as well.

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: addon-http-app-routing-fix
rules:
- apiGroups:
  - "networking.k8s.io"
  resources: 
  - "ingresses/status"
  verbs:
  - "update"
- apiGroups:
  - "networking.k8s.io"
  resources:
    - "ingresses"
  verbs:
    - "get"
    - "watch"
    - "list"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: addon-http-app-routing-fix-clusterrolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: addon-http-app-routing-fix
subjects:
  - kind: ServiceAccount
    name: addon-http-application-routing-nginx-ingress-serviceaccount
    namespace: kube-system
  - kind: ServiceAccount
    name: addon-http-application-routing-external-dns
    namespace: kube-system
---

And that solves it. Obviously as we're closing in on the removal of the old beta APIs, this should have been solved by now but apparently not.

Roberth Strand

Roberth Strand

Norway