Skip to content

Does it make sense to use ServiceAccounts for custom resources in the context of workload identity? #131740

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
matheuscscp opened this issue May 13, 2025 · 24 comments
Labels
sig/auth Categorizes an issue or PR as relevant to SIG Auth. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@matheuscscp
Copy link

matheuscscp commented May 13, 2025

Hi 👋

In CNCF Flux we have this RFC.

The TLDR for this RFC is: We plan to introduce spec.serviceAccountName fields for our custom resources. We want to issue JWTs through the TokenRequest API and exchange those JWTs for Access Tokens in Security Token Services of cloud providers. In this context, we do not bind the JWT to any Pods or Nodes, it wouldn't make any sense in this case (unless we bind to the Flux controller Pod/Node, but the TokenRequest API rejects such requests because the Flux controller Pod is obviously not using the ServiceAccount specified in the CR object, it's using its own ServiceAccount that we ship in the Flux distribution, and that's what it should use).

This works, I have tested it thoroughly for EKS, AKS and GKE (and for kind clusters as well!).

Is this an abuse, though? Does Kubernetes agree with this approach?

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 13, 2025
@matheuscscp
Copy link
Author

/sig auth

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 13, 2025
@enj enj moved this to Needs Triage in SIG Auth May 13, 2025
@liggitt
Copy link
Member

liggitt commented May 13, 2025

If you want to bind a token to an object to allow revocation without deleting the service account, you can bind it to a Secret object. You could set up a Secret with an OwnerReference pointing to your custom resource, so that when your custom resource is deleted, garbage collection would automatically delete the Secret as well, then bind the token to that secret.

@matheuscscp
Copy link
Author

Thanks! That's useful information 👍

In our case the idea is for the tokens to be short-lived anyway, we exchange them for cloud access tokens right away and never use that KSA JWT again, so it's used only once. I think we're good with setting a very short expiration 👍

My question is more about: Is it okay to use ServiceAccounts for Custom Resources? I heard ServiceAccounts are only meant to be used by Pods. Is that accurate?

@stealthybox
Copy link
Member

secret-bound tokens are interesting -- what other users of them exist?

@liggitt
Copy link
Member

liggitt commented May 13, 2025

Is it okay to use ServiceAccounts for Custom Resources?

It's not clear what you mean by that... a declarative resource doesn't use something, a program does :)

It's fine for a program to obtain / use service account credentials... plenty of things that aren't necessarily pods do that. For example, kube-controller-manager creates a service account and credential for every controller loop it runs.

@liggitt
Copy link
Member

liggitt commented May 13, 2025

secret-bound tokens are interesting -- what other users of them exist?

a few examples: https://github.com/search?q=language%3AGo+%2F%28%3F-i%29%22Secret%22%2F+%22BoundObjectRef%3A%22+-path%3Atest&type=code

@matheuscscp
Copy link
Author

It's fine for a program to obtain / use service account credentials... plenty of things that aren't necessarily pods do that. For example, kube-controller-manager creates a service account and credential for every controller loop it runs.
a few examples: https://github.com/search?q=language%3AGo+%2F%28%3F-i%29%22Secret%22%2F+%22BoundObjectRef%3A%22+-path%3Atest&type=code

Thanks very much, that's exactly what I wanted to hear!

It's not clear what you mean by that... a declarative resource doesn't use something, a program does :)

What I mean is that my controller will do what I said in the issue description, it will use the ServiceAccount on behalf of the reconciliation of the CR object. You can think about it as impersonation, which we do for a long time in Flux's kustomize-controller. When reconciling a Flux Kustomization object we allow users to set spec.serviceAccountName, then kustomize-controller will impersonate this SA when reconciling that object and doing server-side apply in the Kubernetes API. The difference is that in this case the controller Pod needs different RBAC, something like this:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
rules:
- apiGroups: [""]
  resources:
  - serviceaccounts
  verbs:
  - impersonate

While in this new feature from the RFC we are impersonating a cloud provider identity, and to do that we need to call TokenRequest explicitly, requiring this RBAC instead:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
rules:
- apiGroups: [""]
  resources:
  - serviceaccounts/token
  verbs:
  - create

One final question: Since we use the JWT immediately and only once, is it okay to simply set a very short expiration and avoid the overhead of creating a Secret?

@stealthybox
Copy link
Member

a declarative resource doesn't use something, a program does

The lines are blurred here with Pods, SA's, and the Kubelet.
Pods within a same namespace declare that they should use an SA credential.
In this case, the Kubelet is the program that promises a Pod-bound token be generated by kube-apiserver for it.


I don't believe RBAC in its current state can do anything with the pod claim within Pod-bound KSA JWT's.
It doesn't look like it shows up in UserInfo via AdmissionReview. It's fuzzy in my head right now what exactly within Kubernetes is validating or constraining behavior on this Pod claim within the SA JWT.

Our use-case is trading SA tokens for cloud-credentials.
With AWS IRSA, Azure Workload Identity, GKE Workload Identity, and Federeated Workload Identity on GCP, the services are currently happy to perform that trade for something that can either impersonate a cloud identity or be bound to with the cloud's IAM directly.

However, EKS Pod Identity is semantically and behaviorally only binding to and trading for Pod-bound KSA JWT's, and this shows how strongly coupled the identity is.
If other cloud providers follow that pattern, what we build into Flux may break in the future.

Historically, It used to be possible for controllers to steal the ServiceAccount Secret in a namespace (intended for Pods) to do other things, but we avoided that hack in Flux and instead opted for Impersonation directly with the kube API.
When SA Secrets were deprecated, it didn't affect us.

Basically, our concern is if we generate ServiceAccount tokens this way, will the ecosystem follow Kuberetes' direction of pod-bound workload identity and further constrain what is allowed? This is an ecosystem question as much as it is a project architecture/guidance question.

We would happily carve out a different section of the User namespace if there were Kubernetes API's to issue those tokens and cloud API's ready to trade for them, but TokenRequest is purely for ServiceAccounts. This maybe works with CertificateSigningRequest, but it's a different vehicle.

@stealthybox
Copy link
Member

Thank you for having this async convo -- there's so many details and nuance here, and I'm sure this is non-trivial energy among other priorities.

@liggitt
Copy link
Member

liggitt commented May 13, 2025

#131740 (comment)
What I mean is that my controller will do what I said in the issue description, it will use the ServiceAccount on behalf of the reconciliation of the CR object. You can think about it as impersonation, which we do for a long time in Flux's kustomize-controller. When reconciling a Flux Kustomization object we allow users to set spec.serviceAccountName, then kustomize-controller will impersonate this SA when reconciling that object and doing server-side apply in the Kubernetes API. The difference is that in this case the controller Pod needs different RBAC, something like this:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
rules:

- apiGroups: [""]
  resources:
  - serviceaccounts
  verbs:
  - impersonate

The permissions required to let a controller either impersonate or obtain credentials for arbitrary service accounts is sort of scary... even kube-controller-manager doesn't get credentials for arbitrary service accounts, but has fixed ones it sets up for each controller loop. If a user can create a CR, specify an arbitrary service account, and then get flux to impersonate or obtain credentials for it and act as it, and flux is granted broad permissions so it can do that, that seems like a good attack vector for a confused deputy attack.

Since we use the JWT immediately and only once, is it okay to simply set a very short expiration and avoid the overhead of creating a Secret?

If you have a reconciling controller, would it be creating a new token and using it on every time through the sync loop?

#131740 (comment)
I don't believe RBAC in its current state can do anything with the pod claim within Pod-bound KSA JWT's.
It doesn't look like it shows up in UserInfo via AdmissionReview. It's fuzzy in my head right now what exactly within Kubernetes is validating or constraining behavior on this Pod claim within the SA JWT.

The claims are validated at authentication time, and if the bound pod or secret is missing, the token is treated as invalid and the request rejected.

If the token is valid, the pod and associated node info is then copied into extra keys in the user info, which is visible to admission. See https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#additional-metadata-in-pod-bound-tokens as an example:

    extra:
      authentication.kubernetes.io/credential-id:
      - JTI=7ee52be0-9045-4653-aa5e-0da57b8dccdc
      authentication.kubernetes.io/node-name:
      - kind-control-plane
      authentication.kubernetes.io/node-uid:
      - 497e9d9a-47aa-4930-b0f6-9f2fb574c8c6
      authentication.kubernetes.io/pod-name:
      - test-pod
      authentication.kubernetes.io/pod-uid:
      - e87dbbd6-3d7e-45db-aafb-72b24627dff5

https://github.com/kubernetes/kubernetes/blob/master/test/e2e/auth/e2edata/per_node_validatingadmissionpolicy.yaml is an example of a validating admission policy making use of that node name.

Basically, our concern is if we generate ServiceAccount tokens this way, will the ecosystem follow Kuberetes' direction of pod-bound workload identity and further constrain what is allowed? This is an ecosystem question as much as it is a project architecture/guidance question.

For workload identity specifically, tying to pods / nodes makes more sense because those workloads are typically running on nodes, and it is more sensible to want to know which node a credential came from and possibly connect it to or intersect it with cloud permissions related to that node.

@matheuscscp
Copy link
Author

matheuscscp commented May 13, 2025

Our idea is to use workload identity for CRs. Let's say you have a GCS bucket and you want to use it with the Flux Bucket API. Instead of relying on the GCP identity associated with Flux's source-controller Pod, we want to allow you to set spec.serviceAccountName in the Bucket object and source-controller will use that to get a GCP access token for an identity that is only meant for this Flux object. Does this sound reasonable?

This is definitely possible right now, I have tested it many times. Is this wrong, though? Will Kubernetes make this impossible in the future?

@liggitt
Copy link
Member

liggitt commented May 13, 2025

Instead of relying on the GCP identity associated with Flux's source-controller Pod, we want to allow you to set spec.serviceAccountName in the Bucket object and source-controller will use that to get a GCP access token for an identity that is only meant for this Flux object. Does this sound reasonable?

I don't know enough about flux to know if it is assumed that whoever has write access to any flux-managed CRs in a namespace should have access to whatever any service account in the namespace can do. If so, that might be reasonable. If not, letting the user tell flux to use a random service account in the namespace to do things seems questionable.

This is definitely possible right now, I have tested it many times. Is this wrong, though? Will Kubernetes make this impossible in the future?

The Kubernetes service account --> cloud credential integrations are added on top of Kubernetes by cloud providers ... there's nothing in Kubernetes directly that would prevent that, but I wouldn't assume you can go directly from kubernetes service account to cloud identity credential... as mentioned earlier, most of those seem to tie into workload identity, and I wouldn't be surprised if node / pod info was assumed / required for that.

@stefanprodan
Copy link

stefanprodan commented May 13, 2025

I don't know enough about flux to know if it is assumed that whoever has write access to any flux-managed CRs in a namespace should have access to whatever any service account in the namespace can do. If so, that might be reasonable.

Flux allows clusters admins to assign a namespace to a Git repository with Kubernetes manifests. The assumption is that the team that has write access to the repo is the admin of the namespace. Flux is designed to impersonate any identity in that namespace to perform operations against Kubernetes API and Cloud APIs, we assume that all the identities (SAs, IAM bindings) in a namespace belong to a single entity (the team with write access to the repo).

@stealthybox
Copy link
Member

If a user can create a CR, specify an arbitrary service account, and then get flux to impersonate or obtain credentials for it and act as it, and flux is granted broad permissions so it can do that, that seems like a good attack vector for a confused deputy attack.

Flux always namespaces the SA to the same NS as the CR.
However, this is still dangerous, and it does cause confused deputy problems.
Example: fluxcd/helm-controller#498

One of the reasons we chose the serviceaccounts resource instead of users is so that somebody could not compromise the controller and then start acting as someone's gmail user or other external OIDC identity.

ResourceNames is omitted so that user's do not have to modify the Flux RBAC to have self-service multi-tenancy, but in hardened installations, cluster administrators should restrict the impersonation permissions of Flux via resourceNames similar to the kube-controller-manager.

@stealthybox
Copy link
Member

Off-topic on the history why Flux uses kube SA's:

We originally considered impersonating usernames in the form of system:fluxcd:helmrelease:namespace:name but decided against it because:

  • it's adding a non standard namespace to Kubernetes system:
  • RoleBinding subjects would not be namespace relative like ServiceAccounts
  • we can't prefix-pattern match for system:fluxcd:.* resourceNames on the impersonate verb (this was years ago, but we could probably accomplish this now with CEL and dynamic admission)

@stealthybox
Copy link
Member

It's not a feature that exists today, but it would be nice if we could TokenRequest for CustomResource-bound tokens.
This could form a basis for fine-grained workload/reconciler identity for controllers.
This would immediately be available in AdmissionReview (we could write a policy that Flux can only change resources when using a HelmRelease or Kustomization bound token)
Cloud IAM implementations could opt-in to supporting it formally.

@stealthybox
Copy link
Member

Ideally, the token username is scoped to the object's name itself (system:fluxcd:helmrelease:namespace:name).
Azure, AWS IAM, and GCP federation support binding to these kinds of specific in varying amounts, but there's already a whole ecosystem that's binding cloud ID's to ServiceAccounts and not just Usernames.

@liggitt
Copy link
Member

liggitt commented May 13, 2025

It's not a feature that exists today, but it would be nice if we could TokenRequest for CustomResource-bound tokens.
This could form a basis for fine-grained workload/reconciler identity for controllers.

Supporting arbitrary types as boundObjectRefs was a non-goal... that requires authentication to either keep informer-fed watches or request arbitrary APIs at authentication time to validate the bindings.

@matheuscscp
Copy link
Author

As explained above, impersonating a cloud identity from a Flux controller on behalf of the reconciliation of a Flux CR object is completely acceptable in the Flux multi-tenancy model. We would like to seek (continued!) support from the cloud providers to enable this use case. I know Kubernetes does not control what cloud providers do, but I'm sure they would probably at least like to be informed about what Kubernetes thinks about this subject. We will try to talk to the cloud providers about this, if we show them this thread should they conclude that Kubernetes supports our use case or not?

Just to be clear once again about the use case: we want to impersonate a cloud identity using a ServiceAccount defined on the CR object.

To give an example, external-secrets has this feature for a while:

https://external-secrets.io/latest/provider/azure-key-vault/

https://external-secrets.io/latest/provider/aws-secrets-manager/

https://external-secrets.io/latest/provider/google-secrets-manager/

We just want to offer the same.

@stealthybox
Copy link
Member

cross-posting @enj's comment about Azure Workload for posterity:

"Future iterations of workload identity between AKS <-> Azure will be far more tightly constrained to pods, nodes, networks, TPM, etc so I would not recommend relying on the current lax STS semantics" - @enj Yesterday at 11:43 PM

@matheuscscp
Copy link
Author

matheuscscp commented May 14, 2025

Then I hope Azure rolls out the iteration without breaking changes, like EKS did:

IRSA -> Pod Identity

IRSA still works.

Setup of Azure WI today is identical to IRSA, you have to create yourself the OIDC Provider with the Issuer URL of the cluster.

@matheuscscp
Copy link
Author

matheuscscp commented May 14, 2025

I'm just thinking about how we can improve the security/least privilege principle here while still giving the best possible experience to Kubernetes users when consuming cloud provider services.

  • Users configure SAs for Pods to talk to the Kubernetes API.
  • Users configure SAs for Pods to talk to cloud APIs.
  • Users configure SAs for CRs (the respective controller) to talk to the Kubernetes API.

Why should we teach Kubernetes users to configure a different thing for CRs (the respective controller) to talk to cloud APIs?

I thought SAs were the common tongue for authenticating workloads in Kubernetes. A controller reconciling a CR is a workload too, a special kind, but still a workload. Why should it not be treated as a workload? Controllers usually have powerful RBAC, that's completely expected e.g. in Flux. It should be okay for a controller to impersonate the SAs users configure in the CRs, including outside the cluster if there's a link between the SA and an external identity.

The goal of the approach proposed here is giving users a way to have fine-grained permissions for their namespaces in a way that is consistent with what they already understand about workload identity. The current alternative is giving the controller Pod access to the resources belonging to all namespaces/tenants, creating a cloud identity that is too powerful. I think the best solution for this would be one that allows cloud identities with less permissions to be created and assigned to the respective tenants that need them, while not increasing the cognitive load on Kubernetes users to configure such things.

I'd be really happy to hear what is wrong with this goal

@ibihim ibihim moved this from Needs Triage to In Progress in SIG Auth May 19, 2025
@ibihim
Copy link
Contributor

ibihim commented May 19, 2025

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 19, 2025
@matheuscscp
Copy link
Author

Hey @enj @liggitt @aramase @deads2k @ahmedtd, I'd be immensely grateful if you could please share your thoughts here:

fluxcd/flux2#5359

@matheuscscp matheuscscp changed the title Does it make sense to use ServiceAccounts for custom resources? Does it make sense to use ServiceAccounts for custom resources in the context of workload identity? May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/auth Categorizes an issue or PR as relevant to SIG Auth. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: In Progress
Development

No branches or pull requests

6 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy