GCP Model Engine Support by arniechops · Pull Request #782 · scaleapi/llm-engine

arniechops · 2026-03-18T16:10:13Z

Pull Request Summary

What is this PR changing? Why is this change being made? Any caveats you'd like to highlight? Link any relevant documents, links, or screenshots here if applicable.

Test Plan and Usage Guide

How did you validate that your PR works correctly? How do you run or demo the code? Provide enough detail so a reviewer can reasonably reproduce the testing procedure. Paste example command line invocations if applicable.

Greptile Summary

This PR adds GCP Workload Identity Federation (WIF) support to the Helm chart by annotating both Kubernetes Service Accounts with the required iam.gke.io/gcp-service-account annotation.

Key changes:

service_account.yaml: Fixes the annotation guard so the annotations: block renders whenever any of $annotations, .Values.azure, or .Values.gcp is set — correctly addressing the previously identified bug where the GCP annotation was silently dropped if no base annotations were configured.
service_account_inference.yaml: Adds GCP WIF support for the inference SA with a sensible two-level lookup: gcp.inference_service_account takes priority, falling back to the shared gcp.iam_service_account. The block is placed inside the existing {{- with $annotations }} scope, which is safe since the outer if guard on line 1 already requires serviceTemplate.serviceAccountAnnotations to be present.

One asymmetry to be aware of: the service_account_inference.yaml outer if still requires serviceTemplate.serviceAccountAnnotations to render the inference SA at all. GCP-only deployments without any base annotations to set on the inference SA will need to supply a non-empty serviceAccountAnnotations (e.g., a placeholder) to enable the inference SA. This is a pre-existing structural constraint — not introduced by this PR — but worth documenting in values files for GCP deployers.

Confidence Score: 4/5

This PR is safe to merge; the main SA annotation fix is correct and the inference SA changes are logically sound, with one pre-existing usability caveat documented above.
The previously reported critical bug (GCP annotation silently dropped on the main SA) is properly fixed. The inference SA GCP block is correctly placed and uses a sensible fallback. The only concern is a pre-existing structural constraint in the inference SA outer guard that may require a non-obvious workaround for pure-GCP deployments — but this is not a regression introduced by this PR. Score reflects a well-targeted fix with no new logic bugs.
No files require special attention — both templates look correct.

Important Files Changed

Filename	Overview
charts/model-engine/templates/service_account.yaml	Fixes the previously reported GCP annotation silently-dropped bug by restructuring the outer condition to `{{- if or $annotations .Values.azure .Values.gcp }}`, ensuring the `annotations:` block renders whenever GCP, Azure, or base annotations are configured — regardless of whether base annotations are present.
charts/model-engine/templates/service_account_inference.yaml	Adds GCP Workload Identity annotation to the inference SA, with a sensible fallback from `gcp.inference_service_account` → `gcp.iam_service_account`. The block is correctly placed inside the existing `{{- with $annotations }}` scope, which is always truthy when reached due to the outer guard requiring `serviceAccountAnnotations`.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Helm Template Render] --> B{service_account.yaml}
    A --> C{service_account_inference.yaml}

    B --> D{"or $annotations\n.Values.azure\n.Values.gcp"}
    D -- "true" --> E[Render annotations block]
    D -- "false" --> F[No annotations key]

    E --> G{"$annotations set?"}
    G -- "yes" --> H[Render base annotations]
    G -- "no" --> I[Skip]

    E --> J{".Values.azure?"}
    J -- "yes" --> K["azure.workload.identity/client-id"]
    J -- "no" --> L[Skip]

    E --> M{".Values.gcp?"}
    M -- "yes" --> N{".Values.gcp.iam_service_account?"}
    N -- "yes" --> O["iam.gke.io/gcp-service-account\n= iam_service_account"]
    N -- "no" --> P[Skip GCP annotation]
    M -- "no" --> Q[Skip]

    C --> R{"outer if: serviceTemplate\n+ createServiceAccount\n+ serviceAccountAnnotations\n+ serviceAccountName\n+ endpoint_namespace"}
    R -- "ALL true" --> S["{{- with $annotations }}"]
    R -- "any false" --> T[No SA rendered]

    S --> U[Render base annotations]
    S --> V{".Values.gcp?"}
    V -- "yes" --> W{".Values.gcp.inference_service_account?"}
    W -- "yes" --> X["iam.gke.io/gcp-service-account\n= inference_service_account"]
    W -- "no" --> Y{".Values.gcp.iam_service_account?"}
    Y -- "yes" --> Z["iam.gke.io/gcp-service-account\n= iam_service_account"]
    Y -- "no" --> AA[Skip]
    V -- "no" --> AB[Skip]

_{Reviews (2): Last reviewed commit: "fix" | Re-trigger Greptile}

Add GCP Workload Identity annotation support to both main and inference service accounts, following the same pattern as Azure PR #762. Changes: - service_account.yaml: Add iam.gke.io/gcp-service-account annotation using gcp.iam_service_account value - service_account_inference.yaml: Add iam.gke.io/gcp-service-account annotation with fallback from gcp.inference_service_account to gcp.iam_service_account (allows separate SA for inference pods) This enables proper GCP Workload Identity binding for model-engine pods on GKE clusters. Implements SGPINF-1123

charts/model-engine/templates/service_account.yaml

arniechops force-pushed the arnavchopra/gcp-model-engine-support branch from a84ead4 to 28df380 Compare March 18, 2026 16:12

greptile-apps bot reviewed Mar 18, 2026

View reviewed changes

charts/model-engine/templates/service_account.yaml Show resolved Hide resolved

fix

4709967

lilyz-ai approved these changes Mar 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCP Model Engine Support#782

GCP Model Engine Support#782
arniechops wants to merge 2 commits intomainfrom
arnavchopra/gcp-model-engine-support

arniechops commented Mar 18, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

arniechops commented Mar 18, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Summary

Test Plan and Usage Guide

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

arniechops commented Mar 18, 2026 •

edited by greptile-apps bot

Loading