Skip to content

CKS cluster goes into alert state when worker node is unmanaged #12783

@kiranchavala

Description

@kiranchavala

problem

CKS cluster goes into alert state

versions

ACS 4.22 , KVM hypervisor

The steps to reproduce the bug

  1. Created a K8s issue with 1 controller, and 3 workers

  2. Make sure the CKS cluster is in running state

  3. Navigate to one of the worker instances and unmanaged it

Image
  1. The CKS cluster went into Alert state
Image
  1. Delete the CKS Cluster, following error is issued

"com.cloud.vm.VMInstanceVO.getBackupOfferingId()" because "vm" is null"

Image

logs

2026-03-10 11:39:56,869 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-55:[ctx-53050b87, job-73]) (logid:755e27ea) Unexpected exception while executing org.apache.cloudstack.api.command.user.kubernetes.cluster.DeleteKubernetesClusterCmd java.lang.NullPointerException: Cannot invoke "com.cloud.vm.VMInstanceVO.getBackupOfferingId()" because "vm" is null
        at com.cloud.kubernetes.cluster.KubernetesClusterManagerImpl.checkIfVmsAssociatedWithBackupOffering(KubernetesClusterManagerImpl.java:2018)

  1. Stop the CKS cluster if the following error is issued

Failed to find all VMs in Kubernetes cluster :

Image

logs

2026-03-10 11:40:47,436 ERROR [c.c.k.c.a.KubernetesClusterStopWorker] (API-Job-Executor-56:[ctx-37106f8f, job-74, ctx-f1370fcc]) (logid:eeffc52f) Failed to find all VMs in Kubernetes cluster : test

  1. Import back the unmanged instance using the api > select the same cks network

https://cloudstack.apache.org/api/apidocs-4.22/apis/importUnmanagedInstance.html

  1. Add the imported worker node instance back to the CKS cluster
Image

logs

2026-03-10 12:52:22,913 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-61:[ctx-3eab1e04, job-81]) (logid:fba281a3) Complete async job-81, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed to add nodes to cluster ID 1 due to: No valid nodes found to be added to the Kubernetes cluster"}
Image

Other related CKS Alert issues

#12641
#12633
#11581

What to do about it?

CloudStack should not allow to unmanage an instance if its a part of a cks cluster

or Cloudstack CKS cluster should support addition of an unmanaged instance

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions