2 Fast 2 MCM

2025-01-15

These visual representations helps understand the complex workflows within the Machine Controller Manager.

Machine Controller Manager Architecture

The system consists of three main controllers working in concert
Each controller handles specific aspects of machine lifecycle management
Interfaces with both cloud providers and Kubernetes clusters
Manages the full lifecycle of machines from creation to deletion

Let’s start with an overview of the main components and their interactions:

stateDiagram-v2
    direction TB

    state "Machine Controller Manager" as MCM {
        state "Machine Controller" as MC
        state "Safety Controller" as SC
        state "MCM Controller" as MCMC

        [*] --> MC
        [*] --> SC
        [*] --> MCMC
    }

    state "Cloud Provider" as CP {
        VMs
        API
    }

    state "Kubernetes Cluster" as K8S {
        state "Control Plane" as CP_K8S {
            API_Server
            etcd
        }

        state "Node Components" as NC {
            kubelet
            container_runtime
        }
    }

    MCM --> CP : Manages VMs
    MCM --> K8S : Manages Nodes

    note right of MCM
        Handles:
        - Machine lifecycle
        - Safety checks
        - Deployments/Sets
    end note

Machine Controller Core Flows

Now, let’s dive into the Machine Controller’s core reconciliation flows for different resources. It handles three main types of reconciliation:

Secret Reconciliation: Manages secrets referenced by MachineClasses
MachineClass Reconciliation: Handles machine class lifecycle
Machine Reconciliation: Core machine lifecycle management

---
  config:
    layout: elk
---
stateDiagram-v2
    state "Machine Controller" as MC {
        state "Secret Reconciliation" as SR {
            [*] --> FetchSecret
            FetchSecret --> GetMachineClass
            GetMachineClass --> CheckReferences
            CheckReferences --> FinalizerAdd : Has References
            CheckReferences --> FinalizerRemove : No References
            FinalizerAdd --> [*]
            FinalizerRemove --> [*]
        }

        state "MachineClass Reconciliation" as MCR {
            [*] --> FetchClass
            FetchClass --> GetMachines
            GetMachines --> CheckMachines
            CheckMachines --> AddFinalizer : Has Machines
            CheckMachines --> RemoveFinalizer : No Machines
            AddFinalizer --> EnqueueMachines
            EnqueueMachines --> [*]
            RemoveFinalizer --> [*]
        }

        state "Machine Reconciliation" as MR {
            [*] --> FetchMachine
            FetchMachine --> CheckFrozen

            CheckFrozen --> ValidateMachine : Not Frozen
            CheckFrozen --> RetryLater : Frozen

            ValidateMachine --> ValidateMachineClass
            VaildateMachineClass --> DeletionTimestamp

            DeletionTimestamp --> DeletionFlow : Deletion Requested
            DeletionTimestamp --> AddFinalizers : No Deletion

            AddFinalizers --> CheckPhase&NodeLabel

            CheckPhase&NodeLabel --> ReconcileHealth : Has Node & Non-empty phase
            CheckPhase&NodeLabel --> CreationFlow : No Node or
CrashLoopBackOff
or EmptyPhase

            ReconcileHealth --> SyncNodeName
            SyncNodeName --> SyncTemplates
            SyncTemplates --> [*]

            CreationFlow --> [*]
            DeletionFlow --> [*]
        }
    }

Machine Creation

Machine Creation Flow:

Complex process involving multiple status checks
Handles initialization and error cases
Includes node verification and cleanup of stale resources
Multiple retry mechanisms for resilience

---
  config:
    look: handDrawn
---
stateDiagram-v2
    classDef imp font-weight:bold,stroke-width:5px;
        state "From CreateResponse: Assign Node Name & ProviderID" as ANPIDCMR
        state "From GetMachineStatusResponse: Assign Node Name & ProviderID" as ANPIDGMS
        state "From GetMachineStatusResponse: Assign Node Name & ProviderID" as ANPIDGMSR
        state "Assign Node Name
from Machine label" as ANML
        state "Phase: Pending
State: Processing
OpType: Create" as CPPP
        state "State: Failed
OpType: Create" as SFFF

        [*] --> AddBootToken&MachineName
        AddBootToken&MachineName --> GetMachineStatus:::imp

        GetMachineStatus:::imp --> ANPIDGMS : Success
        ANPIDGMS --> UpdateAnnotationsLabels
        UpdateAnnotationsLabels --> CPPP : Phase ""(empty) or CrashLoopBackOff
        CPPP --> StatusUpdate
        StatusUpdate --> [*]

        GetMachineStatus:::imp --> CheckNodeExists : NotFound or Unimplemented
        CheckNodeExists --> ANML : Node Exists
        ANML --> UpdateAnnotationsLabels

        CheckNodeExists --> CreateMachine:::imp : No Node
        CreateMachine:::imp --> ANPIDCMR : Successful creation
        CreateMachine:::imp --> CheckFailurePhase : Creation Error
        ANPIDCMR --> SetUninitialized : Node name is Machine Name
        SetUninitialized --> UpdateAnnotationLabel
        UpdateAnnotationLabel --> InitializeMachine:::imp
        InitializeMachine:::imp --> [*]

        ANPIDCMR --> DeleteMachine:::imp : Stale Node
NodeName is not MachineName
        DeleteMachine:::imp --> SFFF: "VM using old node obj"

        GetMachineStatus:::imp --> ANPIDGMSR : Uninitialized
        ANPIDGMSR --> SetUninitialized

        GetMachineStatus:::imp --> CheckFailurePhase : Other Errors
        CheckFailurePhase --> Failed : Timeout
        CheckFailurePhase --> CrashLoopBackOff : Not timed out
        Failed --> SFFF
        CrashLoopBackOff --> SFFF

        SFFF --> [*]

Health Check

---
  config:
    layout: elk
---
stateDiagram-v2
    state "Health Reconciliation" as HR {
        state "Phase: Unknown
State: Processing
LastOp: HealthChk" as PUSP
        state "Phase: Failed
State: Failed" as PFSF
        state "LastOp State: Successful
Phase: Running" as SSPR

        [*] --> GetMachineNode
        GetMachineNode --> PUSP : Not Found & RunningPhase
Node object missing
        GetMachineNode --> Found

        Found --> MachineCondSetToNodeCond : NodeCondition != MachineCondition
        Found --> isHealthy : TODO (isHealthy)

        GetMachineNode --> CreationTimeout : PendingPhase
        GetMachineNode --> HealthTimeout : UnknownPhase

        CreationTimeout --> PFSF : Now - LastUpdateTime > Timeout
        HealthTimeout --> GetDeploymentName : Now - LastUpdateTime > Timeout
        CreationTimeout --> EnqueueAfter : Not timed out
        HealthTimeout --> EnqueueAfter : Not timed out


        GetDeploymentName --> RegisterPermit
        RegisterPermit --> TryMarkingMachineFailed
        TryMarkingMachineFailed --> InProgressMachines++ : Phase not
Unknown or Running
Machines "getting replaced"
        InProgressMachines++ --> PFSF:  InProgressMachines < MaxReplacements(1)

        MachineCondSetToNodeCond --> isHealthy
        isHealthy --> PUSP: Not Healthy & RunningPhase
        isHealthy --> CheckLastOp : Healthy & NotRunningPhase &
NoCriticalComponentNotReadyTaint

        CheckLastOp --> DeleteBootstrapToken: TypeCreate &
 State is not Successful
(Machine creation happened)
        CheckLastOp --> LastOpType=HealthChk: Not Create
(Machine re-joined)

        DeleteBootstrapToken --> SSPR
        LastOpType=HealthChk --> SSPR

        SSPR --> UpdateStatus
        PUSP --> UpdateStatus
        PFSF --> UpdateStatus

        UpdateStatus --> [*]
        EnqueueAfter --> [*]
    }

Machine Deletion

Machine Deletion Flow:

Carefully orchestrated process to ensure clean resource cleanup
Involves multiple phases from drain to final cleanup
Handles volume attachments and node cleanup
Includes finalizer management for resource protection

---
  config:
    layout: elk
---
stateDiagram-v2
    state "Deletion Flow" as DF {
        direction LR
        state "ProcessPhase" as PP
        state "UpdateStatus" as US

        [*] --> CheckFinalizers
        CheckFinalizers --> SetTerminating
        SetTerminating --> PP

        PP --> GetVMStatus
        GetVMStatus --> [*]
        PP --> InitiateDrain
        InitiateDrain --> [*]
        PP --> DeleteVolumeAttachments
        DeleteVolumeAttachments --> [*]
        PP --> InitiateVMDeletion
        InitiateVMDeletion --> [*]
        PP --> InitiateNodeDeletion
        InitiateNodeDeletion --> [*]
        PP --> RemoveFinalizers
        RemoveFinalizers --> [*]
        PP --> US
        US --> [*]
    }

---
  config:
    layout: elk
---
stateDiagram-v2
    state "Initiate Drain" as ND {
        [*] --> ValidateNode
        state "UpdateStatus" as USD
        state "State: Processing
Type: Delete" as SPTD
        state "CheckNodeCondition
'Ready' or 'Read-only FS'" as CNC
        state "Phase is not Terminating" as NAT
        state "Terminating
Reason: Unhealthy" as TRU
        state "Terminating
Reason: ScaleDown" as TRSD
        state "SkipDrain
State: Failed" as CUFail
        state "State: Processing
Desc: DelVolAttachments" as SPDDVA
        state "State: Processing
Desc: InitVMDeletion" as SPDIVD
        state "State: Failed
Desc: InitiateDrain" as SFDID

        ValidateNode --> SPTD : NodeName is empty
        SPTD --> USD
        ValidateNode --> CNC
        CNC --> ForceDeletion : Read-Only/NotReady &
Last-transition Timeout
        CNC --> NormalDrain : Healthy
        CNC --> ForceDeletion : "force-delete" label on machine or Drain
 Timeout on deletion

        ForceDeletion --> UpdateTerminationCondition
        NormalDrain --> UpdateTerminationCondition

        UpdateTerminationCondition --> RunDrain : Phase is empty or CrashLoopBackOff
        UpdateTerminationCondition --> NAT : Non-creation Phase
        NAT --> TRU : Phase is failed
        NAT --> TRSD : Phase not failed
        TRU --> TerminationConditionUpdate
        TRSD --> TerminationConditionUpdate

        TerminationConditionUpdate --> CUFail : Update failure
during NormalDrain
        TerminationConditionUpdate --> RunDrain : Update failure
during ForceDeletion
        TerminationConditionUpdate --> RunDrain : Update Successful
        CUFail --> USD

        RunDrain --> SPDDVA : Drain successful
during ForceDeletion
        RunDrain --> SPDIVD : Drain successful
during NormalDrain
        RunDrain --> SPDDVA : Drain failed
"force-delete" label present
        RunDrain --> SFDID : Drain failed
"force-delete" label absent

        SPDDVA --> USD
        SPDIVD --> USD
        SFDID --> USD

        USD --> [*]
    }

Let’s visualize the Node Drain process, which is a critical part of machine deletion:

Sophisticated pod eviction handling
Supports both forced and normal drain scenarios
Handles PDB (Pod Disruption Budget) violations
Includes parallel and serial eviction strategies

---
  config:
    layout: elk
---
stateDiagram-v2
    state "RunDrain" as Normal {
        state "CordonNode (Sealing off)
(Set Unschedulable to true)" as CN
        [*] --> CN
        CN --> WaitForPodCacheSync
        WaitForPodCacheSync --> GetPodsForDeletion : TODO

        %% http://localhost:3000/machine-controller/node_drain.html#drainoptionsgetpodsfordeletion
        %% mirrorPodFilter: pod doesnt have MirrorPodAnnotation (set by kubelet when creating mirror pods)
        %% localStorageFilter
        %% unreplicatedFilter
        %% daemonSetFilter

        GetPodsForDeletion --> DeleteOrEvictPods

        DeleteOrEvictPods --> UpdateNodeCondition
        UpdateNodeCondition --> [*]

        state "DeleteOrEvictPods" as EP {
            [*] --> CheckEvictionSupport

            CheckEvictionSupport --> ParallelEviction : ForceDeletion
            CheckEvictionSupport --> MixedEviction : NormalDrain

            MixedEviction --> ParallelEvictNoPV
            MixedEviction --> SerialEvictWithPV

            ParallelEvictNoPV --> WaitForEviction
            SerialEvictWithPV --> WaitForEviction
            ParallelEviction --> WaitForEviction
            WaitForEviction --> HandlePDBViolation
            HandlePDBViolation --> RetryEviction
            RetryEviction --> [*]
        }
}

---
title: EvictPodsNoPV
---
stateDiagram-v2
    classDef imp font-weight:bold,stroke-width:5px;
        state "Retry count >= MaxEvictRetries" as Term
        state "Set attemptEvict as False" as AEF
        state "Sleep(EvictRetryInterval)" as SRC

        [*] --> Term:::imp

        Term:::imp --> CheckAttemptEvict : No
        Term:::imp --> AEF : Yes
        AEF --> CheckAttemptEvict

        CheckAttemptEvict --> EvictPod : True
        CheckAttemptEvict --> DeletePod : False

        EvictPod --> CheckErr
        DeletePod --> CheckErr

        CheckErr --> BreakLoop:::imp : nil
        CheckErr --> LogEvict : notFound
        CheckErr --> EvictFailErr : AttemptEvict is False
        CheckErr --> PDBViolation : APIErr too many req

        PDBViolation --> GetPDB

        GetPDB --> SRC : No PDB
        GetPDB --> CheckMisconfigured : PDB exists

        CheckMisconfigured --> MisconfigErr : Generation is ObserverGen
HealthyPods >= ExpectedPods
DisruptionsAllowed is 0
        CheckMisconfigured --> SRC : No

        SRC:::imp --> Term : count++


        BreakLoop:::imp --> ReturnSuccess:::imp : ForceDeletion
        BreakLoop:::imp --> GetTerminationGracePeriod : NormalDrain

        GetTerminationGracePeriod --> SetToTimeout : GracePeriod > Timeout
        GetTerminationGracePeriod --> WaitForDeletion : Grace < Timeout
        SetToTimeout --> WaitForDeletion

        WaitForDeletion --> TimeoutErr : timeout &
pod exists
        WaitForDeletion --> WaitErr : err
        WaitForDeletion --> ReturnSuccess:::imp : timeout &
pod deleted

        LogEvict --> [*]
        EvictFailErr --> [*]
        MisconfigErr --> [*]
        TimeoutErr --> [*]
        WaitErr --> [*]
        ReturnSuccess:::imp --> [*]

---
title: EvictPodsWithPV
config:
  layout: elk
---
stateDiagram-v2
    classDef imp font-weight:bold,stroke-width:5px;
        state "Retry count < MaxEvictRetries" as Term
        state "Sleep(EvictRetryInterval)" as SRC
        state "CheckRemainingPods" as CRP

        [*] --> SortPodsByPriority
        SortPodsByPriority --> podVolumeInfoMap
        note left of podVolumeInfoMap
            Creates a map from pod to list of attached PVs (VolName, VolID -> GetVolumeID)
        end note

        podVolumeInfoMap --> AttemptEvict
        AttemptEvict --> evictPodPVInternal(Delete):::imp : false
        AttemptEvict --> Term:::imp : true
        Term:::imp --> evictPodPVInternal(Evict):::imp : true
        evictPodPVInternal(Evict):::imp --> break:::imp : FastTrack or
All pods evicted
        evictPodPVInternal(Evict):::imp --> SRC : Not FastTrack and
Pods Remaining
        SRC --> Term:::imp : count++

        Term:::imp --> evictPodPVInternal(Delete):::imp : false
Not FastTrack and
Pods Remaining
        break:::imp --> [*] : All pods evicted

        break:::imp --> CRP : FastTrack
        evictPodPVInternal(Delete):::imp --> CRP

        CRP --> Success:::imp : Node Not Found
        CRP --> ChkAttemptEvict
        ChkAttemptEvict --> EvictErr : True
        ChkAttemptEvict --> DeleteErr : False

---
title: EvictPodsWithPVInternal
config:
  layout: elk
---
stateDiagram-v2
    classDef imp font-weight:bold,stroke-width:5px;
        state "Add Pod to RetryPods" as Retry
        state "Log NotFound
DeleteWorker" as LogNotFound
        [*] --> SelectPod : Start Eviction Process

        SelectPod --> CheckContextTimeout:::imp

        CheckContextTimeout:::imp --> AbortProcess : Context Done
        CheckContextTimeout:::imp --> AddWorker(AttachmentHandler) : Context Not Done

        AddWorker(AttachmentHandler) --> EvictOrDelete

        EvictOrDelete --> CheckEvictionResult:::imp

        CheckEvictionResult:::imp --> EvictionFailed
        EvictionFailed --> PDBViolation : Eviction Attempted &
TooManyRequests
        EvictionFailed --> PodAlreadyGone : Pod Not Found
        EvictionFailed --> EvictionError : Other Errors
        CheckEvictionResult:::imp --> WaitForVolumeDetach : Successful Eviction

        PDBViolation --> GetPDB
        GetPDB --> CheckMisconfigured : PDB Exists
        GetPDB --> Retry : NoPDB
        CheckMisconfigured --> MisconfigErr : Generation is ObserverGen
HealthyPods >= ExpectedPods
DisruptionsAllowed is 0
        CheckMisconfigured --> Retry:::imp : NotMisconfig
        MisconfigErr --> DeleteWorker

        PodAlreadyGone --> DeleteWorker

        EvictionError --> Retry:::imp

        WaitForVolumeDetach --> CheckDetachResult:::imp : TerminationGracePeriod + DetachTimeout

        CheckDetachResult:::imp --> LogNotFound : Node Not Found
        CheckDetachResult:::imp --> DetachError : Detach Failed
        CheckDetachResult:::imp --> WaitForReattach : Successful Detach

        LogNotFound --> AbortProcess
        DetachError --> DeleteWorker

        WaitForReattach --> CheckReattachResult:::imp : PvReattachTimeout

        CheckReattachResult:::imp --> ReattachTimeout : Timeout
        CheckReattachResult:::imp --> LogError : Reattach Failed
        CheckReattachResult:::imp --> SuccessfulEviction:::imp : Successful Reattach

        ReattachTimeout --> DeleteWorker : TODO IsThisCorrect?
        LogError --> DeleteWorker
        SuccessfulEviction:::imp --> DeleteWorker : Pod Processed

        DeleteWorker --> [*]
        Retry:::imp --> DeleteWorker
        AbortProcess --> Exit:::imp : Terminate (FastTrack)
Return Remaining Pods

Safety Controller

Orphan VM Check:
- Runs periodically (every 15 minutes) to detect and clean up orphaned VMs
- Lists all VMs in the cloud provider matching the cluster’s tag
- Maps VMs to machine objects using ProviderID
- Handles nodes without machine objects:
  - Adds `NotManagedByMCM` annotation after timeout
  - Removes annotation if machine object is found
- Logs all cleanup operations for audit purposes
API Server Safety:
- Monitors connectivity to both control and target API servers
- Implements a freezing mechanism when API servers are unreachable
- Manages machine controller state based on API server health:
  - Freezes operations if timeout exceeded
  - Unfreezes when API servers become available
- Handles machine status updates during API server recovery

---
  config:
    layout: elk
---
stateDiagram-v2
    state "Safety Controller" as SC {
        state "Orphan VM Check" as OVC {
            [*] --> ListCloudVMs
            ListCloudVMs --> MapToMachines
            MapToMachines --> CheckOrphans

            state "CheckOrphans" as CO {
                [*] --> NoMachineObject
                NoMachineObject --> ConfirmDeletion
                ConfirmDeletion --> DeleteVM
                DeleteVM --> LogDeletion
            }

            CheckOrphans --> AnnotateNodes

            state "AnnotateNodes" as AN {
                [*] --> CheckNodeMachine
                CheckNodeMachine --> MultipleMatch : Multiple Machines
                CheckNodeMachine --> NoMatch : No Machine
                CheckNodeMachine --> SingleMatch : One Machine

                NoMatch --> TimeoutCheck
                TimeoutCheck --> AddAnnotation : Timeout Exceeded

                SingleMatch --> RemoveAnnotation : Has Annotation

                AddAnnotation --> UpdateNode
                RemoveAnnotation --> UpdateNode
            }
        }

        state "API Server Safety" as ASS {
            [*] --> CheckFrozen
            CheckFrozen --> CheckAPIServer : Frozen
            CheckFrozen --> MonitorAPI : Not Frozen

            CheckAPIServer --> Unfreeze : API Up
            CheckAPIServer --> Requeue : API Down

            MonitorAPI --> SetInactiveTime : API Down
            MonitorAPI --> ClearInactiveTime : API Up

            SetInactiveTime --> CheckTimeout
            CheckTimeout --> Freeze : Timeout Exceeded

            Unfreeze --> UpdateMachines
            UpdateMachines --> ResetTimeout
        }
    }

MachineSet Controller

Core Reconciliation:
- Validates MachineSet specifications
- Manages finalizers for proper cleanup
- Implements machine ownership through controller references
- Synchronizes node templates and configurations
Replica Management:
- Implements sophisticated scaling logic:
  - Slow-start batching for scale-up operations
  - Prioritized scale-down based on machine health
- Handles stale machine cleanup
- Maintains desired replica count
- Updates status to reflect current state

---
  config:
    layout: elk
---
stateDiagram-v2
    state "MachineSet Controller" as MSC {
        state "Sync MachineSet
NodeTemplate
to Machine" as SyncNodeTemplates
        state "Sync MachineSet
MachineConfiguration
to Machine" as SyncMachineConfig
        state "Sync MachineSet
MachineClass.Kind
to Machine" as SyncMachineKind

        [*] --> FetchMachineSet
        FetchMachineSet --> ValidateSpec
        ValidateSpec --> AddFinalizers : Deletion Not Requested

        AddFinalizers --> ClaimMachines

        state "ClaimMachines (Returns filtered machines)" as CM {
            [*] --> CreateControllerRefMgr
            CreateControllerRefMgr --> GetControllerRef
            GetControllerRef --> Orphan : Nil
(No Owner)
            GetControllerRef --> CheckUID : Not Nil
(Owner Exists)

            CheckUID --> Ignore : Mismatch
(Wrong Owner)
            CheckUID --> MatchSelector : UID Same
            Orphan --> CheckDeletion
            CheckDeletion --> SelectorMatch : No Deletion
            SelectorMatch --> AdoptOrphan : Selector Match

            MatchSelector --> KeepClaim : Selector Match
Already Owned
            MatchSelector --> DeletionCheck : Selector Mismatch
            DeletionCheck --> AttemptRelease : No Deletion

            KeepClaim --> AddToClaimed
            AdoptOrphan --> AddToClaimed
            AttemptRelease --> RemoveFromClaimed
        }

        ClaimMachines --> SyncNodeTemplates
        SyncNodeTemplates --> SyncMachineConfig
        SyncMachineConfig --> SyncMachineKind
        SyncMachineKind --> CheckFilteredMachines : Deletion Requested
        SyncMachineKind --> ManageReplicas : No Deletion

        CheckFilteredMachines --> RemoveFinalizers : Zero Owned Machines
        CheckFilteredMachines --> CheckFinalizerPresent : Backed Machines
        CheckFinalizerPresent --> TerminateMachines
        RemoveFinalizers --> UpdateStatus
        TerminateMachines --> UpdateStatus

        state "ManageReplicas" as MR {
            [*] --> CheckMachinePhase
            CheckMachinePhase --> ActiveMachines : Phase
NotFailedOrTerminating
            CheckMachinePhase --> StaleMachines : PhaseFailed

            ActiveMachines --> CheckDiff
            StaleMachines --> TerminateStale
            TerminateStale --> CheckDiff

            CheckDiff --> ScaleUp : ActiveMachines
Less than
Replica Count
            CheckDiff --> ScaleDown : ActiveMachines
More than
Replica Count

            ScaleUp --> NotFrozenAnd
NotToBeDeleted
            NotFrozenAnd
NotToBeDeleted --> SlowStartBatch : TODO Expectations
            SlowStartBatch --> CreateMachines

            ScaleDown --> SortMachines
            SortMachines --> DeleteExcess
        }

        ManageReplicas --> UpdateStatus
        UpdateStatus --> [*]
    }

MachineDeployment Controller

Deployment Management:

Handles multiple MachineSets for a deployment
Maintains deployment history through revisions
Supports pausing and resuming deployments
Implements rollback functionality
Deployment Strategies:
- Recreate Strategy:
  - Scales down old MachineSets completely
  - Creates and scales up new MachineSet
  - Ensures clean cutover between versions
- Rolling Update Strategy:
  - Gradually scales up new MachineSet
  - Gradually scales down old MachineSets
  - Maintains availability during updates
  - Handles surge and unavailability constraints
Scaling Operations:
- Detects and handles scaling events
- Manages desired replica counts across MachineSets
- Updates annotations for autoscaler integration
- Ensures proper resource cleanup

---
  config:
    layout: elk
---
stateDiagram-v2
    state "TODO MachineDeployment Controller" as MDC {
        [*] --> FetchDeployment
        FetchDeployment --> LogFrozenOrTBD
        LogFrozenOrTBD --> ValidateSpec
        ValidateSpec --> CheckDeletion

        state "GetMachineSets" as GMS {
            [*] --> CreateControllerRefMgr
            CreateControllerRefMgr --> GetControllerRef
            GetControllerRef --> Orphan : Nil
(No Owner)
            GetControllerRef --> CheckUID : Not Nil
(Owner Exists)

            CheckUID --> Ignore : Mismatch
(Wrong Owner)
            CheckUID --> MatchSelector : UID Same
            Orphan --> CheckDelete
            CheckDelete --> SelectorMatch : No Deletion
            SelectorMatch --> AdoptOrphan : Selector Match

            MatchSelector --> KeepClaim : Selector Match
Already Owned
            MatchSelector --> DeletionCheck : Selector Mismatch
            DeletionCheck --> AttemptRelease : No Deletion

            KeepClaim --> AddToClaimed
            AdoptOrphan --> AddToClaimed
            AttemptRelease --> RemoveFromClaimed
        }

        CheckDeletion --> AddFinalizer : No Deletion
        AddFinalizer --> StatusUpdate
        StatusUpdate --> GetMachineSets

        GetMachineSets --> BuildMachineMap
MSetUIDToMachines
        BuildMachineMap
MSetUIDToMachines --> DeleteChk
        DeleteChk --> CheckPausedCond : No Deletion
        DeleteChk --> ProcessDeletion : Deletion Requested

        state "Process Deletion" as DC {
            [*] --> Exit : Finalizer
NotPresent
            [*] --> RemoveFinalizers : NoBackingMS
            [*] --> TerminateMachineSets : BackingMS

            TerminateMachineSets --> SyncStatusOnly
UpdateMcdStatus
            RemoveFinalizers --> Exit
        }

        state "Check Paused Condition" as CPC {
            [*] --> GetCondition
TypeProcessing

            GetCondition
TypeProcessing --> [*] : CondReason
TimeOut
            GetCondition
TypeProcessing --> ExistingPaused : CondReason
Paused
            GetCondition
TypeProcessing --> NotExistingPaused : Else

            NotExistingPaused --> Spec.Paused
            Spec.Paused --> SetPausedCondition : true

            ExistingPaused --> SpecPaused
            SpecPaused --> SetResumedCondition : False

            SetPausedCondition --> UpdateMcdStatus
            SetResumedCondition --> UpdateMcdStatus

            UpdateMcdStatus --> [*]
        }

        CheckPausedCond --> SetPrioAnnotation : TODO

        SetPrioAnnotation --> Sync : Spec.Paused true
TODO
        SetPrioAnnotation --> CheckRollbackTo : Spec.Paused false

        state "Rollback" as RB {
            [*] --> FindRevision
            FindRevision --> FindMatchingMS : RollbackTo.Revision
Present
            FindRevision --> ClearRollbackTo : No last revision

            FindMatchingMS --> Remove
PreferNoSched
Taint : MSRevisionAnnotation
same as
RollbackTo Revision
            FindMatchingMS --> ClearRollbackTo : NoMachineSetFound

            Remove
PreferNoSched
Taint --> UpdateMcdTemplate
            UpdateMcdTemplate --> UpdateMcdAnnotations : Copy MS template
Remove label
machine-template-hash

            UpdateMcdAnnotations --> ClearRollbckTo
            ClearRollbckTo --> EmitRollbackEvent
        }

        CheckRollbackTo --> Rollback : Rollback Requested
        CheckRollbackTo --> IsScalingEvent : No Rollback

        state "Is Scaling Event" as SC {
            [*] --> GetMS
SyncRev
            GetMS
SyncRev --> NotScaling : err
            GetMS
SyncRev --> NotScaling : No New MS

            GetMS
SyncRev --> CheckActiveMS : MS Replicas > 0
            CheckActiveMS --> ScalingEvent : NoActiveMS &
MCD Replicas > 0
(ScaleFromZero)

            CheckActiveMS --> GetMSDesiredReplica
Annotation
            GetMSDesiredReplica
Annotation --> ScalingEvent : Desired not equal
to MCD Replicas

            CheckActiveMS --> NotScaling : NoActiveMS or
Desired = MCD Replicas
(For all active)
        }

        IsScalingEvent --> Sync : Scale Event
        IsScalingEvent --> DeployStrategy : No Scale Event

        state "Sync" as SN {
            [*] --> GetMS
SyncRevision
            GetMS
SyncRevision --> Scale
            Scale --> CleanMCD : Paused and
No RollbackTo
            Scale --> SyncMCDStatus

            state "Find Active or Latest MS" as ALMS {
            [*] --> SortMS by CreationTime
FilterActiveMS
            }

            state "TODO Scale" as SCC {
                state "ReplicasToAdd
AllowedSize - AllMSReplicaCnt" as ReplicasToAdd

                [*] --> GetActiveOrLatestMS
                GetActiveOrLatestMS --> CheckActiveMSReplicas : not nil
                GetActiveOrLatestMS --> CheckNewMS
Saturated

                CheckActiveMSReplicas --> FIXME : ActiveMSRep = mcdRep

                CheckNewMS
Saturated --> ScaleDownOldMS : true
                CheckNewMS
Saturated --> IsRollingUpdate : false

                IsRollingUpdate --> FilterActiveMS : true
                FilterActiveMS --> GetReplicaCount
AllMS

                GetReplicaCount
AllMS --> FindAllowedSize

                FindAllowedSize --> Zero : MCD Replicas <= 0
                FindAllowedSize --> McdReplicas+MaxSurge : MCD Replicas > 0

                Zero --> ReplicasToAdd
                McdReplicas+MaxSurge --> ReplicasToAdd

                ReplicasToAdd --> ScaleUp : more than 0
                ReplicasToAdd --> ScaleDown : < 0

                ScaleUp --> map[name]=NewRep : oldMS = Replicas
                ScaleUp --> map[name]=NewRep : newMS = Rep+RepToAdd

            }
        }

        state "TODO DeployStrategy" as DS {
            state "Recreate" as RC {
                [*] --> OldScaleDown
                OldScaleDown --> CreateNew
                CreateNew --> NewScaleUp
            }

            state "RollingUpdate" as RU {
                [*] --> ScaleUpNew
                [*] --> ScaleDownOld
                ScaleDownOld --> CleanupOld
            }
        }

        DeployStrategy --> UpdateStatus
        UpdateStatus --> [*]
    }

Summary

Each of these controllers implements sophisticated error handling and retry mechanisms:

Error Handling:
- Categorizes errors into recoverable and non-recoverable
- Implements exponential backoff for retries
- Maintains error counters and conditions
- Updates status to reflect error states
Resource Protection:
- Uses finalizers to prevent premature deletion
- Implements owner references for proper garbage collection
- Maintains consistent state through careful status updates
- Handles race conditions through proper locking
Performance Considerations:
- Implements work queues for efficient processing
- Uses informers for efficient cache handling
- Batches operations when possible
- Implements rate limiting for API calls
Monitoring and Metrics:
- Tracks operation durations
- Records error counts and types
- Provides health metrics
- Implements proper logging for debugging

The entire system works together to provide:

Reliable machine lifecycle management
Proper cleanup of resources
Scaling capabilities
Rolling updates and rollbacks
Protection against race conditions and API server issues
Efficient resource utilization
Proper monitoring and debugging capabilities

This comprehensive system ensures robust machine management while maintaining high availability and proper resource utilization. The controllers work together to maintain the desired state while handling various edge cases and failure scenarios.