2 Fast 2 MCM
These visual representations helps understand the complex workflows within the Machine Controller Manager.
Machine Controller Manager Architecture
- The system consists of three main controllers working in concert
- Each controller handles specific aspects of machine lifecycle management
- Interfaces with both cloud providers and Kubernetes clusters
- Manages the full lifecycle of machines from creation to deletion
Let’s start with an overview of the main components and their interactions:
stateDiagram-v2 direction TB state "Machine Controller Manager" as MCM { state "Machine Controller" as MC state "Safety Controller" as SC state "MCM Controller" as MCMC [*] --> MC [*] --> SC [*] --> MCMC } state "Cloud Provider" as CP { VMs API } state "Kubernetes Cluster" as K8S { state "Control Plane" as CP_K8S { API_Server etcd } state "Node Components" as NC { kubelet container_runtime } } MCM --> CP : Manages VMs MCM --> K8S : Manages Nodes note right of MCM Handles: - Machine lifecycle - Safety checks - Deployments/Sets end note
Machine Controller Core Flows
Now, let’s dive into the Machine Controller’s core reconciliation flows for different resources. It handles three main types of reconciliation:
- Secret Reconciliation: Manages secrets referenced by MachineClasses
- MachineClass Reconciliation: Handles machine class lifecycle
- Machine Reconciliation: Core machine lifecycle management
--- config: layout: elk --- stateDiagram-v2 state "Machine Controller" as MC { state "Secret Reconciliation" as SR { [*] --> FetchSecret FetchSecret --> GetMachineClass GetMachineClass --> CheckReferences CheckReferences --> FinalizerAdd : Has References CheckReferences --> FinalizerRemove : No References FinalizerAdd --> [*] FinalizerRemove --> [*] } state "MachineClass Reconciliation" as MCR { [*] --> FetchClass FetchClass --> GetMachines GetMachines --> CheckMachines CheckMachines --> AddFinalizer : Has Machines CheckMachines --> RemoveFinalizer : No Machines AddFinalizer --> EnqueueMachines EnqueueMachines --> [*] RemoveFinalizer --> [*] } state "Machine Reconciliation" as MR { [*] --> FetchMachine FetchMachine --> CheckFrozen CheckFrozen --> ValidateMachine : Not Frozen CheckFrozen --> RetryLater : Frozen ValidateMachine --> ValidateMachineClass VaildateMachineClass --> DeletionTimestamp DeletionTimestamp --> DeletionFlow : Deletion Requested DeletionTimestamp --> AddFinalizers : No Deletion AddFinalizers --> CheckPhase&NodeLabel CheckPhase&NodeLabel --> ReconcileHealth : Has Node & Non-empty phase CheckPhase&NodeLabel --> CreationFlow : No Node or<br/>CrashLoopBackOff<br/>or EmptyPhase ReconcileHealth --> SyncNodeName SyncNodeName --> SyncTemplates SyncTemplates --> [*] CreationFlow --> [*] DeletionFlow --> [*] } }
Machine Creation
Machine Creation Flow:
- Complex process involving multiple status checks
- Handles initialization and error cases
- Includes node verification and cleanup of stale resources
- Multiple retry mechanisms for resilience
--- config: look: handDrawn --- stateDiagram-v2 classDef imp font-weight:bold,stroke-width:5px; state "From <u>CreateResponse</u>: Assign Node Name & ProviderID" as ANPIDCMR state "From <u>GetMachineStatusResponse</u>: Assign Node Name & ProviderID" as ANPIDGMS state "From <u>GetMachineStatusResponse</u>: Assign Node Name & ProviderID" as ANPIDGMSR state "Assign Node Name<br/>from Machine label" as ANML state "Phase: <i>Pending</i><br/>State: <i>Processing</i><br/>OpType: Create" as CPPP state "State: <i>Failed</i><br/>OpType: <i>Create</i>" as SFFF [*] --> AddBootToken&MachineName AddBootToken&MachineName --> GetMachineStatus:::imp GetMachineStatus:::imp --> ANPIDGMS : Success ANPIDGMS --> UpdateAnnotationsLabels UpdateAnnotationsLabels --> CPPP : Phase <i>""(empty) or CrashLoopBackOff</i> CPPP --> StatusUpdate StatusUpdate --> [*] GetMachineStatus:::imp --> CheckNodeExists : NotFound or Unimplemented CheckNodeExists --> ANML : Node Exists ANML --> UpdateAnnotationsLabels CheckNodeExists --> CreateMachine:::imp : No Node CreateMachine:::imp --> ANPIDCMR : Successful creation CreateMachine:::imp --> CheckFailurePhase : Creation Error ANPIDCMR --> SetUninitialized : Node name is Machine Name SetUninitialized --> UpdateAnnotationLabel UpdateAnnotationLabel --> InitializeMachine:::imp InitializeMachine:::imp --> [*] ANPIDCMR --> DeleteMachine:::imp : <u>Stale Node</u><br/>NodeName is not MachineName DeleteMachine:::imp --> SFFF: "VM using old node obj" GetMachineStatus:::imp --> ANPIDGMSR : Uninitialized ANPIDGMSR --> SetUninitialized GetMachineStatus:::imp --> CheckFailurePhase : Other Errors CheckFailurePhase --> Failed : Timeout CheckFailurePhase --> CrashLoopBackOff : Not timed out Failed --> SFFF CrashLoopBackOff --> SFFF SFFF --> [*]
Health Check
--- config: layout: elk --- stateDiagram-v2 state "Health Reconciliation" as HR { state "Phase: <i>Unknown</i><br/>State: <i>Processing</i><br/>LastOp: <i>HealthChk</i>" as PUSP state "Phase: <i>Failed</i><br/>State: <i>Failed</i>" as PFSF state "LastOp State: Successful<br/>Phase: Running" as SSPR [*] --> GetMachineNode GetMachineNode --> PUSP : Not Found & RunningPhase<br/>Node object missing GetMachineNode --> Found Found --> MachineCondSetToNodeCond : NodeCondition != MachineCondition Found --> isHealthy : TODO (isHealthy) GetMachineNode --> CreationTimeout : PendingPhase GetMachineNode --> HealthTimeout : UnknownPhase CreationTimeout --> PFSF : Now - LastUpdateTime > Timeout HealthTimeout --> GetDeploymentName : Now - LastUpdateTime > Timeout CreationTimeout --> EnqueueAfter : Not timed out HealthTimeout --> EnqueueAfter : Not timed out GetDeploymentName --> RegisterPermit RegisterPermit --> TryMarkingMachineFailed TryMarkingMachineFailed --> InProgressMachines++ : Phase not<br/>Unknown or Running<br/>Machines "getting replaced" InProgressMachines++ --> PFSF: InProgressMachines < MaxReplacements(1) MachineCondSetToNodeCond --> isHealthy isHealthy --> PUSP: Not Healthy & RunningPhase isHealthy --> CheckLastOp : Healthy & NotRunningPhase &<br/>NoCriticalComponentNotReadyTaint CheckLastOp --> DeleteBootstrapToken: TypeCreate &<br/> State is not Successful<br/>(Machine creation happened) CheckLastOp --> LastOpType=HealthChk: Not Create<br/>(Machine re-joined) DeleteBootstrapToken --> SSPR LastOpType=HealthChk --> SSPR SSPR --> UpdateStatus PUSP --> UpdateStatus PFSF --> UpdateStatus UpdateStatus --> [*] EnqueueAfter --> [*] }
Machine Deletion
Machine Deletion Flow:
- Carefully orchestrated process to ensure clean resource cleanup
- Involves multiple phases from drain to final cleanup
- Handles volume attachments and node cleanup
- Includes finalizer management for resource protection
--- config: layout: elk --- stateDiagram-v2 state "Deletion Flow" as DF { direction LR state "ProcessPhase" as PP state "UpdateStatus" as US [*] --> CheckFinalizers CheckFinalizers --> SetTerminating SetTerminating --> PP PP --> GetVMStatus GetVMStatus --> [*] PP --> InitiateDrain InitiateDrain --> [*] PP --> DeleteVolumeAttachments DeleteVolumeAttachments --> [*] PP --> InitiateVMDeletion InitiateVMDeletion --> [*] PP --> InitiateNodeDeletion InitiateNodeDeletion --> [*] PP --> RemoveFinalizers RemoveFinalizers --> [*] PP --> US US --> [*] }
--- config: layout: elk --- stateDiagram-v2 state "Initiate Drain" as ND { [*] --> ValidateNode state "UpdateStatus" as USD state "State: Processing<br/>Type: Delete" as SPTD state "CheckNodeCondition<br/>'Ready' or 'Read-only FS'" as CNC state "Phase is not Terminating" as NAT state "Terminating<br/>Reason: Unhealthy" as TRU state "Terminating<br/>Reason: ScaleDown" as TRSD state "SkipDrain<br/>State: Failed" as CUFail state "State: Processing<br/>Desc: DelVolAttachments" as SPDDVA state "State: Processing<br/>Desc: InitVMDeletion" as SPDIVD state "State: Failed<br/>Desc: InitiateDrain" as SFDID ValidateNode --> SPTD : NodeName is empty SPTD --> USD ValidateNode --> CNC CNC --> ForceDeletion : Read-Only/NotReady &<br/>Last-transition Timeout CNC --> NormalDrain : Healthy CNC --> ForceDeletion : "force-delete" label on machine or Drain<br/> Timeout on deletion ForceDeletion --> UpdateTerminationCondition NormalDrain --> UpdateTerminationCondition UpdateTerminationCondition --> RunDrain : Phase is empty or CrashLoopBackOff UpdateTerminationCondition --> NAT : Non-creation Phase NAT --> TRU : Phase is failed NAT --> TRSD : Phase not failed TRU --> TerminationConditionUpdate TRSD --> TerminationConditionUpdate TerminationConditionUpdate --> CUFail : Update failure<br/>during NormalDrain TerminationConditionUpdate --> RunDrain : Update failure<br/>during ForceDeletion TerminationConditionUpdate --> RunDrain : Update Successful CUFail --> USD RunDrain --> SPDDVA : Drain successful<br/>during ForceDeletion RunDrain --> SPDIVD : Drain successful<br/>during NormalDrain RunDrain --> SPDDVA : Drain failed<br/>"force-delete" label present RunDrain --> SFDID : Drain failed<br/>"force-delete" label absent SPDDVA --> USD SPDIVD --> USD SFDID --> USD USD --> [*] }
Let’s visualize the Node Drain process, which is a critical part of machine deletion:
- Sophisticated pod eviction handling
- Supports both forced and normal drain scenarios
- Handles PDB (Pod Disruption Budget) violations
- Includes parallel and serial eviction strategies
--- config: layout: elk --- stateDiagram-v2 state "RunDrain" as Normal { state "CordonNode (Sealing off)<br/>(Set Unschedulable to true)" as CN [*] --> CN CN --> WaitForPodCacheSync WaitForPodCacheSync --> GetPodsForDeletion : TODO %% http://localhost:3000/machine-controller/node_drain.html#drainoptionsgetpodsfordeletion %% mirrorPodFilter: pod doesnt have MirrorPodAnnotation (set by kubelet when creating mirror pods) %% localStorageFilter %% unreplicatedFilter %% daemonSetFilter GetPodsForDeletion --> DeleteOrEvictPods DeleteOrEvictPods --> UpdateNodeCondition UpdateNodeCondition --> [*] state "DeleteOrEvictPods" as EP { [*] --> CheckEvictionSupport CheckEvictionSupport --> ParallelEviction : ForceDeletion CheckEvictionSupport --> MixedEviction : NormalDrain MixedEviction --> ParallelEvictNoPV MixedEviction --> SerialEvictWithPV ParallelEvictNoPV --> WaitForEviction SerialEvictWithPV --> WaitForEviction ParallelEviction --> WaitForEviction WaitForEviction --> HandlePDBViolation HandlePDBViolation --> RetryEviction RetryEviction --> [*] } }
--- title: EvictPodsNoPV --- stateDiagram-v2 classDef imp font-weight:bold,stroke-width:5px; state "Retry count >= MaxEvictRetries" as Term state "Set attemptEvict as False" as AEF state "Sleep(EvictRetryInterval)" as SRC [*] --> Term:::imp Term:::imp --> CheckAttemptEvict : No Term:::imp --> AEF : Yes AEF --> CheckAttemptEvict CheckAttemptEvict --> EvictPod : True CheckAttemptEvict --> DeletePod : False EvictPod --> CheckErr DeletePod --> CheckErr CheckErr --> BreakLoop:::imp : nil CheckErr --> LogEvict : notFound CheckErr --> EvictFailErr : AttemptEvict is False CheckErr --> PDBViolation : APIErr too many req PDBViolation --> GetPDB GetPDB --> SRC : No PDB GetPDB --> CheckMisconfigured : PDB exists CheckMisconfigured --> MisconfigErr : Generation is ObserverGen<br/>HealthyPods >= ExpectedPods<br/>DisruptionsAllowed is 0 CheckMisconfigured --> SRC : No SRC:::imp --> Term : count++ BreakLoop:::imp --> ReturnSuccess:::imp : ForceDeletion BreakLoop:::imp --> GetTerminationGracePeriod : NormalDrain GetTerminationGracePeriod --> SetToTimeout : GracePeriod > Timeout GetTerminationGracePeriod --> WaitForDeletion : Grace < Timeout SetToTimeout --> WaitForDeletion WaitForDeletion --> TimeoutErr : timeout &<br/>pod exists WaitForDeletion --> WaitErr : err WaitForDeletion --> ReturnSuccess:::imp : timeout &<br/>pod deleted LogEvict --> [*] EvictFailErr --> [*] MisconfigErr --> [*] TimeoutErr --> [*] WaitErr --> [*] ReturnSuccess:::imp --> [*]
--- title: EvictPodsWithPV config: layout: elk --- stateDiagram-v2 classDef imp font-weight:bold,stroke-width:5px; state "Retry count < MaxEvictRetries" as Term state "Sleep(EvictRetryInterval)" as SRC state "CheckRemainingPods" as CRP [*] --> SortPodsByPriority SortPodsByPriority --> podVolumeInfoMap note left of podVolumeInfoMap Creates a map from pod to list of attached PVs (VolName, VolID -> GetVolumeID) end note podVolumeInfoMap --> AttemptEvict AttemptEvict --> evictPodPVInternal(Delete):::imp : false AttemptEvict --> Term:::imp : true Term:::imp --> evictPodPVInternal(Evict):::imp : true evictPodPVInternal(Evict):::imp --> break:::imp : FastTrack or<br/>All pods evicted evictPodPVInternal(Evict):::imp --> SRC : Not FastTrack and<br/>Pods Remaining SRC --> Term:::imp : count++ Term:::imp --> evictPodPVInternal(Delete):::imp : false<br/>Not FastTrack and<br/>Pods Remaining break:::imp --> [*] : All pods evicted break:::imp --> CRP : FastTrack evictPodPVInternal(Delete):::imp --> CRP CRP --> Success:::imp : Node Not Found CRP --> ChkAttemptEvict ChkAttemptEvict --> EvictErr : True ChkAttemptEvict --> DeleteErr : False
--- title: EvictPodsWithPVInternal config: layout: elk --- stateDiagram-v2 classDef imp font-weight:bold,stroke-width:5px; state "Add Pod to RetryPods" as Retry state "Log NotFound<br/>DeleteWorker" as LogNotFound [*] --> SelectPod : Start Eviction Process SelectPod --> CheckContextTimeout:::imp CheckContextTimeout:::imp --> AbortProcess : Context Done CheckContextTimeout:::imp --> AddWorker(AttachmentHandler) : Context Not Done AddWorker(AttachmentHandler) --> EvictOrDelete EvictOrDelete --> CheckEvictionResult:::imp CheckEvictionResult:::imp --> EvictionFailed EvictionFailed --> PDBViolation : Eviction Attempted &<br/>TooManyRequests EvictionFailed --> PodAlreadyGone : Pod Not Found EvictionFailed --> EvictionError : Other Errors CheckEvictionResult:::imp --> WaitForVolumeDetach : Successful Eviction PDBViolation --> GetPDB GetPDB --> CheckMisconfigured : PDB Exists GetPDB --> Retry : NoPDB CheckMisconfigured --> MisconfigErr : Generation is ObserverGen<br/>HealthyPods >= ExpectedPods<br/>DisruptionsAllowed is 0 CheckMisconfigured --> Retry:::imp : NotMisconfig MisconfigErr --> DeleteWorker PodAlreadyGone --> DeleteWorker EvictionError --> Retry:::imp WaitForVolumeDetach --> CheckDetachResult:::imp : TerminationGracePeriod + DetachTimeout CheckDetachResult:::imp --> LogNotFound : Node Not Found CheckDetachResult:::imp --> DetachError : Detach Failed CheckDetachResult:::imp --> WaitForReattach : Successful Detach LogNotFound --> AbortProcess DetachError --> DeleteWorker WaitForReattach --> CheckReattachResult:::imp : PvReattachTimeout CheckReattachResult:::imp --> ReattachTimeout : Timeout CheckReattachResult:::imp --> LogError : Reattach Failed CheckReattachResult:::imp --> SuccessfulEviction:::imp : Successful Reattach ReattachTimeout --> DeleteWorker : TODO IsThisCorrect? LogError --> DeleteWorker SuccessfulEviction:::imp --> DeleteWorker : Pod Processed DeleteWorker --> [*] Retry:::imp --> DeleteWorker AbortProcess --> Exit:::imp : Terminate (FastTrack)<br/>Return Remaining Pods
Safety Controller
-
Orphan VM Check:
- Runs periodically (every 15 minutes) to detect and clean up orphaned VMs
- Lists all VMs in the cloud provider matching the cluster’s tag
- Maps VMs to machine objects using ProviderID
- Handles nodes without machine objects:
- Adds
NotManagedByMCM
annotation after timeout - Removes annotation if machine object is found
- Adds
- Logs all cleanup operations for audit purposes
-
API Server Safety:
- Monitors connectivity to both control and target API servers
- Implements a freezing mechanism when API servers are unreachable
- Manages machine controller state based on API server health:
- Freezes operations if timeout exceeded
- Unfreezes when API servers become available
- Handles machine status updates during API server recovery
--- config: layout: elk --- stateDiagram-v2 state "Safety Controller" as SC { state "Orphan VM Check" as OVC { [*] --> ListCloudVMs ListCloudVMs --> MapToMachines MapToMachines --> CheckOrphans state "CheckOrphans" as CO { [*] --> NoMachineObject NoMachineObject --> ConfirmDeletion ConfirmDeletion --> DeleteVM DeleteVM --> LogDeletion } CheckOrphans --> AnnotateNodes state "AnnotateNodes" as AN { [*] --> CheckNodeMachine CheckNodeMachine --> MultipleMatch : Multiple Machines CheckNodeMachine --> NoMatch : No Machine CheckNodeMachine --> SingleMatch : One Machine NoMatch --> TimeoutCheck TimeoutCheck --> AddAnnotation : Timeout Exceeded SingleMatch --> RemoveAnnotation : Has Annotation AddAnnotation --> UpdateNode RemoveAnnotation --> UpdateNode } } state "API Server Safety" as ASS { [*] --> CheckFrozen CheckFrozen --> CheckAPIServer : Frozen CheckFrozen --> MonitorAPI : Not Frozen CheckAPIServer --> Unfreeze : API Up CheckAPIServer --> Requeue : API Down MonitorAPI --> SetInactiveTime : API Down MonitorAPI --> ClearInactiveTime : API Up SetInactiveTime --> CheckTimeout CheckTimeout --> Freeze : Timeout Exceeded Unfreeze --> UpdateMachines UpdateMachines --> ResetTimeout } }
MachineSet Controller
-
Core Reconciliation:
- Validates MachineSet specifications
- Manages finalizers for proper cleanup
- Implements machine ownership through controller references
- Synchronizes node templates and configurations
-
Replica Management:
- Implements sophisticated scaling logic:
- Slow-start batching for scale-up operations
- Prioritized scale-down based on machine health
- Handles stale machine cleanup
- Maintains desired replica count
- Updates status to reflect current state
- Implements sophisticated scaling logic:
--- config: layout: elk --- stateDiagram-v2 state "MachineSet Controller" as MSC { state "Sync MachineSet<br/>NodeTemplate<br/>to Machine" as SyncNodeTemplates state "Sync MachineSet<br/>MachineConfiguration<br/>to Machine" as SyncMachineConfig state "Sync MachineSet<br/>MachineClass.Kind<br/>to Machine" as SyncMachineKind [*] --> FetchMachineSet FetchMachineSet --> ValidateSpec ValidateSpec --> AddFinalizers : Deletion Not Requested AddFinalizers --> ClaimMachines state "ClaimMachines (Returns filtered machines)" as CM { [*] --> CreateControllerRefMgr CreateControllerRefMgr --> GetControllerRef GetControllerRef --> Orphan : Nil<br/>(No Owner) GetControllerRef --> CheckUID : Not Nil<br/>(Owner Exists) CheckUID --> Ignore : Mismatch<br/>(Wrong Owner) CheckUID --> MatchSelector : UID Same Orphan --> CheckDeletion CheckDeletion --> SelectorMatch : No Deletion SelectorMatch --> AdoptOrphan : Selector Match MatchSelector --> KeepClaim : Selector Match<br/>Already Owned MatchSelector --> DeletionCheck : Selector Mismatch DeletionCheck --> AttemptRelease : No Deletion KeepClaim --> AddToClaimed AdoptOrphan --> AddToClaimed AttemptRelease --> RemoveFromClaimed } ClaimMachines --> SyncNodeTemplates SyncNodeTemplates --> SyncMachineConfig SyncMachineConfig --> SyncMachineKind SyncMachineKind --> CheckFilteredMachines : Deletion Requested SyncMachineKind --> ManageReplicas : No Deletion CheckFilteredMachines --> RemoveFinalizers : Zero Owned Machines CheckFilteredMachines --> CheckFinalizerPresent : Backed Machines CheckFinalizerPresent --> TerminateMachines RemoveFinalizers --> UpdateStatus TerminateMachines --> UpdateStatus state "ManageReplicas" as MR { [*] --> CheckMachinePhase CheckMachinePhase --> ActiveMachines : Phase<br/>NotFailedOrTerminating CheckMachinePhase --> StaleMachines : PhaseFailed ActiveMachines --> CheckDiff StaleMachines --> TerminateStale TerminateStale --> CheckDiff CheckDiff --> ScaleUp : ActiveMachines<br/>Less than<br/>Replica Count CheckDiff --> ScaleDown : ActiveMachines<br/>More than<br/>Replica Count ScaleUp --> NotFrozenAnd<br/>NotToBeDeleted NotFrozenAnd<br/>NotToBeDeleted --> SlowStartBatch : TODO Expectations SlowStartBatch --> CreateMachines ScaleDown --> SortMachines SortMachines --> DeleteExcess } ManageReplicas --> UpdateStatus UpdateStatus --> [*] }
MachineDeployment Controller
Deployment Management:
- Handles multiple MachineSets for a deployment
- Maintains deployment history through revisions
- Supports pausing and resuming deployments
- Implements rollback functionality
-
Deployment Strategies:
-
Recreate Strategy:
- Scales down old MachineSets completely
- Creates and scales up new MachineSet
- Ensures clean cutover between versions
-
Rolling Update Strategy:
- Gradually scales up new MachineSet
- Gradually scales down old MachineSets
- Maintains availability during updates
- Handles surge and unavailability constraints
-
-
Scaling Operations:
- Detects and handles scaling events
- Manages desired replica counts across MachineSets
- Updates annotations for autoscaler integration
- Ensures proper resource cleanup
--- config: layout: elk --- stateDiagram-v2 state "TODO MachineDeployment Controller" as MDC { [*] --> FetchDeployment FetchDeployment --> LogFrozenOrTBD LogFrozenOrTBD --> ValidateSpec ValidateSpec --> CheckDeletion state "GetMachineSets" as GMS { [*] --> CreateControllerRefMgr CreateControllerRefMgr --> GetControllerRef GetControllerRef --> Orphan : Nil<br/>(No Owner) GetControllerRef --> CheckUID : Not Nil<br/>(Owner Exists) CheckUID --> Ignore : Mismatch<br/>(Wrong Owner) CheckUID --> MatchSelector : UID Same Orphan --> CheckDelete CheckDelete --> SelectorMatch : No Deletion SelectorMatch --> AdoptOrphan : Selector Match MatchSelector --> KeepClaim : Selector Match<br/>Already Owned MatchSelector --> DeletionCheck : Selector Mismatch DeletionCheck --> AttemptRelease : No Deletion KeepClaim --> AddToClaimed AdoptOrphan --> AddToClaimed AttemptRelease --> RemoveFromClaimed } CheckDeletion --> AddFinalizer : No Deletion AddFinalizer --> StatusUpdate StatusUpdate --> GetMachineSets GetMachineSets --> BuildMachineMap<br/>MSetUIDToMachines BuildMachineMap<br/>MSetUIDToMachines --> DeleteChk DeleteChk --> CheckPausedCond : No Deletion DeleteChk --> ProcessDeletion : Deletion Requested state "Process Deletion" as DC { [*] --> Exit : Finalizer<br/>NotPresent [*] --> RemoveFinalizers : NoBackingMS [*] --> TerminateMachineSets : BackingMS TerminateMachineSets --> SyncStatusOnly<br/>UpdateMcdStatus RemoveFinalizers --> Exit } state "Check Paused Condition" as CPC { [*] --> GetCondition<br/>TypeProcessing GetCondition<br/>TypeProcessing --> [*] : CondReason<br/>TimeOut GetCondition<br/>TypeProcessing --> ExistingPaused : CondReason<br/>Paused GetCondition<br/>TypeProcessing --> NotExistingPaused : Else NotExistingPaused --> Spec.Paused Spec.Paused --> SetPausedCondition : true ExistingPaused --> SpecPaused SpecPaused --> SetResumedCondition : False SetPausedCondition --> UpdateMcdStatus SetResumedCondition --> UpdateMcdStatus UpdateMcdStatus --> [*] } CheckPausedCond --> SetPrioAnnotation : TODO SetPrioAnnotation --> Sync : Spec.Paused true<br/>TODO SetPrioAnnotation --> CheckRollbackTo : Spec.Paused false state "Rollback" as RB { [*] --> FindRevision FindRevision --> FindMatchingMS : RollbackTo.Revision<br/>Present FindRevision --> ClearRollbackTo : No last revision FindMatchingMS --> Remove<br/>PreferNoSched<br/>Taint : MSRevisionAnnotation<br/>same as<br/>RollbackTo Revision FindMatchingMS --> ClearRollbackTo : NoMachineSetFound Remove<br/>PreferNoSched<br/>Taint --> UpdateMcdTemplate UpdateMcdTemplate --> UpdateMcdAnnotations : Copy MS template<br/>Remove label<br/>machine-template-hash UpdateMcdAnnotations --> ClearRollbckTo ClearRollbckTo --> EmitRollbackEvent } CheckRollbackTo --> Rollback : Rollback Requested CheckRollbackTo --> IsScalingEvent : No Rollback state "Is Scaling Event" as SC { [*] --> GetMS<br/>SyncRev GetMS<br/>SyncRev --> NotScaling : err GetMS<br/>SyncRev --> NotScaling : No New MS GetMS<br/>SyncRev --> CheckActiveMS : MS Replicas > 0 CheckActiveMS --> ScalingEvent : NoActiveMS &<br/>MCD Replicas > 0<br/>(ScaleFromZero) CheckActiveMS --> GetMSDesiredReplica<br/>Annotation GetMSDesiredReplica<br/>Annotation --> ScalingEvent : Desired not equal<br/>to MCD Replicas CheckActiveMS --> NotScaling : NoActiveMS or<br/>Desired = MCD Replicas<br/>(For all active) } IsScalingEvent --> Sync : Scale Event IsScalingEvent --> DeployStrategy : No Scale Event state "Sync" as SN { [*] --> GetMS<br/>SyncRevision GetMS<br/>SyncRevision --> Scale Scale --> CleanMCD : Paused and<br/>No RollbackTo Scale --> SyncMCDStatus state "Find Active or Latest MS" as ALMS { [*] --> SortMS by CreationTime<br/>FilterActiveMS } state "TODO Scale" as SCC { state "ReplicasToAdd<br/>AllowedSize - AllMSReplicaCnt" as ReplicasToAdd [*] --> GetActiveOrLatestMS GetActiveOrLatestMS --> CheckActiveMSReplicas : not nil GetActiveOrLatestMS --> CheckNewMS<br/>Saturated CheckActiveMSReplicas --> FIXME : ActiveMSRep = mcdRep CheckNewMS<br/>Saturated --> ScaleDownOldMS : true CheckNewMS<br/>Saturated --> IsRollingUpdate : false IsRollingUpdate --> FilterActiveMS : true FilterActiveMS --> GetReplicaCount<br/>AllMS GetReplicaCount<br/>AllMS --> FindAllowedSize FindAllowedSize --> Zero : MCD Replicas <= 0 FindAllowedSize --> McdReplicas+MaxSurge : MCD Replicas > 0 Zero --> ReplicasToAdd McdReplicas+MaxSurge --> ReplicasToAdd ReplicasToAdd --> ScaleUp : more than 0 ReplicasToAdd --> ScaleDown : < 0 ScaleUp --> map[name]=NewRep : oldMS = Replicas ScaleUp --> map[name]=NewRep : newMS = Rep+RepToAdd } } state "TODO DeployStrategy" as DS { state "Recreate" as RC { [*] --> OldScaleDown OldScaleDown --> CreateNew CreateNew --> NewScaleUp } state "RollingUpdate" as RU { [*] --> ScaleUpNew [*] --> ScaleDownOld ScaleDownOld --> CleanupOld } } DeployStrategy --> UpdateStatus UpdateStatus --> [*] }
Summary
Each of these controllers implements sophisticated error handling and retry mechanisms:
-
Error Handling:
- Categorizes errors into recoverable and non-recoverable
- Implements exponential backoff for retries
- Maintains error counters and conditions
- Updates status to reflect error states
-
Resource Protection:
- Uses finalizers to prevent premature deletion
- Implements owner references for proper garbage collection
- Maintains consistent state through careful status updates
- Handles race conditions through proper locking
-
Performance Considerations:
- Implements work queues for efficient processing
- Uses informers for efficient cache handling
- Batches operations when possible
- Implements rate limiting for API calls
-
Monitoring and Metrics:
- Tracks operation durations
- Records error counts and types
- Provides health metrics
- Implements proper logging for debugging
The entire system works together to provide:
- Reliable machine lifecycle management
- Proper cleanup of resources
- Scaling capabilities
- Rolling updates and rollbacks
- Protection against race conditions and API server issues
- Efficient resource utilization
- Proper monitoring and debugging capabilities
This comprehensive system ensures robust machine management while maintaining high availability and proper resource utilization. The controllers work together to maintain the desired state while handling various edge cases and failure scenarios.