How to fix an HCI Host that fails to drain even though no roles on it

Question

How to fix an HCI Host that fails to drain even though no roles on it

John Rodger 0

How to fix an HCI Host that fails to drain even though no roles on it, and even after a reboot

Ankit Yadav 14,165 Reputation points Microsoft External Staff Moderator

2026-04-14T16:03:44.4+00:00

Hello John,

Thank you for contacting us about the Azure Local related query. We have begun looking into it and will share our suggestions with you as soon as possible. If you have any additional details or concerns in the meantime, please let us know. We appreciate your patience while we work on this.
Ankit Yadav 14,165 Reputation points Microsoft External Staff Moderator

2026-04-15T08:52:05.3166667+00:00

Hello John,

I wanted to follow up to see if you were able to review the response sent earlier and whether it addressed your question.

2 answers

Your answer

Ankit Yadav 14,165 Reputation points Microsoft External Staff Moderator

2026-04-14T16:03:44.4+00:00

Hello John,

Thank you for contacting us about the Azure Local related query. We have begun looking into it and will share our suggestions with you as soon as possible. If you have any additional details or concerns in the meantime, please let us know. We appreciate your patience while we work on this.
Ankit Yadav 14,165 Reputation points Microsoft External Staff Moderator

2026-04-15T08:52:05.3166667+00:00

Hello John,

I wanted to follow up to see if you were able to review the response sent earlier and whether it addressed your question.

Answer 1

John Rodger hi,

tl dr something still owned or running on node even if u dont see it...

yeah this is almost never about roles, its cluster state being stuck, node still “owns” something even if UI shows nothing, usually its storage jobs, CSV ownership, or hidden cluster resources, check Get-ClusterGroup and Get-ClusterResource to see if anything still tied to that node, then check storage jobs Get-StorageJob bc if anything running drain will fail, also check CSV ownership Get-ClusterSharedVolume and move it off the node, sometimes node stuck in draining/paused state so run Resume-ClusterNode -Name <node> -Failback Immediate, if still blocked force move everything Move-ClusterGroup -Node <other-node> -All, if that still doesnt work its stale cluster state so restart cluster service or use Clear-ClusterNode

rgds,

Alex

&pls if it helps accept my answer

Answer 2

In Azure Local, drain is based on cluster ownership, not just visible VM roles. A node can remain in Draining if the cluster still considers it as owning infrastructure resources or if maintenance activity hasn’t fully completed.

Use the steps below to identify and clear the condition.

1. Check whether the node still owns any cluster groups

Even if no VMs are present, the node may still own CSV or other cluster groups.

Get-ClusterGroup | Format-Table Name, OwnerNode, State

What to look for: Any group where OwnerNode is the affected host. If ownership exists, drain will not complete.

2. Move any remaining groups off the node

If ownership is found, move the group manually:

Move-ClusterGroup -Name "<GroupName>" -Node "<OtherNodeName>"

Re‑run step 1 and confirm the node no longer owns any groups.

3. Re‑attempt the supported drain operation

Once ownership is clear, retry the documented drain action:

Suspend-ClusterNode -Name "<NodeName>" -Drain

What to expect: The node should transition to Paused after draining completes.

4. Resume and drain again if the state appears stuck

If the node remains in Draining despite no owned resources, reset the pause state and retry:

Resume-ClusterNode -Name "<NodeName>"
Suspend-ClusterNode -Name "<NodeName>" -Drain

This re-applies the supported pause/drain workflow without using undocumented force actions.

5. Check for active storage jobs (S2D environments)

Background storage maintenance or repair activity can delay maintenance transitions.

Get-StorageJob

What to look for: Any running jobs. Allow them to complete before retrying drain.

6. Restart the cluster service on the node (if quorum is safe)

If the node is still stuck and no groups or storage jobs are present, restart the cluster service on that node only:

Stop-ClusterNode -Name "<NodeName>"
Start-ClusterNode -Name "<NodeName>"

Important: Only do this if the cluster will remain in quorum.

7. Validate cluster health if the issue persists

Run cluster validation to surface configuration or health issues that may block maintenance:

Test-Cluster -Node "<NodeName>"

Review storage, network, and system results.

Share via

How to fix an HCI Host that fails to drain even though no roles on it

2 answers

Your answer