Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article covers common issues you might encounter when deploying, setting up, or using Container Network Insights Agent on AKS. Each section follows a Symptom → Cause → Resolution format.
For deployment instructions, see Deploy and use Container Network Insights Agent on AKS.
Extension installation fails
Symptom: The az k8s-extension create command fails, or the extension provisioning state shows Failed.
Cause: Sovereign cloud region (the extension is supported only in Azure public regions), missing cluster features, or insufficient permissions.
Resolution:
Check the extension provisioning state for details:
az k8s-extension show \ --cluster-name $CLUSTER_NAME \ --resource-group $RESOURCE_GROUP \ --cluster-type managedClusters \ --name containernetworkingagent \ --query "{state:provisioningState, statuses:statuses}" -o jsonVerify your cluster is in an Azure public region. The extension is available in all Azure public regions where AKS is supported, but isn't available in Azure Government, Microsoft Azure operated by 21Vianet, or other sovereign clouds.
Verify your cluster has workload identity and OIDC issuer enabled:
az aks show \ --resource-group $RESOURCE_GROUP \ --name $CLUSTER_NAME \ --query "{oidcEnabled:oidcIssuerProfile.enabled, workloadIdentityEnabled:securityProfile.workloadIdentity.enabled}"Check that you have
ContributorandUser Access Administratorroles on the resource group.If you already ran
az k8s-extension createonce, running it again returns an error because the extension already exists. Useaz k8s-extension updateto change configuration settings on an existing extension:az k8s-extension update \ --cluster-name $CLUSTER_NAME \ --resource-group $RESOURCE_GROUP \ --cluster-type managedClusters \ --name containernetworkingagent \ --configuration-settings config.SOME_SETTING=new-value
Identity and permissions errors
Symptom: The agent pod starts but returns 401 Unauthorized or 403 Forbidden errors when processing requests. Pod logs show authentication or authorization failures.
Cause: The managed identity is missing required RBAC role assignments, or the federated credential subject doesn't match the agent's service account.
Resolution:
Verify the managed identity has all four required role assignments:
az role assignment list --assignee <identity-principal-id> --all -o tableConfirm these roles are present:
Role Scope Cognitive Services OpenAI UserAzure OpenAI resource Azure Kubernetes Service Cluster User RoleAKS cluster Azure Kubernetes Service Contributor RoleAKS cluster ReaderResource group Verify workload identity is enabled on the cluster:
az aks show \ --resource-group $RESOURCE_GROUP \ --name $CLUSTER_NAME \ --query "securityProfile.workloadIdentity.enabled"Verify the federated credential subject matches the service account:
az identity federated-credential list \ --identity-name $IDENTITY_NAME \ --resource-group $RESOURCE_GROUPThe
subjectfield should besystem:serviceaccount:kube-system:container-networking-agent-reader.Verify the Kubernetes service account has the correct workload identity annotation:
kubectl get serviceaccount container-networking-agent-reader -n kube-system -o yamlThe
azure.workload.identity/client-idannotation must match your managed identity's client ID. If it doesn't match, correct it and restart the pod:kubectl annotate serviceaccount container-networking-agent-reader \ -n kube-system \ azure.workload.identity/client-id=$IDENTITY_CLIENT_ID \ --overwrite kubectl rollout restart deployment container-networking-agent -n kube-system
Tip
Azure RBAC role assignments can take up to 10 minutes to propagate. If you see 401 or 403 errors immediately after setup, wait a few minutes and restart the pod.
Azure OpenAI connectivity issues
Symptom: The agent pod starts but chat requests fail. Pod logs show 401 Unauthorized, 404 Not Found, or connection errors referencing the Azure OpenAI endpoint.
Cause: The Azure OpenAI endpoint, deployment name, or managed identity credentials are misconfigured, or network traffic to the endpoint is blocked.
Resolution:
Check pod logs for specific error patterns:
Log message Cause Fix 401 UnauthorizedManaged identity missing Cognitive Services OpenAI UserroleAssign the role on the OpenAI resource 404 Not FoundWrong endpoint URL or deployment name Verify AZURE_OPENAI_ENDPOINTandAZURE_OPENAI_DEPLOYMENTConnection refused/Name resolution failedNetwork or DNS issue Check NSG/firewall rules and verify the endpoint hostname Token acquisition failedWorkload identity not configured Check service account annotation and federated credential Verify the managed identity has the
Cognitive Services OpenAI Userrole on the Azure OpenAI resource:az role assignment list \ --assignee <managed-identity-principal-id> \ --scope /subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.CognitiveServices/accounts/<openai-resource-name> \ --output tableIf you use network policies, Azure Firewall, or NSGs, ensure outbound HTTPS traffic (port 443) is allowed from the
kube-systemnamespace to your Azure OpenAI endpoint. Verify no network policies are blocking outbound traffic:kubectl get networkpolicies -n kube-system
App Registration and Entra ID authentication errors
Symptom: The Microsoft Entra ID (MSAL) login flow fails, login redirects return errors, or the pod logs show the placeholder value 44444444-4444-4444-4444-444444444444 for ENTRA_CLIENT_ID.
Cause: The App Registration isn't configured correctly, or the ENTRA_CLIENT_ID wasn't set during extension deployment.
Resolution:
If pod logs show the placeholder value
44444444-4444-4444-4444-444444444444, update the extension with your actual App Registration client ID:az k8s-extension update \ --cluster-name $CLUSTER_NAME \ --resource-group $RESOURCE_GROUP \ --cluster-type managedClusters \ --name containernetworkingagent \ --configuration-settings config.ENTRA_CLIENT_ID=<your-app-registration-client-id>If the login callback fails with a
redirect_uri mismatcherror, verify the redirect URI in the Azure portal under App Registrations > Your App > Authentication > Redirect URIs. For port-forwarded local access, the URI must behttp://localhost:8080/auth/callback.Note
Only
localhostredirect URIs are currently supported. Public LoadBalancer URLs aren't supported for redirect URIs.Ensure the App Registration has the required Microsoft Graph delegated permissions:
openid,profile,User.Read,offline_access. If admin consent is required, grant it:az ad app permission admin-consent --id <app-registration-object-id>Check pod logs for authentication-specific errors:
kubectl logs -n kube-system -l app=container-networking-agent | grep -i "auth\|msal\|entra"
Missing environment variables at startup
Symptom: The agent pod crashes immediately on startup with:
RuntimeError: Missing required Azure OpenAI environment variable(s): AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT, AZURE_OPENAI_API_VERSION.
Cause: One or more required configuration values weren't set when the extension was deployed.
Resolution:
Check the ConfigMap for placeholder values or missing settings:
kubectl get configmap -n kube-system -l app=container-networking-agent -o yamlConfirm these required variables are set with real values (not placeholders like
00000000-0000-0000-0000-000000000000):Variable Description Example AZURE_OPENAI_ENDPOINTAzure OpenAI resource endpoint https://your-instance.openai.azure.com/AZURE_OPENAI_DEPLOYMENTModel deployment name gpt-4oAZURE_OPENAI_API_VERSIONAPI version 2025-03-01-previewAZURE_CLIENT_IDManaged identity client ID UUID AZURE_TENANT_IDAzure tenant ID UUID AZURE_SUBSCRIPTION_IDAzure subscription ID UUID AKS_CLUSTER_NAMEAKS cluster name Your cluster name AKS_RESOURCE_GROUPCluster resource group Your resource group If values show placeholders, update the extension with the correct settings:
az k8s-extension update \ --cluster-name $CLUSTER_NAME \ --resource-group $RESOURCE_GROUP \ --cluster-type managedClusters \ --name containernetworkingagent \ --configuration-settings config.AZURE_OPENAI_ENDPOINT=<your-endpoint> \ --configuration-settings config.AZURE_OPENAI_DEPLOYMENT=<your-deployment>
Agent pod not running or crashing
Symptom: The agent pod is in CrashLoopBackOff, Error, or Pending state.
Cause: Misconfiguration, missing Azure OpenAI connectivity, or insufficient cluster resources.
Resolution:
Check pod events for immediate errors:
kubectl describe pod -n kube-system -l app=container-networking-agentCheck pod logs for error messages:
kubectl logs -n kube-system -l app=container-networking-agent --tail=200Match log messages to known causes:
Log message Cause Fix Missing required Azure OpenAI environment variable(s)ConfigMap has placeholder values Set correct values via az k8s-extension updatebootstrap.validation_agent_failedCan't connect to Azure OpenAI Check network, endpoint URL, and managed identity RBAC AKS MCP binary not foundBinary missing from image Use the official extension image from acnpublic.azurecr.ioFailedMount/ volume mount errorMissing Hubble certificate secrets Deploy with hubble.enabled=falseor ensure ACNS is enabledToken acquisition failedWorkload identity not configured Check service account annotation and federated credential Verify that the Azure OpenAI endpoint is reachable from the cluster. If you use egress restrictions, ensure outbound HTTPS (port 443) is allowed from the
kube-systemnamespace to your Azure OpenAI endpoint.
Readiness probe failures
Symptom: The pod is Running but shows 0/1 ready status. The /ready endpoint returns HTTP 503.
Cause: One or more startup checks haven't completed: the warmup agent pool isn't initialized yet, cluster properties have errors, or there are no pre-warmed agents available.
Resolution:
Wait up to 2-3 minutes after deployment for the warmup pool to create pre-warmed agents.
Check the readiness response for specific failure reasons:
kubectl port-forward svc/container-networking-agent-service -n kube-system 8080:80 curl -s http://localhost:8080/ready | jqCheck pod logs for warmup-related issues:
kubectl logs -n kube-system -l app=container-networking-agent | grep -i "warmup\|ready\|error"If cluster properties are failing, verify that
AKS_CLUSTER_NAME,AKS_RESOURCE_GROUP, andAZURE_SUBSCRIPTION_IDare correctly set in the extension configuration.
Warmup pool keeps failing
Symptom: The pod is Running but never becomes ready. Pod logs show repeated "Failed to create warmed agent" errors even after waiting several minutes.
Cause: The background warmup pool is failing to create pre-warmed agent instances. This is typically caused by an unresolved Azure OpenAI connectivity issue or a MCP initialization failure that prevents agents from being created.
Resolution:
Check logs for the specific underlying error:
kubectl logs -n kube-system -l app=container-networking-agent | grep -i "warmup\|Failed to create"Match the error to its fix:
Error in logs Fix 401 Unauthorizedor403 ForbiddenSee Azure OpenAI connectivity issues and verify the managed identity role assignment Token acquisition failedSee Identity and permissions errors 404 Not Foundon endpointVerify AZURE_OPENAI_ENDPOINTandAZURE_OPENAI_DEPLOYMENTin the ConfigMapAKS MCP binary not foundSee Agent pod not running or crashing Once the underlying issue is resolved, the warmup pool retries automatically. You don't need to restart the pod unless the error persists after fixing the root cause.
Hubble commands fail
Symptom: The agent reports errors for Hubble-related diagnostics, or Hubble flow analysis isn't available.
Cause: The cluster doesn't have Advanced Container Networking Services (ACNS) or the Cilium dataplane enabled.
Resolution:
If your cluster doesn't use ACNS, deploy the extension with
hubble.enabled=falseandconfig.AKS_MCP_ENABLED_COMPONENTS=kubectl. The agent still provides DNS, packet drop, and standard Kubernetes networking diagnostics without Hubble.To enable Hubble, your cluster must use Azure CNI powered by Cilium with Advanced Container Networking Services (ACNS) enabled.
Verify Hubble is running on your cluster:
kubectl get pods -n kube-system -l k8s-app=hubble-relayIf no pods return, Hubble isn't enabled. Enable ACNS on your cluster or set
hubble.enabled=falsein the extension configuration.
Chat rate limiting
Symptom: Chat requests return HTTP 429 with X-RateLimit-* or X-LLM-RateLimit-* response headers.
Cause: The built-in rate limiter is throttling requests to protect the service.
Resolution:
Container Network Insights Agent has three rate limiting layers:
| Rate limiter | Default | Behavior |
|---|---|---|
| Chat | 13 requests/second, burst of 13 | Per-session throttle on chat messages |
| Auth | 1 request/second, burst of 20 | Throttle on login and callback endpoints |
| LLM (adaptive) | 100 requests/second global, shared across users | Global throughput control with fair share per active user |
- For chat 429 errors: reduce message frequency and wait for the rate limit bucket to refill.
- For LLM 429 errors: check your Azure OpenAI Tokens-Per-Minute (TPM) quota in the Azure portal. Request a quota increase under Cognitive Services > Quotas if you need higher throughput.
Chat message sent but no response
Symptom: A chat message is sent but no response appears. The request hangs or eventually returns a timeout error.
Cause: Azure OpenAI may be rate-limited or unreachable, no pre-warmed agents may be available yet, or a long-running diagnostic command is still executing.
Resolution:
Check whether the pod has active sessions and whether an agent is assigned:
kubectl port-forward svc/container-networking-agent-service -n kube-system 8080:80 curl -s http://localhost:8080/api/status/sessions | jqCheck pod logs for error patterns:
kubectl logs -n kube-system -l app=container-networking-agent --tail=50Log indicator Cause Fix 429errorsAzure OpenAI rate limited Wait for the rate limit to reset; check your TPM quota "No pre-warmed agents available"Warmup pool not ready Wait for initialization; see Warmup pool keeps failing Connection timeouts Network or NSG issue Check pod network, DNS, and NSG rules If the request is still pending after 2 minutes, start a new conversation and send a simple query first (for example, "list pods in the default namespace") to verify the agent is responding before asking a complex diagnostic question.
Slow first request
Symptom: The first chat message after deployment or pod restart takes 10-30 seconds to respond.
Cause: Container Network Insights Agent maintains a pool of pre-warmed agents to reduce latency. After a pod restart, the warmup pool needs time to initialize each agent, which requires MCP plugin startup, Azure credential setup, and AI framework initialization.
Resolution: This is expected behavior. Wait for the /ready endpoint to return HTTP 200 before sending requests — that confirms at least one pre-warmed agent is available. Subsequent requests use the pre-warmed pool and respond faster (typically 5-10 seconds for simple queries).
kubectl port-forward svc/container-networking-agent-service -n kube-system 8080:80
curl -s http://localhost:8080/ready | jq
Slow responses for complex diagnostics
Symptom: Diagnostic responses take 30 seconds to 2 minutes to complete.
Cause: Multi-step diagnostics involve sequential operations: an initial LLM classification call to Azure OpenAI, multiple kubectl/cilium/hubble commands run against the cluster, and a final LLM analysis of the collected evidence. Each step adds latency.
Resolution: This is expected for complex diagnostics. The following table shows typical response times:
| Query type | Expected time |
|---|---|
| Simple cluster queries (listing pods, services) | 5–10 seconds |
| Single-domain diagnostics (specific pod DNS check, service endpoint check) | 15–30 seconds |
| Multi-node packet drop analysis or broad networking diagnostics | 30–120 seconds |
To reduce latency:
- Use a specific query that targets a known symptom instead of a broad question. For example, "check DNS resolution for service
my-svcin namespacemy-ns" is faster than "diagnose all networking issues." - Ensure your Azure OpenAI resource is in the same Azure region as your AKS cluster to minimize network round-trip time.
- Check your Azure OpenAI TPM quota — higher quota allows more parallel token processing.
Diagnostic commands time out
Symptom: The agent reports a command timed out, or the chat stops responding for more than 10 minutes before returning an error.
Cause: The default timeout for diagnostic commands (kubectl, cilium, hubble) is 600 seconds (10 minutes). Broad queries — such as collecting statistics from every node in a large cluster — can exceed this limit.
Resolution:
Scope your query to a specific node, pod, or namespace instead of the entire cluster. For example:
- Instead of: "Check packet drops across all nodes"
- Ask: "Check packet drops on node
<specific-node-name>"
If timeouts happen consistently on a type of query, the cluster may have performance or connectivity issues that are independently slowing down command responses.
Check pod logs for timeout-related entries:
kubectl logs -n kube-system -l app=container-networking-agent | grep -i "timeout\|timed out"
Session data lost after pod restart
Symptom: All chat history and active sessions disappear after the pod restarts.
Cause: Session data is stored in-memory only. All data is lost when the pod restarts.
Resolution: This is expected behavior for the current architecture. Start a new session after a pod restart.
Session expires unexpectedly
Symptom: You are logged out without warning during an active session, or your session ends after a period of inactivity even though you were using the extension.
Cause: Container Network Insights Agent enforces session timeouts for security. Two independent limits apply:
| Timeout type | Default | Behavior |
|---|---|---|
| Idle timeout | 30 minutes | Session ends if there's no activity for 30 minutes |
| Absolute timeout | 8 hours | Session ends regardless of activity after 8 hours |
Resolution: Log in again to start a new session. Chat history from the expired session isn't recoverable.
Note
Session data is stored in-memory only. Even within an active session, a pod restart clears all session history.
Chat context appears lost after many exchanges
Symptom: After approximately 15 exchanges, the agent seems to forget earlier parts of the conversation or doesn't reference context from earlier in the session.
Cause: Container Network Insights Agent summarizes conversation history to stay within the Azure OpenAI token limit. When the context window reaches approximately 15 messages, older messages are replaced by an automatically generated summary. The most recent messages and the summary are retained and passed to the model.
Resolution: This is expected behavior. The summarization preserves key diagnostic context while managing Azure OpenAI token limits. If you need to reference something from much earlier in the conversation:
- Repeat the relevant context: "Earlier you found X — can you investigate further?"
- Start a new conversation with a concise recap of the known findings.
Conversation limit reached
Symptom: The interface shows an error that you can't create a new conversation, or the oldest conversations disappear without being explicitly deleted.
Cause: Each user account is limited to 20 active conversations. When this limit is reached, the two oldest conversations are automatically removed to make room, starting when the count reaches 18 (90% of the 20-conversation limit).
Resolution: This automatic cleanup is expected behavior. If you can't create a new conversation, wait briefly for the background cleanup to run, then try again. The two least-recently-used conversations are removed automatically.
Note
Conversations are stored in-memory per pod. All conversations are lost if the pod restarts, regardless of how many exist.
Debug DaemonSet persists after a crash
Symptom: The rx-troubleshooting-debug DaemonSet remains in the kube-system namespace after a diagnostic session.
Cause: Container Network Insights Agent deploys a lightweight debug DaemonSet during packet drop diagnostics. If the agent pod crashes unexpectedly during this diagnostic, the cleanup step doesn't run.
Resolution: Manually delete the DaemonSet:
kubectl delete ds rx-troubleshooting-debug -n kube-system
Packet drop diagnostic fails
Symptom: When asking the agent to investigate packet drops, it reports errors deploying diagnostic pods or cannot collect node-level statistics.
Cause: Packet drop diagnostics deploy a lightweight DaemonSet (rx-troubleshooting-debug) to each node to collect host-level network statistics (ethtool stats, softnet counters, ring buffer state). Failures occur if the agent's service account doesn't have permission to create DaemonSets in kube-system, or if nodes block the required privileged access to collect host network statistics.
Resolution:
Check whether the DaemonSet was created:
kubectl get daemonset -n kube-system rx-troubleshooting-debugIf it doesn't exist, the deployment step failed. Check pod logs:
kubectl logs -n kube-system -l app=container-networking-agent | grep -i "daemonset\|rx\|packet\|error"If the DaemonSet was created but its pods aren't starting, describe them to find the cause:
kubectl describe pods -n kube-system -l app=cna-diagnosticVerify the ClusterRole assigned to the agent includes DaemonSet creation permissions:
kubectl get clusterrole -l app=container-networking-agent -o yaml | grep -A2 daemonsetIf the DaemonSet is left over from a failed run, delete it manually and ask the agent to retry:
kubectl delete daemonset -n kube-system -l app=cna-diagnostic
DNS diagnostics return incomplete or no results
Symptom: When troubleshooting a DNS issue, the agent returns partial diagnostic data, reports errors running DNS checks, or exits the investigation without results.
Cause: The agent's DNS diagnostic tools run resolution tests and inspect CoreDNS from inside the cluster. Incomplete results can occur if the agent's service account lacks cluster-level read access, CoreDNS pods aren't accessible, or individual commands hit the 30-second per-command timeout.
Resolution:
Verify CoreDNS is running:
kubectl get pods -n kube-system -l k8s-app=kube-dnsIf CoreDNS pods aren't running, that's the root cause. Describe them for details:
kubectl describe pods -n kube-system -l k8s-app=kube-dnsVerify the managed identity has the
Azure Kubernetes Service Cluster User Roleassignment on the cluster. This role allows the agent to retrieve kubeconfig and run kubectl commands:az role assignment list --assignee <identity-principal-id> --all -o tableIf the agent reports command timeouts during DNS checks, narrow the scope of your question. For example, instead of "diagnose all DNS issues," ask "check DNS resolution for pod
<pod-name>in namespace<namespace>."
Agent stops mid-investigation
Symptom: The agent begins a diagnostic investigation but stops before completing it, without providing a root cause analysis or a final report.
Cause: Several factors can interrupt a multi-step investigation:
- A diagnostic command timed out.
- The Azure OpenAI rate limit or token limit was hit mid-investigation.
- The conversation history context window reached its summarization threshold, causing the agent to lose thread of the current plan.
Resolution:
- Ask the agent to continue in the same conversation:
"Please continue the investigation"or"What other checks can you run?"The agent can resume from the current state. - If timeouts are the cause, scope the next query more narrowly. For example, "check the specific namespace
<name>" rather than the full cluster. - If the investigation stopped due to rate limiting, wait a minute and ask the agent to proceed.
- For a fresh start, open a new conversation and provide a concise summary of what was already found: "I've confirmed DNS resolution is failing in namespace X. Can you investigate the NetworkPolicy for that namespace?"
Workload identity not enabled on the cluster
Symptom: Federated credential setup fails, or the agent pod cannot authenticate to Azure. Pod logs show "Failed to acquire token" or "AADSTS..." errors.
Cause: The AKS cluster was created without the OIDC issuer or workload identity enabled.
Resolution: Enable both features on your existing cluster using the az aks update command:
az aks update \
--resource-group $RESOURCE_GROUP \
--name $CLUSTER_NAME \
--enable-oidc-issuer \
--enable-workload-identity
After enabling, re-run the federated credential setup steps from the deployment guide to link the managed identity to the Kubernetes service account.
Azure OpenAI model not available in the selected region
Symptom: Azure OpenAI deployment creation fails, or the Container Network Insights Agent startup fails with an endpoint or model error immediately after deployment.
Cause: The Azure OpenAI model you selected isn't available in your chosen Azure region.
Resolution:
Check which models are available in your region:
az cognitiveservices model list -l <your-region> --output tableUse a region where your target model is available. Consult the Azure OpenAI model region support reference for current availability.
Verify your subscription has sufficient Tokens-Per-Minute (TPM) quota for the model. If model deployment fails with a quota error, request a quota increase in the Azure portal under Cognitive Services > Quotas.
Quick diagnostic commands
Use these commands to quickly diagnose common issues:
# ──── Pod Status ────
kubectl get pods -n kube-system -l app=container-networking-agent
kubectl describe pod -n kube-system -l app=container-networking-agent
kubectl top pod -n kube-system -l app=container-networking-agent
# ──── Application Logs ────
kubectl logs -n kube-system -l app=container-networking-agent --tail=200
kubectl logs -n kube-system -l app=container-networking-agent -f # Stream live
kubectl logs -n kube-system -l app=container-networking-agent | grep ERROR # Errors only
# ──── Health Checks (requires port-forward) ────
kubectl port-forward svc/container-networking-agent-service -n kube-system 8080:80
curl -s http://localhost:8080/ready | jq
curl -s http://localhost:8080/live | jq
curl -s http://localhost:8080/api/status/sessions | jq
# ──── Configuration ────
kubectl get configmap -n kube-system -l app=container-networking-agent -o yaml
kubectl get serviceaccount container-networking-agent-reader -n kube-system -o yaml
# ──── Workload Identity ────
kubectl describe serviceaccount container-networking-agent-reader -n kube-system
az identity show --name $IDENTITY_NAME -g $RESOURCE_GROUP --query "{clientId:clientId, principalId:principalId}"
# ──── RBAC ────
az role assignment list --assignee <principal-id> --output table
# ──── Extension Status ────
az k8s-extension show \
--cluster-name $CLUSTER_NAME \
--resource-group $RESOURCE_GROUP \
--cluster-type managedClusters \
--name containernetworkingagent \
--query "{state:provisioningState, version:version}" -o table
# ──── Cleanup Stuck Resources ────
kubectl delete daemonset rx-troubleshooting-debug -n kube-system # Leftover diagnostic DaemonSet
kubectl delete pod -n kube-system -l app=container-networking-agent # Force pod restart