Edit

Share via


"Error from server: error dialing backend: dial tcp" message

Summary

This article explains how to troubleshoot the "Error from server: error dialing backend: dial tcp" error message that you might receive when you use kubectl commands or other tools to connect to the API server. This error message indicates that the API server is having trouble connecting to an upstream component, like a Kubernetes service or kubelet, which is required for certain operations.

Symptoms

You receive an "Error from server: error dialing backend: dial tcp" error message when you take one of the following actions:

  • Use any of the kubectl logs, exec, attach, top, or port-forward commands.
  • Use third-party Kubernetes client tools to achieve the same functionality as the commands in the previous list item.

Why the error occurs

The Kubernetes API server has to forward API requests to an upstream component for several use cases. This error occurs if the API server can't establish a Transmission Control Protocol (TCP) connection to the upstream component. Example of such upstream components includes kubernetes services that are inside the cluster and kubelet.

If the issue persists, a network blockage is likely the cause. To identify the responsible network configuration, first determine the scope of the problem.

Narrowing down: Are all kubectl subcommands failing?

Try to run at least the kubectl exec, kubectl logs <podname>, and kubectl top pods commands.

If only kubectl logs <podname> or kubectl exec fails, check whether the issue occurs by having pods on different nodes.

If only kubectl top pods fails, check whether the issue occurs for pods on all nodes or only for pods on one node.

Cause 1: kubelet port (node:10250) blocked

Pod-specific access issues, such as those that are experienced by running kubectl logs and kubectl exec, occur if the API server can't reach the node on port 10250 to access the Kubelet API. These issues can be caused by a connection that's blocked by a Network Security Group (NSG) or firewall.

To resolve the issue, check whether the NSG on the node subnet includes an inbound rule that might block TCP port 10250.

Cause 2: specific service failure

Kubernetes accesses svc/metrics-server under the kube-system namespace for running kubectl top commands. There are other scenarios, such as admission webhooks, in which the API server can also reach other services. It's important to note that, depending on the service failure pattern, the error message can vary.

To troubleshoot the issue, check the error message to identify which service is affected and review the status of the related pods, services, and endpoints.

Cause 3: Konnectivity or tunnel failure

When API Server VNet Integration isn't enabled, AKS deploys a tunnel solution that proxies API server requests to in-cluster networking locations. Most AKS clusters use the Konnectivity solution. Konnectivity doesn't require that you open special ports on the API server. For more information, see AKS required network rules.

To resolve the issue, check whether the konnectivity-agent in the kube-system namespace is running and its containers are in a ready state. Try to delete the pods to see whether the connection recovers after the new pods are ready.

Third-party information disclaimer

The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.