Traditional Culture Encyclopedia - Traditional culture - K8s network in detail

K8s network in detail

In fact, the operation here, it is necessary to y understand the network operation mechanism and basic structure of K8s, otherwise when it really encountered problems will be more depressed.

First of all, to understand the usefulness of K8s is actually the arrangement and management of containers, the smallest component is actually not a container, is the pod, the physical machine or virtual machine called node, pod is the base unit, the pod can have more than one container, there can be only one container, the same pod containers each other is **** enjoy the network and the host configuration, in other words, each other can be a direct localhost communication. In other words, each other can communicate directly with localhost, similar to the same machine for communication, so it does not matter isolation and security, externally speaking, is an environment, so the pod is the business entity of the environment. So, the first question comes, the same pod of different containers can be located on different node? Of course not, it has to be on the same node, because *** enjoy hosting and networking. So how do you know if a pod has more than one container, and is it possible to specify which containers you want to run when you execute kubectl? Of course you can, refer to the following command:

So, here you can ignore the concept of containers and consider pods alone, after all, pods are the smallest scheduling unit of k8s. So how do pods communicate with pods?

Pod communication is inseparable from the K8s network model:

flannel to form a large two-tier flat network, pod ip allocation by the flannel unified allocation, the communication process is also to go to the flannel bridge.

Each node creates a flannel0 virtual NIC for communication across nodes. So containers can communicate directly using the pod id.

When communicating across nodes, the data on the sending end will be routed from docker0 to the flannel0 virtual NIC, and the data on the receiving end will be routed from flannel0 to docker0.

If Pod is a collection of application containers, then Service is meaningless. When the application service needs to do load, needs to do full lifecycle tracking and management is reflected, so Service is an abstract concept, it defines the Pod logical collection and access to these Pod policy.

A very common scenario is when a Pod stops running for some reason, and kubelet restarts a new Pod based on deployment requirements to provide the previous Pod's functionality, but the flannel assigns a new IP address to the new Pod, which causes a big Effort, and a lot of configuration items in the app service need to be adjusted. This will bring a big Effort, and many configuration items of the application service will need to be adjusted, but with Service, this will not be a problem, see how Service works.

This diagram explains the operation mechanism of Service, when Service A is created, Service Controller and EndPoints Controller will be triggered to update some resources, for example, based on the selectionor of the Pod configured in the Service to create an EndPoint resource for each Pod and store it in the EndPoint resource. For example, based on the selector of the pods configured in Service, each pod will create an EndPoint resource and store it in etcd. kube-proxy will also update the chain rules of iptables to generate the link rules based on the Cluster IP link of Service to the corresponding pods, and then if a pod in the cluster wants to access a certain service, it can directly cluster ip:port to send the request to the corresponding pod based on the link rules of iptables. Next, a pod in the cluster wants to access a certain service, directly cluster ip:port can send the request to the corresponding Pod based on the iptables link, there are two algorithms for picking pods in this layer, Round Robin and Session Affinity. Of course, in addition to this iptabels model, there is a more primitive way of userland forwarding, where Kube-Proxy listens to a random port (Proxy Port) for each Service and adds an IPtables rule. Messages from the client to the ClusterIP:Port are redirected to the Proxy Port, and Kube-Proxy receives the messages and distributes them to the corresponding Pods via Round Robin or Session Affinity, where the same Client IPs are served to the same Pod on the same link.

Of course, newer versions of k8s are replacing iptables with ipvs, but the format is similar to iptables.

A conceptual diagram can be found here:

This is the original approach, see below:

IPVS, part of the LVS project, is a Layer 4 load balancer that runs in the Linux Kernel, and performs exceptionally well. With a tuned kernel, it can easily handle more than 100,000 forwarding requests per second. Currently, IPVS is widely used in medium to large Internet projects to take traffic from web portals.