Kubelet调用network流程
Kubelet在提供pod时需要与网络组件交互,实现网络的功能。
kubelet/运行时
网络的调用流程是由运行时(runtime)发起的。这里以运行时docker(dockershim)为例。dockerservice通过networkpluginmanager来管理网络。代码在kubelet/dockershim/docker_service.go中
初始化networkpluginmanager
dockerservice包含的属性,其中network是指networkpluginmanager。networkpluginmanager通过包含的networkplugin起作用。
type dockerService struct {
...
network *network.PluginManager
// Map of podSandboxID :: network-is-ready
networkReady map[string]bool
networkReadyLock sync.Mutex
}
在dockerservice实例化时,会初始化相应的networkpluginmanager,并且networkpluginmanager中真正使用的plugin是cniNetworkPlugin
。可以认为networkplugin/cniNetworkPlugin是kubelet和cni框架之间的适配器。
// dockershim currently only supports CNI plugins.
pluginSettings.PluginBinDirs = cni.SplitDirs(pluginSettings.PluginBinDirString)
cniPlugins := cni.ProbeNetworkPlugins(pluginSettings.PluginConfDir, pluginSettings.PluginBinDirs)
cniPlugins = append(cniPlugins, kubenet.NewPlugin(pluginSettings.PluginBinDirs))
netHost := &dockerNetworkHost{
&namespaceGetter{ds},
&portMappingGetter{ds},
}
plug, err := network.InitNetworkPlugin(cniPlugins, pluginSettings.PluginName, netHost pluginSettings.HairpinMode, pluginSettings.NonMasqueradeCIDR, pluginSettings.MTU)
if err != nil {
return nil, fmt.Errorf("didn't find compatible CNI plugin with given settings %+v: %v", pluginSettings, err)
}
ds.network = network.NewPluginManager(plug)
klog.Infof("Docker cri networking managed by %v", plug.Name())
调用
当dockerservice创建podSandbox时会调用networkpluginmanager实现网络功能。
文件kubelet/dockershim/docker_sandbox.go
- 加入网络
func (ds *dockerService) RunPodSandbox(ctx context.Context, r *runtimeapi.RunPodSandboxRequest) (*runtimeapi.RunPodSandboxResponse, error) { .... err = ds.network.SetUpPod(config.GetMetadata().Namespace, config.GetMetadata().Name, cID, config.Annotations, networkOptions) .... }
RunPodSandbox是CRI接口。dockerservice在该方法中通过调用network(networkpluginmanager)将pod加入网络。
- 离开网络
func (ds *dockerService) StopPodSandbox(ctx context.Context, r *runtimeapi.StopPodSandboxRequest) (*runtimeapi.StopPodSandboxResponse, error) { .... err := ds.network.TearDownPod(namespace, name, cID) .... }
StopPodSandbox是CRI接口。dockerservice在该方法中通过调用network(networkpluginmanager)将pod离开网络。
networkpluginmanager
networkpluginmanager逻辑相对简单,就是一个networkplugin的包装类。 kubelet/dockershim/network/plugins.go
// The PluginManager wraps a kubelet network plugin and provides synchronization
// for a given pod's network operations. Each pod's setup/teardown/status operations
// are synchronized against each other, but network operations of other pods can
// proceed in parallel.
type PluginManager struct {
// Network plugin being wrapped
plugin NetworkPlugin
// Pod list and lock
podsLock sync.Mutex
pods map[string]*podLock
}
networkpluginmanager主要实现了2个方法。 当然,都是通过委托给networkplugin实现的。
func (pm *PluginManager) SetUpPod(podNamespace, podName string, id kubecontainer.ContainerID, annotations, options map[string]string) error
func (pm *PluginManager) TearDownPod(podNamespace, podName string, id kubecontainer.ContainerID) error
network plugin
NetworkPlugin
是kubelet中的网络插件接口类。NetworkPlugin
接口提供了如下方法,代码在kubelet/dockershim/network/plugins.go中
// NetworkPlugin is an interface to network plugins for the kubelet
type NetworkPlugin interface {
Init(host Host, hairpinMode kubeletconfig.HairpinMode, nonMasqueradeCIDR string, mtu int) error
Event(name string, details map[string]interface{})
Name() string
Capabilities() utilsets.Int
SetUpPod(namespace string, name string, podSandboxID kubecontainer.ContainerID, annotations, options map[string]string) error
TearDownPod(namespace string, name string, podSandboxID kubecontainer.ContainerID) error
GetPodNetworkStatus(namespace string, name string, podSandboxID kubecontainer.ContainerID) (*PodNetworkStatus, error)
Status() error
}
cniNetworkPlugin
和kubenetNetworkPlugin
都是该接口的一个具体实现。
值得注意的是:cniNetworkPlugin
是整个网络的一种实现策略,不是具体实现机制。cniNetworkPlugin
通过一套out-of-tree的方式与具体的网络机制比如calico交互以实现网络功能。
cniNetworkPlugin
cniNetworkPlugin
是networkplugin的一个实现类。代码在kubelet/dockershim/network/cni包中。
type cniNetworkPlugin struct {
network.NoopNetworkPlugin
loNetwork *cniNetwork
sync.RWMutex
defaultNetwork *cniNetwork
host network.Host
execer utilexec.Interface
nsenterPath string
confDir string
binDirs []string
podCidr string
}
type cniNetwork struct {
name string
NetworkConfig *libcni.NetworkConfigList
CNIConfig libcni.CNI
}
cniNetworkPlugin
包含一个cniNetwork
类型的网络defaultNetwork
。
cniNetwork
包含一个具体的cni网络配置NetworkConfig
和一个实现了libcni.CNI
接口的CNIConfig。CNIConfig中包含具体的网络类型——比如calico,具体的网络插件执行路径/opt/cni/bin/
。
那么什么是libcni.CNI接口呢?
cni
CNI接口定义
type CNI interface {
AddNetworkList(ctx context.Context, net *NetworkConfigList, rt *RuntimeConf) (types.Result, error)
CheckNetworkList(ctx context.Context, net *NetworkConfigList, rt *RuntimeConf) error
DelNetworkList(ctx context.Context, net *NetworkConfigList, rt *RuntimeConf) error
AddNetwork(ctx context.Context, net *NetworkConfig, rt *RuntimeConf) (types.Result, error)
CheckNetwork(ctx context.Context, net *NetworkConfig, rt *RuntimeConf) error
DelNetwork(ctx context.Context, net *NetworkConfig, rt *RuntimeConf) error
GetNetworkCachedResult(net *NetworkConfig, rt *RuntimeConf) (types.Result, error)
ValidateNetworkList(ctx context.Context, net *NetworkConfigList) ([]string, error)
ValidateNetwork(ctx context.Context, net *NetworkConfig) ([]string, error)
}
CNIConfig是CNI的一个具体实现类。
type CNIConfig struct {
Path []string
exec invoke.Exec
}
// CNIConfig implements the CNI interface
var _ CNI = &CNIConfig{}
cniNetworkPlugin和CNI
cniNetworkPlugin是networkplugin的一个具体实现。
它要实现包括SetUpPod
和TearDownPod
在内的networkplugin方法。
SetUpPod
SetUpPod方法调用plugin.addToNetwork将当前容器加入到某个网络。
在addToNetwork中,会生成runtimeConf和netConf参数,调用cni框架的标准接口:AddNetworkList。
func (plugin *cniNetworkPlugin) addToNetwork(network *cniNetwork, podName string, podNamespace string, podSandboxID kubecontainer.ContainerID, podNetnsPath string, annotations, options map[string]string) (cnitypes.Result, error) {
rt, err := plugin.buildCNIRuntimeConf(podName, podNamespace, podSandboxID, podNetnsPath, annotations, options)
....
netConf, cniNet := network.NetworkConfig, network.CNIConfig
....
res, err := cniNet.AddNetworkList(netConf, rt)
....
}
TearDownPod
TearDownPod方法调用plugin.deleteFromNetwork将当前容器从某个网络中删除。 在deleteFromNetwork中,会生成runtimeConf和netConf参数,调用cni框架的标准接口:DelNetworkList。
func (plugin *cniNetworkPlugin) deleteFromNetwork(network *cniNetwork, podName string, podNamespace string, podSandboxID kubecontainer.ContainerID, podNetnsPath string, annotations map[string]string) error {
rt, err := plugin.buildCNIRuntimeConf(podName, podNamespace, podSandboxID, podNetnsPath, annotations, nil)
....
netConf, cniNet := network.NetworkConfig, network.CNIConfig
....
err = cniNet.DelNetworkList(netConf, rt)
....
}
从以上分析可以看出,kubelet通过调用cniNetworkPlugin来创建/删除网络,cniNetworkPlugin通过调用cni的API与cni框架交互来创建/删除网络。cniNetworkPlugin是kubelet和具体的cni网络方案——比如calico等的适配器。
buildCNIRuntimeConf
运行时通过方法buildCNIRuntimeConf
构建`RuntimeConf。
rt := &libcni.RuntimeConf{
ContainerID: podSandboxID.ID,
NetNS: podNetnsPath,
IfName: network.DefaultInterfaceName,
CacheDir: plugin.cacheDir,
Args: [][2]string{
{"IgnoreUnknown", "1"},
{"K8S_POD_NAMESPACE", podNs},
{"K8S_POD_NAME", podName},
{"K8S_POD_INFRA_CONTAINER_ID", podSandboxID.ID},
},
}
- ContainerID: Pod的Sandbox容器的ID。
- NetNS: pod的net namespace path。
- IfName: 设备的名字,比如eth0。
Args包含了一些orchastrotor相关的信息:
- K8S_POD_NAMESPACE
- K8S_POD_NAME
- K8S_POD_INFRA_CONTAINER_ID
RuntimeConf中CapabilityArgs包含portmappings,bandwidth, ipRanges, dns等信息。
rt.CapabilityArgs = map[string]interface{}{
"portMappings": portMappingsParam,
}
...
rt.CapabilityArgs["bandwidth"] = bandwidthParam
...
rt.CapabilityArgs["ipRanges"] = [][]cniIPRange}
...
rt.CapabilityArgs["dns"] = *dnsParam
CapabilityArgs的参数如果NetworkConfig具体的网络支持。