版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、K8S集群基础架构的有效管理实践How we Manage our Widely Varied Kubernetes Infrastructures in AlibabaAgendaBackgroundAlibaba Kubernetes ArchitectureInfrastructure ManagementCI/CD PipelinesQuick DemoBackgroundWho are we?Scale of Alibaba Kubernetes Clusters (handreds of internal clusters, 5k-10k nodes each)Variety of
2、 Cluster Infrastructures (200+ addons)Significance of keeping the stability in large-scale clusters.Tenant ClusterMeta ClusterArchitecture of Alibaba Kubernetes InfrastructureKubeletPouch-ContainerPodPodPodCNIalinetultron-pluginAlibaba ECSKubeletcontainerdPodPodPodCNIterwaycsi-pluginData PlaneBare M
3、entalKubeletcontainerdkatakatakataCNIalinetCSIultron-pluginMulti-tenantNetwork ControllerStorage ControllerKruiseDefender OperatorKubeNode Operatorkube-apiserverkube-controller-managerControl Planekube-scheduleretcdalphaCustomized SchedulerAlert OpeatorMonitoring OpeatorMetrics OperatorAdd-onsRepair
4、 OperatorCustomized OperatorInfrastructure Management - MasterapiVersion: /v1alpha1 kind: Clustermetadata:labels:cluster.id: c3f1b726caecf4d0ca076f73ee781e312 name: kubernetes-clusternamespace: c3f1b726caecf4d0ca076f73ee781e312 spec:kubernetes:kcm:commit: 0bfce06name: kubernetes.kdm.kcm replicas: 3v
5、ersion: v1.16.3-alibaba.2kore:name: kubernetes.kdm.korepanelreplicas: 3version: v1.16.3-alibaba.2 rols:name: kubernetes.kdm.roles version: v1.16.3-alibaba.2scheduler:commit: 0bfce06name: kubernetes.kdm.schedulerreplicas: 3version: v1.16.3-alibaba.2Kubernetes VersionOpsCICluster APIKube-ApiserverKube
6、-Controller-ManagerKube-SchedulerCluster Spec1. push k8s version2. update cluster specSimplified logic of managing master versionswatch3. Upgrade master versionUse Cluster API manage master versionOperator manager infrastructureInfrastructure Management - AddonSimplified logic of managing addon vers
7、ionsOpsCIOperator-Manager1. push operator version2. Call Operator manager to trick canary grayOld PodCanary PodNew Pod4. upgrade versionfirst create a canary pod andthen update operator rules and watching the canary podstatuscall UpdateOperatorRule to empty the rules and delete the canary podupgrade
8、 to new version3. operate canary podInfrastructure Management - DataplaneSimplified logic of managing data plane versionsOpsCI1. push rpm version2. create machine component setMachine OperatorMachineComponentSetwatchkube-node-agentRPMcall kubenode agent to upgade rpm versionupgrade rpmuse partition
9、to controller the batch of grayKubeNode: upgrade a dataplane component“Philosophy”Components varied from different clustersHow to manage componentsAlways provide the stable component versionHow to make stable releasesContinuous and non-disruptive cluster deliveryHow to build safe delivery pipelinesC
10、omponent ManagementImage-OrientedOnly patch container imageSimple but not fit to all casesYAML-OrientedHelm templateSeparate image and meta- configDesign for CIHelm + Version ControlComponent ManagementapiVersion: apps.kruise.io/v1alpha1kind: DaemonSetmetadata:name: asi-proxy-ds-1namespace: kube-sys
11、tem Spec:template:spec:containers:- image: .image.nginx.repository:.image.nginx.tagresource: toYaml .resource | indent 8 tolerations: toYaml .tolerations | indent 8 .nginx:repository: nginxtag: latestresource: requests: cpu: 1 memory: 2Gi limit:cpu: 2memory: 4Gitolerations:- operator: ExistsYAMLMeta
12、-Config: Varies from cluster to clusterInfrastructure Components = YAML = Template + Image + Meta-ConfigImage: expected to be the sameimage:Template: constants that never changesComponent ManagementDo things like that kubectl apply doesCompare with current spec/cluster specPATCH diff to apiserver7HP
13、SODWH,PDJH0HWDDB SpecTarget SpecCluster SpecResource DiResource DiNew Cluster Specpodpodpodpodpod% 5HFRUGHG,PDJH 9HUVLRQ1HZ0HWD 9HUVLRQreplicas: 3cpu: 1 mem: 2Gi spec:replicas: 3 resource:request: UHSOLFD Cluster SpecFilter out danger fieldthree-way diTrigger operators reconcileVersion Release & Tes
14、tingBranch updateRun e2e testsRelease and deliveryControlplaneaddonaddonaddonaddonDeploy to e2e ClusterDev branchNew features & fixesRelease v1.0.0ClusterClusterClusterVersion Release & TestingKubernetes Conformance e2ee2e-cluster-1apiserver kcm scheduler cni-serviceextension-webhookextension-contro
15、ller Pouche2e-cluster-2apiserver kcm scheduler kube-proxy coredns containerde2e-cluster-3apiserver kcm scheduler terwaycloud-controller-manager containerd:KLWHER 7HVWLQJOperatorsguest-cluster-1Operatorsguest-cluster-2Operatorsguest-cluster-3Canary Test Sets 1%ODFNER 7HVWLQJCanary Test Sets 2Canary T
16、est Sets 3e2e testing is not enoughCanary tests runs continuouslyCreate/delete pod/sts/deployUpgrade sts/deployScale up/down sts/deployCreate JobCreate CustomResouceIntra-cluster upgradeRolling updates for Kubernetes WorkloadsDeployment (Kruise)StatefulSet (Kruise)DaemonSet (Kruise)Dataplane compone
17、nts (KubeNode)Rollout PolicyPause/ResumeMax unavailableDeploymentStatefulSetDaemonSetDataplane ComponentsRollout PolicyRollingUpdate Canary DeployRollingUpdate Canary DeployRollingUpdateRollingUpdatePause/Resum eYesYesYesYesMax unavailableNot yetYesYesYesPartitionNoNoYesYes/openkruise/kruiseRollout
18、for operatorsEnhance the ability of Operator (StatefulSet / Deployment)Implement operator as the way kubebuilder doesSidecar container which contains clientset, informer and pluginsServing operator with gRPC requests/openkruise/kruiseRollout for operatorsCanary deploy for OperatorsFlow control on a
19、monilithic managerFlow slice controlled by rule (Custom Resource)Rolling update/openkruise/kruiseRollout for DaemonSetOriginal DaemonSetLack of the ability of rolling updatealways updates all pods once image changesOnDelete ?Replicas:5Updated Replicas: 0Replicas:5Updated Replicas: 5Rollout for Daemo
20、nSetKruise: Enhance the ability of DaemonSetsPartition: the number of pods remained to be old versionMaxUnavailable: the maximum number of pods can be unavailable during rolling updateReplicas:5Updated Replicas: 0Partiton:5Replicas:5Updated Replicas: 5Partition:4Replicas:5Updated Replicas: 5Partitio
21、n:2/openkruise/kruiseRollout for DataplaneKubelet / Pouch / containerd Similar to Kruise Daemonset on patition controlNodeSet: a group of nodes which has the same characters, minimum rollout unitRolling update in each NodeSetUpgrade NodeSet sequentiallyNS - 1NS - 2NS - 3Cluster - 1NS - 5NS - 6Cluste
22、r - 2Bake timeNS - 4Inter-cluster upgradesInter-cluster rollout pipelinesOrchestrate clusters with scale / importance of upper biz appsBuild a gray release pipelineTekton-liked implementationTesting ClusterCanary ClusterSmallflow ClusterProduction ClusterContains small percent of service invocationI
23、nter-cluster upgradesInter-cluster rollout pipelinesSilent period between each clustersPre-checking and post-checkingtime window checker / rule-based blockerMetric monitoring / health checksCluster - 1Cluster - 2Cluster - 3Cluster - 4Pipelines inter-clusterPut all things together, here comes our pipeline journey ! Source CodeUnit TestBuild Imagee2e Cluster Ca
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 农活外包合同
- 分装业务外包合同
- 加工车间外包合同
- 包装设备外包合同
- 医院安保外包合同
- 单位外派外包合同
- 厂区内保安外包合同
- 口罩厂产线外包合同
- 商场外包合同
- 圆才外包合同
- 国家事业单位招聘2025中国动物卫生与流行病学中心招聘拟聘用人员笔试历年参考题库典型考点附带答案详解
- GB/T 18984-2026低温管道用无缝钢管
- 2026年广东省揭阳市普宁市中考模拟预测化学试题
- 2026广东茂名高岭科技有限公司技术部职员2名备考题库含答案详解(综合题)
- 2026年上海市浦东新区初三下学期二模道德与法治试卷和答案
- 金昌市金川区玉石沟冶金用石英岩矿产资源开发与恢复治理方案
- 2026年高级经济师之工商管理考试彩蛋押题及参考答案详解(综合卷)
- 鞋厂各部门责任制度
- 闸门安全生产责任制度
- 新能源汽车充电桩建设中的法律问题与规制路径研究毕业答辩汇报
- 2025年卫健委工作人员岗位招聘面试参考题库及参考答案
评论
0/150
提交评论