Jump to content

Kubernetes

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by LordOfPens (talk | contribs) at 21:39, 16 May 2019. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Kubernetes
Original author(s)Google
Developer(s)Cloud Native Computing Foundation
Initial release7 June 2014; 10 years ago (2014-06-07)[1]
Stable release
1.14[2] / March 25, 2019; 5 years ago (2019-03-25)
Repository
Written inGo
TypeCluster management software
LicenseApache License 2.0
Websitekubernetes.io

Kubernetes (commonly stylized as k8s[3]) is an open-source container-orchestration system for automating application deployment, scaling, and management.[4] It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation. It aims to provide a "platform for automating deployment, scaling, and operations of application containers across clusters of hosts".[3] It works with a range of container tools, including Docker.[5] Many cloud services offer a Kubernetes-based platform or infrastructure as a service (PaaS or IaaS) on which Kubernetes can be deployed as a platform-providing service. Many vendors also provide their own branded Kubernetes distributions.

History

Google Container Engine talk at Google Cloud Summit

Kubernetes (κυβερνήτης, Greek for "governor", "helmsman" or "captain")[3] was founded by Joe Beda, Brendan Burns and Craig McLuckie,[6] who were quickly joined by other Google engineers including Brian Grant and Tim Hockin, and was first announced by Google in mid-2014.[7] Its development and design are heavily influenced by Google's Borg system,[8][9] and many of the top contributors to the project previously worked on Borg. The original codename for Kubernetes within Google was Project Seven of Nine, a reference to a Star Trek character that is a "friendlier" Borg.[10] The seven spokes on the wheel of the Kubernetes logo are a reference to that codename. The original Borg project was written entirely in C++[8], but the rewritten Kubernetes system is implemented in Go.

Kubernetes v1.0 was released on July 21, 2015.[11] Along with the Kubernetes v1.0 release, Google partnered with the Linux Foundation to form the Cloud Native Computing Foundation (CNCF)[12] and offered Kubernetes as a seed technology. On March 6, 2018, Kubernetes Project reached ninth place in commits at GitHub, and second place in authors and issues to the Linux kernel.[13]

Kubernetes Objects

Kubernetes defines a set of building blocks ("primitives"), which collectively provide mechanisms that deploy, maintain, and scale applications based on CPU, memory[14] or custom metrics.[15] Kubernetes is loosely coupled and extensible to meet different workloads. This extensibility is provided in large part by the Kubernetes API, which is used by internal components as well as extensions and containers that run on Kubernetes.[16]. The platform exerts its control over compute and storage resources by defining resources as Objects, which can then be managed as such. The key objects are:

Pods

The basic scheduling unit in Kubernetes is a pod.[17] It adds a higher level of abstraction by grouping containerized components. A pod consists of one or more containers that are guaranteed to be co-located on the host machine and can share resources.[16]

Each pod in Kubernetes is assigned a unique Pod IP address within the cluster, which allows applications to use ports without the risk of conflict.[18] Within the pod, all containers can reference each other on localhost, but a container within one pod has no way of directly addressing another container within another pod; for that, it has to use the Pod IP Address. An application developer should never use the Pod IP Address though, to reference / invoke a capability in another pod, as Pod IP addresses are ephemeral - the specific pod that they are referencing may be assigned to another Pod IP address on restart. Instead, they should use a reference to a Service, which holds a reference to the target pod at the specific Pod IP Address.

A pod can define a volume, such as a local disk directory or a network disk, and expose it to the containers in the pod.[19] Pods can be managed manually through the Kubernetes API, or their management can be delegated to a controller.[16] Such volumes are also the basis for the Kubernetes features of ConfigMaps (to provide access to configuration through the filesystem visible to the container) and Secrets (to provide access to credentials needed to access remote resources securely, by providing those credentials on the filesystem visible only to authorized containers).

Services

Simplified view showing how Services interact with Pod networking in a Kubernetes cluster

A Kubernetes service is a set of pods that work together, such as one tier of a multi-tier application. The set of pods that constitute a service are defined by a label selector.[16] Kubernetes provides two modes of service discovery, using environmental variables or using Kubernetes DNS.[20] Service discovery assigns a stable IP address and DNS name to the service, and load balances traffic in a round-robin manner to network connections of that IP address among the pods matching the selector (even as failures cause the pods to move from machine to machine).[18] By default a service is exposed inside a cluster (e.g. back end pods might be grouped into a service, with requests from the front-end pods load-balanced among them), but a service can also be exposed outside a cluster (e.g. for clients to reach frontend pods).[21]

Volumes

Filesystems in the Kubernetes container provide ephemeral storage, by default. This means that a restart of the container will wipe out any data on such containers, and therefore, this form of storage is quite limiting in anything but trivial applications. A Kubernetes Volume provides persistent storage that exists for the lifetime of the pod itself. This storage can also be used as shared disk space for containers within the pod. Volumes are mounted at specific mount points within the container, which are defined by the pod configuration, and cannot mount onto other volumes or link to other volumes. The same volume can be mounted at different points in the filesystem tree by different containers.

Namespaces

Kubernetes provides a partitioning of the resources it manages into non-overlapping sets called namespaces. They are intended for use in environments with many users spread across multiple teams, or projects, or even separating environments like development, test, and production.

Managing Kubernetes Objects

Kubernetes provides some mechanisms that allow one to manage, select, or manipulate its objects.

Labels and selectors

Kubernetes enables clients (users or internal components) to attach key-value pairs called "labels" to any API object in the system, such as pods and nodes. Correspondingly, "label selectors" are queries against labels that resolve to matching objects.[16] When a service is defined, one can define the label selectors that will be used by the service router / load balancer to select the pod instances that the traffic will be routed to. Thus, simply changing the labels of the pods or changing the label selectors on the service, can be used to control which pods get traffic and which don't, which can be used to support various deployment patterns like blue-green deployments or A-B testing. This capability to dynamically control how services utilize implementing resources provides a loose coupling within the infrastructure.

For example, if an application's pods have labels for a system tier (with values such as front-end, back-end, for example) and a release_track (with values such as canary, production, for example), then an operation on all of back-end and canary nodes can use a label selector, such as:[22]

tier=back-end AND release_track=canary

Field Selectors

Just like labels, Field selectors also lets one select Kubernetes resources. Unlike labels, the selection is based on the attribute values inherent to the resource being selected, rather than user defined categorization. metadata.name and metadata.namespace are fields selectors that will be present on all Kubernetes objects. Other selectors that can be used depend on the object / resource type.

Architecture

Kubernetes architecture diagram

Kubernetes follows the primary/replica architecture. The components of Kubernetes can be divided into those that manage an individual node and those that are part of the control plane.[16][23]

Kubernetes control plane (primary)

The Kubernetes primary is the main controlling unit of the cluster, managing its workload and directing communication across the system. The Kubernetes control plane consists of various components, each its own process, that can run both on a single primary node or on multiple primaries supporting high-availability clusters.[23] The various components of Kubernetes control plane are as follows:

  • etcd: etcd[24] is a persistent, lightweight, distributed, key-value data store developed by CoreOS that reliably stores the configuration data of the cluster, representing the overall state of the cluster at any given point of time. Just like Apache ZooKeeper, etcd is a system that favors Consistency over Availability in the event of a network partition (see CAP theorem). This consistency is crucial for correctly scheduling and operating services. The Kubernetes API Server uses etcd's watch API to monitor the cluster and roll out critical configuration changes or simply restore any divergences of the state of the cluster, back to what was declared by the deployer. As an example, if the deployer specified that three instances of a particular pod need to be running, this fact is stored in etcd. If it is found that only two instances are running, this delta will be detected by comparison with etcd data, and Kubernetes will use this to schedule the creation of an additional instance of that pod.[23]
  • API server: The API server is a key component and serves the Kubernetes API using JSON over HTTP, which provides both the internal and external interface to Kubernetes.[16][25] The API server processes and validates REST requests and updates state of the API objects in etcd, thereby allowing clients to configure workloads and containers across Worker nodes.[26]
  • Scheduler: The scheduler is the pluggable component that selects which node an unscheduled pod (the basic entity managed by the scheduler) runs on, based on resource availability. Scheduler tracks resource use on each node to ensure that workload is not scheduled in excess of available resources. For this purpose, the scheduler must know the resource requirements, resource availability, and other user-provided constraints and policy directives such as quality-of-service, affinity/anti-affinity requirements, data locality, and so on. In essence, the scheduler's role is to match resource "supply" to workload "demand".[27]
  • Controller manager: A controller is a reconciliation loop that drives actual cluster state toward the desired cluster state.[28] It does this by managing a set of controllers. One kind of controller is a replication controller, which handles replication and scaling by running a specified number of copies of a pod across the cluster. It also handles creating replacement pods if the underlying node fails.[28] Other controllers that are part of the core Kubernetes system include a "DaemonSet Controller" for running exactly one pod on every machine (or some subset of machines), and a "Job Controller" for running pods that run to completion, e.g. as part of a batch job.[29] The set of pods that a controller manages is determined by label selectors that are part of the controller's definition.[22]
The controller manager is a process that runs core Kubernetes controllers like DaemonSet Controller and Replication Controller. The controllers communicate with the API server to create, update, and delete the resources they manage (pods, service endpoints, etc.).[25]

Kubernetes node

The Node, also known as Worker or Minion, is a machine where containers (workloads) are deployed. Every node in the cluster must run a container runtime such as Docker, as well as the below-mentioned components, for communication with primary for network configuration of these containers.

  • Kubelet: Kubelet is responsible for the running state of each node, ensuring that all containers on the node are healthy. It takes care of starting, stopping, and maintaining application containers organized into pods as directed by the control plane.[16][30]
Kubelet monitors the state of a pod, and if not in the desired state, the pod re-deploys to the same node. Node status is relayed every few seconds via heartbeat messages to the primary. Once the primary detects a node failure, the Replication Controller observes this state change and launches pods on other healthy nodes.[citation needed]
  • Kube-proxy: The Kube-proxy is an implementation of a network proxy and a load balancer, and it supports the service abstraction along with other networking operation.[16] It is responsible for routing traffic to the appropriate container based on IP and port number of the incoming request.
  • Container runtime: A container resides inside a pod. The container is the lowest level of a micro-service that holds the running application, libraries, and their dependencies. Containers can be exposed to the world through an external IP address. Kubernetes supports Docker containers since its first version, and in July 2016 rkt container engine was added.[31]

Add-ons

Add-ons operate just like any other application running within the cluster: they are implemented via pods and services, and are only different in that they implement features of the Kubernetes cluster. The pods may be managed by Deployments, ReplicationControllers, and so on. There are many add-ons, and the list is growing. Some of the more important are:

  • DNS: All Kubernetes clusters should have cluster DNS; it is a mandatory feature. Cluster DNS is a DNS server, in addition to the other DNS server(s) in your environment, which serves DNS records for Kubernetes services. Containers started by Kubernetes automatically include this DNS server in their DNS searches.
  • Web UI: This is a general purpose, web-based UI for Kubernetes clusters. It allows users to manage and troubleshoot applications running in the cluster, as well as the cluster itself.
  • Container Resource Monitoring: Providing a reliable application runtime, and being able to scale it up or down in response to workloads, means being able to continuously and effectively monitor workload performance. Container Resource Monitoring provides this capability by recording metrics about containers in a central database, and provides a UI for browsing that data. The cAdvisor is a component on a slave node that provides a limited metric monitoring capability. There are full metrics pipelines as well, such as Prometheus, which can meet most monitoring needs.
  • Cluster-level logging: Logs should have a separate storage and lifecycle independent of nodes, pods, or containers. Otherwise, node or pod failures can cause loss of event data. The ability to do this is called cluster-level logging, and such mechanisms are responsible for saving container logs to a central log store with search/browsing interface. Kubernetes provides no native storage solution for log data, but one can integrate many existing logging solutions into the Kubernetes cluster.

Microservices

Kubernetes is commonly used as a way to host a microservice based implementation, because it and its associated ecosystem of tools provide all the capabilities needed to address key concerns of any microservice architecture.

See also

References

  1. ^ "First GitHub commit for Kubernetes". github.com. 2014-06-07. Archived from the original on 2017-03-01. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  2. ^ "GitHub Releases page". github.com. 2019-03-25. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  3. ^ a b c "What is Kubernetes?". Kubernetes. Retrieved 2017-03-31.
  4. ^ "kubernetes/kubernetes". GitHub. Archived from the original on 2017-04-21. Retrieved 2017-03-28. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  5. ^ https://kubernetes.io/blog/2018/10/10/kubernetes-v1.12-introducing-runtimeclass/
  6. ^ "Google Made Its Secret Blueprint Public to Boost Its Cloud". Archived from the original on 2016-07-01. Retrieved 2016-06-27. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  7. ^ "Google Open Sources Its Secret Weapon in Cloud Computing". Wired. Archived from the original on 10 September 2015. Retrieved 24 September 2015. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  8. ^ a b Abhishek Verma; Luis Pedrosa; Madhukar R. Korupolu; David Oppenheimer; Eric Tune; John Wilkes (April 21–24, 2015). "Large-scale cluster management at Google with Borg". Proceedings of the European Conference on Computer Systems (EuroSys). Archived from the original on 2017-07-27. {{cite journal}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  9. ^ "Borg, Omega, and Kubernetes - ACM Queue". queue.acm.org. Archived from the original on 2016-07-09. Retrieved 2016-06-27. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  10. ^ "Early Stage Startup Heptio Aims to Make Kubernetes Friendly". Retrieved 2016-12-06.
  11. ^ "As Kubernetes Hits 1.0, Google Donates Technology To Newly Formed Cloud Native Computing Foundation". TechCrunch. Archived from the original on 23 September 2015. Retrieved 24 September 2015. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  12. ^ "Cloud Native Computing Foundation". Archived from the original on 2017-07-03. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  13. ^ Conway, Sarah. "Kubernetes Is First CNCF Project To Graduate". Cloud Native Computing Foundation. Archived from the original (html) on 29 October 2018. Retrieved 3 December 2018. Compared to the 1.5 million projects on GitHub, Kubernetes is No. 9 for commits and No. 2 for authors/issues, second only to Linux. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  14. ^ Sharma, Priyanka (13 April 2017). "Autoscaling based on CPU/Memory in Kubernetes—Part II". Powerupcloud Tech Blog. Medium. Retrieved 27 December 2018.
  15. ^ "Configure Kubernetes Autoscaling With Custom Metrics". Bitnami. BitRock. 15 November 2018. Retrieved 27 December 2018.
  16. ^ a b c d e f g h i "An Introduction to Kubernetes". DigitalOcean. Archived from the original on 1 October 2015. Retrieved 24 September 2015. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  17. ^ https://kubernetes.io/docs/concepts/workloads/pods/pod/
  18. ^ a b Langemak, Jon (2015-02-11). "Kubernetes 101 – Networking". Das Blinken Lichten. Archived from the original on 2015-10-25. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  19. ^ Strachan, James (2015-05-21). "Kubernetes for Developers". Medium (publishing platform). Archived from the original on 2015-09-07. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  20. ^ https://kubernetes.io/docs/concepts/services-networking/service/#discovering-services
  21. ^ Langemak, Jon (2015-02-15). "Kubernetes 101 – External Access Into The Cluster". Das Blinken Lichten. Archived from the original on 2015-10-26. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  22. ^ a b "Intro: Docker and Kubernetes training - Day 2". Red Hat. 2015-10-20. Archived from the original on 2015-10-29. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  23. ^ a b c "Kubernetes Infrastructure". OpenShift Community Documentation. OpenShift. Archived from the original on 6 July 2015. Retrieved 24 September 2015. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  24. ^ Container Linux by CoreOS: Cluster infrastructure
  25. ^ a b Marhubi, Kamal (2015-09-26). "Kubernetes from the ground up: API server". kamalmarhubi.com. Archived from the original on 2015-10-29. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  26. ^ Ellingwood, Justin (2 May 2018). "An Introduction to Kubernetes". DigitalOcean. Archived from the original on 5 July 2018. Retrieved 20 July 2018. One of the most important primary services is an API server. This is the main management point of the entire cluster as it allows a user to configure Kubernetes' workloads and organizational units. It is also responsible for making sure that the etcd store and the service details of deployed containers are in agreement. It acts as the bridge between various components to maintain cluster health and disseminate information and commands.
  27. ^ "The Three Pillars of Kubernetes Container Orchestration - Rancher Labs". rancher.com. 18 May 2017. Archived from the original on 24 June 2017. Retrieved 22 May 2017. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  28. ^ a b "Overview of a Replication Controller". Documentation. CoreOS. Archived from the original on 2015-09-22. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  29. ^ Sanders, Jake (2015-10-02). "Kubernetes: Exciting Experimental Features". Livewyer. Archived from the original on 2015-10-20. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  30. ^ Marhubi, Kamal (2015-08-27). "What [..] is a Kubelet?". kamalmarhubi.com. Archived from the original on 2015-11-13. Retrieved 2015-11-02. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  31. ^ https://kubernetes.io/blog/2016/07/rktnetes-brings-rkt-container-engine-to-kubernetes/

External links