11/23/2019
Kubernetes Production Readiness and Best Practices Checklist
Availability
- Configured liveness and readiness probes?
- Liveness probe is the Kubernetes equivalent of “have you tried turning it off and on again”. Liveness probes detect containers that are not able to recover from failed states and restart them. It is a great tool to build-in auto recovery into production Kubernetes deployments. You can create liveness probes based on kubelet, http or tcp checks.
- Readiness probes detect whether a container is temporarily unable to receive traffic and will mitigate these situations by stopping traffic flow to it. Readiness probes will also detect whether new pods are ready to receive traffic, before allowing traffic flow, during deployment updates.
- Distributed worker nodes across zones?
- Worker nodes should also be distributed across availability zones.
- Configured Autoscaling for worker nodes?
- When using the cloud, the best practice is to place worker nodes in autoscaling groups. Autoscaling groups will automatically bring up a node in the event of termination.
- Configured the correct number of pod replicas for high availability?
- To ensure highly available Kubernetes workloads, pods should also be replicated using Kubernetes controllers like ReplicaSets, Deployments and Statefulsets.
- Both deployments and stateful-sets are central to the concept of high availability and will ensure that the desired number of pods is always maintained. The number of replicas is usually dictated by application requirements.
- Kubernetes does recommend using Deployments over Replicasets for pod replication since they are declarative and allow you to roll back to previous versions easily. However, if your use-case requires custom updates orchestration or does not require updates at all, you can still use Replicasets.
- Not spinning up any naked pods?
- Are all your pods part of a Replicaset or Deployment? Naked pods are not re-scheduled in case of node failure or shut down. Therefore, it is best practice to always spin up pods as part of a Replicaset or Deployment.
- Setup Ingress?
- Ingress allows HTTP and HTTPS traffic from the outside internet to services inside the cluster. Ingress can also be used for load balancing, terminating SSL and to give services externally reachable URLs.
- In order for ingress to work, your cluster needs an ingress controller. Kubernetes officially supports GCE and Nginx controller as of now.
Resource Management
- Configured resource requests and limits for containers?
- Resource requests and limits help you manage resource consumption by individual containers. Resource requests are a soft limit on the amount of resources that can be consumed by individual containers. Limits are the maximum amount of resources that can be consumed.
- Resource requests and limits can be set for CPU, memory and ephemeral storage resources. Setting resource requests and limits is a Kubernetes best practice and will help avoid containers getting throttled due to lack of resources or going berserk and hogging resources.
- Created separate namespaces for your Teams/Products?
- Kubernetes namespaces are virtual partitions of your Kubernetes clusters. It is recommended best practice to create separate namespaces for individual teams, projects or customers. Examples include Dev, production, frontend etc. You can also create separate namespaces based on custom application or organizational requirements.
- Configured default resource requests and limits for namespaces?
- Default requests and limits specify the default values for memory and CPU resources for all containers inside a namespace. In situations where resource request and limit values are not specifically defined for a container created inside a namespace with default values, that container will automatically inherit the default values. Configuring default values on a namespace level is a best practice to ensure that all containers created inside that namespace get assigned both request and limit values.
- Configured limit ranges for namespaces
- Limit ranges also work on the namespace level and allow us to specify the minimum and maximum CPU and memory resources that can be consumed by individual containers inside a namespace.
- Whenever a container is created inside a namespace with limit ranges, it has to have a resource request value that is equal to or higher than the minimum value we defined in the limit range. The container also has to have both CPU and memory limits that are equal to lower than the max value defined in the limit range.
- Specified Resource Quotas for namespaces?
- Resource quotas also work on the namespace level and provide another layer of control over cluster resource usage. Resource Quotas limit the total amount of CPU, memory and storage resources that can be consumed by all containers running in a namespace.
- Consumption of storage resources by persistent volume claims can also be limited based on individual storage class. Kubernetes administrators can define storage classes based on the quality of service levels or backup policies.
- Configured pod and API Quotas for namespaces?
- Pod quotas allow you to restrict the total number of pods that can run inside a namespace. API quotas let you set limits for other API objects like PersistentVolumeClaims, Services and ReplicaSets.
- Pod and API quotas are a good way to manage resource usage on a namespace level
- Attached labels to Kubernetes objects?
- Labels allow Kubernetes objects to be queried and operated upon in bulk. They can also be used to identify and organize Kubernetes objects into groups. As such defining labels should figure right at the top of any Kubernetes best practices list. Here is a list of recommended Kubernetes labels that should be defined for every deployment.
- Limited the number of pods that can run on a node?
- This will help avoid scenarios where rogue or misconfigured jobs create pods in such large numbers as to overwhelm system pods.
- Reserved compute resources for system daemons?
- Another best practice is to reserve resources for system daemons that are needed by both the OS and Kubernetes itself to run. All three resource types CPU, memory and ephemeral storage resources can be reserved for system daemons. Once reserved these resources are deducted from node capacity and are exposed as node allocable resources. Below are kubelet flags that can be used to reserve resources for system daemons:
- –kube-reserved: allows you to reserve resources for Kubernetes system daemons like the kubelet, container runtime and node problem detector.
- –system-reserved: allows you to reserve resources for OS system daemons like sshd and udev.
- Another best practice is to reserve resources for system daemons that are needed by both the OS and Kubernetes itself to run. All three resource types CPU, memory and ephemeral storage resources can be reserved for system daemons. Once reserved these resources are deducted from node capacity and are exposed as node allocable resources. Below are kubelet flags that can be used to reserve resources for system daemons:
- Configured out of resource handling?
- Make sure you configure out of resource handling to prevent unused images and dead pods and containers taking up too much unnecessary space on the node.
- Out of resource handling specifies Kubelet behaviour when the node starts to run low on resources. In such cases, the Kubelet will first try to reclaim resources by deleting dead pods (and their containers) and unused images. If it cannot reclaim sufficient resources, it will then start evicting pods.
- You can influence when the Kubelet kicks into action by configuring eviction thresholds for eviction signals.
- Thresholds can be configured for nodefs.available, nodefs.inodesfree, imagefs.available and imagefs.inodesfree eviction signals in the pod spec.
For Example
nodefs.available<10% nodefs.inodesFree<5% imagefs.available<15% imagefs.inodesFree<20% * Doing this will ensure that unused images and dead containers and pods do not take up unnecessary disk space. * You should also consider specifying a threshold for memory.available signal. This will ensure that Kubelet kicks into action when free memory on the node falls below your desired level. * Another best practice is to pass the –eviction-minimum-reclaim to Kubelet. This will ensure that the Kubelet does not pop up and down the eviction threshold by reclaiming a small amount of resources. Once an eviction threshold is triggered the Kubelet will evict pods till the minimum threshold is reached. |
- Using recommended settings for Persistent Volumes?
- Always include Persistent Volume Claims in the config
- Never include PVs in the config
- Always create a default storage class
- Give the user the option of providing a storage class name
- Enabled log rotation?
- If you have node-level logging enabled, make sure you also enable log rotation to avoid logs consuming all available storage on the node. Enable the logrotate tool for clusters deployed by the kube-up.sh script on GCP or Docker’s log-opt for all other environments.
- Prevented Kubelet from setting or modifying label keys?
- If you are using labels and label selectors to target pods to specific nodes for security or regulatory purposes, make sure you also choose label keys that cannot be modified by the Kubelet. This will ensure that compromised nodes cannot use their Kubelet credentials to label their node object and schedule pods.
- Make sure you use the Node authorizer, enable the NodeRestriction admission plugin and always prefix node labels with node-restriction.Kubernetes.io/ as well as add the same prefix to label selectors.
Security
- Using the latest Kubernetes version?
- Kubernetes regularly pushes out new versions with critical bug fixes and new security features. Make sure you have upgraded to the latest version to take advantage of these features.
- Enabled RBAC (Role-Based Access Control)?
- Kubernetes RBAC allows you to regulate access to your Kubernetes environment. Using RBAC Kubernetes admins can define policies authorizing user access as well as the extent of this access to resources. Make sure to define all four high-level API objects including Role, Cluster role, Role binding and Cluster role binding to secure your Kubernetes environment.
- Following user access best practices?
- Make sure you limit the scope of user permissions by chopping up your Kubernetes environment into separate namespaces. This will isolate resources and help contain the damage in case of security misconfigurations or malicious activity. You can do this by using Roles (which grant access to resources within a single namespace) as opposed to ClusterRoles (which grant access to resources cluster-wide).
- Another best practice is to limit user permissions based on the resources they need access to. For example, you can limit a Role to a specific namespace and a set of resources e.g. pods.
- Avoid giving admin access as much as possible.
- Enabled audit logging?
- Kubernetes audits are performed by the kube-api server and allow you to record a sequence of the activities that users or other system components perform. You should also be monitoring audit logs to proactively react to malicious activity or authorization failures.
- You can define rules about which events to record and what data to log in an audit policy. Here is a minimal audit policy which will log all metadata related to requests
# Log all requests at the Metadata level. apiVersion: audit.k8s.io/v1kind: Policyrules:- level: Metadata |
- You can implement logging at Request level which will log both metadata and request body as well as RequestResponse level which will log response body in addition to request metadata and request body.
- Enabled AlwaysPullImages in admission controller?
- AlwaysPullImages is an admission controller which ensures that images are always pulled with the correct authorization and cannot be re-used by other pods without first providing credentials. This is very useful in a multi-tenant environment and ensures that images can only be used by users with the correct credentials.
- Besides these, you should also enable other recommended Kubernetes admission controllers including:
- NamespaceLifecycle
- LimitRanger
- ServiceAccount
- DefaultStorageClass
- DefaultTolerationSeconds
- MutatingAdmissionWebhook
- ValidatingAdmissionWebhook
- Priority
- ResourceQuota
- Chosen a Network plugin and configured network policies?
- Network policies allow you to configure how the various components of your Kubernetes environment communicate among each other and with outside network endpoints. These can include, pods, containers, services and namespaces
- Implemented authentication for kubelet?
- Kubernetes allows anonymous authentication by default. You can avoid this by enabling RBAC or by disabling anonymous access to kubelet’s https endpoint by starting kubelet with the flag –anonymous-auth=false
- You can then enable X509 client certificate authentication by starting the kubelet using the flag –client-ca-file flag and the API server with –kubelet-client-certificate and –kubelet-client-key flags.
- You can also enable API bearer tokens by ensuring the authentication.k8s.io/v1beta1 API group is enabled in the API server and starting the kubelet with the –authentication-token-webhook and –kubeconfig flags.
- Configured Kubernetes secrets?
- Sensitive information related to your Kubernetes environment like a password, token or a key should always be stored in a Kubernetes secrets object.
- Enabled data encryption at rest?
- Encrypting data at rest is another security best practice. In Kubernetes, data can be encrypted using either of these four providers: aescbc, secretbox, aesgcm or kms.
- Encryption can be enabled by passing the –encryption-provider-config flag to kube-apiserver process.
- Disabled default service account?
- All newly created pods and containers without a service account are automatically assigned the default service account.
- The default service account has a very wide range of permissions in the cluster and should therefore be disabled.
- You can do this by setting automountServiceAccountToken: false on the service account.
- Scanned containers for security vulnerabilities?
- Another security best practice is to scan your container images for known security vulnerabilities.
- You can do this using open source tools like Anchore and Clair which will help you identify common vulnerabilities and exposures (CVEs) and mitigate them.
- Configured security context for pods, containers and volumes?
- Security context specifies privilege and access control settings for pods and containers. Pod security context can be defined by including the securityContext field in the pod specification. Once a security context has been specified for a pod it automatically propagates to all the containers that belong to the pod.
- A best practice when setting the security context for a pod is to set both runAsNonRoot and readOnlyRootFileSystem fields to true and allowPriviligeEscalation to false. This will introduce more layers into your container and Kubernetes environment and prevent privilege escalation
- Enabled Kubernetes logging?
- Kubernetes logs will help you understand what is happening inside your cluster as well as debug problems and monitor activity. Logs for containerized applications are usually written to the standard output and standard error streams.
- A best practice when implementing Kubernetes logging is to configure a separate lifecycle and storage for logs from pods, containers and nodes. You can do this by implementing a cluster level logging architecture.
Scalability
- Configured the horizontal autoscaler?
- The horizontal pod auto-scaler (HPA) automatically scales the number of pods in a deployment or replica set based on CPU utilization or custom metrics.
- Configured vertical pod autoscaler?
- The vertical pod auto-scaler (VPA) automatically sets resource requests and limits for containers and pods based on resource utilization metrics.
- The VPA can change resource limits and requests and can do this for new pods as well as existing pods.
- Configured cluster autoscaler?
- The cluster autoscaler (CA) automatically scales cluster size based on two signals; whether there are any pending pods as well as the utilization of nodes.
- If the CA detects any pending pods during its periodic checks, it requests more nodes from the cloud provider. The CA will also downscale the cluster and remove idle nodes if they are underutilized.
Monitoring
- Set up a monitoring pipeline?
- There are a number of open source monitoring tools that you can use to monitor your Kubernetes clusters.
- Prometheus+grafana is one of the most widely used monitoring toolsets among DevOps.
- Selected a list of metrics to monitor?
- Setting up a metrics pipeline also involves identifying a list of metrics that you want to track.
- In the context of resource management, most useful metrics to track include usage and utilization metrics for CPU, memory and filesystem. These can be tracked on many different levels of abstraction from clusters and namespaces, to pods and nodes.
Bonus
- Run an end-to-end (e2e) test?
- End-to-end tests are a great way to ensure that your Kubernetes environment will behave in a consistent and reliable manner when pushed into production. End-to-end tests will also enable developers to identify bugs before pushing their application out to end users.
- You can run an e2e test by installing kubetest
go get -u k8s.io/test-infra/kubetest kubetest –build –up –test –down |
- Mapped external services?
- Another best practice is to provision a DNS server as a cluster add-on. A DNS server is the recommended method for service discovery in Kubernetes. The DNS server will monitor theKubernetes API for new Services and will create a set of DNS entries for each. Kubernetes recommends CoreDNS to be installed as a DNS server
- The DNS add-on makes it easier for pods to connect to services by doing a DNS query for the servicename or for the servicename.namespacename.