Lets discuss about the POD to Node relationship and how you can restrict what pods are placed on what nodes. The concept of taints and tolerations can be a bit confusing at first.
Lets take the example and using a analogy of a bug approaching a person.
To prevent the bug from landing on the person, we sprayed the person with a repellent spray or a taint as we will call it. The bug is intolerant to the smell.
The taint applied on the person will throw the bug off. However there could be other bugs that are tolerant to the smell and so the taint does not really affect. So other bug will end up landing on the person.
So there are two things that decide if a bug can land on a person.
- The taint on the person.
- The bugs toleration level to that particular taint.
Lets map this scenario in to Kubernetes. The person is a node and the bugs are the pods. Taints and tolerations are used to set restrictions on what pods can be scheduled on the nodes.
Lets start with simple cluster with three worker nodes.
When the pods are crated, Kubernetes scheduler tries to place these pods on the available worker nodes. As of now there are no restrictions or limitations and as such the scheduler places the pods across all the nodes to balance them out equally.
Lets assume that we have dedicated resources on node 1 for a particular use case or application. So we would like only those pods that belong to this application to be placed on node 1.
First we prevent all pods from being placed on the node by placing a taint on the node.
By default, pods have no toleration. Which means unless specified, otherwise none of the pods can tolerate any taint. So in this case none of the pods can be placed on node 1 as none of them can tolerate the taint.
This solves half of our requirement. No unwanted pods are going to be placed on this node. The half is to enable certain pods to be placed on this node. We must specify which pods are tolerant to this particular taint.
In our case we would like to allow only pod D to be placed on this node. We add toleration to pod D.
So when the scheduler tries to place this pod on node 1, it goes through. Node 1 can now only accept the pods that can tolerate the taint. The scheduler tries to place pod A in to node 1, but due to taint, it will throw off and it goes to node 2. Likewise it will place pods accordingly.
So remember, taints are set on node and tolerations are set on pods.
Use the kubectl taint nodes command to taint a node. Specify the name of the node “taint” followed by the “taint” it self which is key-value pair.
kubectl taint nodes node-name key=value:taint-effect
The taint-effect defines, what would happen to the pods if they do not tolerate the taint. There are three main effects.
- NoSchedule -pods will not be scheduled on the node
- PreferNoSchedule -the system will try to avoid placing a pod on the node bu that is not guaranteed.
- NoExecute -new pods will not be scheduled on the node and existing pods on the node if any, will be evicted if they do not tolerate the taint. These pods may have been placed on the node before taint was applied.
kubectl taint nodes node1 app=blue:NoSchedule
Tolerations are added to the pod. We can add them under the spec section.
- key: app
Remember taints and tolerations are only meant to restrict nodes from accepting certain pods. In our case node 1 can only accept pod D. It does not guarantee that pod D will always be placed in node 1. Since there are not taints of restriction on node 1 and node 3, pod D can be placed on those nodes too.
In other words, taints and tolerations are not telling pod to go to specific node, instead it tells the node to only accept pods with certain tolerations.
If you requirement is to restrict a pod to certain nodes, it is achieved through another concept called as node affinity which I will discuss later.
There is another interesting fact here. In a Kubernets cluster we have master nodes too which are technically just another node that has all the capabilities to accept pods plus it runs all the management software. If you have noticed, the scheduler does not place any pods on the master node.
When the Kubernetes cluster is first set up, a taint is set on the master node. Automatically that prevents any pods from being scheduled on this node. You can see this as well as modify this behavior if required. However the best practice is not to deploy application workloads on a master node.
You can see this taint.
kubectl describe node kubemaster | grep Taint