Jobs

There are different types of workloads that a container can serve.
Workloads such as web servers continue to run for a long period of time, until manually taken down. There are other kinds of workloads such as batch processing, analytics or reporting, that are meant to carryout specific tasks and finish.
Example: Performing a computation, processing an image, performing an analytic task, sending an email etc,. These are meant for short period of time.

Let us first see how these workloads work in docker.
In Docker,
docker run ubuntu expr 3+2 (performs a simple math operation)
In this case, docker container comes up, perform the simple operation, prints the output and then exists.
when we run docker ps -a command, we see the container in exit state.

In Kubernetes,
expr

Though the container computes output and exits, kubernetes continues to restart the container and bring it up again. This continues to happen until a threshold is reached.

Kubernetes wants our applications to live forever. The default behaviour of pods is to attempt to restart the container to keep it running.
This behaviour is defined by the property restartPolicy set on the pod which is by default set to always. Therefore, the pod always recreates the container when it exits. We can overwrite this behaviour by setting this property to Never or Onfailure. Thus, the kubernetes does not restart the container once the job is finished.

pod-definition file

apiVersion: v1
kind: Pod
metadata: 
  name: math-pod
spec:
  containers: 
    - name: math-add
      image: ubuntu
      command: ['expr', '3', '+', '2']

  restartPolicy: Never

While replicaset is used to make sure a specified number of pods are running at all time, a job is used to run a set of pods to perform a given task to completion.

We create a job using a definition file.

job-definition file

apiVersion: batch/v1
kind: Job
metadata:
  name: math-add-job
spec:
  template:
  # pod-definition specification
    spec:
      containers:
        - name: math-add
          image: ubuntu
          command: ['expr', '2', '+', '3']
      restartPolicy: Never

To create a job kubectl create -f job-definition.yaml

To view the jobs
kubectl get jobs

The standard ouput of the container can be seen in container logs
kubectl logs <podname>

To delete the job
kubectl delete job <jobname>

job

Job with multiple pods

To run multiple pods we set a value for completions under the job specification.

multiple-pods

By default, the pods are created one after the other. The second pod is created only after the first is finished.

If the pod fails, the job tries to create new pods until it has three successful completions, and that completes the job.

failed

Instead of getting the pods created sequentially, we can get them created in parallel. For this add the property called parallelism to the job specification. We set it to 3, to create 3 pods in parallel.

parallel

The job first creates 3 pods at once, two of which completes successfully. Now, we only need one more, it is intelligent enough to create one pod at a time, until we get a total 3 successful pods.

CronJobs

A CronJob is a job that can be scheduled. We can schedule and run a CronJob Periodically. Example: A job to generate a report and send an email.

cronjob-definition file

apiVersion: batch/v1beta1
kind: CronJob
metadata: 
  name: reporting-cron-job
spec:
  # schedule option takes cron like format string, where it takes the time when the job is to be run
  schedule: "*/1 * * * *"
  jobTemplate:
    # Job spec
    spec: 
      completions: 3
      parallelism: 3
      template:
        spec:
          containers: 
            - name: reporting-tool
              image: reporting-tool
          restartPolicy: Never

schedule

To create cronjob
kubectl create -f <cronjob-definition-file>

To view the available cronjobs kubectl get cronjob