luigi.contrib.batch module

AWS Batch wrapper for Luigi

From the AWS website:

AWS Batch enables you to run batch computing workloads on the AWS Cloud.

Batch computing is a common way for developers, scientists, and engineers to access large amounts of compute resources, and AWS Batch removes the undifferentiated heavy lifting of configuring and managing the required infrastructure. AWS Batch is similar to traditional batch computing software. This service can efficiently provision resources in response to jobs submitted in order to eliminate capacity constraints, reduce compute costs, and deliver results quickly.

See AWS Batch User Guide for more details.

To use AWS Batch, you create a jobDefinition JSON that defines a docker run command, and then submit this JSON to the API to queue up the task. Behind the scenes, AWS Batch auto-scales a fleet of EC2 Container Service instances, monitors the load on these instances, and schedules the jobs.

This boto3-powered wrapper allows you to create Luigi Tasks to submit Batch jobDefinition``s. You can either pass a dict (mapping directly to the ``jobDefinition JSON) OR an Amazon Resource Name (arn) for a previously registered jobDefinition.


  • boto3 package
  • Amazon AWS credentials discoverable by boto3 (e.g., by using aws configure from awscli)
  • An enabled AWS Batch job queue configured to run on a compute environment.

Written and maintained by Jake Feala (@jfeala) for Outlier Bio (@outlierbio)

exception luigi.contrib.batch.BatchJobException[source]

Bases: exceptions.Exception

class luigi.contrib.batch.BatchClient(poll_time=10)[source]

Bases: object


Get name of first active job queue


Retrieve the first job ID matching the given name


Retrieve task statuses from ECS API

Parameters:(str) (job_id) – AWS Batch job uuid


get_logs(log_stream_name, get_last=50)[source]

Retrieve log stream from CloudWatch

submit_job(job_definition, parameters, job_name=None, queue=None)[source]

Wrap submit_job with useful defaults


Poll task status until STOPPED


Register a job definition with AWS Batch, using a JSON

class luigi.contrib.batch.BatchTask(*args, **kwargs)[source]

Bases: luigi.task.Task

Base class for an Amazon Batch job

Amazon Batch requires you to register “job definitions”, which are JSON descriptions for how to issue the docker run command. This Luigi Task requires a pre-registered Batch jobDefinition name passed as a Parameter

  • (str) (job_definition) – name of pre-registered jobDefinition
  • job_name – name of specific job, for tracking in the queue and logs.
  • job_queue – name of job queue where job is going to be submitted.
job_definition = Parameter
job_name = OptionalParameter (defaults to None)
job_queue = OptionalParameter (defaults to None)
poll_time = IntParameter (defaults to 10)

The task run method, to be overridden in a subclass.



Override to return a dict of parameters for the Batch Task