luigi.interface module

This module contains the bindings for command line integration and dynamic loading of tasks

If you don’t want to run luigi from the command line. You may use the methods defined in this module to programmatically run luigi.

class luigi.interface.core(*args, **kwargs)[source]

Bases: Config

Keeps track of a bunch of environment params.

Uses the internal luigi parameter mechanism. The nice thing is that we can instantiate this class and get an object with all the environment variables set. This is arguably a bit of a hack.

use_cmdline_section = False
ignore_unconsumed = {'autoload_range', 'no_configure_logging'}
local_scheduler = BoolParameter (defaults to False): Use an in-memory central scheduler. Useful for testing.
scheduler_host = Parameter (defaults to localhost): Hostname of machine running remote scheduler
scheduler_port = IntParameter (defaults to 8082): Port of remote scheduler api process
scheduler_url = Parameter (defaults to ): Full path to remote scheduler
lock_size = IntParameter (defaults to 1): Maximum number of workers running the same command
no_lock = BoolParameter (defaults to False): Ignore if similar process is already running
lock_pid_dir = Parameter (defaults to /tmp/luigi): Directory to store the pid file
take_lock = BoolParameter (defaults to False): Signal other processes to stop getting work if already running
workers = IntParameter (defaults to 1): Maximum number of parallel tasks to run
logging_conf_file = Parameter (defaults to ): Configuration file for logging
log_level = ChoiceParameter (defaults to DEBUG): Default log level to use when logging_conf_file is not set Choices: {INFO, NOTSET, WARNING, CRITICAL, DEBUG, ERROR}
module = Parameter (defaults to ): Used for dynamic loading of modules
parallel_scheduling = BoolParameter (defaults to False): Use multiprocessing to do scheduling in parallel.
parallel_scheduling_processes = IntParameter (defaults to 0): The number of processes to use for scheduling in parallel. By default the number of available CPUs will be used
assistant = BoolParameter (defaults to False): Run any task from the scheduler.
help = BoolParameter (defaults to False): Show most common flags and all task-specific flags
help_all = BoolParameter (defaults to False): Show all command line flags
exception luigi.interface.PidLockAlreadyTakenExit[source]

Bases: SystemExit

The exception thrown by luigi.run(), when the lock file is inaccessible

luigi.interface.run(*args, **kwargs)[source]

Please dont use. Instead use luigi binary.

Run from cmdline using argparse.

Parameters:

use_dynamic_argparse – Deprecated and ignored

luigi.interface.build(tasks, worker_scheduler_factory=None, detailed_summary=False, **env_params)[source]

Run internally, bypassing the cmdline parsing.

Useful if you have some luigi code that you want to run internally. Example:

luigi.build([MyTask1(), MyTask2()], local_scheduler=True)

One notable difference is that build defaults to not using the identical process lock. Otherwise, build would only be callable once from each process.

Parameters:
  • tasks

  • worker_scheduler_factory

  • env_params

Returns:

True if there were no scheduling errors, even if tasks may fail.