luigi.contrib.external_program module¶
Template tasks for running external programs as luigi tasks.
This module is primarily intended for when you need to call a single external program or shell script, and it’s enough to specify program arguments and environment variables.
If you need to run multiple commands, chain them together or pipe output
from one command to the next, you’re probably better off using something like
plumbum, and wrapping plumbum commands in normal luigi
Task
s.
-
class
luigi.contrib.external_program.
ExternalProgramTask
(*args, **kwargs)[source]¶ Bases:
luigi.task.Task
Template task for running an external program in a subprocess
The program is run using
subprocess.Popen
, withargs
passed as a list, generated byprogram_args()
(where the first element should be the executable). Seesubprocess.Popen
for details.Your must override
program_args()
to specify the arguments you want, and you can optionally overrideprogram_environment()
if you want to control the environment variables (seeExternalPythonProgramTask
for an example).By default, the output (stdout and stderr) of the run external program is being captured and displayed after the execution has ended. This behaviour can be overridden by passing
--capture-output False
-
capture_output
= Insignificant BoolParameter (defaults to True)¶
-
stream_for_searching_tracking_url
= Insignificant ChoiceParameter (defaults to none): Stream for searching tracking URL Choices: {none, stderr, stdout}¶ Used for defining which stream should be tracked for URL, may be set to ‘stdout’, ‘stderr’ or ‘none’.
Default value is ‘none’, so URL tracking is not performed.
-
tracking_url_pattern
= Insignificant OptionalParameter (defaults to None): Regex pattern used for searching URL in the logs of the external program¶ Regex pattern used for searching URL in the logs of the external program.
If a log line matches the regex, the first group in the matching is set as the tracking URL for the job in the web UI. Example: ‘Job UI is here: (https?://.*)’.
Default value is None, so URL tracking is not performed.
-
program_args
()[source]¶ Override this method to map your task parameters to the program arguments
Returns: list to pass as args
tosubprocess.Popen
-
program_environment
()[source]¶ Override this method to control environment variables for the program
Returns: dict mapping environment variable names to values
-
always_log_stderr
¶ When True, stderr will be logged even if program execution succeeded
Override to False to log stderr only when program execution fails.
-
-
exception
luigi.contrib.external_program.
ExternalProgramRunError
(message, args, env=None, stdout=None, stderr=None)[source]¶ Bases:
exceptions.RuntimeError
-
class
luigi.contrib.external_program.
ExternalPythonProgramTask
(*args, **kwargs)[source]¶ Bases:
luigi.contrib.external_program.ExternalProgramTask
Template task for running an external Python program in a subprocess
Simple extension of
ExternalProgramTask
, adding twoluigi.parameter.Parameter
s for setting a virtualenv and for extending thePYTHONPATH
.-
virtualenv
= Parameter (defaults to None): path to the virtualenv directory to use. It should point to the directory containing the ``bin/activate`` file used for enabling the virtualenv.¶
-
extra_pythonpath
= Parameter (defaults to None): extend the search path for modules by prepending this value to the ``PYTHONPATH`` environment variable.¶
-