luigi.contrib.presto module

class luigi.contrib.presto.presto(*args, **kwargs)[source]

Bases: luigi.task.Config

host = Parameter (defaults to localhost): Presto host
port = IntParameter (defaults to 8090): Presto port
user = Parameter (defaults to anonymous): Presto user
catalog = Parameter (defaults to hive): Default catalog
password = Parameter (defaults to None): User password
protocol = Parameter (defaults to https): Presto connection protocol
poll_interval = FloatParameter (defaults to 1.0): how often to ask the Presto REST interface for a progress update, defaults to a second
class luigi.contrib.presto.PrestoClient(connection, sleep_time=1)[source]

Helper class wrapping pyhive.presto.Connection for executing presto queries and tracking progress

percentage_progress
Returns:percentage of query overall progress
info_uri
Returns:query UI link
execute(query, parameters=None, mode=None)[source]
Parameters:
  • query – query to run
  • parameters – parameters should be injected in the query
  • mode – “fetch” - yields rows, “watch” - yields log entries
Returns:

class luigi.contrib.presto.WithPrestoClient[source]

Bases: luigi.task_register.Register

A metaclass for injecting PrestoClient as a _client field into a new instance of class T Presto connection options are taken from T-instance fields Fields should have the same names as in pyhive.presto.Cursor

class luigi.contrib.presto.PrestoTarget(client, catalog, database, table, partition=None)[source]

Bases: luigi.target.Target

Target for presto-accessible tables

count()[source]
exists()[source]
Returns:True if given table exists and there are any rows in a given partition False if no rows in the partition exists or table is absent
class luigi.contrib.presto.PrestoTask(*args, **kwargs)[source]

Bases: luigi.contrib.rdbms.Query

Task for executing presto queries During its executions tracking url and percentage progress are set

host
port
user
username
schema
password
catalog
poll_interval
source
partition
protocol
session_props
requests_session
requests_kwargs
query = None
run()[source]

The task run method, to be overridden in a subclass.

See Task.run

output()[source]

Override with an RDBMS Target (e.g. PostgresTarget or RedshiftTarget) to record execution in a marker table