luigi.contrib.rdbms module¶
A common module for postgres like databases, such as postgres or redshift
-
class
luigi.contrib.rdbms.
CopyToTable
(*args, **kwargs)[source]¶ Bases:
luigi.task.MixinNaiveBulkComplete
,luigi.contrib.rdbms._MetadataColumnsMixin
,luigi.task.Task
An abstract task for inserting a data set into RDBMS.
Usage:
Subclass and override the following attributes:
- host,
- database,
- user,
- password,
- table
- columns
- port
-
host
¶
-
database
¶
-
user
¶
-
password
¶
-
table
¶
-
port
¶
-
columns
= []¶
-
null_values
= (None,)¶
-
column_separator
= '\t'¶
-
create_table
(connection)[source]¶ Override to provide code for creating the target table.
By default it will be created using types (optionally) specified in columns.
If overridden, use the provided connection object for setting up the table in order to create the table and insert data using the same transaction.
-
update_id
¶ This update id will be a unique identifier for this insert on this table.
-
output
()[source]¶ The output that this Task produces.
The output of the Task determines if the Task needs to be run–the task is considered finished iff the outputs all exist. Subclasses should override this method to return a single
Target
or a list ofTarget
instances.- Implementation note
- If running multiple workers, the output must be a resource that is accessible by all workers, such as a DFS or database. Otherwise, workers might compute the same output since they don’t see the work done by other workers.
See Task.output
-
init_copy
(connection)[source]¶ Override to perform custom queries.
Any code here will be formed in the same transaction as the main copy, just prior to copying data. Example use cases include truncating the table or removing all data older than X in the database to keep a rolling window of data available in the table.
-
class
luigi.contrib.rdbms.
Query
(*args, **kwargs)[source]¶ Bases:
luigi.task.MixinNaiveBulkComplete
,luigi.task.Task
An abstract task for executing an RDBMS query.
Usage:
Subclass and override the following attributes:
- host,
- database,
- user,
- password,
- table,
- query
Optionally override:
- port,
- autocommit
- update_id
Subclass and override the following methods:
- run
- output
-
host
¶ Host of the RDBMS. Implementation should support hostname:port to encode port.
-
port
¶ Override to specify port separately from host.
-
database
¶
-
user
¶
-
password
¶
-
table
¶
-
query
¶
-
autocommit
¶
-
update_id
¶ Override to create a custom marker table ‘update_id’ signature for Query subclass task instances