luigi.contrib.salesforce

Functions

`ensure_utf`(value)
`get_soql_fields`(soql)	Gets queried columns names.
`parse_results`(fields, data)	Traverses ordered dictionary, calls _traverse_results() to recursively read into the dictionary depth of data

Classes

`QuerySalesforce`(args, *kwargs)
`SalesforceAPI`(username, password, security_token)	Class used to interact with the SalesforceAPI.
`salesforce`(args, *kwargs)	Config system to get config vars from 'salesforce' section in configuration file.

luigi.contrib.salesforce.get_soql_fields(soql)[source]: Gets queried columns names.

luigi.contrib.salesforce.ensure_utf(value)[source]

luigi.contrib.salesforce.parse_results(fields, data)[source]: Traverses ordered dictionary, calls _traverse_results() to recursively read into the dictionary depth of data

class luigi.contrib.salesforce.salesforce(*args, **kwargs)[source]

Config system to get config vars from ‘salesforce’ section in configuration file.

Did not include sandbox_name here, as the user may have multiple sandboxes.

username

Parameter whose value is a str, and a base class for other parameter types.

Parameters are objects set on the Task class level to make it possible to parameterize tasks. For instance:

class MyTask(luigi.Task):
    foo = luigi.Parameter()

class RequiringTask(luigi.Task):
    def requires(self):
        return MyTask(foo="hello")

    def run(self):
        print(self.requires().foo)  # prints "hello"

This makes it possible to instantiate multiple tasks, eg MyTask(foo='bar') and MyTask(foo='baz'). The task will then have the foo attribute set appropriately.

When a task is instantiated, it will first use any argument as the value of the parameter, eg. if you instantiate a = TaskA(x=44) then a.x == 44. When the value is not provided, the value will be resolved in this order of falling priority:

Any value provided on the command line:

To the root task (eg. --param xyz)

Then to the class, using the qualified task name syntax (eg. --TaskA-param xyz).

With [TASK_NAME]>PARAM_NAME: <serialized value> syntax. See Parameters from config Ingestion

Any default value set using the default flag.

Parameter objects may be reused, but you must then set the positional=False flag.

password

Parameter whose value is a str, and a base class for other parameter types.

Parameters are objects set on the Task class level to make it possible to parameterize tasks. For instance:

class MyTask(luigi.Task):
    foo = luigi.Parameter()

class RequiringTask(luigi.Task):
    def requires(self):
        return MyTask(foo="hello")

    def run(self):
        print(self.requires().foo)  # prints "hello"

This makes it possible to instantiate multiple tasks, eg MyTask(foo='bar') and MyTask(foo='baz'). The task will then have the foo attribute set appropriately.

When a task is instantiated, it will first use any argument as the value of the parameter, eg. if you instantiate a = TaskA(x=44) then a.x == 44. When the value is not provided, the value will be resolved in this order of falling priority:

Any value provided on the command line:

To the root task (eg. --param xyz)

Then to the class, using the qualified task name syntax (eg. --TaskA-param xyz).

With [TASK_NAME]>PARAM_NAME: <serialized value> syntax. See Parameters from config Ingestion

Any default value set using the default flag.

Parameter objects may be reused, but you must then set the positional=False flag.

security_token

Parameter whose value is a str, and a base class for other parameter types.

Parameters are objects set on the Task class level to make it possible to parameterize tasks. For instance:

class MyTask(luigi.Task):
    foo = luigi.Parameter()

class RequiringTask(luigi.Task):
    def requires(self):
        return MyTask(foo="hello")

    def run(self):
        print(self.requires().foo)  # prints "hello"

This makes it possible to instantiate multiple tasks, eg MyTask(foo='bar') and MyTask(foo='baz'). The task will then have the foo attribute set appropriately.

When a task is instantiated, it will first use any argument as the value of the parameter, eg. if you instantiate a = TaskA(x=44) then a.x == 44. When the value is not provided, the value will be resolved in this order of falling priority:

Any value provided on the command line:

To the root task (eg. --param xyz)

Then to the class, using the qualified task name syntax (eg. --TaskA-param xyz).

With [TASK_NAME]>PARAM_NAME: <serialized value> syntax. See Parameters from config Ingestion

Any default value set using the default flag.

Parameter objects may be reused, but you must then set the positional=False flag.

sb_security_token

Parameter whose value is a str, and a base class for other parameter types.

Parameters are objects set on the Task class level to make it possible to parameterize tasks. For instance:

class MyTask(luigi.Task):
    foo = luigi.Parameter()

class RequiringTask(luigi.Task):
    def requires(self):
        return MyTask(foo="hello")

    def run(self):
        print(self.requires().foo)  # prints "hello"

This makes it possible to instantiate multiple tasks, eg MyTask(foo='bar') and MyTask(foo='baz'). The task will then have the foo attribute set appropriately.

When a task is instantiated, it will first use any argument as the value of the parameter, eg. if you instantiate a = TaskA(x=44) then a.x == 44. When the value is not provided, the value will be resolved in this order of falling priority:

Any value provided on the command line:

To the root task (eg. --param xyz)

Then to the class, using the qualified task name syntax (eg. --TaskA-param xyz).

With [TASK_NAME]>PARAM_NAME: <serialized value> syntax. See Parameters from config Ingestion

Any default value set using the default flag.

Parameter objects may be reused, but you must then set the positional=False flag.

class luigi.contrib.salesforce.QuerySalesforce(*args, **kwargs)[source]

abstract property object_name: Override to return the SF object we are querying. Must have the SF “__c” suffix if it is a customer object.

property use_sandbox: Override to specify use of SF sandbox. True iff we should be uploading to a sandbox environment instead of the production organization.

property sandbox_name: Override to specify the sandbox name if it is intended to be used.

abstract property soql: Override to return the raw string SOQL or the path to it.

property is_soql_file: Override to True if soql property is a file path.

property content_type: Override to use a different content type. Salesforce allows XML, CSV, ZIP_CSV, or ZIP_XML. Defaults to CSV.

run()[source]

The task run method, to be overridden in a subclass.

See Task.run

merge_batch_results(result_ids)[source]: Merges the resulting files of a multi-result batch bulk query.

class luigi.contrib.salesforce.SalesforceAPI(username, password, security_token, sb_token=None, sandbox_name=None)[source]

Class used to interact with the SalesforceAPI. Currently provides only the methods necessary for performing a bulk upload operation.

API_VERSION = 34.0

SOAP_NS = '{urn:partner.soap.sforce.com}'

API_NS = '{http://www.force.com/2009/06/asyncapi/dataload}'

start_session()[source]: Starts a Salesforce session and determines which SF instance to use for future requests.

has_active_session()[source]

query(query, **kwargs)[source]

Return the result of a Salesforce SOQL query as a dict decoded from the Salesforce response JSON payload.

Parameters:: query – the SOQL query to send to Salesforce, e.g. “SELECT id from Lead WHERE email = ‘a@b.com’”

query_more(next_records_identifier, identifier_is_url=False, **kwargs)[source]

Retrieves more results from a query that returned more results than the batch maximum. Returns a dict decoded from the Salesforce response JSON payload.

Parameters:

next_records_identifier – either the Id of the next Salesforce object in the result, or a URL to the next record in the result.
identifier_is_url – True if next_records_identifier should be treated as a URL, False if next_records_identifer should be treated as an Id.

query_all(query, **kwargs)[source]

Returns the full set of results for the query. This is a convenience wrapper around query(…) and query_more(…). The returned dict is the decoded JSON payload from the final call to Salesforce, but with the totalSize field representing the full number of results retrieved and the records list representing the full list of records retrieved.

Parameters:: query – the SOQL query to send to Salesforce, e.g. SELECT Id FROM Lead WHERE Email = “waldo@somewhere.com”

restful(path, params)[source]: Allows you to make a direct REST call if you know the path Arguments: :param path: The path of the request. Example: sobjects/User/ABC123/password’ :param params: dict of parameters to pass to the path

create_operation_job(operation, obj, external_id_field_name=None, content_type=None)[source]

Creates a new SF job that for doing any operation (insert, upsert, update, delete, query)

Parameters:

operation – delete, insert, query, upsert, update, hardDelete. Must be lowercase.
obj – Parent SF object
external_id_field_name – Optional.

get_job_details(job_id)[source]

Gets all details for existing job

Parameters:: job_id – job_id as returned by ‘create_operation_job(…)’
Returns:: job info as xml

abort_job(job_id)[source]

Abort an existing job. When a job is aborted, no more records are processed. Changes to data may already have been committed and aren’t rolled back.

Parameters:: job_id – job_id as returned by ‘create_operation_job(…)’
Returns:: abort response as xml

close_job(job_id)[source]

Closes job

Parameters:: job_id – job_id as returned by ‘create_operation_job(…)’
Returns:: close response as xml

create_batch(job_id, data, file_type)[source]

Creates a batch with either a string of data or a file containing data.

If a file is provided, this will pull the contents of the file_target into memory when running. That shouldn’t be a problem for any files that meet the Salesforce single batch upload size limit (10MB) and is done to ensure compressed files can be uploaded properly.

Parameters:

job_id – job_id as returned by ‘create_operation_job(…)’
data

Returns:

Returns batch_id

block_on_batch(job_id, batch_id, sleep_time_seconds=5, max_wait_time_seconds=-1)[source]: Blocks until @batch_id is completed or failed. :param job_id: :param batch_id: :param sleep_time_seconds: :param max_wait_time_seconds:

get_batch_results(job_id, batch_id)[source]: DEPRECATED: Use get_batch_result_ids

get_batch_result_ids(job_id, batch_id)[source]

Get result IDs of a batch that has completed processing.

Parameters:

job_id – job_id as returned by ‘create_operation_job(…)’
batch_id – batch_id as returned by ‘create_batch(…)’

Returns:

list of batch result IDs to be used in ‘get_batch_result(…)’

get_batch_result(job_id, batch_id, result_id)[source]: Gets result back from Salesforce as whatever type was originally sent in create_batch (xml, or csv). :param job_id: :param batch_id: :param result_id: