Runner

Module contents

Submodules

django_analyses.runner.queryset_runner module

Definition of the QuerySetRunner class.

class django_analyses.runner.queryset_runner.QuerySetRunner

Bases: object

Base class for batch queryset processing.

Example

For a general usage example, see the QuerySet Processing section in the documentation.

ANALYSIS_CONFIGURATION = None

Common input specification dictionary as it is expected to be defined in the required Node’s configuration field. If none is provided, defaults to _NO_CONFIGURATION.

ANALYSIS_TITLE = ''

Required Analysis instance title.

ANALYSIS_VERSION_TITLE = ''

Required AnalysisVersion instance title.

BASE_QUERY = None

Query to use when retrieving the base queryset.

BASE_QUERY_END = '{n_instances} instances found.'
BASE_QUERY_START = 'Querying {model_name} instances...'
BATCH_RUN_START = '\x1b[4m\x1b[95m\x1b[1m{analysis_version}\x1b[0m\x1b[4m\x1b[95m: Batch Execution\x1b[0m'
DATA_MODEL = None

QuerySet model.

DEFAULT_QUERYSET_QUERY = '\n\x1b[94m🔎 Default execution queryset generation:\x1b[0m'
EXECUTION_STARTED = '\n\x1b[92m🚀Successfully started {analysis_version} execution over {n_instances} {model_name} instances🚀\x1b[0m'
FILTER_QUERYSET_END = '{n_candidates} execution candidates found.'
FILTER_QUERYSET_START = 'Filtering queryset...'
INPUT_GENERATION = '\n🔀 \x1b[94mGenerating input specifications:\x1b[0m'
INPUT_GENERATION_FINISHED = '{n_inputs} input specifications prepared.'
INPUT_GENERATION_PROGRESSBAR_KWARGS = {'desc': 'Preparing inputs', 'unit': 'instance'}

A dictionary used for tqdm progressbar customization during input generation.

INPUT_KEY = ''

The associated AnalysisVersion instance’s InputDefinition which will be used to query pending runs and execute them.

INPUT_QUERYSET_VALIDATION = '\n\x1b[94m🔎 Input queryset validation:\x1b[0m'
INPUT_QUERY_END = '{n_existing} runs found.'
INPUT_QUERY_START = 'Querying existing runs...'
NONE_PENDING = '\x1b[92mCongratulations! No pending {model_name} instances were detected in the database 👏\x1b[0m'
NONE_PENDING_IN_QUERYSET = '\x1b[92mAll {n_instances} provided {model_name} instances have been processed already 👑\x1b[0m'
NO_CANDIDATES = '\x1b[93mNo execution candidates detected in {model_name} queryset!\x1b[0m'
PENDING_FOUND = '{n_existing} existing runs found.\n\x1b[1m{n_pending}\x1b[0m instances pending execution.'
PENDING_QUERY_START = '\n⚖ \x1b[94mChecking execution status for the {queryset_description} queryset:\n\x1b[0mFiltering existing runs...\n(large querysets might take a few moments to be evaluated)'
PREPROCESSING_FAILURE = '\x1b[93mFailed to preprocess {model_name} #{instance_id}!\x1b[0m'
PREPROCESSING_FAILURE_REPORT = '\x1b[93m\x1b[1m{n_invalid} of {n_total} {model_name} instances failed to be preprocessed for input generation.\x1b[0m'
STATUS_QUERY_PROGRESSBAR_KWARGS = {'desc': None, 'unit': 'instance'}

A dictionary used for tqdm progressbar customization during input generation.

analysis

Returns the required analysis.

Returns:Analysis to be executed
Return type:Analysis

See also

query_analysis()

analysis_version

Returns the required analysis version.

Returns:Analysis version to be executed
Return type:AnalysisVersion
configuration

Returns the configuration dictionary for the execution node.

Returns:Node configuration
Return type:dict
create_configuration() → dict

Returns the configuration dictionary for the execution node.

Returns:Node configuration
Return type:dict
create_input_specification(instance: django.db.models.base.Model) → dict

Returns an input specification dictionary with the given data instance as input.

Parameters:instance (Model) – Data instance to be processed
Returns:Input specification dictionary
Return type:dict

See also

create_inputs()

create_inputs(queryset: django.db.models.query.QuerySet, progressbar: bool = True, max_total: int = None) → List[Dict[str, List[str]]]

Returns a list of dictionary input specifications.

Parameters:
  • instances (QuerySet) – Batch of instances to run the analysis over
  • progressbar (bool, optional) – Whether to display a progressbar, by default True
Returns:

Input specifications

Return type:

List[Dict[str, List[str]]]

evaluate_queryset(queryset: django.db.models.query.QuerySet, apply_filter: bool = True, log_level: int = 20) → django.db.models.query.QuerySet

Evaluates a provided queryset by applying any required filters or generating the default queryset (if None).

Parameters:
  • queryset (QuerySet) – Provided queryset
  • apply_filter (bool) – Whether to pass the queryset through filter_queryset() or not
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)
Returns:

Evaluated execution queryset

Return type:

QuerySet

filter_queryset(queryset: django.db.models.query.QuerySet, log_level: int = 20) → django.db.models.query.QuerySet

Applies any custom filtering to the a given data model’s queryset.

Parameters:
  • queryset (QuerySet) – A collection of the data model’s instances
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)
Returns:

Fitered queryset

Return type:

QuerySet

get_base_queryset(log_level: int = 20) → django.db.models.query.QuerySet

Returns the base queryset of the data model’s instances.

Parameters:log_level (int, optional) – Logging level to use, by default 20 (INFO)
Returns:All data model instances
Return type:QuerySet
get_instance_representation(instance: django.db.models.base.Model) → Any

Returns the representation of a single instance from the queryset as an Input instance’s value.

Parameters:instance (Model) – Instance to be represented as input value
Returns:Input value
Return type:Any
get_or_create_node() → django_analyses.models.pipeline.node.Node

Get or create the required execution node according to the specified analysis configuration.

Returns:Execution node
Return type:Node
has_run(instance: django.db.models.base.Model) → bool

Check whether the provided instance has an existing run in the databaes or not.

Parameters:instance (Model) – Data instance to check
Returns:Whether the data instance has an existing run or not
Return type:bool
input_definition

Returns the data instance’s matching input definition.

Returns:Data instance input definition
Return type:InputDefinition
input_set

Returns a queryset of existing Input instances for the executed node.

Returns:Existing inputs
Return type:QuerySet
log_base_query_end(n_instances: int, log_level: int = 20) → None

Logs the result of querying the base queryset.

Parameters:log_level (int, optional) – Logging level to use, by default 20 (INFO)
log_base_query_start(log_level: int = 20) → None

Logs the generation of the base queryset.

Parameters:log_level (int, optional) – Logging level to use, by default 20 (INFO)
log_execution_start(n_instances: int, log_level: int = 20) → None

Log the start of a batch execution over some queryset.

Parameters:
  • n_instances (int) – Number of instances in the queryset
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)

See also

run()

log_filter_end(n_candidates: int, log_level: int = 20) → None

Logs the result of queryset filtering prior to execution.

Parameters:
  • n_candidates (int) – Number of execution candidates in queryset after filtering
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)
log_filter_start(log_level: int = 20) → None

Logs the beginning of queryset filtering prior to execution.

Parameters:log_level (int, optional) – Logging level to use, by default 20 (INFO)
log_none_pending(queryset: django.db.models.query.QuerySet, log_level: int = 20) → None

Log an empty queryset of pending instances.

Parameters:
  • queryset (QuerySet) – Provided or generated execution queryset
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)

See also

query_progress()

log_pending(existing: django.db.models.query.QuerySet, pending: django.db.models.query.QuerySet, log_level: int = 20) → None

Log the number of pending vs. existing instances.

Parameters:
  • existing (QuerySet) – Instances with existing runs
  • pending (QuerySet) – Instances pending execution
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)
log_progress_query_end(queryset: django.db.models.query.QuerySet, existing: django.db.models.query.QuerySet, pending: django.db.models.query.QuerySet, log_level: int = 20) → None

Logs the execution progress query’s result.

Parameters:
  • queryset (QuerySet) – Full execution queryset
  • existing (QuerySet) – Instances with existing reults
  • pending (QuerySet) – Instances pending execution
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)

See also

query_progress()

log_progress_query_start(log_level: int = 20) → None

Logs the beginning of queryset filtering prior to execution.

Parameters:log_level (int, optional) – Logging level to use, by default 20 (INFO)

See also

query_progress()

log_run_start(log_level: int = 20) → None

Logs the beginning of the run() method’s execution.

Parameters:log_level (int, optional) – Logging level to use, by default 20 (INFO)

See also

run()

node

Returns the required execution node.

Returns:Node to be executed
Return type:Node
query_analysis() → django_analyses.models.analysis.Analysis

Returns the analysis to be executed.

Returns:Executed analysis
Return type:Analysis
query_analysis_version() → django_analyses.models.analysis_version.AnalysisVersion

Returns the analysis version to be executed.

Returns:Executed analysis version
Return type:AnalysisVersion
query_input_definition() → django_analyses.models.input.definitions.input_definition.InputDefinition

Returns the input definition which corresponds to the queryset instances.

Returns:Instance input definition
Return type:InputDefinition
query_input_set(log_level: int = 10) → django.db.models.query.QuerySet

Returns a queryset of existing Input instances of the execution node.

Parameters:log_level (int, optional) – Logging level to use, by default 20 (INFO)
Returns:Existing inputs
Return type:QuerySet
query_progress(queryset: django.db.models.query.QuerySet = None, apply_filter: bool = True, log_level: int = 20, progressbar: bool = True) → Tuple[django.db.models.query.QuerySet, django.db.models.query.QuerySet]

Splits queryset to instances with and without existing runs. If no queryset is provided, generates the default execution queryset.

Parameters:
  • queryset (QuerySet, optional) – Queryset to split by run status, by default None
  • apply_filter (bool) – Whether to pass the queryset through filter_queryset() or not
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)
  • progressbar (bool, optional) – Whether to display a progressbar, by default True
Returns:

Existing, Pending

Return type:

Tuple[QuerySet, QuerySet]

run(queryset: django.db.models.query.QuerySet = None, max_total: int = None, prep_progressbar: bool = True, log_level: int = 20, dry: bool = False)

Execute this class’s node in batch over all data instances in queryset. If none provided, queries a default execution queryset.

Parameters:
  • queryset (QuerySet, optional) – Queryset to run, by default None
  • max_total (int, optional) – Maximal total number of runs, by default None
  • prep_progressbar (bool, optional) – Whether to display a progressbar for input generation, by default True
  • log_level (int, optional) – Logging level to use, by default 20 (INFO)
  • dry (bool, optional) – Whether this is a dry run (no execution) or not, by default False