`QuerySet` Processing¶

Minimal Exmaple¶

This section details the recommended procedure for creating an interface to easily run some node in batch over a default or provided queryset of some data-representing model.

The QuerySetRunner base class provides a reusable abstraction for the general process of executing some Node instance with inputs generated from a queryset.

For example, let us assume a Scan model storing data scans in the database, and a "Scan Preprocessing" analysis of version "1.0" we would like to routinely run with the configuration:
{"harder": True, "better": 100, "stronger": "faster"}.
In addition, we know the analysis receives the Scan model’s path field’s value as its "input_file". The resulting subclass will look like:

from django_analyses.runner import QuerySetRunner
from myapp.models.scan import Scan

class ScanPreprocessingRunner(QuerySetRunner):
    DATA_MODEL = Scan
    ANALYSIS = "Scan Preprocessing"
    ANALYSIS_VERSION = "1.0"
    ANALYSIS_CONFIGURATION = {
        "harder": True,
        "better": 100,
        "stronger": "faster",
    }
    INPUT_KEY = "input_file"

    def get_instance_representation(self, instance: Scan) -> str:
        return str(instance.path)

And that’s it!

Note

The get_instance_representation() method will, if not overriden, return the instance as it is.

Using model instances as inputs is a fairly advanced usage scenario and outside the scope of this tutorial, therefore, the minimal example includes this modification.

To run the specified node over all Scan instance in the database:

>>> runner = ScanPreprocessingRunner()
>>> runner.run()
Scan Preprocessing v1.0: Batch Execution

🔎 Default execution queryset generation:
Querying Scan instances...
1000 instances found.

⚖ Checking execution status for the input queryset:
Filtering existing runs...
20 existing runs found.
980 instances pending execution.

🔀 Generating input specifications:
980 input specifications prepared.

🚀Successfully started Scan Preprocessing v1.0 execution over 980 Scan instances🚀

QuerySetRunner took care of querying all instances of the Scan model, checking for pending runs, generating the required input specifications, and running them in the background.

To run over a particular queryset, simply pass the queryset to the run() method.

Default `QuerySet` Filtering¶

To apply custom filtering to the data model’s queryset, override the filter_queryset() method. For example, if we would like to process only scans with "anatomical" in their description:

import logging
from django.db.models import QuerySet
from django_analyses.runner import QuerySetRunner
from myapp.models.scan import Scan

class ScanPreprocessingRunner(QuerySetRunner):
    DATA_MODEL = Scan
    ANALYSIS = "Scan Preprocessing"
    ANALYSIS_VERSION = "1.0"
    ANALYSIS_CONFIGURATION = {
        "harder": True,
        "better": 100,
        "stronger": "faster",
    }
    INPUT_KEY = "input_file"
    FILTER__QUERYSET_START = "Filtering anatomical scans..."

    def get_instance_representation(self, instance: Scan) -> str:
        return str(instance.path)

    def filter_queryset(self,
        queryset: QuerySet, log_level: int = logging.INFO
    ) -> QuerySet:
        queryset = super().filter_queryset(queryset, log_level=log_level)
        self.log_filter_start(log_level)
        queryset = queryset.filter(description__icontains="anatomical")
        self.log_filter_end(n_candidates=queryset.count(), log_level=log_level)
        return queryset

This time, when we run ScanPreprocessingRunner, we get the result:

>>> runner = ScanPreprocessingRunner()
>>> runner.run()
Scan Preprocessing v1.0: Batch Execution

🔎 Default execution queryset generation:
Querying Scan instances...
1000 instances found.
Filtering anatomical scans...
500 execution candidates found.

⚖ Checking execution status for the input queryset:
Filtering existing runs...
20 existing runs found.
480 instances pending execution.

🔀 Generating input specifications:
480 input specifications prepared.

🚀Successfully started Scan Preprocessing v1.0 execution over 480 Scan instances🚀

Note

Filtering is applied to provided querysets as well, not just the default.
super().filter_queryset(queryset) is called to apply any preceding filtering.
The log message is replaced by overriding the FILTER_QUERYSET_START class attribute (which is used automatically by filter_queryset() to log the filtering of the input queryset.

The QuerySetRunner class provides a wide range of utility attributes and functions that enable the automation of highly customized queryset processing. For more information, simply follow the QuerySetRunner hyperlink to the class’s reference.

QuerySet Processing¶

Minimal Exmaple¶

Default QuerySet Filtering¶

`QuerySet` Processing¶

Default `QuerySet` Filtering¶