All Collections
Concepts
Workflows and Pipelines
Workflows and Pipelines

What is a workflow?

Gwenn Berry avatar
Written by Gwenn Berry
Updated over a week ago

What's a Workflow?

Workflows are essentially implementations of a pipeline. You may have multiple different workflows for a given pipeline, which dictate the actual nuts and bolts that go from inputs to results. Commonly, you will have at least two different workflows for a pipeline: the default Manual Upload Workflow and one or more integrated workflows under test (WUT).

Workflow Scaffolds

Underlying any workflow is a Workflow Scaffold, which is a representation of a series of one or more steps that point to Components. When you create a workflow for your pipeline, you are directing Miqa on what files and properties of your inputs are to be used by the workflow steps, and how the outputs should be post-processed to be evaluated within your pipeline. Scaffolds can be used across multiple pipelines and with different input and output combinations. Most of the time (after initial setup), you will be working directly with a Workflow, without having to think about its underlying scaffold.

What's a Workflow Version?

Workflows describe the systems that get you from inputs to results, but they don't execute on their own. The Workflow Version describes the series of steps defined in its parent Workflow, with the artifact information needed to execute it (for example, a docker hash, a gzipped compiled executable, or a versioned API endpoint). A Workflow Version may have one or more Executors, which combine the versioned information of the Workflow Version with any parameter sets defined in Workflow Variants.

Worked Example

For example, the following Workflow represents the GATK Best Practices germline variant calling pipeline:

This workflow is made up of 4 steps, each of which is executed by a distinct Component. The Component entity sets the defaults of how this component is to be called, and the actual versioned artifact information is combined with these instructions to create a Component Version - the actual executable units.

Thus, a Workflow Version for the GATK Germline Variant Calling workflow would look exactly like the Workflow above, just with version and executable information for each of its component steps.

While a change between bwa versions may be fairly infrequent, the workflow will update with new releases of the GATK's tools. So an initial Workflow Version based on GATK 4.1.8.1:

...may be upgraded to a Workflow Version based on GATK 4.1.9.0:

If you're using open-source or other third-party tools such as GATK in your analytical workflow, you'll certainly want to perform performance evaluation, benchmarking, and verification on the new version before incorporating it and shipping it out. You'll want to ensure that the results for your use case on the types of data that matter for you are at least equal to, if not better than, the results from the previous version. If you're developing tools in-house, or combining in-house tools with third-party or open-source software, you'll likely be requiring these evaluations even more frequently as you make your own code changes.

And of course, that's what we're here for!

Did this answer your question?