Configuring Performance Comparisons

From the Reporting Landing page, click Comparisons under the Performance Reports section to go to the Comparison Configurer.

The Workflow Versions available to compare are listed in the first column. Drag workflow versions of interest into the second column to select them. You must select at least two to perform a comparison.

The available Specifications are listed in the third column. They are colored and labeled to indicate the type of specification. Specifications are based on comparing thresholds to summary statistics that are either:

Aggregate ("Proportion Stats") or
Overall ("Basic Stats")

Summary statistics may also be truth-based (based on the comparison of results to truths - i.e., precision, recall, F1, average deviation of matched result property) or metadata-based (from summary statistics derived from the output files, such as number and types/rates of variants, error rates in a BAM file, etc.)

Drag the specifications of interest from the third column into the fourth column to select them. You must select at least one specification to generate a comparison report.

The summary at the top of your screen will tell you the number of versions and specifications you have selected for reporting. It also gives a toggle option (top right of the page) to show:

Only Shared Samples for "apples-to-apples" comparisons, or
All Samples to include all analyzed datasources.

Understanding the Performance Comparison View

Comparing Aggregate Specifications

Aggregate statistics are summarized and compared to thresholds in the Performance Comparison View on both the aggregate and per-datasource level.

The stat listed above the graph, and the thicker dashed line and larger points represent the Workflow Version's aggregate result (in this case, that 100% of sources in both versions have an Indel Rate of <= 0.25).

The points will be shown in green or red if the results are higher or lower, respectively, than the threshold in the specification. In the example, both points are green as they are both higher than the 90% threshold.

The dashed line connecting the points is colored grey to indicate that there is no change between the two versions.

The thinner dashed line and small points show the per-datasource results that are rolled up into this aggregate statistic (in this case, the single sample that has an Indel Rate of 0.02 in Freebayes and 0.125 in bcftools. Both points are colored green because they are below the threshold of 0.25. In this case, the dashed line connecting the points is colored red, indicating that the bcftools result was higher for that sample, considered a negative in this "lower-better" statistic.

Get further information on the individual data points (stat value, datasource) by hovering or clicking on the points.

Note that the per-datasource results are graphed on the secondary axis.

Comparing Overall Specifications

Overall Specifications are summarized in the Performance Comparison View by rolling up the performance stats for all results meeting the filtering criteria in any analyzed datasource. The datasources whose results are included in the summary stat are shown to the right of the graphs. In the example shown below, datasources are only considered if they were analyzed by both workflow versions, for an "apples-to-apples" comparison. You can also choose to show all datasources analyzed for each version.

Click any datapoint on the graph to get further information, including the value and version. Click the Explore link in the tooltip to open a new tab pivoting on just the results that went into that summary data point.

Datasets

Building a Body of Test Data

Metaprocessors

Viewing and Comparing Test Results

Multi-Sample Summary Report