Basics#
Template matching produces similarity estimates (scores) between the target and a set of rotations of the template. Local maxima in those scores correspond to putative occurences of the template in the target. Identifying such local maxima is challenging because of
False positives: Other cellular features or sample preparation
Wide peaks: High scores spread over multiple voxels around each particle
Edge artifacts: Inflated scores near tomogram boundaries
Variable backgrounds: Different score distributions across the tomogram
Peak Calling Strategies#
pytme implements different peak calling algorithms for specific scenarios.
Algorithm |
Best For |
Key Features |
---|---|---|
|
General use |
Fast, reliable local maxima detection |
|
Crowded environments |
Uses template mask to restrict potential matches |
|
Well-separated broad peaks |
Robust, but may miss overlapping peaks |
Key Parameters
--min-distance
: Minimum separation between peaks in voxels (prevents multiple picks per particle)--mask-edges
: Automatically exclude tomogram boundaries--min-boundary-distance
: Minimum distance from tomogram boundaries in voxels.--num-peaks
: Maximum number of peaks to identify
Number of Peaks#
The number of peaks is generally unknown, but we can determine suitable score cutoffs based on statistical properties of the cross-correlation. This approach automatically sets thresholds to limit false positives
postprocess.py \
--input-file results.pickle \
--n-false-positives 5 \
--output-format orientations
Alternatively, score cutoffs can be set using --min-score
and --max-score
. The maximum number of peaks is always given by --num-peaks
.
Background Correction#
Cellular environments contain complex backgrounds that can yield high scores for regions that do not represent the template. We can avoid picking such regions by background correction. Conceptually, we account for that by computing template matching scores between the target and a different template, to exclude regions where both the template of interest and alternative template score highly.
Single Background#
In the simplest case, a control template (for instance from running match_template.py
with --scramble-phases
) can be used to model the score background
postprocess.py \
--input-file signal.pickle \
--background-file noise.pickle \
--output-format orientations
Multiple Background#
We can also account for different cellular components simultaneously
postprocess.py \
--input-file ribosome.pickle \
--background-file noise.pickle membrane.pickle \
--output-format orientations
Multiple Entities#
When analyzing multiple templates simultaneously, we can distinguish between different macromolecular species
postprocess.py \
--input-file ribosome.pickle proteasome.pickle \
--background-file noise.pickle \
--output-format orientations
In all cases, the tool will report statistics for foreground, background, and normalized scores
> Foreground mean 0.125, std 0.087, max 0.445
> Background mean 0.089, std 0.023, max 0.234
> Normalized mean 0.067, std 0.078, max 0.298
Local Optimization and Refinement#
For high-precision applications (e.g., fitting atomic structures), pytme offers local optimization
postprocess.py \
--input-file results.pickle \
--local-optimization \
--peak-oversampling 2 \
--output-format alignment
Peak Oversampling Achieves sub-voxel precision by interpolating score maxima. Factor of 2 provides half-voxel precision.
Local Optimization Uses basin-hopping optimization to refine translation and rotation parameters around initial peaks. Most useful when analyzing small numbers of high-quality candidates.
Coordinate System#
Our convention follows the schematics outlined in [1]. We use a right-handed coordinate system with orthogonal X, Y and Z axes. Euler angles are expressed counter-clockwise using intrinsic ZYZ convention, with the first rotation around the Z-axis, the second around the new Y-axis and the third around the new Z-axis (see euler_to_rotationmatrix
). The default orientation is the z-unit vector (0, 0, 1).
Details for Developers#
The output of match_template.py
is a pickle file. All but the last element will correspond to the return value of a given analyzer’s merge method. The file can be read using load_pickle
. For the default analyzer MaxScoreOverRotations
the pickle file contains
Scores: An array with scores mapped to translations.
Offset: Offset informing about shifts in coordinate sytems.
Rotations: An array of optimal rotation indices for each translation.
Rotation Dictionary: Mapping of rotation indices to rotation matrices.
Sum of Squares: Sum of squares of scores for statistics.
Metadata: Coordinate system information and parameters for reproducibility.
However, when you use the -p flag the output structure differs
Translations: A numpy array containing translations of peaks.
Rotations: A numpy array containing rotations of peaks.
Scores: Score of each peak.
Details: Additional information regarding each peak.
Metadata: Coordinate system information and parameters for reproducibility.