Summary#
The postprocess.py
tool analyzes results generated by match_template.py
to identify and characterize top-scoring peaks from template matching.
postprocess.py --help
Tip
From version 0.3.0 onwards, postprocessing supports advanced multi-input and background correction. Multiple input files can be specified to distinguish between different macromolecular species, with corresponding class identifiers made available in orientations and RELION output formats. Additionally, multiple background corrections can be applied simultaneously via --background_file
, enabling users to account for various noise sources beyond single backgrounds (e.g., from --scramble_phases
) and incorporate complex cellular environments such as membrane backgrounds.
Depending on the subequent use case, different output_format
options are available and outlined below.
A tab-separated file output.tsv will be created in the process containing eight columns. The x, y and z column correspond to the translation, the euler_x, euler_y and euler_z column to the rotation used to obtain the column score. The detail column contains peak caller specific information.
postprocess.py \
--input_file output.pickle \
--output_prefix output \
--output_format orientations \
--mask_edges \
--min_boundary_distance 20 \
--num_peaks 1000
These options generate STAR files compatible with RELION 4 and 5. Both formats contain particle coordinates, Euler angles, scores, and source file references in a format that RELION can directly import. The coordinates and angles are identical to output_format orientations
. The created files differ in version headers and in the coordinate system. Relion 4 uses voxel coordinates, Relion 5 centers coordinates and scales them by the voxel size.
postprocess.py \
--input_file output.pickle \
--output_prefix output \
--output_format relion4 \
--mask_edges \
--min_boundary_distance 20 \
--num_peaks 1000
The code below will call peaks analogously to Orientations, but additionally also applies the identified orientation to the template and writes it to disk using the naming pattern {output_prefix}_{index}.{extension}. Index 0 corresponds to the highest scoring orientation.
postprocess.py \
--input_file output.pickle \
--output_prefix output \
--output_format alignment \
--mask_edges \
--min_boundary_distance 20 \
--num_peaks 10
The code below will call peaks analogously to Orientations, but additionally extract subsets centered around the peak with specified box size. The generated files follow the naming pattern {output_prefix}_{index}.mrc, where index 0 corresponds to the highest observed score.
postprocess.py \
--input_file output.pickle \
--output_prefix output \
--output_format extraction \
--mask_edges \
--min_boundary_distance 20 \
--num_peaks 100
The code below will call peaks analogously to Orientations, and compute a simple average based on the identified orientations.
postprocess.py \
--input_file output.pickle \
--output_prefix average \
--output_format average \
--mask_edges \
--min_boundary_distance 20 \
--num_peaks 100
The code below will apply background correction, and create a new pickle file containing the corrected scores. The output is intended for visual assessment of the normalization procedure, and can be reused for postprocessing.
postprocess.py \
--input_file output.pickle \
--output_prefix output_new \
--output_format pickle \
--background_file background1.pickle background2.pickle
Note
Orientations are following the conventions outlined in [1]. We use a right-handed coordinate system with orthogonal X, Y and Z axes. Euler angles are expressed using intrinsic ZYZ convention, with the first rotation around the Z-axis, the second around the new Y-axis and the third around the new Z-axis (see euler_to_rotationmatrix
). The default orientation is the z-unit vector (0, 0, 1).