iSpine

Challenges

High variability from multiple image modalities or from shape deformations of the vertebrae.^[1] Image contrast, resolution, and appearance, for the same spine structure, could be very different when exposed to MR/CT or T1/T2 weighted MR images resulting in difficulty with vertebrae detection.^[2]
The vertebrae and intervertebral discs lack unique characteristic features and have highly repetitive appearances that automatic naming could fail and mismatching could happen easily.^[1]^[2]
The vertebrae sizes and orientations are highly diverse in pathological data that regular detectors, such as appearance detectors are insufficient to match all vertebrae.^[2] Except for the local pose and appearance problems, the global geometry of spine is often difficult to recover in some medical situations, i.e., spine deformity and scoliosis. Reconstruction of the global spine geometry from limited CT/MR slices can be ill-posed and requires sophisticated learning algorithms.^[1]
Vertebrae have complex shape compositions.^[2] The large appearance variations in different image modalities/views and the high geometric distortions in spine shape make vertebra recognition a challenging task. Existing vertebra recognitions, simplified as vertebrae detections, mainly focus on the identification of vertebra locations and labels, but cannot support further spine quantitative assessment.^[1]

Research

2D Spine Recognition (vs. labeling) and Modelling

We propose a novel deep learning architecture called Transformed Deep Convolution Network (TDCN) used as a method for multi-modal vertebra recognition. This new architecture can fuse image features from different modalities, unsupervised, and automatically correct the pose of the vertebra. The TDCN-based recognition system allows us to simultaneously identify the locations, labels, and poses of vertebra structures in both MR and CT.^[1]

The tasks

Cross-modality: T1 MR, T2 MR, CT
Recognition: Identify each vertebras location, pose, label
Multiple view: Lumbar, Thoracic, Cervical
Shape reconstruction

The task of the automatic vertebra recognition is to identify the global spine and local vertebra structural information, such as spine shape, vertebra location and pose.

3D Spine Recognition and Modelling

We propose a novel anatomy-inspired Hierarchical Deformable Model (HDM) that implements a comprehensive cross-modality vertebra framework. The framework provides simultaneous identification of local and global spine information in arbitrary image views. The HDM stimulates the local/global structures of the spine to perform deformable matching of spine images.^[2]

Approach

2D Spine Structure Recognition and Modelling

Y. Cai et al. / Computerized Medical Imaging and Graphics 51 (2016) 11–19

A deep-learning-based supervised detection approach

Advantages:
(1) cross-modality
(2) feature enhancement

Fig. 1. The multi-modal recognition for lumbar spine imaging. The modalities are uniformly trained and detected in one unified recognition system. In this system, features from different modalities are fused and enhanced by each other via a deep network.^[1]

Transformed DCN: a novel invariant deep network

MR-CT feature fusion: Mixing MR low-level features with CT features
Pose recovery: Congealing

Fig. 3. The structure of the Transformed Deep Convolution Network (TDCN). Two different types of layer, Convolution Restricted Boltzmann Machines (CRBM) and Restricted Boltzmann Machines (RBM), are used in the network. Note that the congealing component is applied in the training stage, and will be bypassed in the testing stage.^[1]

The TDCN automatically extracts the best representative and invariant features for MR/CT. It employs MR–CT feature fusion to enhance the feature discriminativity, and applies alignment transforms for input data to generate invariant representation. This resolves the modality and pose variation problems in vertebra recognition.^[1]

2D and 3D Spine Structure Recognition and Modelling

A computational anatomy approach: Hierarchical Deformable Model

Vertebra-based compositional model for simulating spine deformation
Tri-planar vertebra templates for sparse slice matching in multi-slice (MR) and volume data (CT)
Model-based 3D reconstruction for whole spine

Fig. 2. The overview of the work flow.^[2]

A three stage recognition approach: landmark detection, global shape registration, and local pose adjustment. The three stage approach is a comprehensive recognition method that provides simultaneous identification of local and global spine structures, with each stage individually implemented by the Hierarchical Deformable Model.

Tri-planar template matching (Step A)

(1) Apply 2D template matching on sagittal, axial, coronal view using deep features

Global shape registration (Step B)

(1) Reduce to point-set registration
(2) Registration adaptively driven by matching
with the tri-planar models
(3) ‘Anchor’ vertebrae (i.e., S1) can prevent
translational mis-alignment of other vertebrae

Local pose alignment (Step C)

(1) Congealing in multi-slices and different image views
(2) Recover 3D poses via back-projection of the aligned 2D planar poses

Validation

2D Spine Recognition

Validation was performed on cross-modality MR-CT datasets containing a total of 150 volumes with varying pathologies. The SVM was trained for vertebra/non-vertebra classification, and a set of 1150 patches, sampled from a combined total of 10 volumes from MR and CT, were used to train the TDCN system. The ground truth of the testing data and slice selection followed the standard radiology protocol of spine physician, and are in separated manual processes. Testing was conducted using 110 MR and CT sagittal slices from 90 MR-CT volumes, excluding training volumes. Lumbar scans and whole spine MR and CT slices were tested to show generality of our method.^[1]

3D Spine Recognition

Validation was performed on T1/T2 MR and CT modalities using a combined total of 140 MR and CT samples from three different datasets. Data collected covers from lumbar, thoracic, cervical, and the whole spine. The initial HDM model was constructed from MR+CT image patches collected from different spine sections/views. The deep network of local appearance module was trained using randomly sampled planar patches. HDM planar templates were constructed from lumbar and thoracic patches. The HDM 3D spine model was manually built. Single slice processing was tested. Specific slices were sampled for 3D volume data and multiple-slice data. Ground truth values were used for comparison to evaluate pose accuracy, and the correct labelling rate and rate of vertebra/non-vertebra classification were evaluated.^[2]