Model Fitting using Microstructure Imaging of Crossing (MIX): DIPY
Diffusion MRI measures water diffusion in biological tissue, which can be used to probe its microstructure. The most common model for water diffusion in tissue is the diffusion tensor (DT), which assumes a Gaussian distribution. This assumption of Gaussian diffusion oversimplifies the diffusive behavior of water in complex media and is known experimentally to break down for relatively large b-values. DT derived indices, such as mean diffusivity or fractional anisotropy, can correlate with major tissue damage, but lack sensitivity and specificity to subtle pathological changes.
Microstructure Imaging of Crossing (MIX) is versatile and thus suitable to a broad range of generic multicompartment models, in particular for brain areas where axonal pathways cross.
Multicompartment models (assess the variability of diffusion in sub-voxel regions) enable the estimation of more specific indices, such as axon diameter, density, orientation, and permeability, and so potentially give much greater insight into tissue architecture and sensitivity to pathology.
The goal of Model Fitting: Identity which model compartments are essential to explain the data and parameters that are potentially estimable from a particular experiment.
As a part of GSoC, I worked a lot with Model Fitting using the Neurite Orientation Dispersion and Density Imaging (NODDI) model with its implementation using the MIX framework.
Achievements and Benchmarks
The MIX framework for Microstructure Imaging is a very novel and advanced technique, however, required a higher fitting time. As reported by the MIX paper the MATLAB implementation took 191.8 seconds to fit, whereas, the DIPY implementation that I worked on is significantly faster, taking only 10-14 seconds to fit.
This basically means that the current implementation is ~ 20x faster than the state-of-the-art.
Pull Requests with Detailed Descriptions:
https://github.com/nipy/dipy/pull/1600 -> Branch: https://github.com/ShreyasFadnavis/dipy/tree/noddix_speed
https://github.com/nipy/dipy/pull/1614 -> Branch: https://github.com/ShreyasFadnavis/dipy/tree/noddix_gsoc
The above mentioned PRs contain: Main Code for the NODDIx model (contained in: dipy/dipy/reconst) Simulation and Fitting Simulated Signal (contained in: dipy/dipy/sims) Testing the model (contained in: dipy/dipy/reconst/tests) Example for NODDIx using HCP example (contained in: dipy/docs/examples [only present in the noddix_speed branch])
Link to submission via the PSF GSoC blog : https://blogs.python-gsoc.org/shreyas-fadnavis/2018/08/10/the-gsoc-experience-and-project-summary/ Challenging aspects of the work
My project was particularly challenging as it was geared more towards research based on Mathematical Optimization and Model Fitting which I had not worked on before.
I feel that working with DIPY under PSF was one of the most amazing experiences as I really learnt how to overcome the following challenges and contribute quality code to DIPY:
Optimization in Scientific Computing using Approximations and not losing out on model accuracy at the same time. Better coding practices and Software Design for new Models. Fitting the Model Parameters in Less time so that it can be used from an analytical standpoint. Understanding and implementing Cython modules to remove the Python Bottlenecks. Line-by-line profiling to understand and remove bottlenecks from the code. As a part of this project, I have worked on [links to the Blogs written at each step]:
- Implementing the NODDIx model with 2 fiber orientation crossings https://blogs.python-gsoc.org/shreyas-fadnavis/2018/05/24/neurite-orientation-dispersion-and-density-imaging-using-mix-noddix/
- Simulating the signal from the NODDIx model https://blogs.python-gsoc.org/shreyas-fadnavis/2018/05/22/simulating-dmri-data-using-camino/
- Fitting the Simulated Signal https://blogs.python-gsoc.org/shreyas-fadnavis/2018/06/05/simulating-and-fitting-the-signal-using-noddix/
- Speeding up the Model Fitting process and Mathematical Optimizations https://blogs.python-gsoc.org/shreyas-fadnavis/2018/07/14/cythonizing-bottlenecks-in-the-noddix-models/
- Legendre Polynomials and Numerical Methods in Cython for Speedups https://blogs.python-gsoc.org/shreyas-fadnavis/2018/06/27/speeding-up-the-legendre-gauss-integration/
Creating Error Functions for the model and rigorous testing for edge cases
Implementing the NODDI model for fitting 2 Fiber Crossings using MIX
The NODDI model consists of normalized signals from intracellular and extracellular compartments.
The estimated dMRI signal Ŝ comprises of the normalized signals from the following three compartments:
Ŝ = (1−v_iso)(v_ic S_ic (𝑂𝐷,𝜃,𝜑) + (1−v_ic)S_ec(d,𝜃,𝜑))+ v_iso S_iso
where S_ic and S_ec are the normalized signals from intracellular and extracellular compartments respectively.
Parameters to be estimated: Six (v_ic, v_iso, 𝑂𝐷, d, 𝜃, 𝜑)
Noise: Rician noise needs is added to signal for each substrate for different noise realizations.
To visualize each of the compartments mentioned in the formulation above, please take a look at the following figure for the compartments:
The above diagram is representative of a single fiber.
To further expand the above model to 2 fiber orientation crossings, the following formulation has been implemented:
Parameters to be estimated: 13 but for implementation, we use only 11 since some of the parameters are reused.
The parameters of the model can be visualized as:
The signal was estimated with the model above and then fitted with SHORE (an analytical basis). We could then visualize the fiber orientations using SHORE’s Orientation Distribution Functions (ODFs) as follows:
[Note: the steps to implement the above simulations and visualize the signals have been explained in detail below]
Implementation of NODDIx using MIX framework of Optimization
The Microstructure Imaging of Crossings is a novel and robust method using a 3 step optimization process. It enables to fit existing biophysical models with improved accuracy by utilizing the Variable Separation Method (VSM) to distinguish parameters that enter in both, linear and non-linear manner, in the model (Methods). The estimation of non-linear parameters is a non-convex problem and is handled first. This is done by using Differential Evolution since it is more effective in approximating exponential time series models.
The task to estimate linear parameters amounts to a convex problem and can be solved using standard least squares techniques. These parameter estimates provide a starting point for a Trust Region method in search for a refined solution.
4 Steps involved in Implementing MIX:
- Step 1 – Variable Separation: The objective function has a separable structure which can be exploited to separate the variables by Variable Separation (VarPro) method. We can rewrite our objective function as a projection using the Moore-Penrose Inverse (Pseudoinverse) and get the variable projection functional.
This is a rather advanced and mathematically well-formed method which makes use of variable projections to transform the complicated computations between the variables of the NODDIx model into a space where they can be fit individually.
Taking advantage of this special structure of the model, the method of variable projections eliminates the linear variables obtaining a somewhat more complicated function that involves only the nonlinear parameters.
This procedure not only reduces the dimension of the parameter space but also results in a better-conditioned problem. The same optimization method applied to the original and reduced problems will always converge faster for the latter.
Further literature for this method can be found here:
- Step 2 – Stochastic search for non-linear parameters ‘x’: The objective function is non-convex, particularly of non-linear least-square form. Any gradient based method employed to estimate the parameters will have critical dependence on a good starting point, which is unknown. Alternative approach can be regular grid search, which is time consuming and adds computational burden. This particular type of problem therefore points towards considering stochastic search methods like Differential Evolution (DE). In case of time series analysis, DE can be used efficiently for sum of exponentials functions. DE parameters can be varied for each selected biophysical model and time complexity may change with each choice.
I have written a different blog post for implementation of DE with a detailed explanation of its working and its nature for NODDIx.
Step 3 – Constrained search for linear parameters ‘f ’: After estimating the parameters ‘x’, estimation of linear parameters ‘f ’ is a constrained linear least-squares estimation problem. I have made use of the cvxpy optimizer to perform the constrained search to find the f’s of the model compartments.
Step 4 – Non-Linear Least Squares Estimation (NLLS) using Trust Region Method: Step 2 and step 3 give a reliable initial guess of both ‘x’ and ‘f ’ as initial value for Trust Region method. This method has been implemented using the Levenberg Marquardt method to perform (NLLS) fitting from SciPy: Optimize module.
Testing and Running my code for NODDIx model in DIPY
Here is the link to the code for NODDIx implemented in DIPY: https://github.com/ShreyasFadnavis/dipy/tree/noddix_speed