CDE Quick Start: Mixture Density Networks (MDN) =============================================== This is a quick-start guide for using Mixture Density Networks (MDN) for parameter inference. For comprehensive documentation covering both MDN and MAF methods, see :doc:`inference_cde_guide`. .. note:: **Looking for complete API documentation?** See :doc:`inference_cde_guide` for detailed coverage of both MDN and MAF methods, including all parameters, methods, and best practices. What is MDN? ------------ Mixture Density Networks (MDN) combine neural networks with Gaussian mixture models to learn conditional probability distributions. In the context of brain model inference: - **Input**: Features extracted from brain model simulations (e.g., power spectrum, connectivity) - **Output**: A mixture of Gaussians representing p(θ|x), the probability of parameters θ given features x **How MDN Works:** 1. Neural network takes features x as input 2. Network outputs mixture parameters: weights, means, and covariances 3. These define a Gaussian mixture model that approximates p(θ|x) 4. You can sample from this mixture to get probable parameter values **Key Idea:** Instead of predicting a single parameter value, MDN predicts a full probability distribution, capturing uncertainty in the inference. Why Use MDN? ------------ **Advantages:** ✅ **Lightweight**: No PyTorch or SBI dependencies ✅ **Fast Training**: Typically converges in minutes ✅ **Fast Inference**: Generate thousands of samples in milliseconds ✅ **Interpretable**: Mixture components are easy to visualize and understand ✅ **Flexible**: Works with various brain model architectures **Limitations:** ⚠️ **Limited expressiveness**: Gaussian mixtures can't capture all distribution types ⚠️ **Independence assumption**: Parameters are modeled as independent within each component ⚠️ **Dimensionality**: Works best with < 10 parameters **When to use MDN vs MAF:** - **Use MDN when**: You have < 10 parameters, want fast training, need interpretability - **Use MAF when**: You have > 10 parameters, need to capture parameter correlations, have complex posteriors **When to use CDE vs SBI:** - **Use CDE when**: You want lightweight inference, have limited computational resources, or prefer mathematical transparency - **Use SBI when**: You need state-of-the-art neural architectures or are working with very high-dimensional problems Quick Start Example ------------------- Here's a minimal working example: .. code-block:: python import numpy as np from vbi.cde import MDNEstimator # Initialize MDN estimator mdn = MDNEstimator( param_dim=2, # Dimension of parameters θ feature_dim=2, # Dimension of observations x n_components=5, # Number of mixture components hidden_sizes=(64, 64) # Hidden layer dimensions ) # Train the estimator loss_history = mdn.train( params=theta_train, # Shape: (N, 2) features=x_train, # Shape: (N, 2) n_iter=2000, learning_rate=1e-3 ) # Sample from posterior posterior_samples = mdn.sample( features=x_observed, # Shape: (1, 2) n_samples=2000 ) Tutorial Workflow ----------------- 1. **Prepare Data**: Generate or load simulation parameters and features 2. **Initialize MDN**: Set parameter/feature dimensions and network architecture 3. **Train Model**: Fit the MDN to learn the conditional density p(θ|x) 4. **Perform Inference**: Sample from posterior for observed data 5. **Analyze Results**: Visualize and evaluate posterior distributions Complete Examples ----------------- See these Jupyter notebooks for detailed tutorials: - :doc:`examples/damp_oscillator_cde` - Damped oscillator with CDE - :doc:`examples/jansen_rit_sde_numba_cde` - Jansen-Rit neural mass model - :doc:`examples/vep_sde_numba_cde` - Visual evoked potential model Next Steps ---------- - **Full Documentation**: See :doc:`inference_cde_guide` for comprehensive API reference - **Try Examples**: Download complete notebooks from the examples directory - **Compare Methods**: Learn about MAF (Masked Autoregressive Flows) in :doc:`inference_cde_guide` - **Brain Models**: Apply CDE to real neural mass models