ML for Brain AVM Rupture Prediction

Existing clinical scoring systems for brain arteriovenous malformations leave a significant decision gap: they can’t capture nonlinear risk patterns or estimate when rupture is likely to occur. I contributed to both modeling workstreams in this project — working across the classification and time-to-event analyses — in collaboration with a Johns Hopkins team spanning the Whiting School of Engineering and School of Medicine, on 1,065 patients from the JHU bAVM Registry (1990–2023). For cross-sectional classification of hemorrhagic presentation, eight classifiers were trained on 13 clinical and angiographic features extending the original R2eD predictor set. CatBoost achieved the best performance (AUROC 0.801, sensitivity 87.4%), significantly outperforming the existing clinical score (DeLong p=0.034). SHAP analysis identified nidus size, deep brain location, age, and single feeding artery as the leading predictors.

For time-to-rupture prediction, a ridge-penalized Cox regression model incorporating age, prior rupture, and exclusively deep nidus location achieved a concordance index of 0.727 (95% CI 0.664–0.794) across 1,022 patients with follow-up data, with strong risk stratification across tertiles (log-rank p=3.0×10⁻⁶) and well-calibrated 2-year rupture probabilities. Together, the two frameworks address complementary clinical questions — whether a lesion is likely to present with hemorrhage, and how rupture risk evolves during surveillance — while remaining interpretable enough for clinical communication. This work is currently being prepared for publication.