Video Compression & Surgical AI

Surgical AI systems depend on large annotated video datasets that impose significant storage and computational demands — a real barrier to deployment in clinical settings. As project lead, I coordinated this study across collaborators at JHU and UT Southwestern, while also contributing directly to the technical work: I built the video compression pipeline and led the pilot study benchmarking ResNet, ResNet+LSTM, and ResNet+Transformer across compression levels. That pilot informed the full study design, which expanded to three more state of the art architectures across four compression levels on the Cholec80 dataset, building a standardized compressed benchmark and full train-test compression matrix for each model.

The core finding is that moderate compression (CRF 28–35) reduces storage by up to 85% with minimal accuracy loss, and occasionally improves performance through implicit regularization effects — while extreme compression (CRF 51) causes significant degradation across all architectures. Temporal models proved more robust to compression than frame-level baselines, as sequential context helps compensate for degraded individual frames. These results suggest that compression-aware dataset design can enable scalable surgical AI deployment without sacrificing model performance. This work is currently being prepared for publication.