Research
Efficient Adaptation of Foundation Models
Parameter-efficient fine-tuning with factorized latent spaces. My work on FVAE-LoRA (NeurIPS 2025) factorizes task-relevant and residual features in low-rank adapters to improve robustness across modalities.
Speech & Language with LLMs
Understanding how LLMs can be coupled with speech encoders for ASR. I study robustness under domain shift , prompt sensitivity and modality compression in SpeechLLM architectures.
Multitask & Unified Speech Models
Building single models that jointly handle transcription, speaker change detection, endpointing, and entity recognition. Replacing fragile cascaded pipelines (TokenVerse, EMNLP 2024; TokenVerse++, ASRU 2025).
Sequence Alignment with Optimal Transport
A novel differentiable sequence-alignment framework based on 1D optimal transport, enabling the model to learn a single alignment and perform ASR in an E2E manner.
News
- Mar 2026 1st place in 3 of 4 tracks at the DiSPLACE-M Challenge 2026, including diarization and Hindi Devanagari ASR for medical conversations. [Paper]
- Jan 2026 Paper on reducing prompt sensitivity in LLM-based ASR accepted at ICASSP 2026. [Paper]
- Sep 2025 FVAE-LoRA accepted at NeurIPS 2025. Also received the Idiap PhD Paper Award. [Paper]
- Apr 2025 Best Paper Award at the SALMA Workshop (ICASSP 2025) for our evaluation of SLAM-ASR. [Paper]
- 2025 Presented at NeurIPS 2025, Interspeech 2025. Reviewer for Interspeech 2025 and ARR.
- Sep 2024 TokenVerse accepted at EMNLP 2024. [Paper]
Selected Publications
For a full list, see my Google Scholar profile.