Sreetama Sarkar

prof_pic_sreetama.jpg

I am a 4th year PhD student at EESSC lab in the University of Southern California, advised by Prof. Peter Beerel. My research interests involve energy efficiency and trustworthiness in multi-modal models. My recent works include efficient fine-tuning and inference of Vision Transformer (ViT), Vision Language Models (VLMs) and hallucination mitigation in VLMs.

Before this, I completed Master of Science in Communication Engineering from the Technical University of Munich. I conducted my Masterโ€™s thesis on Robustness aware Pruning methods for Convolutional Neural Networks in the Autonomous Driving Group at BMW. I completed my BTech with a gold medal in Electronics and Communications Engineering from the National Institute of Technology, Durgapur, India.

In my free time, I enjoy cooking as it helps me unwind, and I love to try out different cuisines. I also enjoy swimming, biking and playing badminton. My creative pursuits include painting and dancing.

news

May 18, 2026 Joined Dolby Laboratories as a PhD Research Intern in the Multimodal Perception Lab.
May 04, 2026 ๐Ÿ† Honored to be awarded the USC WiSE Merit Award 2026-2027! ๐ŸŒŸ
Apr 15, 2026 Successfully passed my PhD Qualifying Exam! โœจ
Apr 15, 2026 ๐Ÿ† Honored to be selected for participating in the 13th Heidelberg Laureate Forum 2026! ๐ŸŒŸ
Aug 20, 2025 Our paper Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression has been accepted at EMNLP Main 2025!
May 19, 2025 Joined Samsung Research America as a Research Scientist Intern in the Visual Display Intelligence Lab! My research focus is Improving Efficiency in Vision Language Models.
Jan 21, 2025 Our paper Region Masking to Accelerate Video Processing on Neuromorphic Hardware, in collaboration with Intel Labs, accepted for ORAL presentation at ISQED 2025!
Oct 29, 2024 MaskVD accepted at WACV 2025!
Aug 15, 2024 ๐Ÿ† Awarded the Annenberg Endowed Graduate Fellowship 2024-2025 at USC!
Aug 05, 2024 FixPix accepted at ICPR 2024! See you in Kolkata :relaxed:

selected publications

  1. CVPR
    RedVTP: Training-Free Acceleration of Diffusion Vision-Language Models Inference via Masked Token-Guided Visual Token Pruning
    Jingqi Xu, Jingxi Lu, Chenghao Li, and 3 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings, Jun 2026
  2. EMNLP
    Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression
    Sreetama Sarkar, Yue Che, Alex Gavin, and 2 more authors
    Jun 2025
  3. WACV
    MaskVD: Region Masking for Efficient Video Object Detection
    Sreetama Sarkar, Gourav Datta, Souvik Kundu, and 3 more authors
    Jun 2025
  4. CVPRW
    Block Selective Reprogramming for On-device Training of Vision Transformers
    Sreetama Sarkar, Souvik Kundu, Kai Zheng, and 1 more author
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun 2024
  5. CVPRW
    RLNet: Robust Linearized Networks for Efficient Private Inference
    Sreetama Sarkar, Souvik Kundu, and Peter A. Beerel
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jun 2024
  6. DAC
    Accelerating and pruning cnns for semantic segmentation on fpga
    Pierpaolo Morฤฑฬ€, Manoj-Rohit Vemparala, Nael Fasfous, and 8 more authors
    In Proceedings of the 59th ACM/IEEE Design Automation Conference, Jun 2022
  7. CVPRW
    Adversarial robust model compression using in-train pruning
    Manoj-Rohit Vemparala, Nael Fasfous, Alexander Frickenstein, and 8 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun 2021