Sreetama Sarkar

I am a 4^th year PhD student at EESSC lab in the University of Southern California, advised by Prof. Peter Beerel. My research interests involve energy efficiency and trustworthiness in multi-modal models. My recent works include efficient fine-tuning and inference of Vision Transformer (ViT), Vision Language Models (VLMs) and hallucination mitigation in VLMs.

Before this, I completed Master of Science in Communication Engineering from the Technical University of Munich. I conducted my Master’s thesis on Robustness aware Pruning methods for Convolutional Neural Networks in the Autonomous Driving Group at BMW. I completed my BTech with a gold medal in Electronics and Communications Engineering from the National Institute of Technology, Durgapur, India.

In my free time, I enjoy cooking as it helps me unwind, and I love to try out different cuisines. I also enjoy swimming, biking and playing badminton. My creative pursuits include painting and dancing.

news

Aug 20, 2025	Our paper Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression has been accepted at EMNLP Main 2025!
May 19, 2025	Joined Samsung Research America as a Research Scientist Intern in the Visual Display Intelligence Lab! My research focus is Improving Efficiency in Vision Language Models.
Jan 21, 2025	Our paper Region Masking to Accelerate Video Processing on Neuromorphic Hardware, in collaboration with Intel Labs, accepted for ORAL presentation at ISQED 2025!
Oct 29, 2024	MaskVD accepted at WACV 2025!
Aug 15, 2024	Awarded the Annenberg Endowed Graduate Fellowship 2024-2025 at USC!
Aug 05, 2024	FixPix accepted at ICPR 2024! See you in Kolkata
Jul 16, 2024	Our recent work MaskVD where we explore region masking for efficient video inference in now on arxiv!
Jun 27, 2024	My video won the 2-minute Video Contest at DAC Young Fellows Program 2024!
Apr 10, 2024	Nominated as Outstanding Mentor in the Viterbi Graduate Mentorship Program!
Apr 08, 2024	Two papers, RLNet and BSR accepted for ORAL presentation at CVPR Workshops TCV and ECV(overall acceptance rate: 32.6%)!

selected publications

EMNLP

Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression

Sreetama Sarkar, Yue Che, Alex Gavin, and 2 more authors

2025

Bib PDF Code

@misc{sarkar2025spin,
  title = {Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression},
  author = {Sarkar, Sreetama and Che, Yue and Gavin, Alex and Beerel, Peter A. and Kundu, Souvik},
  year = {2025},
  booktitle = {EMNLP Main 2025},
}

WACV

MaskVD: Region Masking for Efficient Video Object Detection

Sreetama Sarkar, Gourav Datta, Souvik Kundu, and 3 more authors

2025

Bib PDF Code

@article{sarkar2024maskvd,
  title = {MaskVD: Region Masking for Efficient Video Object Detection},
  author = {Sarkar, Sreetama and Datta, Gourav and Kundu, Souvik and Zheng, Kai and Bhattacharyya, Chirayata and Beerel, Peter A},
  year = {2025},
  eprint = {2407.12067},
  archiveprefix = {arXiv},
  primaryclass = {cs.CV},
}

CVPRW

Block Selective Reprogramming for On-device Training of Vision Transformers

Sreetama Sarkar, Souvik Kundu, Kai Zheng, and 1 more author

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024

Bib PDF Code

@inproceedings{sarkarECV24,
  author = {Sarkar, Sreetama and Kundu, Souvik and Zheng, Kai and Beerel, Peter},
  title = {Block Selective Reprogramming for On-device Training of Vision Transformers},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
  year = {2024},
}

CVPRW

RLNet: Robust Linearized Networks for Efficient Private Inference

Sreetama Sarkar, Souvik Kundu, and Peter A. Beerel

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024

Bib PDF Code

@inproceedings{sarkar2024rlnet,
  author = {Sarkar, Sreetama and Kundu, Souvik and Beerel, Peter A.},
  title = {RLNet: Robust Linearized Networks for Efficient Private Inference},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
  year = {2024},
  pages = {244-253},
}

DAC

Accelerating and pruning cnns for semantic segmentation on fpga

Pierpaolo Morı̀, Manoj-Rohit Vemparala, Nael Fasfous, and 8 more authors

In Proceedings of the 59th ACM/IEEE Design Automation Conference, 2022

Bib PDF

@inproceedings{mori2022accelerating,
  title = {Accelerating and pruning cnns for semantic segmentation on fpga},
  author = {Mor{\`\i}, Pierpaolo and Vemparala, Manoj-Rohit and Fasfous, Nael and Mitra, Saptarshi and Sarkar, Sreetama and Frickenstein, Alexander and Frickenstein, Lukas and Helms, Domenik and Nagaraja, Naveen Shankar and Stechele, Walter and others},
  booktitle = {Proceedings of the 59th ACM/IEEE Design Automation Conference},
  pages = {145--150},
  year = {2022},
}

CVPRW

Adversarial robust model compression using in-train pruning

Manoj-Rohit Vemparala, Nael Fasfous, Alexander Frickenstein, and 8 more authors

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021

Bib PDF

@inproceedings{vemparala2021adversarial,
  title = {Adversarial robust model compression using in-train pruning},
  author = {Vemparala, Manoj-Rohit and Fasfous, Nael and Frickenstein, Alexander and Sarkar, Sreetama and Zhao, Qi and Kuhn, Sabine and Frickenstein, Lukas and Singh, Anmol and Unger, Christian and Nagaraja, Naveen-Shankar and others},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages = {66--75},
  year = {2021},
}