Our recent work MaskVD where we explore region masking for efficient video inference in now on arxiv!