The project is continuously updated, welcome to starts ⭐ & comments 💹 & sharing 😀 !!!
Other awesome projects: Awesome-Referring-Video-Object-Segmentation
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
OOKD | Offline-to-Online Knowledge Distillation for Video Instance Segmentation | WACV | Online | ||
MobileInst | MobileInst: Video Instance Segmentation on the Mobile | AAAI | Online | ||
LBVQ | Learning Better Video Query with SAM for Video Instance Segmentation | TCSVT | Offline | Code | |
CLIP-VIS | CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation | TCSVT | Online | Code | |
QCEN | Video Instance Segmentation Without Using Mask and Identity Supervision | TMM | Online | ||
STFormer | STFormer: Spatial-Temporal-Aware Transformer for Video Instance Segmentation | TNNLS | Semi-Online | ||
OV2Seg+ | OV-VIS: Open-Vocabulary Video Instance Segmentation | IJCV | Online | Code | |
OW-VISFormer | Video Instance Segmentation in an Open-World | IJCV | Offline | Code | |
OMG-Seg | OMG-Seg: Is One Model Good Enough For All Segmentation? | CVPR | Semi-Online | Code | |
UniVS | UniVS: Unified and Universal Video Segmentation with Prompts as Queries | CVPR | Online | Code | |
GLEE | General Object Foundation Model for Images and Videos at Scale | CVPR | Offline | Code | |
UVIS | UVIS: Unsupervised Video Instance Segmentation | CVPRW | Online | ||
PointVIS | What is Point Supervision Worth in Video Instance Segmentation? | CVPRW | Online | ||
DVIS-DAQ | DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries | ECCV | Online/Offline | Code | |
VISAGE | VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement | ECCV | Online | Code | |
OVFormer | Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation | ECCV | Semi-Online | Code | |
GvSeg | General and Task-Oriented Video Segmentation | ECCV | Semi-Online | Code | |
S-AModal | Foundation Models for Amodal Video Instance Segmentation in Automated Driving | ECCVW | Online | Code | |
SyncVIS | SyncVIS: Synchronized Video Instance Segmentation | NeurIPS | Offline | Code | |
OW-VISCapTor | OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning | NeurIPS | Online | Code | |
RT-VIS | RT-VIS: Real-Time Video Instance Segmentation with Light-Weight Decoupled Framework | PRCV | Online | Code | |
Cluster2Former | Cluster2Former: Semisupervised Clustering Transformers for Video Instance Segmentation | Sensors | Offline | ||
UPVIS | UPVIS: upsampled video query for offline video instance segmentation | MTA | Offline | ||
KeyVIS | Improving Weakly-supervised Video Instance Segmentation Using Keypoints Consistency | CVIU | Offine | Code | |
Gp3Former | Gp3Former: Gaussian Prior Tri-Cascaded Transformer for Video Instance Segmentation in Livestreaming Scenarios | ETCI | Offine | ||
PM-VIS+ | PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation | MIPR | Online | Code | |
RAP-SAM | RAP-SAM : Towards Real-Time All-Purpose Segment Anything | Arxiv | Online | Code | |
BriVIS | Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation | Arxiv | Offline | Code | |
InstFormer | OpenVIS: Open-vocabulary Video Instance Segmentation | Arxiv | Online | ||
PM-VIS | PM-VIS: High-Performance Box-Supervised Video Instance Segmentation | Arxiv | Online | ||
UVOSAM | UVOSAM: A Mask-free Paradigm for Unsupervised Video Object Segmentation via Segment Anything Model | Arxiv | Online | Code | |
CAVIS | CAVIS: Context-Aware Video Instance Segmentation | Arxiv | Online/Offline | Code | |
ViLLa | ViLLa: Video Reasoning Segmentation with Large Language Model | Arxiv | Offline | Code | |
Eigen-Cluster VIS | Improving Weakly-supervised Video Instance Segmentation by Leveraging Spatio-temporal Consistency | Arxiv | Offline | Code | |
SDI-Paste | SDI-Paste: Synthetic Dynamic Instance Copy-Paste for Video Instance Segmentation | Arxiv | Online | ||
SSL-VIS | Self-supervised Video Instance Segmentation Can Boost Geographic Entity Alignment in Historical Maps | Arxiv | Offline | ||
A2VIS | A2VIS: Amodal-Aware Approach to Video Instance Segmentation | Arxiv | Offline | ||
TROY-VIS | Towards Real-Time Open-Vocabulary Video Instance Segmentation | Arxiv | Online | Code | |
O2VIS | O2VIS: Occupancy-aware Object Association for Temporally Consistent Video Instance Segmentation | Arxiv | Online |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
InstanceFormer | InstanceFormer: An Online Video Instance Segmentation Framework | AAAI | Online | Code | |
GenVIS | A Generalized Framework for Video Instance Segmentation | CVPR | Online/Semi-Online | Code | |
MDQE | MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos | CVPR | Semi-Online | Code | |
Mask-Free VIS | Mask-Free Video Instance Segmentation | CVPR | Online | Code | |
InstMove | InstMove: Instance Motion for Object-centric Video Segmentation | CVPR | Online | Code | |
VideoCutLER | VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation | CVPR | Offline | Code | |
TarViS | TarViS: A Unified Approach for Target-based Video Segmentation | CVPR | Offline | Code | |
CAROQ | Context-Aware Relative Object Queries To Unify Video Instance and Panoptic Segmentation | CVPR | Online | ||
UNINEXT | Universal Instance Perception as Object Discovery and Retrieval | CVPR | Offline | Code | |
CTVIS | CTVIS: Consistent Training for Online Video Instance Segmentation | ICCV | Online | Code | |
DVIS | DVIS: Decoupled Video Instance Segmentation Framework | ICCV | Online/Offline | Code | |
OV2Seg | Towards Open-Vocabulary Video Instance Segmentation | ICCV | Online | Code | |
TCOVIS | TCOVIS: Temporally Consistent Online Video Instance Segmentation | ICCV | Online | Code | |
Tube-Link | Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation | ICCV | Semi-Online | Code | |
TMT-VIS | TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation | NeurIPS | Offline | Code | |
NOVIS | NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation | ICML | Semi-Online | ||
TIVE | TIVE: A Toolbox for Identifying Video Instance Segmentation Errors | Neurocomputing | Toolbox | Code | |
VLKP | VLKP: Video Instance Segmentation with Visual-Linguistic Knowledge Prompts | ICASSP | Offline | ||
IAST | IAST: Instance Association Relying on Spatio-Temporal Features for Video Instance Segmentation | ICASSP | Offline | Code | |
HEVis* | Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows | JAS | Offline | Code | |
GRAtt-VIS | GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation | ICPR | Online | Code | |
VATrack | End-to-end Amodal Video Instance Segmentation | BMVC | Online | Code | |
TAFormer | Towards Robust Video Instance Segmentation with Temporal-Aware Transformer | Arxiv | Offline | ||
UVOSAM | UVOSAM: A Mask-free Paradigm for Unsupervised Video Object Segmentation via Segment Anything Model | Arxiv | Online | ||
RefineVIS | RefineVIS: Video Instance Segmentation with Temporal Attention Refinement | Arxiv | Online | ||
BoxVIS | BoxVIS: Video Instance Segmentation with Box Annotations | Arxiv | Online | Code | |
OW-VISFormer | Video Instance Segmentation in an Open-World | Arxiv | Offline | Code | |
DVIS++ | DVIS++: Improved Decoupled Framework for Universal Video Segmentation | Arxiv | Online/Offline | Code | |
VIS-Survey | Deep Learning Techniques for Video Instance Segmentation: A Survey | Arxiv | Survey |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
HIATF | Hybrid Instance-Aware Temporal Fusion for Online Video Instance Segmentation | AAAI | Online | ||
Mask2former-VIS | Mask2former for Video Instance Segmentation | CVPR | Offline | Code | |
Video K-Net | Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation | CVPR | Offline | Code | |
VISOLO | VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation | CVPR | Online | Code | |
TeViT | Temporally Efficient Vision Transformer for Video Instance Segmentation | CVPR | Offline | Code | |
EfficientVIS | Efficient Video Instance Segmentation via Tracklet Query and Proposal | CVPR | Online | Code | |
SeqFormer | SeqFormer: Sequential Transformer for Video Instance Segmentation | ECCV | Offline | Code | |
IDOL | In Defense of Online Models for Video Instance Segmentation | ECCV | Online | Code | |
MS-STS VIS | Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer | ECCV | Offline | Code | |
Self-Shot VIS | Less than Few: Self-Shot Video Instance Segmentation | ECCV | Offline | ||
VMT | Video Mask Transfiner for High-Quality Video Instance Segmentation | ECCV | Offline | Code | |
STC | STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation | ECCV | Online | ||
IAI | Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation | ECCV | Online | Code | |
VITA | VITA: Video Instance Segmentation via Object Token Association | NeurIPS | Offline | Code | |
MinVIS | MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training | NeurIPS | Online | Code | |
InsPro | InsPro: Propagating Instance Query and Proposal for Online Video Instance Segmentation | NeurIPS | Online | ||
SipMaskv2 | SipMaskv2: Enhanced Fast Image and Video Instance Segmentation | TPAMI | Online | Code | |
TPR | Improving Video Instance Segmentation via Temporal Pyramid Routing | TPAMI | Online | Code | |
IFA | Video Instance Segmentation by Instance Flow Assembly | TMM | Online | ||
DefVIS | Deformable VisTR : Spatio temporal deformable attention for video instance segmentation | ICASSP | Offline | Code | |
TBA | Tag-Based Attention Guided Bottom-Up Approach for Video Instance Segmentation | ICPR | Offline | ||
DeVIS | DeVIS: Making Deformable Transformers Work for Video Instance Segmentation | Arxiv | Offline | Code | |
RCF | Online Video Instance Segmentation via Robust Context Fusion | Arxiv | Online | ||
IFR | Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention | Arxiv | Offline | ||
ROVIS | Robust Online Video Instance Segmentation with Track Queries | Arxiv | Online | Code | |
CiCo | One-stage Video Instance Segmentation: From Frame-in Frame-out to Clip-in Clip-out | Arxiv | Offline | Code | |
TLTM | Two-Level Temporal Relation Model for Online Video Instance Segmentation | Arxiv | Online | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
CompFeat | CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation | AAAI | Online | Code | |
VisTR | End-to-End Video Instance Segmentation with Transformers | CVPR | Offline | Code | |
SG-Net | SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation | CVPR | Online | Code | |
STMask | Spatial Feature Calibration and Temporal Fusion for Effective One-Stage Video Instance Segmentation | CVPR | Online | Code | |
CrossVIS | Crossover Learning for Fast Online Video Instance Segmentation | ICCV | Online | Code | |
Propose-Reduce | Video Instance Segmentation with a Propose-Reduce Paradigm | ICCV | Offline | Code | |
VisSTG | End-to-end Video Instance Segmentation via Spatial-Temporal Graph Neural Networks | ICCV | Online | Code | |
QueryInst | Instances as Queries | ICCV | Online | Code | |
HEVis | Learning Hierarchical Embedding for Video Instance Segmentation | ACM MM | Offline | Code | |
SRNet | SRNet: Spatial Relation Network for Efficient Single-stage Instance Segmentation in Videos | ACM MM | Online | ||
IFC | Video Instance Segmentation using Inter-Frame Communication Transformers | NeurIPS | Offline | Code | |
PCAN | Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation | NeurIPS | Online | Code | |
CMaskTrack R-CNN | Occluded Video Instance Segmentation: A Benchmark | IJCV | Online | Dataset | |
RGNNVIS++ | Recurrent Graph Neural Networks for Video Instance Segmentation | IJCV | Online | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
MaskProp | Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation | CVPR | Offline | ||
VAE | Video Instance Segmentation Tracking with a Modified VAE Architecture | CVPR | Online | ||
SipMask | Sipmask: Spatial Information Preservation for Fast Image and Video Instance Segmentation | ECCV | Online | Code | |
STEm-Seg | STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos | ECCV | Offline | Code | |
RGNNVIS | Learning Video Instance Segmentation with Recurrent Graph Neural Networks | GCPR | Online | Code |
Model | Title | Venue | Type | Paper | Code |
---|---|---|---|---|---|
MaskTrack R-CNN | Video instance segmentation | ICCV | Online | Code |