# Robust Video Matting (RVM) ![Teaser](/documentation/image/teaser.gif)
English | 中文
Official repository for the paper [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/). RVM is specifically designed for robust human video matting. Unlike existing neural models that process frames as independent images, RVM uses a recurrent neural network to process videos with temporal memory. RVM can perform matting in real-time on any videos without additional inputs. It achieves **4K 76FPS** and **HD 104FPS** on an Nvidia GTX 1080 Ti GPU. The project was developed at [ByteDance Inc.](https://www.bytedance.com/)Framework | Download | Notes |
PyTorch |
rvm_mobilenetv3.pth rvm_resnet50.pth |
Official weights for PyTorch. Doc |
TorchHub | Nothing to Download. | Easiest way to use our model in your PyTorch project. Doc |
TorchScript |
rvm_mobilenetv3_fp32.torchscript rvm_mobilenetv3_fp16.torchscript rvm_resnet50_fp32.torchscript rvm_resnet50_fp16.torchscript |
If inference on mobile, consider export int8 quantized models yourself. Doc |
ONNX |
rvm_mobilenetv3_fp32.onnx rvm_mobilenetv3_fp16.onnx rvm_resnet50_fp32.onnx rvm_resnet50_fp16.onnx |
Tested on ONNX Runtime with CPU and CUDA backends. Provided models use opset 12. Doc, Exporter. |
TensorFlow |
rvm_mobilenetv3_tf.zip rvm_resnet50_tf.zip |
TensorFlow 2 SavedModel. Doc |
TensorFlow.js |
rvm_mobilenetv3_tfjs_int8.zip |
Run the model on the web. Demo, Starter Code |
CoreML |
rvm_mobilenetv3_1280x720_s0.375_fp16.mlmodel rvm_mobilenetv3_1280x720_s0.375_int8.mlmodel rvm_mobilenetv3_1920x1080_s0.25_fp16.mlmodel rvm_mobilenetv3_1920x1080_s0.25_int8.mlmodel |
CoreML does not support dynamic resolution. Other resolutions can be exported yourself. Models require iOS 13+. s denotes downsample_ratio . Doc, Exporter
|