Vision
Last updated
Was this helpful?
Last updated
Was this helpful?
: A Self-supervised Vision Transformer Model A family of foundation models producing universal features suitable for image-level visual tasks (image classification, instance retrieval, video understanding) as well as pixel-level visual tasks (depth estimation, semantic segmentation).