자연어처리(NLP) 모델 총정리 요약표(HF-based) – Video Type Models

NameFull NameArchitectureBase ModelDevelopedTraining DatasetLib. & FrameworkUse CasesHF URLGithhub URL
TimeSformerTimeSformer (Time-Space Transformer)TransformerVision Transformer (ViT)2021Evaluated on datasets like Kinetics-400 and Kinetics-600PyTorchVideo classification and action recognition taskshttps://github.com/facebookresearch/TimeSformer
VideoMAEVideo Masked AutoencodersMasked autoencoderVision Transformer (ViT)2022Pre-trained on large-scale video datasets; specifics vary by implementationPyTorchVideo classification, action recognition, and efficient video representation learninghttps://huggingface.co/docs/transformers/en/model_doc/videomae
ViViTVideo Vision TransformerPure transformer-based modelVision Transformer (ViT)2021Trained and evaluated on datasets such as Kinetics-400, Kinetics-600, Epic Kitchens, Something-Something V2, and Moments in Time.TensorFlow and JAXVideo classification and action recognition taskshttps://huggingface.co/docs/transformers/en/model_doc/vivithttps://github.com/google-

댓글 달기

이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다