VideoMem: Constructing, Analyzing, Predicting Short-Term and Long-Term Video Memorability

Published in ICCV, 2019

Abstract

Humans share a strong tendency to memorize/forget some of the visual information they encounter. This paper focuses on understanding the intrinsic memorability of visual content. To address this challenge, we introduce a large scale dataset (VideoMem) composed of 10,000 videos with memorability scores. In contrast to previous work on image memorability – where memorability was measured a few minutes after memorization – memory performance is measured twice: a few minutes and again 24-72 hours after memorization. Hence, the dataset comes with short-term and long-term memorability annotations. After an in-depth analysis of the dataset, we investigate various deep neural network-based models for the prediction of video memorability. Our best model using a ranking loss achieves a Spearman’s rank correlation of 0.494 (respectively 0.256) for short-term (resp. long-term) memorability prediction, while our model with attention mechanism provides insights of what makes a content memorable. The VideoMem dataset with pre-extracted features is publicly available.

Paper Data

Citation

If you found this work useful, please cite the associated paper:

R. Cohendet, C.-H. Demarty, N. Q. K. Duong, and M. Engilberge, “VideoMem: Constructing, Analyzing, Predicting Short-Term and Long-Term Video Memorability,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 2531–25

BibTex:

@inproceedings{engilbergeFinding2018,
  title = {VideoMem: Constructing, Analyzing, Predicting Short-Term and Long-Term Video Memorability},
  booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
  author = {Cohendet, Romain and Demarty, Claire-Helene and Duong, Ngoc Q. K. and Engilberge, Martin},
  year = {2019},
  pages = {2531--2540}
}