The Video2GIF dataset contains over 100,000 pairs of GIFs and their source videos. The GIFs were collected from two popular GIF websites (makeagif.com, gifsoup.com) and the orresponding source videos were collected from YouTube in Summer 2015. We provide IDs and URLs of the GIFs and the videos, along with temporal alignment of GIF segments to their source videos. The dataset shall be used to train models for GIF creation and video highlight detection. In addition to the 100K GIF-video pairs, the dataset contains 357 pairs of GIFs and their source videos as the test set. The 357 videos come with a Creative Commons CC-BY license, which allows us to redistribute the material with appropriate credit. We provide this test set to make the results reproducible even when some of the videos become unavailable. If you end up using the dataset, we ask you to cite the following paper: Michael Gygli, Yale Song, Liangliang Cao Video2GIF: Automatic Generation of Animated GIFs from Video IEEE CVPR 2016 If you have any question regarding the dataset, please contact: Michael Gygli gygli@vision.ee.ethz.ch
The crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense cr...
video, pedestrian, scene, crowd, human, understanding, anomaly, detectionAbstract Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. ...
scene, segmentation, pedestrian, 3d, classification, understanding, car, semanticThe All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera ...
similarity, scene, summary, user, indoor, outdoor, video, 3d, clustering, studyThe Salient Montages is a human-centric video summarization dataset from the paper [1]. In [1], we present a novel method to generate salient montages...
video, saliency, wearable, montage, summarization, humanThe VSUMM (Video SUMMarization) dataset is of 50 videos from Open Video. All videos are in MPEG-1 format (30 fps, 352 x 240 pixels), in color and with s...
similarity, type, summary, user, video, static, keyframe, studyScene Parsing Benchmark Scene parsing data and part segmentation data derived from ADE20K dataset could be download from MIT Scene Parsing Benchmark. ...
segmentation, annotation, benchmark, semantic, scene, recognitionPlaces205 dataase contains 2.5 million images from 205 scene categories for the academic public. The image dataset contains 2,448,873 images from 205 ...
urban, learning, scene, feature, place, recognitionWe present a dataset to address the problem of visual privacy - where users unintentionally leak private information when sharing personal images online...
multilabel, privacy, classification, flickr, scene, regressionThe UMD Dynamic Scene Recognition dataset consists of 13 classes and 10 videos per class and is used to classify dynamic scenes. The dataset has been ...
video, motion, dynamic, classification, scene, recognitionThe CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test ima...
object, color, patch, scene, tiny, image classificationThe SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset ...
scene, layout, recognition, indoor, object, segmentation, rendering, 3d, realism, room, syntheticThe CALTECH 256 dataset by Li Fei-Fei contains 30607 images for 256 categories.
object, detection, image, centered, classification, sceneScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and ins...
scene, layout, recognition, indoor, object, cad, segmentation, rendering, 3d, realism, room, syntheticThe domain-specific personal videos highlight dataset from the paper [1] describes a fully automatic method to train domain-specific highlight ranker f...
saliency, domain, wearable, human, recognition, action, video, summarizationThe CALTECH 101 dataset by Li Fei-Fei contains images for 101 categories with about 40 to 800 images per category. Most categories have about 50 images ...
object, natural-image, centered, scene, image classificationSceneNet RGB-D is dataset comprised of 5 million Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth. It expands the previous work ...
trajectory, reconstruction, scene, slam, lighting, indoor, segmentation, robot, rendering, 3d, synthetic, navigationThe Video Summarization (SumMe) dataset consists of 25 videos, each annotated with at least 15 human summaries (390 in total). The data consists of vide...
video, benchmark, summary, event, human, groundtruth, action