The CALTECH 101 dataset by Li Fei-Fei contains images for 101 categories with about 40 to 800 images per category. Most categories have about 50 images at roughly 300 x 200 pixels. Interesting note, Andrea Vedaldi has single Matlab script for evaluating the CALTECH 101 classes using PHOW features and SVM classification.Pictures of objects belonging to 101 categories.
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test ima...
object, color, patch, scene, tiny, image classificationThe CALTECH 256 dataset by Li Fei-Fei contains 30607 images for 256 categories.
object, detection, image, centered, classification, sceneScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and ins...
scene, layout, recognition, indoor, object, cad, segmentation, rendering, 3d, realism, room, syntheticThe UK Bench dataset from Henrik Stewenius and David Nister contains 10200 images of N=2550 groups with each four images at size 640x480. The images are...
image retrieval, object, rotation, centeredThe SUNCG dataset is a Large 3D Model Repository for Indoor Scenes. SUNCG is an ongoing effort to establish a richly-annotated, large-scale dataset ...
scene, layout, recognition, indoor, object, segmentation, rendering, 3d, realism, room, syntheticThe Microsoft COCO (mscoco) is an image recognition and segmentation dataset which contains more 300k images for more than 70 categories. Other featur...
object, segmentation, benchmark, semantic, context, recognition, detectionA collection of 9 million URLs to images that have been annotated with labels spanning over 6,000 categories under Creative Commons.
natural-imageOur repetitive pattern dataset with 106 images of app. 30 buildings from Pankrac, Prague and Marseille appearing in more than one image, number of appea...
image retrieval, urban, symmetry, repetition, image classificationMany different labeled video datasets have been collected over the past few years, but it is hard to compare them at a glance. So we have created a hand...
video, object, benchmark, classification, recognition, detection, actionis an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. Like CIFAR-10 with some mo...
natural-imageFor the first few decades of the fields existence, computer vision has been focused on algorithmic, logical approaches to perception. But it was only wi...
object, 3d, kinect, reconstruction, depth, recognition, indoorThe Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames...
motion, benchmark, video, object, pedestrian, segmentation, tracking, groundtruthThis data set comprises 144 images of an edge profile cutting head of a milling machine. The head tool contains a total of 30 cutting inserts. The cutti...
profile, head, cutting, edge, tools, inserts, object, tool, milling, localization, wear, monitoringSatellite shots of the entire Earth surface, updated every several weeks.
natural-image, geospatialSheffield Building Image Dataset consists of over 3,000 low-resolution images of forty different buildings typically between 70 and 120 images per buil...
image retrieval, image classification, urban, sheffieldThis material is supplementary to Michael Stark, Bernt Schiele. How Good are Local Features for Classes of Geometric Objects. Eleventh IEEE Internat...
object, binary, tool, classification, shapeThe Tiny Images dataset consists of 79,302,017 images, each being a 32x32 color image. This data is stored in the form of large binary files which can b...
image retrieval, image classification, color, tinyThe ImageNET dataset is the latest dataset by Li Fei-Fei containing various dataset ranging from 1000 to 10000 categories.
image classification, object segmentation, retrievalThe ICG Lab 6 (Multi-Camera Multi-Object Tracking) dataset contains 6 indoor people tracking scenarios recorded at our laboratory using 4 static Axis P1...
evaluation, graz, object, laboratory, pedestrian, segmentation, multiview, tracking, camera, detection, calibrationThe Video Segmentation Benchmark (VSB100) provides ground truth annotations for the Berkeley Video Dataset, which consists of 100 HD quality videos divi...
video, object, segmentation, motion, pedestrian, benchmark, tracking, groundtruthThe Daimler Mono Pedestrian Detection Benchmark dataset contains a large training and test set. The training set contains 15.560 pedestrian samples (ima...
object, mono, urban, pedestrian, outdoor, scale, detectionScene Parsing Benchmark Scene parsing data and part segmentation data derived from ADE20K dataset could be download from MIT Scene Parsing Benchmark. ...
segmentation, annotation, benchmark, semantic, scene, recognitionSome datasets and evaluation tools are provided on this page for four different computer vision and computer graphics problems. Population counting L...
urban, surface, reconstruction, pointcloud, object, road, pedestrian, network, line, 3d, crowd, counting, detection, groundtruth32x32 color images with 10 / 100 categories. Not commonly used anymore, though once again, can be an interesting sanity check.
natural-imageThe UCF Person and Car VideoSeg dataset consists of six videos with groundtruth for video object segmentation. Surfing, jumping, skiing, sliding, big ...
video, object, segmentation, motion, model, camera, groundtruthScene understanding with many ancillary tasks (room layout estimation, saliency prediction, etc.) and an associated competition.
natural-imageWe present the 2017 DAVIS Challenge, a public competition specifically designed for the task of video object segmentation. Following the footsteps of ot...
code, quality, benchmark, video segmentation, object, segmentation, hd, tracking, resolutionThe Farman Institute 3D Point Sets dataset contains 11 objects by a 3D laser scanner. This dataset was peer-reviewed by Image Processing On Line: Farman...
object, scanner, 3d, reconstruction, point, model, laserThis dataset package contains the software and data used for Detection-based Object Labeling on the RGB-D Scenes Dataset as implemented in the paper: ...
object, 3d, kinect, reconstruction, depth, recognition, indoorThe KU Leuven Facade dataset is used for architectural styles classification. M. Mathias, A. Martinovic, J. Weissenberg, S. Haegler, L. Van Gool: Auto...
image classification, urban, architecture, procedural reconstructionThe GaTech VideoSeg dataset consists of two (waterski and yunakim?) video sequences for object segmentation. There exists no groundtruth segmentation ...
video, object, segmentation, motion, model, cameraThe All I Have Seen (AIHS) dataset is created to study the properties of total visual input in humans, for around two weeks Nebojsa Jojic wore a camera ...
similarity, scene, summary, user, indoor, outdoor, video, 3d, clustering, studyA dataset acquired with 3 synchronized sensors (Primesense Carmine 1.09, Microsoft Kinect v2, Canon IXUS 950 IS), featuring: * 30 industry-relevant ob...
object, rgbd, 3d, estimation, pose, texture-lessThe Video2GIF dataset contains over 100,000 pairs of GIFs and their source videos. The GIFs were collected from two popular GIF websites (, ...
gif, scene, summarization, summary, video highlight detection, understandingPlaces205 dataase contains 2.5 million images from 205 scene categories for the academic public. The image dataset contains 2,448,873 images from 205 ...
urban, learning, scene, feature, place, recognitionWe present a dataset to address the problem of visual privacy - where users unintentionally leak private information when sharing personal images online...
multilabel, privacy, classification, flickr, scene, regressionThe TRaffic ANd COngestionS (TRANCOS) dataset, a novel benchmark for (extremely overlapping) vehicle counting in traffic congestion situations. It consi...
urban, highway, spain, object, traffic, transportation, vehicle, detection, carThe VOT2016 pixel-wise annotations dataset contains pixel-wise per-frame annotations for sequences from VOT2016 dataset. The annotation is in a form of ...
object, segmentation, annotation, mask, visual, trackingThe ICG Multi-Camera and Virtual PTZ dataset contains the video streams and calibrations of several static Axis P1347 cameras and one panoramic video fr...
graz, outdoor, video, object, panorama, pedestrian, network, crowd, multiview, tracking, camera, multitarget, detection, calibrationThe de-facto image dataset for new algorithms. Many image API companies have labels from their REST interfaces that are suspiciously close to the 1000 c...
natural-imageSceneNet RGB-D is dataset comprised of 5 million Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth. It expands the previous work ...
trajectory, reconstruction, scene, slam, lighting, indoor, segmentation, robot, rendering, 3d, synthetic, navigationThe Visual Attributes dataset contains visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in Ima...
object, recognition, attribute, classification, imagenetThe Daimler Mono Pedestrian Classification Benchmark dataset consists of two parts: a base data set. The base data set contains a total of 4000 pedest...
illumination, object, urban, pedestrian, classification, outdoor, scaleGeneric image Segmentation / classificationnot terribly useful for building real-world image annotation, but great for baselines
natural-imageThe dataset contains 15 documentary films that are downloaded from YouTube, whose durations vary from 9 minutes to as long as 50 minutes, and the total ...
video, object, detectionThe COIL-100 (Columbia University Image Library) consists of 100 objects. For formal documentation look at the corresponding compressed technical report...
image retrieval, image classificationHouse numbers from Google Street View. Think of this as recurrent MNIST in the wild.
natural-imageThe Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature dat...
object, urban, fine-grained, classification, recognition, vehicle, car, attributeLASIESTA is composed by many real indoor and outdoor sequences organized in different categories, each of one covering a specific challenge in moving ob...
motion, subtraction, dataset, background, object, stationary, foreground, camera, challenge, detection, groundtruthThe Aspect Layout dataset is designed to allow evaluation of object detection for aspect ratios in perspective images. Author text: In this project ...
object, detection, aspect, perspective, ratio, layoutAbstract Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. ...
scene, segmentation, pedestrian, 3d, classification, understanding, car, semanticThe UMD Dynamic Scene Recognition dataset consists of 13 classes and 10 videos per class and is used to classify dynamic scenes. The dataset has been ...
video, motion, dynamic, classification, scene, recognitionVector data for the entire planet under a free license. It contains (an older version of) the US Census Bureaus data.
natural-image, geospatialMNIST: handwritten digits: The most commonly used sanity check. Dataset of 25x25, centered, B&W; handwritten digits. It is an easy taskjust because some...
natural-imageThe Multi-illuminant Image Sequences dataset contains 16 video sequences (13 with single light source and 3 with two global light sources), recorded wi...
constancy, color, white, chromaticity, physics, nature, dichromatic, illumination, object, balance, lightThe PASCAL VOC is augmented with segmentation annotation for semantic parts of objects. For example, for the person category, we provide segmentation ma...
part, human, recognition, object, pedestrian, segmentation, pascal, detection, semanticPictures of objects belonging to 256 categoriesPictures of objects belonging to 256 categories.
natural-image, classificationThe YouTube-Objects dataset is composed of videos collected from YouTube by querying for the names of 10 object classes. It contains between 9 and 24 vi...
video, object, flow, segmentation, detection, opticalThe ICG Multi-Camera datasets consist of Easy Data Set (just one person) Medium Data Set (3-5 persons, used for the experiments) Hard Data Set (cro...
graz, indoor, video, object, pedestrian, multiview, tracking, camera, multitarget, detection, calibrationThe KTH Multiview Football dataset contains 771 images of football players includes images taken from 3 views at 257 time instances 14 annotated body jo...
recognition, soccer, outdoor, object, pedestrian, game, pose, multiview, tracking, camera, multitarget, detectionThe SegTrack dataset consists of six videos (five are used) with ground truth pixelwise segmentation (6th penguin is not usable). The dataset is used fo...
motion, video, object, proposal, flow, segmentation, stationary, model, camera, optical, groundtruthDaimler Multi-Cue, Occluded Pedestrian Classification Benchmark Training and test samples have a resolution of 48 x 96 pixels with a 12-pixel border a...
image classification, urban, pedestrian, object detectionThe crowd datasets are collected from a variety of sources, such as UCF and data-driven crowd datasets. The sequences are diverse, representing dense cr...
video, pedestrian, scene, crowd, human, understanding, anomaly, detectionThe BEOID dataset includes object interactions ranging from preparing a coffee to operating a weight lifting machine and opening a door. The dataset is ...
video, object, egocentric, 3d, interaction, pose, tracking