A communal biometrics framework supporting the development of open algorithms and reproducible evaluations. OpenBR is a framework for investigating new m…
age estimation, biometric, face detection, gender estimationAudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube…
google, music, speech, vehicleUber Movement provides anonymized data from over two billion trips to help urban planning around the world. You need to sign up to download this data.
trips, uber, urban planningMNIST: handwritten digits: The most commonly used sanity check. Dataset of 25x25, centered, B&W handwritten digits. It is an easy taskjust because somethi…
natural-imageThe WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikiped…
language modeling, wikiNetflix released an anonymized version of their movie rating dataset; it consists of 100 million ratings, done by 480,000 users who have rated between 1 a…
movie, rankingIt is hard to find the relevant datasets for a machine learning problem you are working on. We believe researchers should focus on improving the models and innovating in AI. We are here to help you with the time consuming peice of finding datasets.
Launching with 1000+ datasets across multiple fields with a goal to continously increase this number.
A community for AI researchers & developers to learn and share their knowledge of datasets and models.
We are focussed on increasing the searchability of datasets, both by better underlying metadata and better UI.
Coming Soon! We are exploring an option to create a dataset marketplace to further simply the process of acquiring datasets.
If you have any feedback or any features you will like us build on, please send an email to mldta@hcode.tech
In order to facilitate a new text detection research, we introduce the Total-Text dataset (ICDAR-17 paper), which is more comprehensive than the existing …
In order to facilitate a new object detection and image enhancement research, we introduce the Exclusively Dark (ExDark) dataset (CVIU2019). The Exclusive…
This database can be used for food recognition and segmentation. The database is composed of 1,027 tray images with multiple foods and containing 73 food …
Food recognition, food segmentationThe Raw Food Texture database (RawFooT) has been specially designed to investigate the robustness of descriptors and classification methods with respect t…
food, textureThis dataset consists of RGB-D videos in indoor scenes, and has dense, per-frame segmentation labels for both object and material categories. Moreover, au…
Audio, Audio-visual, Depth, Materials, Objects, Places, RGB-D, Scene Understanding, Scenes, Semantic SegmentationMapillary Vistas is the currently largest, publicly available street view image dataset. Stats: 25,000 Images | 100 Categories | 60 Instance-wise Cat…
Signup to start to collaborating, contributing and participating in the dataset discussions! You can always browse the datasets without signing up.