UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The images are divided into 4 major categories - shoes, sandals, slippers, and boots - followed by functional types and individual brands. The shoes are centered on a white background and pictured in the same orientation for convenient analysis. This dataset is created in the context of an online shopping task, where users pay special attentions to fine-grained visual differences. For instance, it is more likely that a shopper is deciding between two pairs of similar men running shoes instead of between a women high heel and a men slipper. GIST and LAB color features are provided. In addition, each image has 8 associated meta-data (gender, materials, etc.) labels that are used to filter the shoes on Zappos.com. Reference: Fine-Grained Visual Comparisons with Local Learning Aron Yu and Kristen Grauman Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, Jun 2014.
Large, metadata-rich, open source dataset on Kaggle that can be good for people experimenting with hybrid recommendation systems.
rankingThe Webcam Interestingness dataset consists of 20 different webcam streams, with 159 images each. It is annotated with interestingness ground truth, acq...
video, interest, retrieval, classification, weather, ranking, webcamThe Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature dat...
object, urban, fine-grained, classification, recognition, vehicle, car, attributeFine-Grained Visual Classification of Aircraft (FGVC-Aircraft) is a benchmark dataset for the fine grained visual categorization of aircraft. Data, an...
benchmark, evaluation, fine-grained, classification, aircraft, airplane, recognitionMovie ratings dataset from the Movielens website, in various sizes ranging from demo to mid-size.
rankingNetflix released an anonymized version of their movie rating dataset; it consists of 100 million ratings, done by 480,000 users who have rated between 1...
ranking, movieMusic recommendation dataset with access to underlying social network and other metadata that can be useful for hybrid systems.
ranking: From the Book-Crossing community. Contains 278,858 users providing 1,149,780 ratings about 271,379 books.
ranking