The most popular 400 queries have been chosen to query the videos from YouTube. Those queries are selected from Google Zeitgeist. Each year, Google examines billions of queries that people around the world have typed into Google search to discover the Zeitgeist saved in Google Zeitgeist Archives. The owners of the dataset collected Google Zeitgeist Archives from 2004 to 2009, and choose the most popular 400 queries to search YouTube. The downloaded number of videos for each query is up to 1000. They crawled more than 200K YouTube videos from July 2010 to September 2010.After filtering out the videos who sizes are greater than 10M, the Combined Dataset contains156,823videos in total. They further extract2,907,447keyframes from these videos. This dataset is released to public so that other researchers will be able to use it as a test bed.Related publicationJingkuan Song, Yi Yang, Zi Huang, Heng Tao Shen, Richang Hong: Multiple feature hashing for real-time large scale near-duplicate video retrieval. ACM Multimedia, pages 423-432, 2011.