Others

96 Datasets

Datasets


Statlog (Vehicle Silhouettes)

The purpose is to classify a given silhouette as one of four types of vehicle, using a set of features extracted from the silhouette. The vehicle may b...

multivariate, classification

Connectionist Bench (Nettal...

This is an updated and corrected version of the data set used by Sejnowski and Rosenberg in their influential study of speech generation using a neural ...

multivariate

Connectionist Bench (Vowel ...

The problem is specified by the accompanying data file, "vowel.data". This consists of a three dimensional array: voweldata [speaker, vowel, input]. Th...

classification

Dodgers Loop Sensor

This loop sensor data was collected for the Glendale on ramp for the 101 North freeway in Los Angeles. It is close enough to the stadium to see unusual...

time-series, multivariate

Bag of Words

For each text collection, D is the number of documents, W is the number of words in the vocabulary, and N is the total number of words in the collection...

text, clustering

Hill-Valley

Each record represents 100 points on a two-dimensional graph. When plotted in order (from 1 through 100) as the Y co-ordinate, the points will create ei...

classification, sequential

Dexter

The original data were formatted by Thorsten Joachims in the bag-of-words representation. There were 9947 features (of which 2562 are always zeros for a...

multivariate, classification

Madelon

MADELON is an artificial dataset containing data points grouped in 32 clusters placed on the vertices of a five dimensional hypercube and randomly label...

multivariate, classification

USPTO Algorithm Challenge, ...

USPTO Algorithm Challenge, run by NASA-Harvard Tournament Lab and TopCoder Problem: Patent Labeling

domain-theory, classification

Libras Movement

The dataset (movement_libras) contains 15 classes of 24 instances each, where each class references to a hand movement type in LIBRAS. In the video pre...

multivariate, classification, clustering, sequential