Life Sciences

100 Datasets

Datasets


Horse Colic

2 data files: -- horse-colic.data: 300 training instances -- horse-colic.test: 68 test instances Possible class attributes: 24 (whether le...

multivariate, classification

ICU

Please see documentation

time-series, multivariate

Iris

This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced f...

multivariate, classification

Liver Disorders

The first 5 variables are all blood tests which are thought to be sensitive to liver disorders that might arise from excessive alcohol consumption. Each...

multivariate

Lung Cancer

This data was used by Hong and Young to illustrate the power of the optimal discriminant plane even in ill-posed settings. Applying the KNN method in th...

multivariate, classification

Lymphography

This is one of three domains provided by the Oncology Institute that has repeatedly appeared in the machine learning literature. (See also breast-cancer...

multivariate, classification

Molecular Biology (Promoter...

This dataset has been developed to help evaluate a "hybrid" learning algorithm ("KBANN") that uses examples to inductively refine preexisting knowledge....

domain-theory, classification, sequential

Molecular Biology (Protein ...

This is a data set used by Ning Qian and Terry Sejnowski in their study using a neural net to predict the secondary structure of certain globular protei...

classification, sequential

Molecular Biology (Splice-j...

Problem Description: Splice junctions are points on a DNA sequence at which `superfluous' DNA is removed during the process of protein creation in hig...

domain-theory, classification, sequential

Pima Indians Diabetes

Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 ye...

multivariate, classification