Description

Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0)) Prediction task is to determine whether a person makes over 50K a year.

Related Papers

  • Omid Madani and David M. Pennock and Gary William Flake. Co-Validation: Using Model Disagreement to Validate Classification Algorithms. Yahoo! Research Labs. [link]
  • S. V. N Vishwanathan and Alexander J. Smola and M. Narasimha Murty. considerably faster than competing methods such as Sequential Minimal Optimization or the Nearest Point Algorithm. Machine Learning Program, National ICT for Australia. [link]
  • Luc Hoegaerts and J. A. K Suykens and J. Vandewalle and Bart De Moor. Primal Space Sparse Kernel Partial Least Squares Regression for Large Scale Problems Special Session paper . Katholieke Universiteit Leuven Department of Electrical Engineering, ESAT-SCD-SISTA. [link]
  • Luc Hoegaerts and J. A. K Suykens and J. Vandewalle and Bart De Moor. Subset Based Least Squares Subspace Regression in RKHS. Katholieke Universiteit Leuven Department of Electrical Engineering, ESAT-SCD-SISTA. [link]
  • Gary M. Weiss and Haym Hirsh. A Quantitative Study of Small Disjuncts: Experiments and Results. Department of Computer Science Rutgers University. 2000. [link]
  • Rich Caruana and Alexandru Niculescu-Mizil. An Empirical Evaluation of Supervised Learning for ROC Area. ROCAI. 2004. [link]
  • Bart Hamers and J. A. K Suykens. Coupled Transductive Ensemble Learning of Kernel Models. Bart De Moor. 2003. [link]
  • Wei-Chun Kao and Kai-Min Chung and Lucas Assun and Chih-Jen Lin. Decomposition Methods for Linear Support Vector Machines. Neural Computation, 16. 2004. [link]
  • William W. Cohen and Yoram Singer. A Simple, Fast, and Effective Rule Learner. AT&T; Labs--Research Shannon Laboratory. [link]
  • Rich Caruana and Alexandru Niculescu-Mizil and Geoff Crew and Alex Ksikes. Ensemble selection from libraries of models. ICML. 2004. [link]
  • Chris Giannella and Bassem Sayrafi. An Information Theoretic Histogram for Single Dimensional Selectivity Estimation. Department of Computer Science, Indiana University Bloomington. [link]
  • I. Yoncaci. Maximum a Posteriori Tree Augmented Naive Bayes Classifiers. O EN INTEL.LIG ` ENCIA ARTIFICIAL CSIC. 2003. [link]
  • Bernhard Pfahringer and Geoffrey Holmes and Richard Kirkby. Optimizing the Induction of Alternating Decision Trees. PAKDD. 2001. [link]
  • Stephen D. Bay. Multivariate Discretization for Set Mining. Knowl. Inf. Syst, 3. 2001. [link]
  • Kuan-ming Lin and Chih-Jen Lin. A Study on Reduced Support Vector Machines. Department of Computer Science and Information Engineering National Taiwan University. [link]
  • Ayhan Demiriz and Kristin P. Bennett and John Shawe and I. Nouretdinov V.. Linear Programming Boosting via Column Generation. Dept. of Decision Sciences and Eng. Systems, Rensselaer Polytechnic Institute. [link]
  • Saharon Rosset. Model selection via the AUC. ICML. 2004. [link]
  • Petri Kontkanen and Jussi Lahtinen and Petri Myllymaki and Tomi Silander and Henry Tirri. Proceedings of Pre- and Post-processing in Machine Learning and Data Mining: Theoretical Aspects and Applications, a workshop within Machine Learning and Applications. Complex Systems Computation Group (CoSCo). 1999. [link]
  • Josep Roure Alcobe. Incremental Hill-Climbing Search Applied to Bayesian Network Structure Learning. Escola Universitria Politcnica de Mataro. [link]
  • Zhiyuan Chen and Johannes Gehrke and Flip Korn. Query Optimization In Compressed Database Systems. SIGMOD Conference. 2001. [link]
  • Luca Zanni. An Improved Gradient Projection-based Decomposition Technique for Support Vector Machines. Dipartimento di Matematica, Universitdi Modena e Reggio Emilia. [link]
  • Stephen D. Bay and Michael J. Pazzani. Detecting Group Differences: Mining Contrast Sets. Data Min. Knowl. Discov, 5. 2001. [link]
  • Rakesh Agrawal and Ramakrishnan ikant and Dilys Thomas. Privacy Preserving OLAP. SIGMOD Conference. 2005. [link]
  • Yk Huhtala and Juha Krkkinen and Pasi Porkka and Hannu Toivonen. Efficient Discovery of Functional and Approximate Dependencies Using Partitions. ICDE. 1998. [link]
  • Alexander J. Smola and Vishy Vishwanathan and Eleazar Eskin. Laplace Propagation. NIPS. 2003. [link]
  • Petri Kontkanen and Jussi Lahtinen and Petri Myllymaki and Tomi Silander and Henry Tirri. USING BAYESIAN NETWORKS FOR VISUALIZING HIGH-DIMENSIONAL DATA. Complex Systems Computation Group (CoSCo). [link]
  • Jie Cheng and Russell Greiner. Learning Bayesian Belief Network Classifiers: Algorithms and System. Canadian Conference on AI. 2001. [link]
  • Ahmed Hussain Khan and Intensive Care. Multiplier-Free Feedforward Networks. 174. [link]
  • Haixun Wang and Philip S. Yu. SSDT-NN: A Subspace-Splitting Decision Tree Classifier with Application to Target Selection. IBM T. J. Watson Research Center. [link]
  • S. Sathiya Keerthi and Kaibo Duan and Shirish Krishnaj Shevade and Aun Neow Poo. A Fast Dual Algorithm for Kernel Logistic Regression. ICML. 2002. [link]
  • Kristin P. Bennett and Ayhan Demiriz and John Shawe-Taylor. A Column Generation Algorithm For Boosting. ICML. 2000. [link]
  • David R. Musicant and Alexander Feinberg. Active Set Support Vector Regression. [link]
  • Dmitry Pavlov and Jianchang Mao and Byron Dom. Scaling-Up Support Vector Machines Using Boosting Algorithm. ICPR. 2000. [link]
  • Shi Zhong and Weiyu Tang and Taghi M. Khoshgoftaar. Boosted Noise Filters for Identifying Mislabeled Data. Department of Computer Science and Engineering Florida Atlantic University. [link]
  • Thomas Serafini and G. Zanghirati and Del Zanna and T. Serafini and Gaetano Zanghirati and Luca Zanni. DIPARTIMENTO DI MATEMATICA. Gradient Projection Methods for. 2003. [link]
  • Ron Kohavi and Barry G. Becker and Dan Sommerfield. Improving Simple Bayes. Data Mining and Visualization Group Silicon Graphics, Inc. [link]
  • Jeff G. Schneider and Andrew W. Moore. Active Learning in Discrete Input Spaces. School of Computer Science Carnegie Mellon University. [link]
  • Bianca Zadrozny. Learning and evaluating classifiers under sample selection bias. ICML. 2004. [link]
  • Ron Kohavi. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. KDD. 1996. [link]
  • Andrew W. Moore and Weng-Keen Wong. Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning. ICML. 2003. [link]
  • Christopher R. Palmer and Christos Faloutsos. Electricity Based External Similarity of Categorical Attributes. PAKDD. 2003. [link]
  • Jie Cheng and Russell Greiner. Comparing Bayesian Network Classifiers. UAI. 1999. [link]
  • David R. Musicant. DATA MINING VIA MATHEMATICAL PROGRAMMING AND MACHINE LEARNING. Doctor of Philosophy (Computer Sciences) UNIVERSITY. [link]
  • Ramesh Natarajan and Edwin P D Pednault. Segmented Regression Estimators for Massive Data Sets. SDM. 2002. [link]
  • Nitesh V. Chawla and Kevin W. Bowyer and Lawrence O. Hall and W. Philip Kegelmeyer. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. (JAIR, 16. 2002. [link]
  • Rong-En Fan and P. -H Chen and C. -J Lin. Working Set Selection Using the Second Order Information for Training SVM. Department of Computer Science and Information Engineering National Taiwan University. [link]
  • Grigorios Tsoumakas and Ioannis P. Vlahavas. Fuzzy Meta-Learning: Preliminary Results. Greek Secretariat for Research and Technology. [link]
  • S. Sathiya Keerthi and Chih-Jen Lin. Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel. Neural Computation, 15. 2003. [link]
  • Dmitry Pavlov and Darya Chudova and Padhraic Smyth. Towards scalable support vector machines using squashing. KDD. 2000. [link]
  • Bianca Zadrozny and Charles Elkan. Transforming classifier scores into accurate multiclass probability estimates. KDD. 2002. [link]
  • John C. Platt. Using Analytic QP and Sparseness to Speed Training of Support Vector Machines. NIPS. 1998. [link]
  • [link]

Related datasets