Combining pattern classifiers : methods and algorithms /

Saved in:
Bibliographic Details
Main Author: Kuncheva, Ludmila I. (Ludmila Ilieva), 1959- (Author)
Corporate Author: Ebooks Corporation
Format: Electronic eBook
Language:English
Published: Hoboken, New Jersey : John Wiley & Sons, Inc., [2014]
Edition:Second edition.
Subjects:
Online Access:Connect to this title online (unlimited simultaneous users allowed; 325 uses per year)
Table of Contents:
  • Machine generated contents note: 1. Fundamentals of Pattern Recognition
  • 1.1. Basic Concepts: Class, Feature, Data Set
  • 1.1.1. Classes and Class Labels
  • 1.1.2. Features
  • 1.1.3. Data Set
  • 1.1.4. Generate Your Own Data
  • 1.2. Classifier, Discriminant Functions, Classification Regions
  • 1.3. Classification Error and Classification Accuracy
  • 1.3.1. Where Does the Error Come From? Bias and Variance
  • 1.3.2. Estimation of the Error
  • 1.3.3. Confusion Matrices and Loss Matrices
  • 1.3.4. Training and Testing Protocols
  • 1.3.5. Overtraining and Peeking
  • 1.4. Experimental Comparison of Classifiers
  • 1.4.1. Two Trained Classifiers and a Fixed Testing Set
  • 1.4.2. Two Classifier Models and a Single Data Set
  • 1.4.3. Two Classifier Models and Multiple Data Sets
  • 1.4.4. Multiple Classifier Models and Multiple Data Sets
  • 1.5. Bayes Decision Theory
  • 1.5.1. Probabilistic Framework
  • 1.5.2. Discriminant Functions and Decision Boundaries
  • 1.5.3. Bayes Error
  • 1.6. Clustering and Feature Selection
  • 1.6.1. Clustering
  • 1.6.2. Feature Selection
  • 1.7. Challenges of Real-Life Data
  • Appendix
  • 1.A.1. Data Generation
  • 1.A.2. Comparison of Classifiers
  • 1.A.2.1. MATLAB Functions for Comparing Classifiers
  • 1.A.2.2. Critical Values for Wilcoxon and Sign Test
  • 1.A.3. Feature Selection
  • 2. Base Classifiers
  • 2.1. Linear and Quadratic Classifiers
  • 2.1.1. Linear Discriminant Classifier
  • 2.1.2. Nearest Mean Classifier
  • 2.1.3. Quadratic Discriminant Classifier
  • 2.1.4. Stability of LDC and QDC
  • 2.2. Decision Tree Classifiers
  • 2.2.1. Basics and Terminology
  • 2.2.2. Training of Decision Tree Classifiers
  • 2.2.3. Selection of the Feature for a Node
  • 2.2.4. Stopping Criterion
  • 2.2.5. Pruning of the Decision Tree
  • 2.2.6. C4.5 and ID3
  • 2.2.7. Instability of Decision Trees
  • 2.2.8. Random Trees
  • 2.3. Naive Bayes Classifier
  • 2.4. Neural Networks
  • 2.4.1. Neurons
  • 2.4.2. Rosenblatt's Perceptron
  • 2.4.3. Multi-Layer Perceptron
  • 2.5. Support Vector Machines
  • 2.5.1. Why Would It Work?
  • 2.5.2. Classification Margins
  • 2.5.3. Optimal Linear Boundary
  • 2.5.4. Parameters and Classification Boundaries of SVM
  • 2.6. κ-Nearest Neighbor Classifier (A:-nn)
  • 2.7. Final Remarks
  • 2.7.1. Simple or Complex Models?
  • 2.7.2. Triangle Diagram
  • 2.7.3. Choosing a Base Classifier for Ensembles
  • Appendix
  • 2.A.1. MATLAB Code for the Fish Data
  • 2.A.2. MATLAB Code for Individual Classifiers
  • 2.A.2.1. Decision Tree
  • 2.A.2.2. Naive Bayes
  • 2.A.2.3. Multi-Layer Perceptron
  • 2.A.2.4. 1-nn Classifier
  • 3. Overview of the Field
  • 3.1. Philosophy
  • 3.2. Two Examples
  • 3.2.1. Wisdom of the "Classifier Crowd"
  • 3.2.2. Power of Divide-and-Conquer
  • 3.3. Structure of the Area
  • 3.3.1. Terminology
  • 3.3.2. Taxonomy of Classifier Ensemble Methods
  • 3.3.3. Classifier Fusion and Classifier Selection
  • 3.4. Quo Vadis?
  • 3.4.1. Reinventing the Wheel?
  • 3.4.2. Illusion of Progress?
  • 3.4.3. Bibliometric Snapshot
  • 4. Combining Label Outputs
  • 4.1. Types of Classifier Outputs
  • 4.2. Probabilistic Framework for Combining Label Outputs
  • 4.3. Majority Vote
  • 4.3.1. "Democracy" in Classifier Combination
  • 4.3.2. Accuracy of the Majority Vote
  • 4.3.3. Limits on the Majority Vote Accuracy: An Example
  • 4.3.4. Patterns of Success and Failure
  • 4.3.5. Optimality of the Majority Vote Combiner
  • 4.4. Weighted Majority Vote
  • 4.4.1. Two Examples
  • 4.4.2. Optimality of the Weighted Majority Vote Combiner
  • 4.5. Naive-Bayes Combiner
  • 4.5.1. Optimality of the Naive Bayes Combiner
  • 4.5.2. Implementation of the NB Combiner
  • 4.6. Multinomial Methods
  • 4.7. Comparison of Combination Methods for Label Outputs
  • Appendix
  • 4.A.1. Matan's Proof for the Limits on the Majority Vote Accuracy
  • 4.A.2. Selected MATLAB Code
  • 5. Combining Continuous-Valued Outputs
  • 5.1. Decision Profile
  • 5.2. How Do We Get Probability Outputs?
  • 5.2.1. Probabilities Based on Discriminant Scores
  • 5.2.2. Probabilities Based on Counts: Laplace Estimator
  • 5.3. Nontrainable (Fixed) Combination Rules
  • 5.3.1. Generic Formulation
  • 5.3.2. Equivalence of Simple Combination Rules
  • 5.3.3. Generalized Mean Combiner
  • 5.3.4. Theoretical Comparison of Simple Combiners
  • 5.3.5. Where Do They Come From?
  • 5.4. Weighted Average (Linear Combiner)
  • 5.4.1. Consensus Theory
  • 5.4.2. Added Error for the Weighted Mean Combination
  • 5.4.3. Linear Regression
  • 5.5. Classifier as a Combiner
  • 5.5.1. Supra Bayesian Approach
  • 5.5.2. Decision Templates
  • 5.5.3. Linear Classifier
  • 5.6. Example of Nine Combiners for Continuous-Valued Outputs
  • 5.7. To Train or Not to Train?
  • Appendix
  • 5.A.1. Theoretical Classification Error for the Simple Combiners
  • 5.A.1.1. Set-up and Assumptions
  • 5.A.1.2. Individual Error
  • 5.A.1.3. Minimum and Maximum
  • 5.A.1.4. Average (Sum)
  • 5.A.1.5. Median and Majority Vote
  • 5.A.1.6. Oracle
  • 5.A.2. Selected MATLAB Code
  • 6. Ensemble Methods
  • 6.1. Bagging
  • 6.1.1. Origins: Bagging Predictors
  • 6.1.2. Why Does Bagging Work?
  • 6.1.3. Out-of-bag Estimates
  • 6.1.4. Variants of Bagging
  • 6.2. Random Forests
  • 6.3. AdaBoost
  • 6.3.1. AdaBoost Algorithm
  • 6.3.2. arc-x4 Algorithm
  • 6.3.3. Why Does AdaBoost Work?
  • 6.3.4. Variants of Boosting
  • 6.3.5. Famous Application: AdaBoost for Face Detection
  • 6.4. Random Subspace Ensembles
  • 6.5. Rotation Forest
  • 6.6. Random Linear Oracle
  • 6.7. Error Correcting Output Codes (ECOC)
  • 6.7.1. Code Designs
  • 6.7.2. Decoding
  • 6.7.3. Ensembles of Nested Dichotomies
  • Appendix
  • 6.A.1. Bagging
  • 6.A.2. AdaBoost
  • 6.A.3. Random Subspace
  • 6.A.4. Rotation Forest
  • 6.A.5. Random Linear Oracle
  • 6.A.6. ECOC
  • 7. Classifier Selection
  • 7.1. Preliminaries
  • 7.2. Why Classifier Selection Works
  • 7.3. Estimating Local Competence Dynamically
  • 7.3.1. Decision-Independent Estimates
  • 7.3.2. Decision-Dependent Estimates
  • 7.4. Pre-Estimation of the Competence Regions
  • 7.4.1. Bespoke Classifiers
  • 7.4.2. Clustering and Selection
  • 7.5. Simultaneous Training of Regions and Classifiers
  • 7.6. Cascade Classifiers
  • Appendix: Selected MATLAB Code
  • 7.A.1. Banana Data
  • 7.A.2. Evolutionary Algorithm for a Selection Ensemble for the Banana Data
  • 8. Diversity in Classifier Ensembles
  • 8.1. What Is Diversity?
  • 8.1.1. Diversity for a Point-Value Estimate
  • 8.1.2. Diversity in Software Engineering
  • 8.1.3. Statistical Measures of Relationship
  • 8.2. Measuring Diversity in Classifier Ensembles
  • 8.2.1. Pairwise Measures
  • 8.2.2. Nonpairwise Measures
  • 8.3. Relationship Between Diversity and Accuracy
  • 8.3.1. Example
  • 8.3.2. Relationship Patterns
  • 8.3.3. Caveat: Independent Outputs [≠] Independent Errors
  • 8.3.4. Independence Is Not the Best Scenario
  • 8.3.5. Diversity and Ensemble Margins
  • 8.4. Using Diversity
  • 8.4.1. Diversity for Finding Bounds and Theoretical Relationships
  • 8.4.2. Kappa-error Diagrams and Ensemble Maps
  • 8.4.3. Overproduce and Select
  • 8.5. Conclusions: Diversity of Diversity
  • Appendix
  • 8.A.1. Derivation of Diversity Measures for Oracle Outputs
  • 8.A.1.1. Correlation ρ
  • 8.A.1.2. Interrater Agreement κ
  • 8.A.2. Diversity Measure Equivalence
  • 8.A.3. Independent Outputs [≠] Independent Errors
  • 8.A.4. Bound on the Kappa-Error Diagram
  • 8.A.5. Calculation of the Pareto Frontier
  • 9. Ensemble Feature Selection
  • 9.1. Preliminaries
  • 9.1.1. Right and Wrong Protocols
  • 9.1.2. Ensemble Feature Selection Approaches
  • 9.1.3. Natural Grouping
  • 9.2. Ranking by Decision Tree Ensembles
  • 9.2.1. Simple Count and Split Criterion
  • 9.2.2. Permuted Features or the "Noised-up" Method
  • 9.3. Ensembles of Rankers
  • 9.3.1. Approach
  • 9.3.2. Ranking Methods (Criteria)
  • 9.4. Random Feature Selection for the Ensemble
  • 9.4.1. Random Subspace Revisited
  • 9.4.2. Usability, Coverage, and Feature Diversity
  • 9.4.3. Genetic Algorithms
  • 9.5. Nonrandom Selection
  • 9.5.1. "Favorite Class" Model
  • 9.5.2. Iterative Model
  • 9.5.3. Incremental Model
  • 9.6. Stability Index
  • 9.6.1. Consistency Between a Pair of Subsets
  • 9.6.2. Stability Index for K Sequences
  • 9.6.3. Example of Applying the Stability Index
  • Appendix
  • 9.A.1. MATLAB Code for the Numerical Example of Ensemble Ranking
  • 9.A.2. MATLAB GA Nuggets
  • 9.A.3. MATLAB Code for the Stability Index
  • 10. Final Thought.