Combining pattern classifiers : methods and algorithms /
Saved in:
Main Author: | |
---|---|
Corporate Author: | |
Format: | Electronic eBook |
Language: | English |
Published: |
Hoboken, New Jersey :
John Wiley & Sons, Inc.,
[2014]
|
Edition: | Second edition. |
Subjects: | |
Online Access: | Connect to this title online (unlimited simultaneous users allowed; 325 uses per year) |
Table of Contents:
- Machine generated contents note: 1. Fundamentals of Pattern Recognition
- 1.1. Basic Concepts: Class, Feature, Data Set
- 1.1.1. Classes and Class Labels
- 1.1.2. Features
- 1.1.3. Data Set
- 1.1.4. Generate Your Own Data
- 1.2. Classifier, Discriminant Functions, Classification Regions
- 1.3. Classification Error and Classification Accuracy
- 1.3.1. Where Does the Error Come From? Bias and Variance
- 1.3.2. Estimation of the Error
- 1.3.3. Confusion Matrices and Loss Matrices
- 1.3.4. Training and Testing Protocols
- 1.3.5. Overtraining and Peeking
- 1.4. Experimental Comparison of Classifiers
- 1.4.1. Two Trained Classifiers and a Fixed Testing Set
- 1.4.2. Two Classifier Models and a Single Data Set
- 1.4.3. Two Classifier Models and Multiple Data Sets
- 1.4.4. Multiple Classifier Models and Multiple Data Sets
- 1.5. Bayes Decision Theory
- 1.5.1. Probabilistic Framework
- 1.5.2. Discriminant Functions and Decision Boundaries
- 1.5.3. Bayes Error
- 1.6. Clustering and Feature Selection
- 1.6.1. Clustering
- 1.6.2. Feature Selection
- 1.7. Challenges of Real-Life Data
- Appendix
- 1.A.1. Data Generation
- 1.A.2. Comparison of Classifiers
- 1.A.2.1. MATLAB Functions for Comparing Classifiers
- 1.A.2.2. Critical Values for Wilcoxon and Sign Test
- 1.A.3. Feature Selection
- 2. Base Classifiers
- 2.1. Linear and Quadratic Classifiers
- 2.1.1. Linear Discriminant Classifier
- 2.1.2. Nearest Mean Classifier
- 2.1.3. Quadratic Discriminant Classifier
- 2.1.4. Stability of LDC and QDC
- 2.2. Decision Tree Classifiers
- 2.2.1. Basics and Terminology
- 2.2.2. Training of Decision Tree Classifiers
- 2.2.3. Selection of the Feature for a Node
- 2.2.4. Stopping Criterion
- 2.2.5. Pruning of the Decision Tree
- 2.2.6. C4.5 and ID3
- 2.2.7. Instability of Decision Trees
- 2.2.8. Random Trees
- 2.3. Naive Bayes Classifier
- 2.4. Neural Networks
- 2.4.1. Neurons
- 2.4.2. Rosenblatt's Perceptron
- 2.4.3. Multi-Layer Perceptron
- 2.5. Support Vector Machines
- 2.5.1. Why Would It Work?
- 2.5.2. Classification Margins
- 2.5.3. Optimal Linear Boundary
- 2.5.4. Parameters and Classification Boundaries of SVM
- 2.6. κ-Nearest Neighbor Classifier (A:-nn)
- 2.7. Final Remarks
- 2.7.1. Simple or Complex Models?
- 2.7.2. Triangle Diagram
- 2.7.3. Choosing a Base Classifier for Ensembles
- Appendix
- 2.A.1. MATLAB Code for the Fish Data
- 2.A.2. MATLAB Code for Individual Classifiers
- 2.A.2.1. Decision Tree
- 2.A.2.2. Naive Bayes
- 2.A.2.3. Multi-Layer Perceptron
- 2.A.2.4. 1-nn Classifier
- 3. Overview of the Field
- 3.1. Philosophy
- 3.2. Two Examples
- 3.2.1. Wisdom of the "Classifier Crowd"
- 3.2.2. Power of Divide-and-Conquer
- 3.3. Structure of the Area
- 3.3.1. Terminology
- 3.3.2. Taxonomy of Classifier Ensemble Methods
- 3.3.3. Classifier Fusion and Classifier Selection
- 3.4. Quo Vadis?
- 3.4.1. Reinventing the Wheel?
- 3.4.2. Illusion of Progress?
- 3.4.3. Bibliometric Snapshot
- 4. Combining Label Outputs
- 4.1. Types of Classifier Outputs
- 4.2. Probabilistic Framework for Combining Label Outputs
- 4.3. Majority Vote
- 4.3.1. "Democracy" in Classifier Combination
- 4.3.2. Accuracy of the Majority Vote
- 4.3.3. Limits on the Majority Vote Accuracy: An Example
- 4.3.4. Patterns of Success and Failure
- 4.3.5. Optimality of the Majority Vote Combiner
- 4.4. Weighted Majority Vote
- 4.4.1. Two Examples
- 4.4.2. Optimality of the Weighted Majority Vote Combiner
- 4.5. Naive-Bayes Combiner
- 4.5.1. Optimality of the Naive Bayes Combiner
- 4.5.2. Implementation of the NB Combiner
- 4.6. Multinomial Methods
- 4.7. Comparison of Combination Methods for Label Outputs
- Appendix
- 4.A.1. Matan's Proof for the Limits on the Majority Vote Accuracy
- 4.A.2. Selected MATLAB Code
- 5. Combining Continuous-Valued Outputs
- 5.1. Decision Profile
- 5.2. How Do We Get Probability Outputs?
- 5.2.1. Probabilities Based on Discriminant Scores
- 5.2.2. Probabilities Based on Counts: Laplace Estimator
- 5.3. Nontrainable (Fixed) Combination Rules
- 5.3.1. Generic Formulation
- 5.3.2. Equivalence of Simple Combination Rules
- 5.3.3. Generalized Mean Combiner
- 5.3.4. Theoretical Comparison of Simple Combiners
- 5.3.5. Where Do They Come From?
- 5.4. Weighted Average (Linear Combiner)
- 5.4.1. Consensus Theory
- 5.4.2. Added Error for the Weighted Mean Combination
- 5.4.3. Linear Regression
- 5.5. Classifier as a Combiner
- 5.5.1. Supra Bayesian Approach
- 5.5.2. Decision Templates
- 5.5.3. Linear Classifier
- 5.6. Example of Nine Combiners for Continuous-Valued Outputs
- 5.7. To Train or Not to Train?
- Appendix
- 5.A.1. Theoretical Classification Error for the Simple Combiners
- 5.A.1.1. Set-up and Assumptions
- 5.A.1.2. Individual Error
- 5.A.1.3. Minimum and Maximum
- 5.A.1.4. Average (Sum)
- 5.A.1.5. Median and Majority Vote
- 5.A.1.6. Oracle
- 5.A.2. Selected MATLAB Code
- 6. Ensemble Methods
- 6.1. Bagging
- 6.1.1. Origins: Bagging Predictors
- 6.1.2. Why Does Bagging Work?
- 6.1.3. Out-of-bag Estimates
- 6.1.4. Variants of Bagging
- 6.2. Random Forests
- 6.3. AdaBoost
- 6.3.1. AdaBoost Algorithm
- 6.3.2. arc-x4 Algorithm
- 6.3.3. Why Does AdaBoost Work?
- 6.3.4. Variants of Boosting
- 6.3.5. Famous Application: AdaBoost for Face Detection
- 6.4. Random Subspace Ensembles
- 6.5. Rotation Forest
- 6.6. Random Linear Oracle
- 6.7. Error Correcting Output Codes (ECOC)
- 6.7.1. Code Designs
- 6.7.2. Decoding
- 6.7.3. Ensembles of Nested Dichotomies
- Appendix
- 6.A.1. Bagging
- 6.A.2. AdaBoost
- 6.A.3. Random Subspace
- 6.A.4. Rotation Forest
- 6.A.5. Random Linear Oracle
- 6.A.6. ECOC
- 7. Classifier Selection
- 7.1. Preliminaries
- 7.2. Why Classifier Selection Works
- 7.3. Estimating Local Competence Dynamically
- 7.3.1. Decision-Independent Estimates
- 7.3.2. Decision-Dependent Estimates
- 7.4. Pre-Estimation of the Competence Regions
- 7.4.1. Bespoke Classifiers
- 7.4.2. Clustering and Selection
- 7.5. Simultaneous Training of Regions and Classifiers
- 7.6. Cascade Classifiers
- Appendix: Selected MATLAB Code
- 7.A.1. Banana Data
- 7.A.2. Evolutionary Algorithm for a Selection Ensemble for the Banana Data
- 8. Diversity in Classifier Ensembles
- 8.1. What Is Diversity?
- 8.1.1. Diversity for a Point-Value Estimate
- 8.1.2. Diversity in Software Engineering
- 8.1.3. Statistical Measures of Relationship
- 8.2. Measuring Diversity in Classifier Ensembles
- 8.2.1. Pairwise Measures
- 8.2.2. Nonpairwise Measures
- 8.3. Relationship Between Diversity and Accuracy
- 8.3.1. Example
- 8.3.2. Relationship Patterns
- 8.3.3. Caveat: Independent Outputs [≠] Independent Errors
- 8.3.4. Independence Is Not the Best Scenario
- 8.3.5. Diversity and Ensemble Margins
- 8.4. Using Diversity
- 8.4.1. Diversity for Finding Bounds and Theoretical Relationships
- 8.4.2. Kappa-error Diagrams and Ensemble Maps
- 8.4.3. Overproduce and Select
- 8.5. Conclusions: Diversity of Diversity
- Appendix
- 8.A.1. Derivation of Diversity Measures for Oracle Outputs
- 8.A.1.1. Correlation ρ
- 8.A.1.2. Interrater Agreement κ
- 8.A.2. Diversity Measure Equivalence
- 8.A.3. Independent Outputs [≠] Independent Errors
- 8.A.4. Bound on the Kappa-Error Diagram
- 8.A.5. Calculation of the Pareto Frontier
- 9. Ensemble Feature Selection
- 9.1. Preliminaries
- 9.1.1. Right and Wrong Protocols
- 9.1.2. Ensemble Feature Selection Approaches
- 9.1.3. Natural Grouping
- 9.2. Ranking by Decision Tree Ensembles
- 9.2.1. Simple Count and Split Criterion
- 9.2.2. Permuted Features or the "Noised-up" Method
- 9.3. Ensembles of Rankers
- 9.3.1. Approach
- 9.3.2. Ranking Methods (Criteria)
- 9.4. Random Feature Selection for the Ensemble
- 9.4.1. Random Subspace Revisited
- 9.4.2. Usability, Coverage, and Feature Diversity
- 9.4.3. Genetic Algorithms
- 9.5. Nonrandom Selection
- 9.5.1. "Favorite Class" Model
- 9.5.2. Iterative Model
- 9.5.3. Incremental Model
- 9.6. Stability Index
- 9.6.1. Consistency Between a Pair of Subsets
- 9.6.2. Stability Index for K Sequences
- 9.6.3. Example of Applying the Stability Index
- Appendix
- 9.A.1. MATLAB Code for the Numerical Example of Ensemble Ranking
- 9.A.2. MATLAB GA Nuggets
- 9.A.3. MATLAB Code for the Stability Index
- 10. Final Thought.