Supervised, Semi-supervised, and Unsupervised Learning

Kernel Based Algorithms for Mining Huge Data Sets

TABLE OF CONTENTS

1 Introduction

2 Support Vector Machines in Classification and Regression - An Introduction

3 Iterative Single Data Algorithm for Kernel Machines from Huge Data Sets: Theory and Performace

4 Feature Reduction with Support Vector Machines and Application in DNA Microarray Analysis

5 Semi-supervised Learning and Applications

6 Unsupervised Learning by Principal and Independent Component Analysis

Appendices

A Support Vector Machines

B Matlab Code for ISDA Classification

D Matlab Code for Conjugate Gradient Method with Box Constraints

E Uncorrelatedness and Independence

F Independent Component Analysis by Empirical Estimation of Score Functions i.e., Probability Density Functions

G SemiL User Guide

1 Introduction

1.1 An Overview of Machine Learning

1.2 Challenges in Machine Learning

1.2.1 Solving Large-Scale SVMs

1.2.2 Feature Reduction with Support Vector Machines

1.2.3 Graph-Based Semi-supervised Learning Algorithms

1.2.4 Unsupervised Learning Based on Principle of Redundancy Reduction

2 Support Vector Machines in Classification and Regression - An Introduction

2.1 Basics of Learning from Data

2.2 Support Vector Machines in Classification and Regression

2.2.1 Linear Maximal Margin Classifier for Linearly Separable Data

2.2.2 Linear Soft Margin Classifier for Overlapping Classes

2.2.3 The Nonlinear SVMs Classifier

2.2.4 Regression by Support Vector Machines

2.3 Implementation Issues

3 Iterative Single Data Algorithm for Kernel Machines from Huge Data Sets: Theory and Performance

3.1 Introduction

3.2 Iterative Single Data Algorithm for Positive Definite Kernels without Bias Term b

3.2.1 Kernel AdaTron in Classification

3.2.2 SMO without Bias Term b in Classification

3.2.3 Kernel AdaTron in Regression

3.2.4 SMO without Bias Term b in Regression

3.2.5 The Coordinate Ascent Based Learning for Nonlinear Classification and Regression Tasks

3.2.6 Discussion on ISDA Without a Bias Term b

3.3 Iterative Single Data Algorithm with an Explicit Bias Term b

3.3.1 Iterative Single Data Algorithm for SVMs Classification with a Bias Term b

3.4 Performance of the Iterative Single Data Algorithm and Comparisons

3.5 Implementation Issues

3.5.1 Working-set Selection and Shrinking of ISDA for Classification

3.5.2 Computation of the Kernel Matrix and Caching of ISDA for Classification

3.5.3 Implementation Details of ISDA for Regression

3.6 Conclusions

4 Feature Reduction with Support Vector Machines and Application in DNA Microarray Analysis

4.1 Introduction

4.2 Basics of Microarray Technology

4.3 Some Prior Work

4.3.1 Recursive Feature Elimination with Support Vector Machines

4.3.2 Selection Bias and How to Avoid It

4.4 Influence of the Penalty Parameter C in RFE-SVMs

4.5 Gene Selection for the Colon Cancer and the Lymphoma Data Sets

4.5.1 Results for Various C Parameters

4.5.2 Simulation Results with Different Preprocessing Procedures

4.6 Comparison between RFE-SVMs and the Nearest Shrunken Centroid Method

4.6.1 Basic Concept of Nearest Shrunken Centroid Method

4.6.2 Results on the Colon Cancer Data Set and the Lymphoma Data Set

4.7 Comparison of Genes’ Ranking with Different Algorithms

4.8 Conclusions

5 Semi-supervised Learning and Applications

5.1 Introduction

5.2 Gaussian Random Fields Model and Consistency Method

5.2.1 Gaussian Random Fields Model

5.2.2 Global Consistency Model

5.2.3 Random Walks on Graph

5.3 An Investigation of the Effect of Unbalanced labeled Data on CM and GRFM Algorithms

5.3.1 Background and Test Settings

5.3.2 Results on the Rec Data Set

5.3.3 Possible Theoretical Explanations on the Effect of Unbalanced Labeled Data

5.4 Classifier Output Normalization: A Novel Decision Rule for Semi-supervised Learning Algorithm

5.5 Performance Comparison of Semi-supervised Learning Algorithms

5.5.1 Low Density Separation: Integration of Graph-Based Distances and rTSVM

5.5.2 Combining Graph-Based Distance with Manifold Approaches

5.5.3 Test Data Sets

5.5.4 Performance Comparison Between the LDS and the Manifold Approaches

5.5.5 Normalizatioin Steps and the Effect of σ

5.6 Implementation of the Manifold Approaches

5.6.1 Variants of the Manifold Approaches Implemented in the Software Package SemiL

5.6.2 Implementation Details of SemiL

5.6.3 Conjugate Gradient Method with Box Constraints

5.6.4 Simulation Results on the MNIST Data Set

5.7 An Overview of Text Classification

5.8 Conclusions

6 Unsupervised Learning by Principal and Independent Component Analysis

6.1 Principal Component Analysis

6.2 Independent Component Analysis

6.3 Concluding Remarks

A Support Vector Machines

A.1 L2 Soft Margin Classifier

A.2 L2 Soft Regressor

A.3 Geometry and the Margin

B Matlab Code for ISDA Classification

C Matlab Code for ISDA Regression

D Matlab Code for Conjugate Gradient Method with Box Constraints

E Uncorrelatedness and Independence

F Independent Component Analysis by Empirical Estimation of Score Functions i.e., Probability Density Functions

G SemiL User Guide

G.1 Installation

G.2 Input Data Format

G.2.1 Raw Data Format

G.3 Getting Started

G.3.1 Design Stage

Index

References