DATA MINING –

Therefore, answer the following questions:

- What are the various types of classifiers?
- What is a rule-based classifier?
- What is the difference between nearest neighbor and naïve bayes classifiers?
- What is logistic regression?

Research

watch

**https://www.youtube.com/watch?v=yFKVI7vgPPs**

Read:

- ch. 4 in textbook: Classification: Alternative Techniques

Classification: Alternative Techniques

Lecture Notes for Chapter 4

Introduction to Data Mining

by

Tan, Steinbach, Kumar

Types of Classifiers

- Binary vs. Multiclass
- Deterministic vs. Probablistic
- Linear vs. Nonlinear
- Global vs. local
- Generative vs. Discriminative

Rule Based Classifiers

- How it works?
- Properties of a Rule Set
- Direct Methods for Rule Extraction
- Learn-One rule function
- Instance Elimination
- Indirect Methods for Rule Extraction
- Characteristics of Rule-Based Classifiers

Nearest Neighbor Classifiers

- Algorithm
- Computes the distance or similarity between each test instance and all training examples.
- Characteristics – Review 4.3.2

Naïve Bayes Classifier

- Basics of Probability Theory
- Bayes Theorem
- Bayes theorem presents the statistical principle for answering questions like the previous one, where evidence from multiple sources has to be combined with prior beliefs to arrive at predictions. Bayes theorem can be briefly described as follows.
- Classification
- Class conditional
- Generative classification
- Prior probabilty

Bayesian Network

- Graphical Representation
- Conditional Independence
- Joint Probability
- Use of Hidden Variables
- Inference and Learning
- Variable Elimination
- Sum-Product Algorithm for Trees
- Generalizations for Non-Tree Graphs
- Learning Model Parameters
- Characteristics of Bayesian Networks

Logistic Regression

- Generalized Linear Model
- Learning Model Parameters
- Characteristics

Artificial Neural Network (ANN)

- Perceptron
- Learning the Perceptron
- Multi-layer Neural Network
- Learning Model Parameters
- Characteristics of ANN
- Universal approximators
- Review 4.7.3

Deep Learning

- Using Synergistic Loss Functions
- Saturation of outputs and Cross entropy loss function
- Using Responsive Activation Functions
- Vanishing gradient problem and ReLU
- Regularization
- Dropout
- Initialization of Model Parameters
- Supervised and unsupervised pretraining
- Use of autoencoders and hybrid pretraining
- Characteristics of Deep Learning
- Review 4.8.5

Support Vector Machine (SVM)

- Margin of a Separating Hyperplane
- Rationale for maximum margin
- Linear SVM
- Learning model parameters
- Soft-margin SVM
- Regularizer of Hinge Loss
- Nonlinear SVM
- Attribute transformation
- Learning a non-linear SVM Model
- Characteristics of SVM
- Review Section 4.9.5

Ensemble Methods

- Rationale for Ensemble Methods
- Methods for Constructing an Ensemble Classifier
- Bias- Variance Decomposition
- Bagging
- Boosting
- AdaBoost
- Random Forests
- Empirical Comparison among Ensemble methods

Class Imbalance Problem

- Building Classifiers with Class Imbalance
- Oversampling and undersampling
- Assigning scores to test instances
- Evaluating Performance with Class Imbalance
- Finding an Optimal Score Threshold
- Aggregate Evaluation of Performance
- ROC Curve
- Precision-Recall Curve

Multiclass Problem

- A multiclass problem is one where the data is divided into more than two categories.