CS667

The Perceptron Learning Algorithm

Learning

Supervised learning

Learning by example

Preparation of training set often most expensive part of project
N dimensional space, M patterns means M N-vectors plus M identifications
Must find W such that perceptron is correct for all M patterns

Learning by query (oracle)

Unsupervised learning

Clustering, vector quantization
Learning the underlying probability distributions

Reinforcement learning

Good dog - bad dog!
Less expensive than LBE

LBE for perceptron

Turning the tables on McCulloch and Pitts
Easy case - solve linear equations

Linear equations are solved by techniques of linear algebra
Only for small number of patterns (small alpha)

General case - solve linear inequalities

Linear inequalities are solved by techniques of linear programming
Simplex, Khatchian, Karmarkar algorithms
Normally very complex

Rosenblatt Perceptron Learning Algorithm (PLA)

Iterative algorithm

Start with arbitrary perceptron
Loop on patterns, correct perceptron for every misclassified pattern
If there is a solution, PLA will find one in a finite number of iterations!
If no solution, then loop forever
How to select the next pattern

Geometric interpretation of the PLA

Misclassified pattern on wrong side of present hyperplane
Rotate hyperplane in correct direction
Even if now right subsequent steps may undo

Perceptron Learning Convergence Theorem

We know that there is a solution W*
At iteration n we have the present perceptron W
Observe cosine of angle between W and W*
Numerator increases linearly with n
Denominator increases as square root of n
Thus cosine increases as square root of n
Since a cosine can never exceed 1 the number of iterations must remain finite!

Assignment

Prove that the majority function (O=+1 iff a majority of the inputs are +1) is linearly separable.
Write a program implementing the majority function for odd N (N=3, 5, 7, ...)

Back to the CS667 Home Page