CS667
The Perceptron Learning Algorithm
- Learning
- Supervised learning
- Learning by example
- Preparation of training set often most expensive part of project
- N dimensional space, M patterns means M N-vectors plus M identifications
- Must find W such that perceptron is correct for all M patterns
- Learning by query (oracle)
- Unsupervised learning
- Clustering, vector quantization
- Learning the underlying probability distributions
- Reinforcement learning
- Good dog - bad dog!
- Less expensive than LBE
- LBE for perceptron
- Turning the tables on McCulloch and Pitts
- Easy case - solve linear equations
- Linear equations are solved by techniques of linear algebra
- Only for small number of patterns (small alpha)
- General case - solve linear inequalities
- Linear inequalities are solved by techniques of linear programming
- Simplex, Khatchian, Karmarkar algorithms
- Normally very complex
- Rosenblatt Perceptron Learning Algorithm (PLA)
- Iterative algorithm
- Start with arbitrary perceptron
- Loop on patterns, correct perceptron for every misclassified pattern
- If there is a solution, PLA will find one in a finite number of iterations!
- If no solution, then loop forever
- How to select the next pattern
- Geometric interpretation of the PLA
- Misclassified pattern on wrong side of present hyperplane
- Rotate hyperplane in correct direction
- Even if now right subsequent steps may undo
- Perceptron Learning Convergence Theorem
- We know that there is a solution W*
- At iteration n we have the present perceptron W
- Observe cosine of angle between W and W*
- Numerator increases linearly with n
- Denominator increases as square root of n
- Thus cosine increases as square root of n
- Since a cosine can never exceed 1 the number of iterations must
remain finite!
Assignment
- Prove that the majority function
(O=+1 iff a majority of the inputs are +1)
is linearly separable.
- Write a program implementing the majority function for odd N
(N=3, 5, 7, ...)
Back to the CS667 Home Page