# The Perceptron Learning Algorithm

1. Learning
• Supervised learning
• Learning by example
• Preparation of training set often most expensive part of project
• N dimensional space, M patterns means M N-vectors plus M identifications
• Must find W such that perceptron is correct for all M patterns
• Learning by query (oracle)
• Unsupervised learning
• Clustering, vector quantization
• Learning the underlying probability distributions
• Reinforcement learning
• Good dog - bad dog!
• Less expensive than LBE
2. LBE for perceptron
• Turning the tables on McCulloch and Pitts
• Easy case - solve linear equations
• Linear equations are solved by techniques of linear algebra
• Only for small number of patterns (small alpha)
• General case - solve linear inequalities
• Linear inequalities are solved by techniques of linear programming
• Simplex, Khatchian, Karmarkar algorithms
• Normally very complex
3. Rosenblatt Perceptron Learning Algorithm (PLA)
• Iterative algorithm
• Loop on patterns, correct perceptron for every misclassified pattern
• If there is a solution, PLA will find one in a finite number of iterations!
• If no solution, then loop forever
• How to select the next pattern
• Geometric interpretation of the PLA
• Misclassified pattern on wrong side of present hyperplane
• Rotate hyperplane in correct direction
• Even if now right subsequent steps may undo
4. Perceptron Learning Convergence Theorem
• We know that there is a solution W*
• At iteration n we have the present perceptron W
• Observe cosine of angle between W and W*
• Numerator increases linearly with n
• Denominator increases as square root of n
• Thus cosine increases as square root of n
• Since a cosine can never exceed 1 the number of iterations must remain finite!

## Assignment

1. Prove that the majority function (O=+1 iff a majority of the inputs are +1) is linearly separable.
2. Write a program implementing the majority function for odd N (N=3, 5, 7, ...)