Principal Components

Forrest Young's Notes

Copyright © 1999 by Forrest W. Young.


Overview of Principal Components

Principal Components
Principal Components is a statistical technique that is used to summarize the variance in a group of variables by linear combinations of the variables. The linear combinations are computed to maximize the variance accounted for.
    Purpose
    To summarize the variation in several numeric variables by a smaller number of variables called components.
    Linear Combinations
    The components are the linear combinations of the original variables that explain the maximum variance.

    The First Principal Component
    The first principal component summarizes as much variance in the variables as can be summarized by any single linear combination.

    Additional Principal Components
    Additional principal compoents are at right angles to all preceeding ones (are "orthogonal" to them) and account for as much variance as is possible.

    The first several principal components

    1. explain as much variation in the raw data as can be explained by that many orthogonal linear combinations.
    2. represent orthogonal directions in the raw data that are the longest directions that are mutually at right angles to each other.
    3. form a rigid, orthogonal rotation of the original raw data into an orientation where the new dimensions have maximum variance.