Principal component analysis of PCA algorithm for machine learning

Foreword: Artificial intelligence machine learning related to algorithm content, artificial intelligence machine learning has three main categories: 1) classification; 2) regression; 3) clustering. Today we focus on PCA algorithm.

PCA (Principal Component Analysis) is one of the ten classic machine learning algorithms. PCA was a multivariate statistical method proposed by Pearson in 1901 and later developed by Hotelling in 1933.

For data with many dimensions, the first thing you need to do is to reduce the dimensions of the data while ensuring the essence of the data. Dimensionality reduction is a data set preprocessing technology, which is often used before the data is applied to other algorithms. It can remove some redundant information and noise of the data, making the data simpler and more efficient, thereby achieving the purpose of improving data processing speed , Save a lot of time and cost. Dimensionality reduction has also become a very widely used data preprocessing method. There are many techniques for dealing with dimensionality reduction, such as SVD singular value decomposition, principal component analysis (PCA), factor analysis (FA), independent component analysis (ICA), etc. Today we will focus on principal component analysis (PCA).

The purpose of the PCA (Principal Component Analysis) algorithm is to convert high-dimensional data to low-dimensional data under the premise that the "information" loss is small, and the largest individual difference displayed by extracting the principal component can also be used to reduce regression analysis. And the number of variables in cluster analysis, thereby reducing the amount of calculation.

PCA (Principal Component Analysis) is usually used for the exploration and visualization of high-dimensional data sets, and can also be used for data compression and data preprocessing.

Analysis of the PCA algorithm for machine learning

PCA algorithm concept:

PCA (Principal Component Analysis) Principal Component Analysis, also known as Karhunen-Loeve Transform (Karhunen-Loeve Transform), is a technique used to explore high-dimensional data structures.

PCA is a more commonly used dimensionality reduction technique. The idea of â€‹â€‹PCA is to map dimensional features onto dimensions, which are brand new orthogonal features. This dimensional feature is called the principal component and is a dimensional feature reconstructed. In PCA, the data is converted from the original coordinate system to the new coordinate system. The selection of the new coordinate system is closely related to the data itself. The first new coordinate axis selects the direction with the largest variance in the original data, and the second new coordinate axis selects the direction that is orthogonal to the first coordinate axis and has the largest variance. This process is repeated all the time, the number of repetitions is the number of features in the original data. Most of the variance is contained in the first few new axes. Therefore, the remaining coordinate axes can be ignored, that is, dimensionality reduction processing is performed on the data.

The essence of PCA algorithm:

The essence of the PCA algorithm is to find some projection directions so that the variance of the data in these projection directions is the largest, and these projection directions are orthogonal to each other. This is actually the process of finding new orthogonal bases. Calculate the variance of the original data projected on these orthogonal bases. The larger the variance, it means that the corresponding orthogonal base contains more information. The larger the feature value of the original data covariance matrix, the greater the corresponding variance, and the greater the amount of information projected on the corresponding feature vector. Conversely, if the feature values â€‹â€‹are small, it means that the amount of information projected by the data on these feature vectors is very small, and the data corresponding to the direction of the small feature values â€‹â€‹can be deleted, thereby achieving the purpose of dimensionality reduction.

PCA synthesizes potentially high-dimensional variables into linearly independent low-dimensional variables, called principal components. The new low-dimensional data set will retain the variables of the original data as much as possible.

In short, PCA essentially takes the direction with the largest variance as the main feature, and "off-correlates" the data in each orthogonal direction, that is, makes them irrelevant in different orthogonal directions.

Terms in PCA algorithm:

1. Sample "information amount"

The "information amount" of a sample refers to the variance of the sample's projection in the feature direction. The larger the variance, the greater the difference in the features of the sample, so the more important the feature. In the classification problem, the larger the sample variance, the easier it is to distinguish samples of different categories.

2. Variance

It is hoped that the projection values â€‹â€‹will be dispersed as much as possible after the projection, and this degree of dispersion can be expressed in terms of mathematical variance. In the statistical description, variance is used to calculate the difference between each variable (observed value) and the population mean. Here, the variance of a field can be regarded as the mean of the sum of squares of the difference between each element and the mean of the field, namely:

3. Covariance

For the problem of reducing two dimensions to one dimension, it is enough to find the direction that maximizes the variance. But for higher dimensional problems, the covariance needs to be used to express its correlation. which is:

PCA theoretical basis:

1) Maximum variance theory.

2) Minimum error theory.

3) Coordinate axis correlation theory.

PCA algorithm process:

1) To average, that is, each feature minus its own average;

2) Calculate the covariance matrix;

3) Calculate the eigenvalue and eigenvector of the covariance matrix;

4) Sort the feature values â€‹â€‹from largest to smallest;

5) Retain the largest feature vector;

6) Transform the data into a new space constructed by a feature vector.

PCA dimensionality reduction criteria:

1) The nearest reconstruction: all points in the sample set, the sum of the errors of the reconstructed point and the original point is the smallest.
2) Maximum separability: The projections of samples in low-dimensional space are separated as much as possible.

Advantages of PCA algorithm:

1) Make the data set easier to use;

2) Reduce the computational cost of the algorithm;

3) Remove noise;

4) Make the results easy to understand;

5) No parameter restrictions at all.

Disadvantages of PCA algorithm:

1) If the user has certain prior knowledge of the observed object and masters some characteristics of the data, but cannot intervene in the processing process through parameterization and other methods, the expected effect may not be obtained and the efficiency is not high;

2) Eigenvalue decomposition has some limitations, for example, the transformed matrix must be a square matrix;

3) In the case of non-Gaussian distribution, the principal components obtained by the PCA method may not be optimal.

PCA algorithm application:

The PCA algorithm has been widely used in the exploration and visualization of high-dimensional data sets, and can also be used in data compression, data preprocessing and other fields. It is widely used in machine learning, such as image, voice, and communication analysis. The main purpose of the PCA algorithm is to "dimension reduction", remove some redundant information and noise of the data, make the data simpler and more efficient, and improve the computational efficiency of other machine learning tasks.

Conclusion:

PCA is a commonly used data analysis method. PCA transforms the original data into a set of linearly independent representations of each dimension through linear transformation, which can be used to identify and extract the main feature components of the data, by rotating the data coordinate axis to those most important directions (maximum variance) on the data angle; then Through eigenvalue analysis, the number of principal components that need to be retained is determined, and other non-principal components are discarded, thereby realizing data dimensionality reduction. Dimensionality reduction makes data simpler and more efficient, so as to achieve the purpose of increasing data processing speed, saving a lot of time and cost. Dimensionality reduction has also become a very widely used data preprocessing method. The PCA algorithm has been widely used in the exploration and visualization of high-dimensional data sets. It can also be used in data compression, data preprocessing, image, voice, and communication analysis and other fields.

E-cigarette Accessories

Product categories of E-cigarette Accessories, we are specialized manufacturers from China, Smoking Accessories, Smoking Accessories suppliers/factory, wholesale high-quality products of Smoking Accessories R & D and manufacturing, we have the perfect after-sales service and technical support. Look forward to your cooperation!

smoking accessories,vape accessories box,disposable vape accessories,disposable vape atomizer,e cigarette parts accessories

Ningbo Autrends International Trade Co.,Ltd. , https://www.vapee-cigarettes.com