Traditional Culture Encyclopedia - Traditional customs - Introduction of artificial intelligence algorithm

Introduction of artificial intelligence algorithm

As one of the three cornerstones of artificial intelligence-algorithm, data and computing power, algorithm is very important, so which algorithms will artificial intelligence involve? What scenarios are different algorithms suitable for?

1. According to different model training methods, it can be divided into four categories: supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning.

Common supervised learning algorithms include the following categories:

(1) Artificial neural network: back propagation, Boltzmann machine, convolutional neural network, Hopfield network, multilayer perceptron, radial basis function network, RBFN), restricted Boltzmann machine, recurrent neural network (RNN), self-organizing mapping (SOM), impulsive neural network, etc.

(2) Bayesin: Naive Bayes, Gaussian Naive Bayes, Polynomial Naive Bayes and Mean Single Correlation Estimation (AODE).

Bayesian belief network (BBN), Bayesian network (BN), etc.

(3) Decision tree: Categorical Regression Tree (CART), Iterative Dichotomy 3 (ID3), C4.5 Algorithm, C5.0 Algorithm, Chi-square Automatic Interaction Detection, CHAID), Decision Tree Stump, ID3 Algorithm, Random Forest, SLIQ (Supervised Learning SLIQ(Quest), etc.

(4) Linear classifier class: Fisher linear discriminant.

Linear regression, logistic regression, multivariate logistic regression, naive Bayesian classifier, perception, support vector machine, etc.

Common unsupervised learning algorithms include:

(1) artificial neural network: generating countermeasure network (GAN), feedforward neural network, logic learning machine, self-organizing mapping, etc.

(2) Association rule learning: Apriori algorithm, Eclat algorithm, FP-Growth algorithm, etc.

(3) Hierarchical clustering algorithms: single-chain clustering, concept clustering, etc.

(4) Cluster analysis: BIRCH algorithm, DBSCAN algorithm, expectation maximization (EM), fuzzy clustering, K-means algorithm, K-means clustering, K-medians clustering, Mean-shift algorithm, etc.

(5) Anomaly detection: K nearest neighbor (KNN) algorithm and local outlier factor (LOF) algorithm.

Common semi-supervised learning algorithms include: generating model, low-density separation, graph-based method, collaborative training and so on.

Common reinforcement learning algorithms include: Q learning, state-action-reward-state-action (SARSA), DQN (deep Q network), strategy gradient, model-based reinforcement learning and time difference learning.

Common deep learning algorithms include: Deep Belief Machine, Deep Convolution Neural Network, Deep Recursive Neural Network, Hierarchical Time Memory (HTM), Deep Boltzmann Machine (DBM), Stacking Automatic Encoder, Generating Countermeasure Network, etc.

Second, according to the different tasks to be solved, it can be roughly divided into five types: binary classification algorithm, multi-classification algorithm, regression algorithm, clustering algorithm and anomaly detection.

1. Two-level classification

(1) Two types of SVM: suitable for scenes with many data features and linear models.

(2) Second-class average perceptron: suitable for scenes with short training time and linear model.

(3) Logistic regression of the second kind: it is suitable for scenes with short training time and linear model.

(4) Two kinds of Bayesian point machines: suitable for scenes with short training time and linear model. (5) The second decision forest: it is suitable for scenes with short training time and accuracy.

(6) Class II lifting decision tree: it is suitable for scenes with short training time, high accuracy and large memory occupation.

(7) The second decision-making jungle: suitable for scenes with short training time, high accuracy and small memory occupation.

(8) Two kinds of local depth support vector machines: suitable for scenes with more data features.

(9) Two types of neural networks: suitable for scenes with high precision and long training time.

There are usually three schemes to solve the multi-classification problem: the first is to use two classifiers to solve the multi-classification problem from the data set and application method; The second is to directly use multi-classifier with multi-classification ability; Thirdly, the problem of multi-classification is solved by improving two classifiers into multi-classifiers.

Common algorithms:

(1) Multi-class Logistic Regression: Suitable for scenes with short training time and linear model.

(2) Multi-class neural network: suitable for scenes with high precision and long training time.

(3) Multi-decision forest: it is suitable for scenes with high accuracy and short training time.

(4) Multi-class decision-making jungle: suitable for scenes with high accuracy and small memory occupation.

(5) One-to-many classes: It depends on the effect of two classifiers.

return

Regression problems are usually used to predict specific values, not to classify them. Except that the returned results are different, other methods are similar to the classification problem. We call it quantitative output or continuous variable predictive regression; Qualitative output, or discrete variable prediction, is called classification. The algorithm of long towel is:

(1) Ordered Regression: Suitable for data sorting.

(2) Poisson regression: it is suitable for forecasting the number of events.

(3) Fast forest quantile regression: suitable for forecasting distribution scenarios.

(4) Linear regression: suitable for scenes with short training time and linear model.

(5) Bayesian linear regression: it is suitable for linear models and training scenarios with less data.

(6) Neural network regression: suitable for scenes with high precision and long training time.

(7) Decision forest regression: it is suitable for scenes with high accuracy and short training time.

(8) Boost decision tree regression: it is suitable for scenes with high accuracy, short training time and large memory occupation.

strand

The goal of clustering is to discover the potential laws and structures of data. Clustering is usually used to describe and measure the similarity between different data sources and divide the data sources into different clusters.

(1) hierarchical clustering: suitable for scenes with short training time and large data.

(2)K-means algorithm: suitable for scenes with high accuracy and short training time.

(3) Fuzzy C-means (FCM): It is suitable for scenes with high precision and short training time.

(4)SOM (self-organizing feature map): It is suitable for long-running scenes.

Anomaly detection

Anomaly detection refers to detecting and marking anomalies or atypical splits in data, sometimes called deviation detection.

Anomaly detection looks like a supervised learning problem, both of which are classification problems. They all predict and judge the labels of samples, but in fact, the difference between them is great, because the positive samples (abnormal points) in anomaly detection are very small. Commonly used algorithms are:

(1) A kind of SVM: It is suitable for scenes with many data features.

(2) Anomaly detection based on PCA: It is suitable for scenes with short training time.

Common transfer learning algorithms include inductive transfer learning, direct transfer learning, unsupervised transfer learning and direct push transfer learning.

Applicable scenarios of the algorithm:

The factors to be considered are:

(1) The size, quality and characteristics of the data.

(2) What is the nature of the problem in the specific business scenario to be solved by machine learning?

(3) What is the acceptable calculation time?

(4) How high is the accuracy requirement of the algorithm?

————————————————

Original link:/nfzhlk/article/details/82725769