Traditional Culture Encyclopedia - Traditional virtues - How Artificial Intelligence Differs from Machine Learning

How Artificial Intelligence Differs from Machine Learning

Artificial intelligence has long not been a new term; it has been around for decades. It started in the early 1980s, when computer scientists designed algorithms that could learn and mimic human behavior. In terms of learning, the most important algorithms were neural networks, but they were not very successful due to overpowered models that were not supported by enough data. However, the idea of using data to adapt functions in some more specific tasks has been a huge success and this forms the basis of machine learning. In terms of imitation, AI has a wide range of applications in image recognition, speech recognition and natural language processing. Experts have spent a great deal of time creating edge calculations, colored profiles, N-gram language models, syntax trees, etc., with unsurprisingly mediocre results.

Traditional Machine Learning

Machine Learning (ML) techniques play an important role in prediction, Machine Learning has gone through multiple generations with a complete set of modeling structures such as:

-Linear Regression

-Logistic Regression

-Decision Trees

-Support Vector Machines

- Bayesian Model

-Regularization Model

-Integrated Model

-Neural Networks

Each predictive model is based on a certain algorithmic structure and the parameters can be adjusted. Training a predictive model involves the following steps:

1. Selecting a model structure (e.g., logistic regression, random forest, etc.).

2. Feedback the model with training data (input and output).

3. The learning algorithm will output the optimal model (i.e., the model with specific parameters that minimize the training error).

Each model has its own characteristics, performing well in some tasks and not so well in others. But in general, we can categorize them into low-power (simple) models and high-power (complex) models. Choosing between the different models is a very tricky issue. Traditionally, it is better to use low-power/simple models than high-power/complex models for the following reasons:

- Until we have a lot of processing power, it takes a long time to train high-power models.

- Until we have a huge amount of data, training high power models can lead to overfitting problems (since high power models have a rich set of parameters that can adapt to a wide range of data shapes, we may end up training a model that is very relevant to the current training data, rather than making predictions about future data).

However, choosing a low-power model suffers from what is known as the "underfitting" problem, where the model structure is too simple to fit the training data in more complex situations. (Assume the following data has a quadratic relationship: y = 5*X squared; there is no way to fit a linear regression: y = A, B, B, B, B, no matter what A and B we choose.)

In order to mitigate the "underfitting problem", data scientists often apply their "domain knowledge" to generate "input characteristics" that are more directly related to the output. (e.g., return to the quadratic relationship y = 5*X squared) and then fit a linear regression by picking a = 5 and b = 0.

One of the major obstacles to machine learning is this feature engineering step, which requires domain experts to identify important signals before entering the training process. The feature engineering step is very manual and requires a great deal of domain expertise, making it a major bottleneck for most machine learning tasks today. In other words, if we don't have enough processing power and enough data, then we have to use low-power/simple models, which require a lot of time and effort to create appropriate input features. This is what most data scientists spend their time doing.

The Return of Neural Networks

In the early 2000s, with the massive collection of fine-grained event data in the age of high-volume data, machine processing power increased dramatically with advances in cloud computing and massively parallel processing infrastructure. We are no longer limited to low-power/simple models. For example, two of the most popular mainstream machine learning models today are Random Forest and Gradient Augmented Tree. However, even though they are both very powerful and provide nonlinear models to fit training data, data scientists still need to carefully create features to achieve good performance.

Meanwhile, computer scientists have repurposed many layers of neural networks to perform these human simulation tasks. This has given the newly born DNNs (deep neural networks) a major breakthrough in the tasks of image classification and speech recognition.

The main difference with DNNs is that you can emit the original signal, (e.g., RGB pixel values) directly to the DNN without creating any domain-specific input features. Through multiple layers of neurons (that's why it's called a "deep" neural network), the ability to automatically generate the appropriate features through each layer finally provides a good prediction. This greatly saves the effort of "feature engineering", which is one of the main bottlenecks encountered by data scientists.

DNNs have also evolved into many different network architectures, so we have CNN (Convolutional Neural Networks), RNNs (Neural Networks), LSTMs (Long Short Term Memory), GANs (Generative Adversarial Networks), Migration Learning, Attention Models... the whole spectrum is known as "Deep Learning", and it is the focus of attention of today's entire machine learning community. The center of attention today is the entire machine learning community.

Reinforcement learning

Another key component is how to mimic a person (or animal) learning. Think of the very natural animal behavior of the perception/behavior/reward cycle. A person or animal will first understand the environment by sensing what state he or she is in. Based on this, he or she will choose an "action" that will take him or her to another "state" and then he or she will get a "reward" and so on.

This approach to learning (called reinforcement learning) is very different from the traditional curve-fitting approach of supervised machine learning. In particular, reinforcement learning occurs very quickly, as each new piece of feedback (e.g., performing an action and receiving a reward) is immediately sent to influence subsequent decisions. Reinforcement learning has been a huge success in self-driving cars as well as AlphaGO (the chess-playing robot).

Reinforcement learning also provides a smooth integration of prediction and optimization as it maintains a belief about the current state and the probability of possible shifts when taking different actions, and then makes a decision about which actions will bring the best results.

Deep Learning + Reinforcement Learning = Artificial Intelligence

Deep Learning provides a more powerful predictive model than classical machine learning techniques, which usually produce good predictions. Reinforcement learning provides a faster learning mechanism and is more adaptive to changes in the environment than classical optimization models.

Previous article:Short Story �OGhost Story - Neighbors (2)
Next article:Is the spicy chicken in Zaozhuang good? Which restaurant has the most authentic spicy chicken?