Traditional Culture Encyclopedia - Traditional festivals - What is big data?

What is big data?

The difference between traditional data and big data

First of all, before big data appeared, computer science relied heavily on models and algorithms. In order to get an accurate conclusion, people need to establish a model to describe the problem, and at the same time, they need to straighten out the logic, understand the cause and effect, design a sophisticated algorithm, and draw a conclusion close to reality. Therefore, whether a problem can be solved best depends on whether the modeling is reasonable, and the competition of various algorithms becomes the key to success or failure. However, the emergence of big data has completely changed people's dependence on modeling and algorithms. For example, suppose there are algorithms A and B to solve a problem. When running with a small amount of data, the result of algorithm A is obviously better than that of algorithm B, that is to say, as far as the algorithm itself is concerned, algorithm A can bring better results; However, it is found that when the amount of data is increasing, the result of algorithm B running in a large amount of data is better than that of algorithm A running in a small amount of data. This discovery has brought landmark enlightenment to both computer science and computer derivative science: when the data is getting larger and larger, the data itself (rather than the algorithms and models used to study the data) ensures the validity of the data analysis results. Even if there is no accurate algorithm, as long as there is enough data, we can get a conclusion close to the fact. Therefore, data is known as the new productivity.

Second, if there is enough data, we can draw a conclusion without knowing the specific causal relationship.

For example, when Google helps users translate, it does not set various grammar and translation rules. Instead, we use the vocabulary habits of all users collected in the Google database for comparison and recommendation. Google checks the writing habits of all users and recommends the most commonly used and commonly used translation methods to users. In this process, the computer may not know the logic of the problem, but when there are more and more recorded data of user behavior, the computer can provide the most reliable results without knowing the logic of the problem. It can be seen that massive data and analytical tools for processing these data provide a brand-new way to understand the world.

Third, because it can handle various data structures, big data can make maximum use of human behavior data recorded on the Internet for analysis. Before the emergence of big data, all the data that computers can handle need to be structured in the early stage and recorded in the corresponding database. The big data technology greatly reduces the structural requirements of data. The information of various dimensions left by people on the Internet, such as social information, geographical location information, behavior habit information, preference information, etc., can be processed in real time, and the various characteristics of each individual can be outlined in a three-dimensional and complete way.

Octopus collector, which developed earlier and did better in the field of big data, is octopus collector.