Traditional Culture Encyclopedia - Traditional stories - Data mining and big data, OLAP, data statistics

Data mining and big data, OLAP, data statistics

In the field of big data, we have always heard of professional vocabulary such as data mining, OLAP and data statistics. However, many people don't understand these words very well. In this article, we will introduce relevant knowledge of data mining, big data, OLAP and data statistics to help you get a preliminary understanding of these technologies.

1. Data analysis level

Data analysis is a big concept. Theoretically, any process of calculating and processing data to draw some meaningful conclusions is called data analysis. From the complexity of data itself and the complexity and depth of data processing, data analysis can be divided into four levels, namely data statistics, OLAP, data mining and big data.

2. Statistics

Data statistics is the most basic and traditional data analysis, which has existed since ancient times. It refers to sorting, screening, calculating and counting data by statistical methods, so as to draw some meaningful conclusions.

3.OLAP

OLAP is on-line analytical processing (OLAP), which refers to on-line multidimensional statistical analysis based on data warehouse. It allows users to observe an indicator online from multiple dimensions, thus providing support for decision-making. OLAP further tells you what will happen next, and what will happen if I take such measures.

4. Data mining technology

Data mining refers to discovering unknown, potentially useful and hidden laws from massive data. Through various algorithms such as correlation analysis, cluster analysis and time series analysis, we can find some deep-seated reasons that cannot be obtained by observing charts. In view of this, targeted management measures can be taken.

5. Big data

Big data refers to a very large-scale data set that is difficult to collect, store, manage, analyze and use by using existing computer software and hardware facilities. Big data has the characteristics of large scale, diverse types, high speed and low value density. The "big" of big data is a relative concept, and there is no specific standard. If a standard must be given, then 10- 100TB is usually called the threshold of big data.

It can be seen that from the perspective of data analysis, at present, most school data application products are still in the stage of data statistics and report analysis, and few can achieve effective OLAP analysis and data mining, and even fewer can reach the stage of big data application, at least they have not used effective big data sets.

Here, we will give you a brief introduction of data mining, big data, OLAP and data statistics. In fact, this knowledge is not as simple as we said. We must really understand this knowledge in order to better understand and master data analysis.