Traditional Culture Encyclopedia - Traditional stories - What are the core technologies of big data
What are the core technologies of big data
1, data collection and preprocessing: FlumeNG real-time log collection system, support in the logging system to customize all kinds of data senders, used to collect data; Zookeeper is a distributed, open source distributed application coordination services, to provide data synchronization services.
2, data storage: Hadoop as an open source framework designed for offline and large-scale data analysis, HDFS as its core storage engine, has been widely used for data storage. HBase, a distributed, column-oriented open source database, can be considered as a wrapper for hdfs, the essence of the data storage, NoSQL database.
3, data cleansing: MapReduce as the query engine of Hadoop, used for parallel computing of large-scale data sets.
4, data query analysis: Hive's core job is to translate SQL statements into MR programs, which can map structured data into a database table and provide HQL (HiveSQL) query functionality.Spark enables in-memory distribution of datasets, in addition to being able to provide interactive querying, it can optimize iterative workloads.
5. Data visualization: docking some BI platforms to visualize the data obtained from the analysis for guided decision-making services.
- Related articles
- Comprehensive interpretation of technological innovation
- The Ethical Implication of Confucian Thought of "Cautious Independence"
- What procedures do stone processing plants need to go through?
- What does Kyoto mean?
- Advantages and disadvantages of hybrid electric vehicles
- What are the well-known cross-border e-commerce industries in China?
- What is the full name of the Chinese Twenty-Eight Constellations?
- Is there any quick, good and standardized way to make egg dumplings?
- Do you adhere to traditional virtues and love service?
- An article describing morality