Traditional Culture Encyclopedia - Traditional festivals - What are the data visualization and analysis tools?

What are the data visualization and analysis tools?

1. hadoop

Hadoop is a software framework that enables distributed processing of large amounts of data. But Hadoop does it in a reliable, efficient, and scalable way.Hadoop is reliable because it assumes that computational elements and storage will fail, so it maintains multiple working copies of the data, ensuring that it can redistribute the processing against failed nodes.Hadoop is efficient because it works in parallel, speeding up processing through parallel processing. Hadoop is also scalable and can handle petabytes of data. In addition, Hadoop relies on community servers, so it is relatively inexpensive and can be used by anyone.

2. HPCC

HPCC, an acronym for High Performance Computing and Communications, was created in 1993 by the U.S. Science, Engineering, and Technology Federal Coordinating Council to Congress? Grand Challenges Program: High Performance Computing and Communications? The report, also known as the HPCC program, is the U.S. President's Science Strategy Program, which aims to solve a number of important scientific and technological challenges through enhanced research and development. HPCC is the U.S. implementation of the information superhighway and on the implementation of the program, the implementation of the program will cost tens of billions of dollars, and the main objectives to achieve: the development of scalable computing systems and related software to support terabits of network transmission performance, the development of gigabits of data, and the development of a high performance computing system. The main goals of the program are to develop scalable computing systems and related software to support terabit-level network performance, to develop gigabit networking technologies, and to expand the capacity of research and educational institutions and networks.

3. Storm

Storm is free and open source software, a distributed, fault-tolerant real-time computing system.Storm can very reliably deal with the huge data streams, used to deal with the batch data of Hadoop.Storm is very simple. It supports many programming languages and is fun to use.Storm is open sourced from Twitter and other well-known applications include Groupon, Taobao, Alipay, Alibaba, Le Element, Admaster and more.

4, Apache Drill

In order to help business users to find a more effective way to speed up the Hadoop data query, the Apache Software Foundation has recently launched an open source project called Drill. Drill implements Google's Dremel. the project will create an open source version of Google's Dremel Hadoop tool (which Google uses to speed up Internet applications for Hadoop data analysis tools). And ?Drill? will help Hadoop users achieve faster querying of massive data sets.

5, RapidMiner

RapidMiner is the world's leading data mining solutions, in a very large degree with advanced technology. It has a wide range of data mining tasks, including a variety of data art, can simplify the design and evaluation of the data mining process.

6, Pentaho BI

Pentaho BI platform is different from traditional BI products, it is a process-centered, solution-oriented (Solution) framework. Its purpose is to integrate a series of enterprise-class BI products, open source software, APIs and other components to facilitate the development of business intelligence applications. Its emergence, making a series of independent products for business intelligence such as Jfree, Quartz , etc., can be integrated together to form a complex, complete business intelligence solutions.