Traditional Culture Encyclopedia - Traditional festivals - What is the data warehouse built with?
What is the data warehouse built with?
Do you prefer to integrate the data of various systems to serve data analysis and decision-making, or do you prefer to complete the analysis and decision-making requirements quickly?
If it is the former, er modeling method is generally chosen when modeling data warehouse.
If it is the latter, the dimension modeling method will generally be chosen.
ER modeling: that is, entity relationship modeling, proposed by BIll Inmon, the father of data warehouse. The core idea is to design a three-paradigm model from the height of the whole enterprise and describe enterprise services with entity relations. It advocates a top-down architecture, which concentrates different OLTP data into a topic-oriented data warehouse.
Dimension modeling: proposed by Kimball, the core idea is to establish a model from the demand of analysis and decision-making. The model consists of fact table and dimension table, namely star model and snowflake model. Kimball advocates a bottom-up architecture, which can build data marts for independent departments, and then build them incrementally and summarize them into data warehouses.
Secondly, you should conduct in-depth business research and data research.
Business research: in-depth business research can make you more clear about the purpose of opening a position; At the same time, it is also beneficial to the subsequent modeling and design. With the deepening of research, how to abstract entity business into multi-warehouse model will be more clear.
Data research: to understand the data status of various departments or departments, including data classification, data storage methods, data volume, specific data content, etc. This is the necessary basis for subsequent master data splicing or dimension consistency processing.
3. Then the tool selection of data warehouse.
Traditional data warehouse: generally, the database of third-party vendors and supporting ETL tools will be selected. Because of the support of the third party, it is relatively guaranteed; But the shortcomings are also obvious, constrained and costly.
NoSQL data warehouse: it is generally a data warehouse based on hadoop ecology. Hadoop ecosystem is already very powerful, and various open source components can be found to support data warehouse. The disadvantage is that you need to recruit professionals to explore, and there will be some hidden dangers that are unknown.
4. Finally, the design and implementation.
Design: including data hierarchy division and concrete model design in data architecture; It also includes data quality management, metadata management and scheduling management in the program architecture.
Implementation: standardized project management implementation, but at the same time, remember that data warehouse is not a project, it is a process.
- Related articles
- Why did ancient astronomy in China produce the ethical thought of harmony between man and nature?
- How to make leek and egg stuffing delicious
- The first lesson brought by the Winter Olympics is 600 words (10).
- Traditional crosstalk I am the complete works of Henan people
- Why do Japanese songs always have a fresh style?
- 100 plus 50 to translate Chinese into English.
- Zhixingzhai traditional trough cake of Zhixingzhai traditional trough cake
- Brief introduction of pph operation
- American liquor cabinet brand
- How do you learn English?