Traditional Culture Encyclopedia - Traditional festivals - 1. The enterprise has experienced two failed data warehouse constructions, and now it is the third time. It is generally believed that this time it will also fail. What should the project manager do?
1. The enterprise has experienced two failed data warehouse constructions, and now it is the third time. It is generally believed that this time it will also fail. What should the project manager do?
Do you prefer to integrate the data of various systems to serve data analysis and decision-making, or do you prefer to complete the analysis and decision-making requirements quickly?
If it is the former, er modeling method is generally chosen when modeling data warehouse.
If it is the latter, the dimension modeling method will generally be chosen.
ER modeling: that is, entity relationship modeling, proposed by BIll Inmon, the father of data warehouse. The core idea is to design a three-paradigm model from the height of the whole enterprise and describe enterprise services with entity relations. It advocates a top-down architecture, which concentrates different OLTP data into a topic-oriented data warehouse.
Dimension modeling: proposed by Kimball, the core idea is to establish a model from the demand of analysis and decision-making. The model consists of fact table and dimension table, namely star model and snowflake model. Kimball advocates a bottom-up architecture, which can build data marts for independent departments, and then build them incrementally and summarize them into data warehouses.
Secondly, you should conduct in-depth business research and data research.
Business research: in-depth business research can make you more clear about the purpose of opening a position; At the same time, it is also beneficial to the subsequent modeling and design. With the deepening of research, how to abstract entity business into multi-warehouse model will be more clear.
Data research: to understand the data status of various departments or departments, including data classification, data storage methods, data volume, specific data content, etc. This is the necessary basis for subsequent master data splicing or dimension consistency processing.
3. Then the tool selection of data warehouse.
Traditional data warehouse: generally, the database of third-party vendors and supporting ETL tools will be selected. Because of the support of the third party, it is relatively guaranteed; But the shortcomings are also obvious, constrained and costly.
NoSQL data warehouse: it is generally a data warehouse based on hadoop ecology. Hadoop ecosystem is already very powerful, and various open source components can be found to support data warehouse. The disadvantage is that you need to recruit professionals to explore, and there will be some hidden dangers that are unknown.
4. Finally, the design and implementation.
Design: including data hierarchy division and concrete model design in data architecture; It also includes data quality management, metadata management and scheduling management in the program architecture.
Implementation: standardized project management implementation, but at the same time, remember that data warehouse is not a project, it is a process.
- Previous article:What are the customs in Ankang?
- Next article:What are the ways to combine agriculture with cultural and creative industries?
- Related articles
- Ask the names of all the festivals you can think of.
- Why is vacuum packaging of roast chicken expensive?
- Zhou Chuntao's personal experience.
- How to make Liubao tea?
- The difference between Lenovo L5 and H3S
- How to measure according to the actual situation?
- Seeking Dragon Boat Festival narration! More than 600 words ~ very, very urgent ~
- The purpose and significance of the topic selection of Guofeng Tea Room
- Common images in landscape pastoral poems
- Korean style items are especially popular with girls, what are the ways to wear them?