Traditional Culture Encyclopedia - Traditional culture - Teach you to master the planning and construction strategy of data warehouse easily.
Teach you to master the planning and construction strategy of data warehouse easily.
As the basis of decision support system (DSS), data warehouse is subject-oriented, integrated, non-renewable and constantly changing with time. These characteristics show that the data warehouse is very different from the original database in data organization and data processing, which requires us to find a suitable method for data warehouse design when designing data warehouse system. In general system development planning, it is necessary to determine the functions of the system first, which are generally obtained by analyzing the needs of users. From the application point of view of data warehouse, DSS analysts are generally middle and senior managers in enterprises, and their demand for decision support can not be specified in advance, but can only be described abstractly to designers.
This requires designers to gradually clarify the requirements of the system and constantly improve it in constant communication with users. Therefore, the development planning process of data warehouse is actually a process in which users and designers constantly understand, become familiar with and improve it. The development and application planning of data warehouse is the primary task of developing data warehouse. Only by making a correct data warehouse plan can the main force of the organization realize the development and application of data warehouse in an orderly manner. In data warehouse planning, there are usually several processes: selecting implementation strategy, determining the development goal and implementation scope of data warehouse, selecting data warehouse architecture, and establishing business and project planning budget. After the completion of the data warehouse planning, it is necessary to prepare the corresponding data warehouse planning instructions, explain the relationship between the data warehouse and the enterprise strategy, as well as the relatively limited development opportunities that the enterprise urgently needs to deal with, the key functional departments that need support, the suggestions for the future development of the data warehouse, the actual use plan and the development budget, as the basis for the actual development of the data warehouse.
1, select the data warehouse implementation strategy.
The development strategies of data warehouse mainly include top-down, bottom-up and the joint use of these two strategies. Top-down strategy is difficult in practical application, because the function of data warehouse is a decision support function. This function is often difficult to determine in the application scope of enterprise strategy, because the application opportunity of data warehouse often exceeds the actual business scope of the enterprise at present, the goal is determined before development, and new applications will not be pursued after the predetermined goal is reached, which is a more strategic application of data warehouse. This strategy is an effective data warehouse development strategy because it can give the realization scope of data warehouse before development, and can clearly describe the benefits and realization goals of the system to decision makers and enterprises. When using this method, developers need to have rich top-down system development experience, and enterprise decision makers and managers fully understand the predetermined goals of data warehouses and understand that data warehouses can play a role in those decisions.
Bottom-up strategy generally starts with a data warehouse prototype, and selects some specific management problems familiar to enterprise managers as the objects of data warehouse development, and develops data warehouses on this basis. Therefore, this strategy is often used to develop data marts, manager systems or departmental data warehouses. The advantage of this strategy is that enterprises can get higher benefits from data warehouse applications with less investment. In the development process, it is easy to get results with less personnel input. Of course, if a project fails to develop, it may delay the development of the whole data warehouse system. This strategy is generally used for enterprises to evaluate the technology of data warehouse when washing dishes, so as to determine the application mode, place and time of this technology, or to understand the various costs needed to realize and run the data warehouse, or to use it when the application goal of the data warehouse is unclear and the impact of the data warehouse on the decision-making process is unclear.
In the top-down development strategy, the development of data warehouse can be completed by structured or object-oriented methods according to the stages of data warehouse planning, demand determination, system analysis, system design, system integration, system testing and system trial operation. In the bottom-up development, spiral prototype development method can be adopted to allow users to modify the trial operation system according to new requirements. The development method of spiral prototype requires the rapid generation of data warehouse systems with more and more functions in a relatively short time. This development method is mainly suitable for such occasions: when the market trend and demand of enterprises are unpredictable, market timing is an important part of realizing products, and market adjustment with enterprises needs continuous improvement; The lasting competitive advantage comes from continuous improvement, and systematic improvement is based on the continuous discovery of users in use. The combination of top-down strategy and bottom-up strategy has the advantages of two strategies, which can not only complete the development and application of data warehouse quickly, but also establish a data warehouse scheme with long-term value. However, it is often difficult to operate in practice, and usually requires experienced developers who can establish, apply and maintain enterprise models, data models and technical structures, and can skillfully transfer from concrete (such as metadata in business systems) to abstract (only based on business nature rather than logical models that realize system technology); Enterprises need an experienced development team composed of end users and information system personnel, who can clearly point out the application of data warehouse in enterprise strategic decision support.
2. Determine the development goal and implementation scope of data warehouse.
In order to determine the development goal and scope of data warehouse, it is necessary to explain the application and development trend of data warehouse in enterprise management to enterprise managers and other data warehouse users, and explain the importance of organizing and using data to support cross-functional systems and business strategies of enterprises, so as to determine the development goal. At this stage, you should confirm the business requirements related to using the data warehouse. These requirements should only support the most important business functional departments and focus on the businesses with obvious benefits, so that the application of data warehouse can have immediate effects, and at the same time, the application of data warehouse should be dispersed in all businesses, and it should not consume too much energy.
After determining the development goal and scope, the requirement document should be written as the basis for developing the data warehouse in the future. The primary goal of data warehouse development is to determine the range of information needed and which data sources are needed in the subject and index fields when users provide decision assistance. This requires definition: What data do users need? What kind of supporting data does the subject-oriented data warehouse need? What business knowledge do developers need to successfully submit data to users? What background knowledge? Therefore, it is necessary to define the overall requirements, organize the existing recording system and system environment in the form of files, identify and sort the candidate application systems that use the data in the data warehouse, build the transmission model, and determine the scale, facts and timestamp algorithm, so as to extract information from the system and put it into the data warehouse. The determination of information scope can provide a good analysis platform for developers to analyze what information the data warehouse needs and what data it needs for business activities with users. Developers and users can further define requirements, such as the level of data grading, the level of aggregation, the frequency of loading and the schedule to be maintained. Another important goal of data warehouse development is to determine which methods and tools to use to access and navigate data. Although users need to access and retrieve the contents of the data warehouse, the granularity of access is different, some may be detailed records, some may be more general records or very general records. Different degrees of data generalization required by users will lead to different requirements for data warehouse aggregation and generalization tools.
The data warehouse also has the functions of accessing and retrieving charts, predefined reports, multidimensional data, summary data and detailed records. Users should obtain information from the data warehouse with the support of spreadsheets, statistical analyzers and analysis processors that support multidimensional analysis, so as to interpret and analyze the contents in the data warehouse and generate and verify different market assumptions, suggestions and decision-making schemes. In order to clearly express decision-making suggestions and various decision-making schemes to users, powerful information expression tools such as reports, charts and images are needed. Another goal of data warehouse development is to determine the size of data in the data warehouse. The data warehouse contains not only current data, but also historical data for many years. The generalization degree of data determines the maximum compression and generalization ability of these data. If the data warehouse is to provide the function of making decisions and querying historical records, it must support the management of a large number of data. The scale of data not only directly affects the time of decision-making query, but also directly affects the quality of enterprise decision-making.
In the development goals of data warehouse, there are: according to the basic needs of users for data warehouse, determine the meaning of data in data warehouse; Determine the quality of data warehouse content to determine the credibility of use, analysis and suggestions; What kind of data warehouses can meet the needs of end users and what functions these data warehouses should have; What metadata is needed, how to use the data in the data source, etc. The development objectives of data warehouse are diverse and complex, which requires continuous interaction and improvement between developers and users in the process of development and use. Therefore, it is necessary to determine the development scope of data warehouse in the planning. In this way, developers can step by step according to the importance of requirements and objectives, and draw lessons from development, so as to provide technical preparation for the full realization of data warehouse in enterprises. Therefore, after determining the overall development direction and goal of data warehouse, it is necessary to determine a limited scope of use that can quickly reflect the benefits of data warehouse. When considering the application scope of data warehouse, it is mainly analyzed from the perspectives of the number and type of departments, the number of data sources, the subset of enterprise models, budget allocation, and the time required for developing projects.
When analyzing these factors, we can do it from the user's point of view and the technical point of view. From the user's point of view, which departments should use the data warehouse first? Who uses the data warehouse for what purpose? What decision queries should the data warehouse satisfy first? Because these decision-making queries usually determine the dimension of data and the type of report, these factors will determine the quantitative relationship needed when defining the data warehouse. The more specific the query format, the easier it is to provide a planning description of the dimension, aggregation and generalization of the data warehouse. From a technical point of view, we should determine the size of the metabase in the data warehouse. A metabase in a data warehouse is a model for storing data definitions in the data warehouse. Data definitions are stored in the directory of the warehouse manager, which can be used as the basis for all query and reporting tools to build and query the data warehouse. The size of the metabase directly indicates the size of data that must be managed in the data warehouse. By managing the size of the metabase, the data size that needs to be managed in the data warehouse is actually determined.
3. Structure selection of data warehouse
The structure of data warehouse can be flexibly selected, various platforms used by organizations can be appropriately divided, and data sources, data warehouses and workstations used by end users can be designed separately.
The application structure of (1) data warehouse
In this structure, the data warehouse based on business processing system uses the operation data in the read-only application without modifying the data. The data warehouse metadata database with this structure is a virtual database, not the metadata of the data warehouse itself. Under the direct guidance of data warehouse metadata database, the query of data warehouse is simply to extract data from the database.
Simple data warehouse
Through the operations of data source purification, integration, generalization and integration in the data warehouse, the data source is transmitted from the business processing system to the centralized data warehouse, and the data warehouse application of each department is only carried out in the data warehouse. This structure often happens when many departments and a few users use data warehouse. The concentration here is only logical, or it may be physical dispersion.
Simple data mart
Data mart refers to the data warehouse used by various departments, because each functional department in an enterprise has its own special needs, and a unified data warehouse may not meet the special needs of these departments. This architecture often happens when individual departments are interested in the application of data warehouse, but other departments in the organization are very indifferent to the application of data warehouse, and it is adopted independently by enthusiastic departments.
Data warehouse and data mart
Every department of an enterprise has a data mart that meets its own needs, and its data is obtained from the enterprise data warehouse, which collects and distributes it from various data sources of the enterprise. This architecture is a relatively perfect data warehouse architecture, which often appears when the whole organization is interested in data warehouse applications.
(2) The technical platform structure of data warehouse is a single-layer structure.
Single-tier structure mainly shares the platform between data source and data warehouse, or makes data source, data warehouse, data mart and end-user workstation use the same platform. * * * Shared platform can reduce the complexity of data extraction and data conversion, but * * * Shared platform may encounter performance and management problems in application. This architecture is generally adopted when the data warehouse is small in scale and the business system platform of the organization has great potential.
Client/server two-tier structure
One layer is the client and the other layer is the server. The end-user access tool runs on the client layer, while the data source, data warehouse and data mart are located on the server. This technical organization is usually used for ordinary-scale data warehouses.
Three-tier client/server structure
A workstation-based client layer, a server-based middle layer and a host-based third layer. The host layer is responsible for managing data sources and optional source data transformation; The server runs data warehouse and data mart software to store the data of the warehouse; Client workstations run query and reporting applications, and can also store local data unloaded from data marts or data warehouses. When the data warehouse is a little larger, the two-tier data warehouse structure can no longer meet the needs of customers. This structure can be adopted when the data storage management, application processing and client application of the data warehouse are separated.
multilayered structure
This is a data warehouse structure developed on the basis of three-tier organization. In this structure, from the innermost data layer to the outermost customer layer, there are: independent data warehouse storage layer, data warehouse service layer for managing data warehouses and data marts, query service layer for data warehouse query processing, application service layer for completing data warehouse application processing, and customer layer for end users. The architecture may be as many as five layers, which is generally used in very large-scale data warehouse systems.
4, data warehouse use plan and project planning budget
The actual use scheme and development budget of data warehouse are the final problems to be determined in data warehouse planning. Because data warehouse is mainly used to support the decision-making of enterprise managers, it is very important to ensure its practicability, so it is necessary to let end users participate in the functional design of data warehouse. This participation is carried out through the user's actual use plan, which is a very important demand model. The actual use scheme must help to clarify the end user's demand for data warehouse. Some of these requirements can basically be met by using suitable data sources, while others need data sources from outside the enterprise, which requires the use of schemes to link these different requirements. The practical use scheme can also link the decision support needs of end users with the technical needs of data warehouse. Because when the user determines the final requirements, it determines a boundary for the scope of the metabase. You can also determine the amount of historical information you need. When planning a data warehouse according to specific users, you can determine the dimensions that end users care about (time, place, business unit, production enterprise). Because dimensions are obviously related to the required generalization operations, you must choose dimensions that are meaningful to end users, such as "month", "quarter" and "year". Finally, the structural requirements of data mart/data warehouse can be determined, so that designers can decide whether to adopt simple data warehouse structure, simple data mart structure or a combination of the two.
After the actual use of the development plan is determined, it is necessary to estimate the budget of the development plan and determine the investment amount of the project. The investment plan can be determined according to the previous software development cost, but the evaluation of this budget is rough. Another method is to evaluate the cost by referring to the structure, that is, to decompose the components determined by the actual use scheme of the data warehouse and estimate the budget according to the cost of each component. The components of data warehouse include data source, data warehouse, data mart, end-user access, data management, metadata management, transmission basis and so on. Some of these components are already in the original information system of the enterprise, some can choose commercial components, and some need to be developed independently. According to the different sources of these components, a more accurate budget can be determined. After the completion of data warehouse planning, it is necessary to prepare a data warehouse development manual to explain the relationship between the system and the strategic objectives of the enterprise, as well as the relatively limited development opportunities that the system and the enterprise urgently need to deal with, the description of the envisaged business opportunities, the general description of the objectives and tasks, the key functional departments and suggestions for future work. A data warehouse project should start with a clear business value plan, which needs to clarify the expected tangible and intangible benefits. Intangible benefits include the benefits of making decisions faster and better by using data warehouses.
The business value plan is best completed by the target business supervisor, because the data warehouse is user-driven, and users should actively participate in the construction of the data warehouse. In the plan, the scope, structure, use scheme and development budget of the data warehouse development goal should be determined.
- Previous article:What does the corresponding festival mean?
- Next article:What does decorative painting pay attention to?
- Related articles
- What is the main content of Lu Xun's novels?
- Deep comment: the truth of Tesla's price reduction: how does the opponent win?
- Contents of Tomb-Sweeping Day Food Handwritten Newspaper
- What are the better marketing methods for corporate branding?
- What are the advantages and disadvantages of laser projection, LED projection and ordinary projection?
- Dou Guimei's recommendation - traditional culture coloring edition: Chinese idiom stories + poems
- Six contents of teachers' morality
- Why do Chinese people celebrate the Spring Festival?
- Three Common Ways of Layout Design of Website Home Page
- Four characteristics of Shi Tao's landscape painting innovation