Traditional Culture Encyclopedia - Traditional customs - Three approaches to data integration

Three approaches to data integration

Three Methods of Data Integration

The current general approach to data integration can be summarized as federated, middleware, data warehouse model, etc.. (1) Federal pattern This pattern builds a data integration system that consists of the collaboration of autonomous multiple database systems, with each data source providing interfaces for mutual access. The integration system of this architecture integrates the data views of each heterogeneous data source in a global schema. The global schema describes the data structure, semantics, and operations of the heterogeneous data sources, and is a virtual data view of the data sources that enables users to access the data transparently. Users send access requests to the system based on the global schema, and the system transforms these requests into operations that each heterogeneous data source can perform within the autonomous system. Constructing the mapping relationship between the global schema and the data views of the heterogeneous data sources and handling the user's query requests in the global schema are the two key issues that need to be addressed in this model.

(2) Middleware mode In the middleware mode data integration system, the middleware is generally located in the middle of the data layer and the application layer, and can coordinate different database systems downward, and provide unified access interfaces and data schemas to different applications upward. The middleware system mainly provides unified retrieval services for heterogeneous multiple data sources in a distributed environment, and each data source still Each data source still has its own independence. The architecture of the middleware model usually consists of a combination of mediators and wrappers. Among them, the intermediary can decompose the query for the global schema, generate sub-queries for different heterogeneous data sources to be executed by the wrapper, and return the results of all sub-queries to the user in a unified format after the query is finished: the wrapper for different data sources can convert heterogeneous data from different data sources into a unified format that can be processed by the integration system. Pattern data warehouse is a subject-oriented, integrated, and time-related data collection, data is categorized into broad, functionally independent, no overlap of topics for data analysis and decision support system, but also for enterprise applications to propose a data integration method. The model uses a way to store copies of multiple heterogeneous data sources in a single data warehouse, and periodically the ETL (Extract, Transform, Load) tool extracts and transforms the data from different data sources, then loads it into the data warehouse, builds a data management system on top of the data warehouse, and handles the user's requests to access the data.