Traditional Culture Encyclopedia - Traditional stories - What is distributed storage?

What is distributed storage?

What is a distributed storage system?

Store data in multiple devices.

What is distributed storage? What kind of distributed storage is better?

Distributed storage system stores data in multiple devices. The traditional network storage system uses centralized storage server to store all data, which becomes the bottleneck of system performance and the focus of reliability and security, and cannot meet the needs of large-scale storage applications. The distributed network storage system adopts an extensible architecture, which uses multiple storage servers to share the storage load and location servers to store information. It not only improves the reliability, availability and access efficiency of the system, but also is easy to expand.

Lenovo hyper-converged ThinkCloud AIO hyper-converged cloud all-in-one machine is the core product launched by Lenovo for enterprise users. ThinkCloud AIO hyper-converged cloud all-in-one machine realizes the seamless integration of cloud management platform, computing, network and storage system, builds a one-stop solution of cloud computing infrastructure as a service, and provides users with a highly simplified one-stop infrastructure cloud platform. This not only shortens the online business deployment from several weeks to several days, but also completely decouples from enterprise application software, middleware and database software, which can effectively improve the efficiency of enterprise IT infrastructure operation and maintenance management and the performance of key applications.

What is distributed data storage?

Definition:

Distributed database refers to the use of high-speed computer networks to connect many physically dispersed data storage units to form a logically unified database. The basic idea of distributed database is to store the data in the original centralized database in multiple data storage nodes connected through the network, so as to obtain larger storage capacity and higher concurrent access. In recent years, with the rapid growth of data, distributed database technology has also developed rapidly. The traditional relational database has begun to develop from centralized mode to distributed architecture. The distributed database based on relational database changes from centralized storage to distributed storage and from centralized computing to distributed computing while retaining the data model and basic characteristics of traditional databases.

Features:

1. High scalability: The distributed database must have high scalability, and can dynamically add storage nodes to realize linear expansion of storage capacity.

2 High concurrency: The distributed database must respond to the reading and writing requests of large-scale users in time, and can read and write massive data randomly.

3. High availability: The distributed database must provide fault-tolerant mechanism, which can realize redundant backup of data and ensure high reliability of data and services.

What is the difference between distributed block storage and distributed file storage?

Both distributed file system (dfs) and distributed database support storage, retrieval and deletion. But distributed file systems are violent and can be accessed as keys/values. Distributed databases involve refining data. The traditional distributed relational database will define the pattern of data tuples, and the granularity of access and deletion is small.

Now the well-known distributed file systems are GFS (non-open source) and HDFS (Hadoop distributed file system). Distributed databases are now known as Hbase and oceanbase. Hbase is based on HDFS, while oceanbase is an internally implemented distributed file system. It can also be said that the distributed database uses the distributed file system as its basic storage.

Differences between Unified Storage, Converged Storage and Distributed Storage

Specific concept of unified storage:

Unified storage is essentially a network storage architecture, which can support file-based network attached storage (NAS) and block-based SAN. Because it supports different storage protocols to provide data storage for host systems, it is also called multi-protocol storage.

Basic introduction:

Unified storage (sometimes called network unified storage or NUS) is a storage system that can run and manage files and applications on a single device. Therefore, the unified storage system integrates file-based and block-based access on a single storage platform, and supports Fibre Channel-based SAN, IP-based SAN(iSCSI) and NAS (Network Attached Storage).

Working mode:

Because it is a centralized disk array, it supports the host system to access data at file level through IP network, or at block level through optical fiber protocol in SAN network. Similarly, iSCSI is a very common IP protocol, except that it provides block-level data access. The disk array is equipped with a multi-port storage controller and management interface, which allows storage administrators to create storage pools or spaces on demand and provide them to host systems with different access types. The most common protocols usually include NAS and FC, or iSCSI and FC. Of course, the above three protocols can also be supported at the same time, but the general storage administrator will choose one of FC or iSCSI, both of which provide block-level access mode and file-level access mode (NAS mode) to form unified storage.

Distributed storage supports multiple nodes. What is a node, disk or master node?

Node is the abbreviation of storage node, generally storage server (with controller), and the servers are interconnected by high-speed network.

Now more and more storage servers use arm CPU+ disk array to save energy and improve the "capacity-to-energy ratio".

What are the main categories of distributed file systems?

Distributed storage is brave in big data, cloud computing and virtualization scenarios, and it is also crucial in most scenarios. Munity.emc/message/655951briefly introduces the development history of distributed file system under *nix platform;

1, single file system

Local storage of operating systems and applications.

2. Network File System (NAS for short)

Based on the existing Ethernet architecture, the traditional file system data sharing between different servers is realized.

3. Cluster file system

On the basis of shared storage, different servers can use a traditional file system through cluster lock.

4. Distributed file system

On the traditional file system, data can be distributed across servers through additional modules, and raid protection function is integrated, which can ensure that multiple servers can access and modify the same file system at the same time. Excellent performance, good expansibility and low cost.

What is distributed storage and explain its basic realization principle.

DFS2000 (referred to as Dfs2000 for short) series of China Yunke DCNNCS is a big data-oriented storage system, which adopts a distributed architecture and a truly distributed and fully symmetric cluster architecture, combines modular storage nodes with data and storage management software, balances the connection load of cross-node clients, automatically balances capacity and performance, optimizes cluster resources, seamlessly expands 3- 144 nodes, and increases capacity and performance.

What is Hadoop Distributed File System 10?

Distributed file system means that the physical storage resources managed by the file system are not necessarily directly connected to local nodes, but connected to nodes through floating computer networks.

Hadoop is an open source parallel computing programming tool and distributed file system developed by Apache Software Foundation, similar to MapReduce and Google file system.

HDFS(Hadoop Distributed File System) is one of them.

What is the way of distributed file storage system?

One. Several implementations of distributed session 1. Sessions based on database * * * 2. File system based on NFS*** 3. Based on memcached session, how to ensure high availability of memcached itself? 4. Session replication mechanism based on resin/tomcat web container itself. Session sharing based on TT/Redis or jbosscache. 6. Session * * * sharing based on cookie or: 1. Introduction of session replication mode management (session replication): Copy the session data broadcast on one machine to the other machines in the cluster. Usage scenario: Less machines and less network traffic Advantages: simple implementation, less configuration, and the downtime of one machine in the network does not affect user access. Disadvantages: When the broadcast is copied to other machines for a certain period of time, it will bring some network overhead. 2. Introduction to sticky session mode management: sticky session, when a user visits a machine in a cluster, it is mandatory to specify that all subsequent requests fall on this machine. Usage scenario: the number of machines is moderate, and the stability requirement is not very high. Advantages: simple implementation, convenient configuration and no additional network overhead. Disadvantages: When a machine in the network goes down, the user session will be lost, which is easy to cause a single point of failure. 3. Introduction to centralized cache management: Storing sessions on machines in a distributed cache cluster. When users visit different nodes, they first get the session information from the cache. Usage scenario: There are many machines in the cluster and the network environment is complex. Advantages: good reliability. Disadvantages: the implementation is complicated, and the stability depends on the stability of the cache. There must be a reasonable strategy to write the session information into the cache. The difference and connection between Session and Cookie and the realization principle of Session 1, the Session is stored in the server, and the client does not know the information in it; Cookie are stored in the client and the server can know the information in them. 2. The object is saved in the session and the string is saved in the cookie. 3. The session cannot distinguish paths. During the same user's visit to the website, all sessions can be accessed anywhere. If the path parameter is set in the cookie, cookies in different paths in the same website cannot access each other. 4. The session needs cookie to work properly. If the client completely prohibits cookie, the session will be invalid. This is a stateless protocol. Every time a client reads a web page, the server opens a new session. ......