Traditional Culture Encyclopedia - Traditional stories - [Theoretical Research on Cloud Computing IT Infrastructure] 05- Hyperconvergence Technology

[Theoretical Research on Cloud Computing IT Infrastructure] 05- Hyperconvergence Technology

In fact, hyper-convergence is not entirely suitable for the IT infrastructure of cloud computing. When you say distributed storage, it is actually hardware server and storage; You call it hardware, but you can't do without distributed storage software.

The traditional IT infrastructure is mainly divided into three layers: network, computing and storage. However, with the development of cloud computing and distributed storage technology and the standardization of x86 servers, a hyper-converged architecture that integrates computing and storage nodes has gradually emerged. Hyperconvergence reduces the three-tier IT infrastructure to two.

In the magic quadrant of Gartner hyper-converged products in June 20 19+0 1 year, there are five leader quadrants: Nutanix, Dell, VMware, Cisco and HPE. (The distributed storage software used in Dell vxRail All-in-One is also VMware's VSAN, and VMware provides a pure software solution of VSAN. )

Nutanix can become the leader of hyper-convergence, which is naturally fully verified and recognized by the market. And because its public information (Nutanix Bible) is relatively complete, we can get a glimpse of super integration through Nutanix.

We don't need to bring it here. You can directly search "Nutanix Bible" or "Nutanix-Bible" through the search engine, and you can find the corresponding official documents.

Quoted from NUTANIX Bible-"Nutanix solution is a solution that combines storage and computing resources. This scheme is a platform of software and hardware integration, which provides 2 or 4 nodes in 2U space.

Each node runs a hypervisor (which supports ESXi, KVM and Hyper-V) and a Nutanix controller virtual machine (CVM). Nutanix CVM runs Nutanix core software, which serves all virtual machines and I/O operations corresponding to virtual machines.

With Intel VT-d (Virtual Machine Direct Path) technology, SCSI control of Nutanix devices running VMware vSphere (managing solid state drives and hard disk devices) will be directly transferred to CVM. "

Personal summary: According to the above official documents, 2~4 Nutanix nodes can be installed in 2U space (each node is equivalent to 1 physical server), so the installed equipment density is very high. Virtualization software is installed on each node, and the virtualization layer runs a Nutanix control virtual machine (CVM), which is mainly responsible for the communication of control planes between different Nutanix nodes. A single node is equipped with SSD hard disk and HDD hard disk as storage instead of disk array. A single node has an independent CPU and memory, which is used as a computing node.

1, infrastructure

Take three Nutanix nodes as an example. Each node is equipped with a Hypervisor running a guest virtual machine, and each node has a Nutanix controller virtual machine controller VM, which is equipped with two SSDs and four HDD, and can be read and written by SCSI controller.

2. Data protection

Nuntanix is different from the traditional disk array in data protection through Raid and LVM. But just like general distributed storage, it protects data by creating copies of data and copying them to other Nutanix nodes for storage. Nutanix calls the copy number RF (generally RF is 2~3).

When a customer writes data into a virtual machine, "see the process of 1a) in the figure". First, the data is written into the OpLog logical area divided in the SSD hard disk of the local Nutanix node (equivalent to the role of cache), and then the process of "1b" is executed. The CVM of the local node copies the data in the operation log of the local SSD to the operation logs of other nodes, and the number of copies depends on RF. When the CVM of other nodes determines that the data writing is completed, it will execute the "1c" process and give a reply that the data writing is completed. Data protection is achieved through data replication.

Data is asynchronously written from the operation log in SSD to the extended storage area of SSD and HDD according to specific rules. See the next section for details.

3. Storage Tiering

Nutanix data writing takes local disk discarding as the main writing principle (core principle).

When the customer writes data to the virtual machine, the local SSD is given priority (if the used capacity of SSD does not reach the threshold). If the local SSD is full, the coldest data of the local SSD will be migrated to the SSDs of other nodes in the cluster to make room for the local SSD to write data. The principle of local inventory is to improve the speed of accessing stored data by virtual machines as much as possible, so that local virtual machines do not need to access stored data across nodes. (This should be the biggest original difference between VSAN and other distributed file systems)

When the SSD used capacity of the whole cluster reaches the threshold (generally 75%), the SSD data of each node will be migrated to the HDD hard disk of that node.

When SSD migrates data to HDD, it does not migrate all data to HDD, but sorts the data according to cold and hot access, and preferentially migrates cold data with less access to HDD hard disk.

If SSD capacity reaches 95% utilization, 20% of cold data will be migrated to HDD;; If the SSD capacity reaches 80%, by default, 15% of cold data will be migrated to HDD.

4. Data reading and migration

Nutanix Bible Quote-"

& ltu style="text-decoration: none; Border-bottom: 1px dashed gray; " & gt when a virtual machine is migrated from one node to another (or HA handover occurs), the data of the virtual machine will be served by the local CVM in the current node. When reading old data (stored in the CVM of the previous node), the I/O request will be forwarded to the remote CVM through the local CVM. All write I/O will be completed in the local CVM. When DFS detects that I/O requests fall on other nodes, it will automatically move data to local nodes in the background, so that all read I/O will be served locally. Data will be moved only when it is read, thus avoiding excessive network pressure. & lt/u & gt; "

Personal summary: A general virtual machine reads and writes data from the hard disk of the local node. If the local node's hard disk does not have these data, it will first copy the local node's hard disk from other nodes, and then provide access for the local virtual machine, instead of the virtual machine directly accessing other nodes. That is, we must implement the core idea of local market recession.

5. Advantages and disadvantages of 5.Nutanix solution

Advantages of Nutanix scheme:

1) local disk dropping strategy to ensure the storage speed of virtual machine access: all the data written by virtual machine are on the disks of physical nodes, which avoids cross-node storage access, ensures the access speed and reduces the network pressure.

2) SSD disk is used as data cache, which greatly improves IO performance;

See the data in the above table. According to random reading and writing, the IO and bandwidth performance of SSD is about 1000 times higher than that of SATA. Combining Nutanix's local disk discarding strategy and virtual machine data writing, only two local SSD hard disks are used as data cache to write data.

However, because the IO of a single SSD hard disk is 1000 times higher than the SATA of the traditional array, the IO performance is greatly improved. (Raid equivalent to more than 2,000 SATA hard drives can provide similar IO performance).

3) Always write SSD first to ensure high IO performance.

There is no need to write data into HDD, even if the local SSD capacity is full, cold data will be migrated to other node SSDs in the cluster, and then SSD will read and write to ensure high IO. Then the SSD cold data is asynchronously migrated to HDD.

4) data cold and hot layered storage

Cold data is stored in HDD and hot data is stored in SSD to ensure high IO reading of hot data.

5) The equipment density is high, which saves the rack space of the computer room.

2U can be configured with four nodes, including storage and computing, which saves a lot of space compared with previous rack/blade server and disk array solutions.

Disadvantages of Nutanix scheme:

1) local disk dropping and SSD caching schemes ensure high IO, but hard disk bandwidth is not guaranteed.

In a traditional disk array, multiple SATA/SAS hard disks join a Raid group. When writing data, the file is divided into multiple blocks and distributed to each hard disk, and the hard disks in the same Raid group simultaneously participate in the reading and writing of the blocks of the file. Improve IO and bandwidth performance by reading and writing in parallel with multiple hard disks.

In Nutanix's solution, the reading and writing of a single file follows the policy of discarding the local disk, so the file is no longer split into multiple hard disks for parallel reading and writing, and only the SSD hard disk of the local node will write the file.

Although the IO and bandwidth of SSD hard disk are several hundred times that of SATA/SAS, the bandwidth of SSD is only 2~3 times higher than that of SATA/SAS hard disk, and the traditional Raid method is to read and write with multiple hard disks in parallel. Although IO is not as good as SSD, its bandwidth is much higher than that of single/dual SSD.

Therefore, Nutanix's solution is suitable for service types with high IO demand, but it is not suitable for service types with low IO and high bandwidth due to its reading and writing principle.

3) Comparison of industry competitors:

VMWARE EVO RAIL package: VMWARE does not involve hardware products, but qualified Evo Rail partners can use the Evo Rail package. Partners sell hardware together with the integrated EVO: RAIL software and provide all hardware and software support to customers.

Evo: The core of Rail is actually the packaging of VSphere virtualization software +VSAN software.

But the biggest difference between VSAN and Nutanix is that it is not necessary to completely follow Nutanix's local strategy. By setting the stripe coefficient, the data reading and writing of the local virtual machine can be set to the hard disk across multiple nodes. The default stripe coefficient is 1, and the maximum stripe coefficient can be set to 12, that is, one virtual machine can write data, and the SSD hard disks of 12 nodes can simultaneously read and write in parallel.

In this way, VSAN can make up for the shortcoming that Nutanix scheme is not suitable for services with high bandwidth requirements and low IO requirements to some extent.

However, this cross-physical node access traffic will definitely bring pressure to the network in the case of a large number of virtual machines, and network bandwidth may become another bottleneck.

Secondly, VSAN can be integrated into the hypervisor layer without running the control virtual machine CVM on the hypervisor like Nutanix.

Third, Nutanix supports KVM, Hyper-V, ESXI and other Hypervisor, while VSAN only supports its own ESXI.

Other things to be supplemented: because the actual deployment test of VSAN has not been carried out for the time being, it only stays in the study of its principle, so the part about VSAN will continue to be supplemented after the online test of the subsequent platform is completed.