Five core technologies of the hottest cloud comput

  • Detail

The five core technologies of cloud computing system

cloud computing system uses many technologies, among which programming model, data management technology, data storage technology, virtualization technology and cloud computing platform management technology are the most critical

(1) programming model

mapreduce is a Java, Python and c++ programming model developed by Google. It is a simplified distributed programming model and an efficient task scheduling model for parallel operations of large-scale data sets (greater than 1TB). The strict programming model makes programming in cloud computing environment very simple. The idea of MapReduce mode is to decompose the problem of typical energy-saving materials to be implemented into map and reduce. First, the data is cut into irrelevant blocks through the map program, and then allocated (scheduled) to a large number of computers for processing, so as to achieve the effect of distributed computing, and then the results are summarized and output through the reduce program

(2) massive data distributed storage technology

cloud computing system is composed of a large number of servers and serves a large number of users at the same time. Therefore, cloud computing system uses distributed storage to store data, and redundant storage to ensure the reliability of data. The data storage system widely used in cloud computing system is the open source implementation HDFS of GFS developed by Google's GFS and Hadoop team

gfs, Google file system, is an extensible distributed file system for large-scale, distributed applications that access a large amount of data. The design idea of GFS is different from the traditional file system. It is designed for large-scale data processing and Google application features. It runs on cheap ordinary hardware, but it can provide fault tolerance. It can provide services with high overall performance to a large number of users

a GFS cluster is composed of a master server that helps customers deal with product technical problems and a large number of chunkservers, which are accessed by many clients. The master server stores all metadata of the file system, including namespace, access control information, mapping from file to block, and the current location of the block. It plays an important role in the field of power batteries and also controls system wide activities, such as block lease management, garbage collection of orphan blocks, and block migration between block servers. The master server regularly communicates with each block server through heartbeat messages, passing instructions to the block server and collecting its status. The files in GFS are divided into 64MB blocks and stored in redundancy. Each data is saved in the system for more than 3 backups

the exchange between the client and the master server is limited to the operation of metadata, and all data communications are directly connected with the block server, which greatly improves the efficiency of the system and prevents the master server from being overloaded

(3) massive data management technology

cloud computing needs to process and analyze distributed and massive data. Therefore, data management technology must be able to efficiently manage a large amount of data. The data management technology in cloud computing system is mainly Google's BT (BigTable) data management technology and the open source data management module HBase developed by Hadoop team

bt is a large distributed database based on GFS, scheduler, lock service and MapReduce. Unlike traditional relational databases, it treats all data as objects to form a huge table for distributed storage of large-scale structured data

many Google projects use BT to store data, including page queries, Google Earth and Google Finance. These applications have different requirements for BT: different data sizes (from URL to page to satellite image) and different reaction speeds (from back-end batch processing to real-time data services). BT has successfully raised different requirements, but many shipyards do not know what tensile testing machines are used to provide flexible and efficient services

(4) virtualization technology

through virtualization technology, software applications can be isolated from underlying hardware. It includes the splitting mode of dividing a single resource into multiple virtual resources, as well as the aggregation mode of integrating multiple resources into a virtual resource. Virtualization technology can be divided into storage virtualization, computing virtualization, network virtualization, etc. computing virtualization can be divided into system level virtualization, application level virtualization and desktop virtualization

(5) cloud computing platform management technology

cloud computing resources are large-scale, with a large number of servers distributed in different locations, and hundreds of applications are running at the same time. How to effectively manage these servers and ensure that the whole system provides uninterrupted services is a huge challenge

the platform management technology of cloud computing system can make a large number of servers work together, facilitate business deployment and opening, quickly find and recover system faults, and realize the reliable operation of large-scale systems through automated and intelligent means. (end)

Copyright © 2011 JIN SHI