The principle and development prospect of p2p technology

P2P technology principle

P2P technology belongs to the category of overlay network, which is a network information exchange method relative to the client/server (C/S) mode. In C/S mode, data is distributed using a dedicated server from which multiple clients obtain data. The advantage of this mode is that the consistency of the data is easy to control and the system is easy to manage. However, the disadvantage of this mode is that because there is only one server (even if there are many, it is very limited), the system is prone to a single point of failure; a single server faces many clients due to CPU capacity, memory size, and network bandwidth. The limitation is that the client that can serve at the same time is very limited and has poor scalability. P2P technology is a peer-to-peer network structure proposed to solve these problems. In a P2P network, each node can get services from other nodes as well as other nodes. In this way, huge terminal resources are utilized, and the two drawbacks in the C/S mode are solved in one fell swoop.

Basic structure of a peer-to-peer network

(1) Centralized peer-to-peer network (Napster, QQ)

The centralized peer-to-peer network is based on a central directory server, which provides directory query services for each program in the network, and the content does not need to go through the central server. This kind of network has a relatively simple structure and the burden on the central server is greatly reduced. However, since there is still a central node, it is easy to form a transmission bottleneck, and the scalability is also poor, which is not suitable for a large network. However, due to the centralized management of the directory, it is an alternative to the management and control of small networks.

(2) Unstructured distributed network (Gnutella)

The most significant difference between an unstructured distributed network and a centralized one is that it does not have a central server, and all nodes access the entire network by communicating with neighboring nodes. In an unstructured network, nodes use a mechanism for querying packets to search for the resources they need. The specific method is that a node sends a query packet containing the query content to a node adjacent thereto, and the query packet spreads in the network in a diffused manner, because if the method is not controlled, the message floods, so Generally, an appropriate time to live (TTL) is set, which is decremented during the query. When the TTL value is 0, the transmission will not continue.

This unstructured way, the organization is relatively loose, the nodes are relatively free to join and leave, when searching for popular content, it is easy to find, but if the content of the demand is relatively unpopular, the smaller TTL is not easy to find, but Large TTL values ​​can easily cause large query traffic. Especially when the network range is extended to a certain scale, even if the limited TTL value is small, the traffic will increase dramatically. However, when there are some so-called class server nodes with abundant resources in the network, the efficiency of the query can be significantly improved.

(3) Structured distributed network (third generation P2P Pastry, Tapestry, Chord, CAN)

Structured distributed networks are the result of research based on Distributed Hash Table technology in recent years. Its basic idea is to organize all the resources in the network into a huge table, the table contains the keywords of the resources and the address of the stored nodes, and then the table is divided and stored separately for each node in the network. Go in. When the user searches for the corresponding resource in the network, it can discover the node where the hash table content corresponding to the keyword is stored, store the node address containing the required resource in the node, and then initiate a search. Based on these address information, the node connects to the corresponding node and transmits resources. This is a technically advanced peer-to-peer network. It is highly structured and highly scalable. Nodes are free to join and leave. This approach is suitable for larger networks.

P2P networks have three popular organizational structures and are used in different P2P applications.

(1) DHT structure

The Distributed Hash Table (DHT) [1] is a powerful tool, and its proposal has caused a boom in the academic world to study DHT. Although DHT has various implementations, it has the common feature that it is a ring topology in which each node has a unique node identifier (ID), and the node ID is a 128-bit Ha. Greek value. Each node saves the IDs of other predecessors and successors in the routing table. As shown in Figure 1 (a). With these routing information, other nodes can be easily found. This structure is mostly used for file sharing and as an underlying structure for streaming media transmission [2].

(2) Tree structure

The tree structure of the P2P network is shown in Figure 1(b). In this structure, all nodes are organized in a tree, the root of the tree has only child nodes, the leaves have only parent nodes, and the other nodes have both child nodes and parent nodes. The flow of information flows along the branches. The original tree structure is mostly used for P2P streaming live [3-4].

(3) mesh structure

The mesh structure is shown in Figure 1(c), which is also called no structure. As the name suggests, in this structure, all nodes are connected randomly, without a stable relationship, without a parent-child relationship. The mesh structure [5] provides the greatest tolerance and dynamic adaptability for P2P, and has achieved great success in streaming media and on-demand applications. When the network becomes very large, the concept of a supernode is often introduced. The supernode can be combined with any one or more structures to form a new structure, such as KaZaA [6].

The principle and development prospect of p2p technology

P2P technology application

(1) Distributed scientific computing

P2P technology can combine the CPU resources of many terminals to serve a common calculation. This kind of calculation is generally a scientific calculation with a huge amount of calculation, a lot of data, and a long time. In each calculation process, tasks (including logic and data, etc.) are divided into multiple slices and assigned to P2P node machines participating in scientific computing. Without affecting the use of the original computer, people use the scattered CPU resources to complete the calculation task, and return the result to one or more servers, and integrate many results to get the final result.

(2) File sharing

BitTorrent is an unstructured network protocol. In addition to BitTorrent, there are many well-known unstructured P2P file sharing protocols, typically Gnutella [8] and KaZaA [6].

(3) Live streaming

(4) Streaming media on demand

(5) IP layer voice communication

Skype adopts a KaZaA-like topology and selects some super nodes in the network. When the direct connection effect of the two communicating parties is not good, some suitable super nodes assume the role of the transit node, create a transit connection for the communication parties, and forward the corresponding voice communication packet.

Mechanism Analysis of Typical P2P Applications

3.1BitTorrent

The BitTorrent software user first obtains the seed file of the downloaded file from the web server. The seed file contains the hash name of the downloaded file name and data part, and also contains one or more indexer server addresses. The working process is as follows: the client sends a Hypertext Transfer Protocol (HTTP) GET request to the index server, and puts its own private information and the hash value of the downloaded file in the GET parameter; the index server according to the request The hash value looks up the internal data dictionary and randomly returns a set of nodes that are downloading the file. The client connects to these nodes and downloads the required file fragments. Therefore, the file download process of the index server can be simply divided into two parts: HTTP that communicates with the index server, a protocol that communicates with other clients and transmits data, which we call the BitTorrent peer-to-peer protocol. The working principle of BitTorrent software is shown in Figure 4. The BitTorrent protocol is also constantly changing. It can obtain available transport node information through the Datagram Protocol (UDP) and DHT methods, instead of just passing the original HTTP. This method makes the BitTorrent application more flexible and improves the BitTorrent user's. Download experience.

The principle and development prospect of p2p technology

3.2eMule

The eMule software is based on the improved protocol of the eDonkey protocol and is compatible with the eDonkey protocol. Each eMule client has a list of servers and a list of locally shared files pre-configured. The client connects to the eMule server through TCP to log in, obtain the information of the desired file and the information of the available clients. A client can download the same file from multiple other EMule clients and get different pieces of data from different clients. eMule also extends eDonkey's capabilities to allow clients to exchange information about servers, other clients, and files. The eMule server does not save any files, it is just the central index of the file location information. Once the eMule client is started, it will automatically connect to the eMule server using Transmission Control Protocol (TCP). The server provides the client with a client identification (ID) that is valid only for the lifetime of the client server connection. After the connection is established, the client sends a list of its shared files to the server. The server saves this list in the internal database. The eMule client also sends a request to download the list. After the connection is established, the eMule server returns a list to the client, including which clients can provide downloads of the requested files. The client then actively establishes a connection to download the file with them. Figure 5 shows how eMule works.

The principle and development prospect of p2p technology

The basic principle of eMule is similar to that of BitTorrent. The client obtains file download information through the index server. eMule also allows server information to be passed between clients. BitTorrent can only be obtained through an index server or DHT. eMule shares the entire file directory, and BitTorrent only shares download tasks, which makes BitTorrent more suitable for distributing popular files. eMule tends to download popular files.

3.3 Thunder

Thunder is a new type of download software based on multi-resource multi-threading technology. Thunder has 7 to 10 times faster download speed than the download software commonly used by current users. Thunder's technology is mainly divided into two parts, one is the search and integration of existing Internet download resources, the download resources on the existing Internet are verified, and the uniform resource location (URL) information of the same check value is aggregated. When the user clicks on a download connection, the Thunder server returns a subset of the aggregated URL information according to a certain policy, and returns the user's information to the Thunder server. The other part is the Thunder client to increase the download rate by downloading the files needed by multi-resource multi-threaded download. The fundamental reason for the fast and stable download of Thunder is to integrate multiple stable server resources to realize multi-resource and multi-thread data transmission. Multi-resource multi-threading technology enables Thunder to balance server resources without reducing the user experience, effectively reducing server load.

Each user downloads the files on the Internet and records the data in the Thunder server. If other users download the same files, the Thunder server will search the database for users who have downloaded the files. The server then connects these files. The user judges by the record in the downloaded file of the user. If the file still exists in the user download file (the file is invalid if the file is renamed or changed), the user will play the download intermediate service role and upload the file without knowing it.

Coffee Grinders

SHENZHEN CHONDEKUAI TECHNOLOGY CO.LTD , https://www.szsiheyi.com