P2P caching

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Peer-to-peer caching (P2P caching) is a computer network traffic management technology used by Internet Service Providers (ISPs) to accelerate content delivered over peer-to-peer (P2P) networks while reducing related bandwidth costs.

P2P caching is similar in principle to the content caching long used by ISPs to accelerate Web (HTTP) content. P2P caching temporarily stores popular content that is flowing into an ISP’s network. If the content requested by a subscriber is available from a cache, the cache satisfies the request from its temporary storage, eliminating data transfer through expensive transit links and reducing network congestion. This approach could make ISPs violate laws as P2P systems share files that infringe copyrights in significant portions.[1]

P2P content responds well to caching because it has high reuse patterns reflecting a Zipf's-like distribution.[2][3][4] P2P communities have different Zipf's parameters[4] which determine what fraction of files is requested multiple times. For example, one P2P community may request 75% of content multiple times while another may request only 10%.

Some P2P caching devices can also accelerate HTTP video streaming traffic from YouTube, Facebook, RapidShare, MegaUpload, Google, AOL Video, MySpace and other web video-sharing sites.[5]

How P2P caching works[edit]

P2P caching involves creating a cache or temporary storage space for P2P data, using specialized communications hardware, disk storage and associated software. This cache is placed in the ISP’s network, either co-located with the Internet transit links or placed at key aggregation points or at each cable head-end.

Once a P2P cache is established, the network will transparently redirect P2P traffic to the cache, which either serves the file directly or passes the request on to a remote P2P user and simultaneously caches that data for the next user. To what extent the caching is beneficial depends on how similar the content interests of ISP's customers. Due to relatively small number of content shared in P2P systems (compared to Web) and semantic, geographic, and organization interests of users[4] sharing ratio in P2P can be significantly higher than HTTP/Web caching[citation needed].

P2P caching typically works with a network traffic-mitigation technology called Deep Packet Inspection (DPI). DPI technology is used by service providers to understand what traffic is running across their networks and to separate it and treat it for the most efficient delivery. DPI products identify and pass P2P packets to the P2P caching system so it can cache the traffic and accelerate it.

Peerapp Ltd. holds the first patent [6] for P2P caching technology, which was filed in 2000.

The P2P bandwidth problem[edit]

In 2008, peer-to-peer traffic was estimated to account for 50% of all Internet traffic, and was expected to quadruple between 2008 and 2013, reaching 3.3 exabytes per month– or the equivalent of 500 million DVDs each month.[7]. However, this trend has been discontinued, as by 2016 the global P2P traffic began to lower, showing a 6% descent between 2016 and 2021.[8] These statistics may be explained by the popularization of Video on Demand services, which have (until the moment) used a centralized architecture for data distribution.

Increasing P2P traffic has created problems for ISPs. Networks can become saturated with P2P traffic, creating congestion for other types of Internet use. The cost of P2P traffic is disproportionate to the amount of revenue ISPs make from these customers because of the flat-rate packages of bandwidth commonly sold. To prevent P2P traffic from degrading service for all subscribers, ISPs typically face three choices:

  • Invest in additional bandwidth and equipment. Unfortunately, increasing bandwidth often does not solve the problem, because P2P applications inherently tend to consume as much bandwidth as available.
  • Implement stricter byte caps, policies, or P2P traffic-shaping, limiting the speed of P2P traffic. The difficulty is that P2P packets are becoming harder and harder to identify, especially with the introduction of encryption (such as BitTorrent protocol encryption). Traffic shaping can also generate negative publicity and customer reactions.
  • Implement a form of P2P caching.

Caching releases the bandwidth demand on critical Internet links and improves the experience for all users – P2P users whose file sharing is improved through using the cache, and non-P2P users who experience better performance from networks un-congested from P2P traffic.

The initial adopters of P2P caching have been ISPs in Asia, the Pacific Rim, Latin America, the Caribbean and the Middle East, whose subscribers are heavy users of P2P networks and where providing the additional bandwidth to handle P2P data is very costly due to the expense of international transit links.

P2P caching is expected to become an increasingly essential technology for ISPs and MSOs (multiple system operators) worldwide, particularly with the growing popularity of P2P content among broadband subscribers and the adoption of P2P as a content-distribution strategy by mainstream content providers such as the BBC.

P2P caching implementations[edit]

  • PeerApp UltraBand Media Caching Software [1]
  • Corelli [9] is a community-based P2P caching system that operates in a decentralized way across multiple peers. This allows a caching service to be realised in environments that do not possess fixed caching infrastructure, e.g. a Wireless ad hoc network.
  • Community Caching is a P2P community-interest-aware, distributed caching solution for structured (DHT-based) P2P systems. It alleviates the overhead due to isolating P2P communities and loss of content popularity due to aggregation of content from multiple communities.[4]


  1. ^ Jacob, Assaf M.; Zoe Argento (1 Sep 2010). "To Cache or Not to Cache – That is the Question; P2P 'System Caching' – The Copyright Dilemma". Whittier Law Review. 31: 421-. SSRN 1670289.
  2. ^ Sripanidkulchai, K. "The popularity of Gnutella queries and its implications on scalability". Retrieved 6 January 2012.
  3. ^ Klemm, A.; C. Lindemann; M. K. Vernon; O. P. Waldhorst (2004). Characterizing the query behavior in peer-to-peer file sharing systems (PDF). 4th ACM SIGCOMM Conf. on Internet Measurement.
  4. ^ a b c d Bandara, H. M. N. Dilum; A. P. Jayasumana (June 2011). Exploiting communities for enhancing lookup performance in structured P2P systems. IEEE Int. Conf. on Communications (ICC '11). doi:10.1109/icc.2011.5962882.
  5. ^ "Archived copy". Archived from the original on 2010-06-09. Retrieved 2010-05-23.CS1 maint: archived copy as title (link)
  6. ^ U.S Patent Number 7,203,741 B2
  7. ^ Cisco. "Approaching the Zettabyte Era". Cisco. Retrieved 6 January 2012.
  8. ^ Cisco. "Cisco Visual Networking Index: Forecast and Methodology, 2016–2021". Cisco. Retrieved 17 August 2018.
  9. ^ Gareth Tyson, Andreas Mauthe, Sebastian Kaune, Mu Mu and Thomas Plagemann. Corelli: A Peer-to-Peer Dynamic Replication Service for Supporting Latency-Dependent Content in Community Networks. "Archived copy" (PDF). Archived from the original (PDF) on 2015-06-18. Retrieved 2012-04-26.CS1 maint: archived copy as title (link)