PAST System Review

Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

PAST

Quick Summary:

Purpose of PAST: “PAST is intended as an archival storage content distribution utility and not as a general-purpose filesystem.”

Security in PAST: “Each PAST node and each user of the system hold a smartcard (read-only clients do not need a card).”

1. Storage Management PAST

Two key characteristics of PAST storage management system is (1) a graceful degradation as the system approaches its maximum (2) replicas of each file are maintained by k other nodes closest to the fileID. In order to guarantee the first characteristic while avoiding a load imbalance (due to the second characteristic) they introduced Replica Diversion.

Replica Diversion/ File Diversion:

The purpose is to balance remaining free storage space among the nodes in a leaf set (a group of nodes). Each files have fileID and each nodes have nodeID; files are stored in nodes with nodeID closest to its fileID. When a node A is unable to store the replica, it diverts it to a node B that is not among the k closest node to the fileID. Thus, a file is replicated in node B, and node A issues a store receipt with pointer to node B. In failure of B, a replica replacement is made. A stored replica in B is independent from failure at node A.

When a new node joins, it is issued a nodeID depending on the neighboring nodes (recall Pastry protocols). During this time the storage invariant requires such nodes to acquire replicas of files from previously failed nodes - to keep the k copies in similar fileID and nodeID. To ensure a balanced decentralized management of storage, PAST maintains storage quotas for each nodes.

2. Caching in PAST

Caching at nodes is actually the key to so called ‘graceful’ degradation of the system at its maximum, by minimizing client access latencies (fetch distance). For highly popular files, it is advantageous to keep more than k number of replicas - a copy near each cluster. These copies are stored in unused spaces in nodes and discarded at any time. As the storage utilization of the system increases, cache performance degrades gracefully.

Cache replacement policy used in PAST is Greedy Dual Size (GD-S) policy. What GD-S does is that it keeps the cache of most popular items and replaces the rarely used items from cache.

End Note:

This was a 10 year-old paper that focuses on the storage management and caching of PAST. As we’ve already see, PAST is a very scalable storage system based on Pastry. This review covered key points that are not covered in the P2P storage survey.

Advantages:

  • Read – write
  • Overall reliable storage system
  • Highly scalable

Disadvantages:

Writing users must have a physical smart card, which is a big liability.

Volunteers:

There’s no specific reason why it should not be used for Volunteers. Therefor PAST is indifferent towards the issue of Volunteer vs. P2P

decentralized_storage_systems/past.txt · Last modified: 2012/04/23 00:59 by julia
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki