Pastis: a Highly-Scalable Multi-User Peer-to-Peer File System

Pastis

Quick Summary:

The authors suggest highly scalable P2P system, Pastis, which is simpler and faster than Ivy and Oceanstore. Pastis is only 1.4-1.8 times slower than NFS, whereas Ivy and Oceanstore are between 2-3 times slower.

Ivy: stores all file system data in a set of logs using DHash hash table – each user has its own logs. This ensures security but limits scalability.

Oceanstore: maintains centralization point called the primary tire, which updates replicas using Byzantine-fault tolerant (BFT) algorithm. However BFT is very expensive and makes the system too complex, and not suitable for cooperative users.

Pastis: uses Pastry to rout messages in the Past network. Past is a highly-scalable P2P storage service which provides a DHT abstraction. Pastry is a P2P key-based routing substrate. The files and folders are contained in a similar fashion to Unix – each file’s metadata is stored in inode object, which contains list of pointers in which the file is stored. Benefits of Pastis: security, scalability, self-organization, and locality. A java prototype of Pastis is implemented under the name FreePastry.

Design Aspects of Pastis:

Updates/Conflicts: Each inode is stored under User Certificate Blocks (UCB), and file/directory contents are stored in fixed-size DHT blocks called Content Hash Blocks (CHB). When a user modifies file/directory, the inode is updated by incrementing its version number and unique id of the user who’s issuing it. A conflict is likely to arise if two users modify inodes concurrently, in which case the version number and user id’s are compared with the older replicas in the Past network.

Consistency: is maintained by close-to-open (CTO) and read-your-writes (RYW) mechanisms. CTO is implemented for applications that are used frequently, and the latter is for applications that are seldom shared. CTO works by retrieving the latest inode from network when the file is opened and keeping a cached copy and a local CHB buffer. When the file is closed all cached data are flushed to the network and removed from the local buffer. RYW, a more relaxed version, work by guaranteeing that the copy that you read is not later than its previously written version. Its advantage: fewer accesses to the DHT than CTO.

Security: Pastis uses write access control and data integrity, and does not provide read access control. Write access control works by the owner of the file issues write certificates to certain trustees, who must properly sign the new version of inode and provide his write certificate when modifying content. This model assumes that all users allowed to write to a given file trust one another.

FreePastry

The prototype runs on any platform supporting Java VM 5.0 – in the higher layer its indifferent to whether it was tested on a simulator or a real environment. FreePastry was tested on a simulated network using Andrew Benchmark and results are as follows:

  • Scalability: Pastis scales as well as Past and Pastry networks
  • Updates/Conflicts: Concurrently modified files in Pastis scales better than Ivy. Pastis – 11.3% for up to 16 concurrent clients. Ivy – 70% for up to 4 clients.
  • Consistency & NFS comparison: Pastis execution time is less than twice that of NFS on close-to-open consistency model and 40% slower on read-your-write consistency model. This is due to live replicas of a given UCB must be retrieved to determine the latest versions in close-to-open models.

Other Important Points:

  • Multi-writer designs must face a number of issues not found in read-only systems:
    • Maintaining consistency between replicas or Enforcing access control
    • Guaranteeing that update requests are authenticated and correctly processed
    • Dealing with conflicting updates
  • One important feature of Pastry is that it takes into account network locality to optimize overlay routes
  • Andrew Benchmark for performance evaluation phases:
    • Create directories
    • Copy files
    • Read file attributes
    • Read file contents
    • Run a make command
  • A Strict consistency models can impair performance, hence a large-scale P2P file system should offer a range of different degrees of consistency.

Advantages:

  • Highly scalable
  • Read – write
  • Up to twice as fast as Ivy or Oceanstore
  • Supports multiple consistency models

Disadvantages:

  • Security based on Trust between parties, which is hard to achieve

Volunteer systems:

  • Due to its security model, Pastis may be more applicable in volunteer system than regular P2P systems, given the right incentives for volunteer peers.
decentralized_storage_systems/pastis.txt · Last modified: 2012/04/23 00:59 by julia
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki