The OceanStore Project Pond: the OceanStore Prototype

OceanStore

OceanStore is a two-tiered, fully read/write distributed storage system for any type of files and applications. The inner tier consists of well-connected servers for primary replications; the outer tier consists of loosely connected client computers for storage and archives. They assume that all user infrastructures are untrusted.

Characteristics:

They use erasure-coding for durability in archiving copies of files. Erasure-coding is a mathematical technique where a file is pided into n fragments and each fragments are distributed in different storage units. When a file is requested, a minimum number of fragments m (m<n) necessary to recover the file is generated from peers. This technique avoids redundant replicas and it ensures high fault- tolerance. Also, users can cache files locally.

Each file has a single primary replica stored in the inner tier servers. An update proceeds from client to primary replica. Then primary replica serializes the update and heartbeat is generated (it certifies the latest version) and notifies the other replicas in dissemination tree. Simultaneously, the new version is erasure-coded and sent to archival storage servers.

Another characteristic is a use of Byzantine fault tolerant algorithm. It is based on the fact that none of the infrastructures can be trusted, including the inner tier servers. Byzantine agreement basically says, as long as the 2/3 of your members in the community agrees on the same decision, it is legitimate. In other words, max 1/ 3 of your peers can be faulty. However, the downside of Byzantine is its complexity, as it scales up. Hence, it is only used in the inner tier for fault tolerance. The files in outer tier are updated using aggressive replication.

They use threshold cryptography / public-key cryptography for inner rings to push updates to peers without authenticating each replicas inpidually. This technique could also be used to hosts store secure files without knowing its contents (used in Publius).

They use Tapestry for locality. It’s a scalable overlay network built on TCP/IP, and any other network could have been used instead of Tapestry.

Advantages:

  • Read – Write
  • High Fault-Tolerant
  • Security: due to assumption that none of the nodes could be trusted.
  • Supports any type of files and applications

Disadvantages:

Writing decreases system performance significantly

Volunteer Systems:

Due to is high fault-tolerance, security, and type of files supported by this system, it is well suitable for volunteer computing systems that don’t require much writing access from nodes.

decentralized_storage_systems/oceanstore.txt · Last modified: 2012/04/23 00:55 by julia
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki