Mar 082013
 

Dynamo is a highly available key-value storage system that sacrifices consistency user certain failure scenarios. Moreover conflict resolution is placed mostly on the application side and versioning is highly used for it. The main contribution of the system is that they developed highly decentralized, loosely coupled, service oriented architecture with hundreds of services, combining different techniques.

Combination of different techniques is used to reach defined level of availability and scalability: Partitioning and replication is based on consistent hashing, and consistency is leveraged by object versioning. The consistency among replicas during the updates are facilitated by quorum-like techniques, while failure detection relies on gossip based protocols.

Architecture

Simple read and writes operation is uniquely identified by a key and do not support any relational schema. Dynamo does not provide any isolation guarantees and permits only single key update. It support always writable design, as its applications require it. This way, conflict resolution is placed on the reads. Incremental scalability, symmetry, heterogeneity are key features of the system.

Only two operation exposed: get and put. Get return the object and its version, while put uses this version as one of the parameter when it’s called.

Partitioning of the data relies on the consistent hashing and this way load is distributed across hosts. Moreover, each node is mapped to  multiple points in the ring. Replication is done on the multiple hosts across the ring and ensured to have unique hosts as a replica, whereas the number of replicas is configured. Preference list is used to store replicas information.

Concurrent updates are resolved by versioning, this way updates can be propagated to all replicas. To resolve updates on the different sites vector clocks are adopted. This way causality between different versions can be tracked. So, each time an object requested to be updated, version number that was obtained before should be specified.

Consistency among replicas is maintained with quorum like mechanism, where W and R, write and read quorum respectively, are configured. On update (put) coordinator of the put generate a vector clock and write the new version of the data. Similarly for a get, where coordinator requests all existing versions for the key. But most of the time “sloppy quorum’ is used, where all read and write operation performed on the first N healthy nodes in the preference list.

This mix of the techniques proved to work to supply highly available and scalable data store, while consistency can be sacrificed in some failure scenarios. Moreover, all parameters, like read, write quorum and number of replicas can be configured by the user.

Pros

  • Cool
  • Inspiring
  • Scalable
  • Available
  • Writable

Cons

  • Sacrifices consistency
  • Hashing for load balancing and replication
  • Nothing more that get and put
  • Quite slow with all its conflict resolution and failure detection/handling
  • Target write specific applications
  • No security guarantees
  • No isolation guarantees
  • Only single key update
Jan 232013
 

NASDAQ OMX (N) is the world’s largest stock exchange technology provider. Around ten percent of all world’s securities transaction is happening here. More importantly is that, N multi-assets trading and clearing system is the world’s fastest and scalable one. This enables the trading of any asset in any place in the world. The core system of the company for trading and clearing is called Genium INET and is very complex, optimized and customized for different customers.  N owns 70 stock exchanges of 50 countries and serves multiple assets including equities, derivatives, debts, commodities, structured products and ETFs.

  • N united 3900 companies in 39 countries with $4,5 trillion in total market value.
  • N is the world’s largest IT industry exchange.
  • N is the world’s largest biotech industry exchange.
  • N is the world’s largest provider of technology for the exchange industry.
  • Over 700 companies use more than 1500 different corporate products and services.
  • N data accessed by more than two millions users in 83 countries.
  • The average transaction latency on N in 2010 was less than 100 microseconds.
  • The N implementation of INET is able to handle more than one million transactions per second.

Main issues that are faced is to maintain high level of reliability, low latency and security in the system. This niche is unique and normally not covered among conventional IT vendors and consultancies. Among the existing N systems are the following: trading systems, clearing systems, market data systems, surveillance systems, advisory systems.

Obviously all the data in the system is required to be stored reliably, with an extremely fast response. Moreover, some requests and queries can be performed over the data, in addition to “on fly” processing the incoming data and storing the results.

Variety of challenges appears to design and implement such a system, among them a load balancing of incoming requests to store the information, fault tolerance to maintain 24/7 availability and ability to hot update without losing the availability.