Jan 152013

My third day at my thesis host company – NASDAQ OMX, Stockholm.

Big eyes.

Happy face.

Hands ready to code.

This is the spirit!

Feel like I will have a crazy 5 month of work with an amazing experience and hands/head ready to handle any problem.

Finally, going down to the thing I’m working on and the thesis topic:

GENIUM DataStore:Distriubted Data Store and Data Processing System.

The system should face very strict requirements of extremely low latency, high availability and scalability. When I’m talking about the low latency requirement, I mean a latency of 100 microseconds per an operation and 250000 transactions per seconds, which for now sounds quite insane:). Two main paradigms will be supported: storing and processing the data. Storing will be a kinda integration part, through which API an information can be stored to already internally used DBs, both relational and key-value. Processing part is similar to the stream processing system and be integrated with the storage part.

Building a system with a high requirements on consistency(C) and fault-tolerance(FT) is supposed to be a pain in the rear part of the body, but using the internally developed libraries for C and FT support will make my life way much easier. However, the performance goals can be a real challenge.

The main goal is not to make a system to have a variety of features and be “almost stable”version, but to have less features and be a stable release. Another side of the work here, is that whatever you are building should be well thought from the perspective of having critical data and a great responsibility for making mistakes. Hooorayyy…

Thesis structure. Preliminary thoughts.

I was also thinking on the structure of the final document already and it is actually quite hard to think of any at this point. I still sure that my vision will change, however here it is:

  • Abstract (Cap. Obvious :D)
  • Introduction

Structure of the Document

  • Background and Related Work

NASDAQ INET (just some words to make at least unclear pic of the base system for GENIUM DS)
Distributed Data Storage Systems
Distributed Data Stream Processing Systems

  • GENIUM DataStore

NASDAQ OMX Domain (What is NASDAQ, what data is used, volume of the data to be processed)
Architecture (and Reasoning for such architecture)
Fault Resilience
Have no idea what more… but for sure should be something

  • Implementation Details (I’m sure that this part will be neseccary, as the coding will be my main occupation these days:))

Failure Scenarios

  • Experimental Results

Set Up
Scalability and Performance

  • Discussion

Main findings
Low Latency
Positioning??? (can’t find any more suitable word… will think on it… one day:))
Comparison to other existing systems (if possible)
Future Work

  • Conclusions
  • References

Let’s see what is going to be in 1 month:)

Finally, I will hope not to become a turtle below and behave:)

Julia is not a turtle 🙂

Nov 082012

In the field of Distributed Data-Storage is it almost impossible to come up with universal system that will satisfy all needs. That’s why, recently, various distributed storage systems appear to face different needs and use different approaches.

DynamoDB uses a key-value interface with only replication within a region. I haven’t checked myself the latencies range, but from its website latency is varies witing single digit milliseconds, what is at least 10 times more that I want to reach in the thesis system.

Megastore doesn’t reach a great performance because it is based on the Bigtable (with high communication cost), however it is scalalble and consistent. Synchronization for wide area reslication is done with Paxos.  Taking into account scalbility, consistency and faults priviledges, latency is sacrifiesed and is witing 100-400 milliseconds for reads and writes.

Scatter is a DHT-based key-value store that layers transactions on top of consistent replication (uses a low level interface). Even though it provides high availaility and scales well, still latencies for the operations are witin milliseconds.

VoltDB is an in-memory db that support master-slave replication over wide area range.

Cassandra is an column based storage developed and used by Facebook with reads within milliseconds.

Spanner provides semi-relational data model support and provides high performance, high level interface, general-purpose transactions and external consistency (using GPS and atomics clocks with new concept of time leases: TrueTime). Spanner also integrates concurrency control with replication. The main contribution of the paper is that the system solves the problem of wide-area replication system and that it implements globally synchronized timestamps (support strong consistency and linearizability for writes and snapshop isolation for reads). Good: TrueTime. Interleaving data. Atomic schema change. Snapshop reads for the past. Weak: Possible clocks uncertainty. Paxos groups are not reconfigurable. Read-Only transaction with trivial solution for executing reads (if there are a few Paxos groups, Spanner is not using communication within this groups and simply apply the latest timestamp on the read). Typical reads are near 10 ms and writes average is 100 ms.

Which characteristics can be sacrified in order to reach specific goals? The answer is: the system should be adopted as much as posible to the needs. Another thing when you are actually chasing for the latencies… Most probably rare DB will fit your requirements…

If it is not 90% well suited – Let the funny part start -> Do it yourself 🙂 Like me:))))