GENIUM Data Store (GDS) is a system I am working on for my Master thesis in NASDAQ.
GENIUM INET messaging bus
To achieve performance goals, GDS leverages the high-performance of GENIUM INET messaging bus. GENIUM INET messaging bus is based on UDP multicasting and is made reliable by totally ordered sequence of backed up messages with a gap-fill mechanism using rewinders. As it is well known, a total order broadcast algorithms are fundamental building blocks for fault-tolerant systems construction. The purpose of such algorithms is to provide a communication primitives that allows processes to agree on the set of messages and order that deliver. NASDAQ OMX implementation of this abstraction assumes a perfect failure detector, e.g. it forces a process to fail if it was considered faulty. Moreover, uniform reliable total order is preserved, where a process is not allowed to deliver any message out of order, even if it faulty.
The receivers of the ordered messages should guarantee exactly-once delivery to the applications for each message, this way uniform integrity is guaranteed. Across the cluster of applications/clients and servers connected to the message stream, messages should be delivered in the same order.
The message stream can be configured without restriction, so that it can contain any number of various clients that can be placed on the same or different server, with defined replication level. The server/sequencer is responsible for the sequenced/ordered stream creation as an output from received clients’ messages.
Failure resilience is provided by the following:
- All message callbacks are fully deterministic and replayable (if single-threaded), as the incoming stream is identical each time it is received.
- Replication can be adopted by installing the same receivers at multiple servers.
- As long as the new primary rewound to the same point as the failed one, the message stream is sufficient to synchronize state.
GDS uses the privileges of the described above infrastructure and a priory maintain a fault-tolerant real-time system. Moreover, the distributed state is made consistent by adhering to the sequenced numbering implied by the message stream.
In a GENIUM Data Store, transactions are submitted through the MoldUDP, therefor it ensures the lowest possible transaction latency.
MoldUDP is a networking protocol that makes transmission of data messages efficient and scalable in a scenario where one transmitter and many listeners are present. It is a lightweight protocol that is built on top of UDP where missed packets can be easily traced and detected, but retransmission is not supported.
Some optimization can be applied to make this protocol more efficient: (a) multiple messages are aggregated into a single packet – to reduce network traffic, (b) caching Re-request Server is placed near remote receiver – to reduce the latency and bandwidth.
MoldUDP presumes that system consists of listeners, which are subscribed to some multicast groups, and server, which transmits on those multicast groups. MoldUDP server transmits downstream packets through UDP multicast to send normal data stream addressed to listeners. MoldUDP server sends heartbeats periodically to clients, so they can retrieve information about packet loss if it takes place. Moreover, listeners should be configured with IP and port to which they can submit the requests.
Note: message in this context is an atomic piece of information that is carried by the MoldUDP protocol from 0 to 64 KB.
In GENIUM Data Store, read query support will be maintained. That is why, TCP-like protocol is intended to be used to stream the data to the client in response to the submitted query.
SoupTCP is a lightweight point-to-point protocol build on top of TCP/IP sockets. This protocol allows to deliver a set of sequenced messages from a server to a client. It guarantees that the client will receive all messages sent from a server strictly in sequence even when failures occur.
Server functionality with SoupTCP includes: (a) clients authentication on login and (b) delivery of a logical stream of sequenced messages to a client in a real-time scenario. Clients sends messages to a server which are not guaranteed to be delivered in case of failures. That’s why the client will need to resubmit the request to the server.
- Client opens a TCP/IP socket to the server with login request.
- If the login information is valid – server responds with accept and starts to send sequenced data.
- Both client and server compute message number locally by simple counting of messages and the first message in a session is always 1.
- Link failure detected by the hearbeating. Both server and client send these messages. Former is required to notify a client in case of failure to reconnect to another socket. Later is necessary to close the existing socket with failed client and listen for a new connection.