Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

What is Hydra?

Hydra is a unified real-time streaming platform that enables the ingestion, storage, analysis and distribution of massive amounts of data in near real-time. In other words, Hydra is built for storing large amounts of data very efficiently, and with the capability to query and replicate that data in a streaming-oriented fashionHydra consists of several interconnected systems that work together to provide easier data capture, data access and data replication. At the heart of the system is a distributed messaging system based on logging semantics called Kafka. Kafka provides a high performance, highly-available distributed data stream. In the remainder of this document the terms log, stream, data stream and topic may be used interchangeably and represent the fundamental abstraction of the DVS, the movement of data around various systems in near real-time. Layered on top of Kafka is a custom-developed data transformation, transport and routing platform named Hydra. Hydra simplifies and abstracts creating, manipulating and storing data streams.

We are able to do this by leveraging technologies like:

  • Akka and the Actor Model for distributed and parallel computation
  • Scala, which, along with Akka, gives us the power to leverage asynchronous processing
  • Kafka for distributed data storage, although Hydra can be configured to work with any fault-tolerant "sink" or destination
  • Apache Avro for validation and serialization of incoming data
  • A combination of application-defined metrics collected through Kamon and Prometheus for monitoring system performance and alerting

In a nutshell, Hydra is:

...

But Why?

Why Akka?

We highly recommend reading at least the introduction to akka to understand why it is useful to use for a streaming platform capable of handling huge amounts of data. In short, Akka allows us to build a multi-threaded and distributed application in a safe, developer-friendly way. This helps us avoid many of the common challenges with programming concurrent systems while delivering a high level of quality, efficiency, and speed.

...