HOW IT WORKS
Reactor is a Java-based, integrated data and application framework that layers on top of Apache Hadoop, HBase, and other Hadoop ecosystem components. It surfaces capabilities of the infrastructure through simple Java and REST APIs, shielding you from unnecessary complexity.
COLLECT » PROCESS » STORE » QUERY
Rather than assembling your own Big Data infrastructure stack, Reactor provides an integrated way to compose the different elements of your application: collecting, processing, storing, and querying data.
REACTOR DASHBOARD UI
Reactor provides both Java APIs to build Apps and REST APIs to interact with them externally.
Elements are the individual components you use to build Big Data Applications. There are several kinds of Elements, each supporting a different aspect of a Big Data App: collect, process, store and query. You can use Continuuity’s pre-built Elements to accelerate development or develop your own.
Streams are the primary means for sending data from external systems into the Reactor. You can write to streams easily using REST or command-line tools, either one operation at a time or in batches. Each individual signal sent to a stream is stored as an Event, which is comprised of a body (blob of arbitrary binary data) and headers (map of strings for metadata).
Flows are user-implemented real-time stream processors. They are comprised of one or more Flowlets that are wired together into a DAG (Directed Acyclic Graph). Flowlets pass Data Objects between one another; each flowlet is able to perform custom logic and execute data operations for each individual data object processed. All data operations happen in a consistent and durable way.
Batch programs allow you to process data with MapReduce. You can write your MapReduce jobs in the same way as you would with a conventional Hadoop system. In addition, you can use Datasets as both input and output with MapReduce jobs. While a Flow processes data as it arrives, Batch programs wait for a large amount of data to be collected and subsequently process that data in bulk.
Datasets are Big Data building blocks: sharable, query-able implementations of common data patterns, such as counters or time-series. A Dataset consists of a set of data (like a table) and the methods to manipulate it (the API). The data is stored in the DataFabric and can be accessed by multiple apps on a Reactor or via REST.
Procedures allow you to make synchronous calls into the Reactor from external systems and perform server-side processing on-demand, similar to a stored procedure in a traditional database. A procedure implements and exposes a very simple API: method name (string) and arguments (map of strings). This implementation is then bound to a REST endpoint and can be called from any external system.
Reactor was designed with extensibility in mind. What started out with Streams, Flows and Datasets quickly included MapReduce and Procedures. As we develop the platform and as the Big Data ecosystem evolves, more and more capabilities will be added to expand the platform.
Reactor Core is comprised of the AppFabric and DataFabric, which allow the Reactor to harness the power of the Big Data ecosystem by integrating with Hadoop, HBase, Zookeeper, and other components.
Based on Apache Hadoop YARN, the Continuuity AppFabric is a scalable and highly available resource manager, lifecycle manager and run-time environment for your Applications, whether you’re running in the cloud with a Hosted Reactor or on-premise with an Enterprise Reactor.
Based on Apache HBase, the Continuuity DataFabric stores all your data, which is accessed and manipulated with Datasets.