Blog

Getting to know Archipelago

This is the first article from a series aimed to explain what Archipelago is and how you can use it to your advantage. 

Despite there's a load of information in our web-site regarding our Archipelago platform, here we will use a different and more direct approach. Also in these article series we will present a quick start guide to use the product.

So, what is Archipelago?

Archipelago can be thought as a library for the JVM (Java Virtual Machine). Archipelago is written in Scala and therefore it is compiled to the JVM, hence it is fully compatible with Java.

Archipelago is an application or system to help you write other applications or systems on top of it. It helps you to write them faster, so you take less time to complete projects. How is this achieved? Simple: Archipelago has a lot of code already written for you and this ready available code is distributed as components which can be plugged together. Therefore you just need to join those components in a lego game fashion and write the bits that are specific to your domain.

In order to make the task of joining the components as easy as possible, Archipelago provides a DSL (Domain Specific Language). You can think of it as a higher level language which allows you to achieve more writing less.


How does this language looks like?

OK, we will show you an example. But first let us introduce a few basic notions:

  • Archipelago's components are called Cells. Those are execution units which tend  to be no longer than an A4 of code and usually are highly specialized on a single thing. For example a Cell can load data from a file (and it will just do that, load the raw data in the form of byte buffers without trying to interpret the contents)
  • Reactors are functional groups of components and they enclose one or more Cell to provide a composed function. For instance a Reactor that reads data from a file and interpret its contents as CSV (Comma Separated Values)
  • In order to define a system in Archipelago we create an object that extends from SystemDef and provide a name for it. Inside this system definition you can further subdivide your application with nested Subsystems and / or directly placing your reactor definitions. Think about each subdivision as higher level black box components which are built from smaller parts.
  • Every part in Archipelago (Subsystems, Reactors and Cells) have a contract. The contract specifies what type of messages the parts can receive and what kind of messages the elements can send. The only difference is that Cells have their contract fixed at implementation time whereas Reactors and Subsystems have their contracts specified at definition time (using the DSL) and the messages that they receive or send are delegated to internal elements. Therefore Cells will eventually be the only components that will send or receive messages.

So, enough for now, have a look at the following snipped showing the definition of a Reconciliation Service. The example shows only the first Reactor (out of other 4 reactors) which implements a text file reader.

object ReconciliationSvc extends SystemDef("TradeReconciliationService") {

  reactor("FileLoaderReactor") {
    contract {
      receives[FileLoadRequest] via "FileLoader"
      sends[String] via "BufferStrUnmarshaller"
      sends[String] via "PartialBufferMatcher"
    }

    routing {
      route[String] to "MatchingService"
    }

    cell[FileLoader]("FileLoader") {
      cellConfig() := new Config {override def bufferSize: Int = 65536}
      route[IndexedByteBuffer] to Next
      route[Tagged[IndexedByteBuffer]] to Nowhere
    }

    cell[BufferStringUnmarshaller]("BufferStrUnmarshaller") {
      cellConfig := new BufferStringUnmarshallerComp.Config {}
      route[IndexedByteBuffer] to Next
      route[IndexedStartOfByteBuffer, IndexedEndOfByteBuffer] to Next
    }

    cell[PartialBufferStringMatcher]("PartialBufferMatcher") {
      cellConfig := new PartialBufferStringMatcherComp.Config {}
    }
  }

  reactor("MatchingService") {
    ...
  }

  reactor("FileStringWriter") {
    ...
  }

  reactor("Orchestrator") {
   ...
  }
}


As can be seen in the example above, the first reactor is called "FileLoaderReactor" and in the contract section it declares that  receives FileLoadRequest messages via the FileLoader cell and that it sends out Strings via 2 of its internal Cells, the BufferStrUnmarshaller and the PartialBufferMatcher cells.

In the Routing section, the reactor declares that it will route those Strings specified in the contract to another reactor called "MatchingService" (not fully specified in the example)

This reactor then declares three Cells: "FileLoader", "BufferStringUnmarshaller" and "PartialBufferStringMatcher".

The FileLoader cell will ultimately receive the FileLoadRequest as declared in the Reactor contract section ( receives[FileLoadRequest] via "FileLoader"). Uppon reception the FileLoader cell will start reading the file and send out the bytes as IndexedByteBuffer to the next declared cell ( route[IndexedByteBuffer] to Next).

The following cell in the reactor (BufferStringUnmarshaller) will therefore receive those buffers and it will interpret what information is expected within the data. In this case this cell inspects the bytes in the buffer to find line separators. When this cell finds two separators interprets the bytes between them as a line and sends it out as a String which in turn will be routed by the reactor as specified in the contract (sends[String] via "BufferStrUnmarshaller").

When this cell can't decide if it has found a line it delegates this task to the next cell, the "PartialBufferStringMatcher". This will happen for instance when the end of one buffer might match the beginning of the next one to form a line. So the "BufferStrUnmarshaller" cell will send those parts to the "PartialBufferStringMatcher" ( route[IndexedStartOfByteBuffer, IndexedEndOfByteBuffer] to Next).

Finally, the last cell ("PartialBufferStringMatcher") will match those bits from beginning and end of the buffers to form the rest of the lines and will send them out as Strings via the Reactor as specified in the reactor contract (sends[String] via "PartialBufferMatcher").

As you can see what we have specified in this first reactor is a flow of data that starts in the file by reading raw bytes then moving to the next stage that will transform those bytes in Strings (lines) which eventually will be forward to another Reactor.

The great thing is that all these components are already written for you and you just need to plug one to another to form those flows. Any component can be plugged to any other as long as their contracts match. That is, a Cell 'A' can be connected to another Cell 'B' if the Cell 'A' sends the same messages that cell 'B' can receive. Isn't that great?
Right, this is too much to digest for a simple presentation... in the next chapters of this series we will explore other Archipelago aspects in detail and later how those connections to form flows work.

To conclude, let us tell you a little secret: every cell in an Archipelago system runs asynchronously and therefore the system makes full use of parallelism, that is why Archipelago systems are easy to create but they also provide superb performance!