Note: Kafka is no longer used in Openmesh's data collection pipeline due to it's centralized architecture, see: Openmesh Core

Data connectors are the first step in the Openmesh data pipeline. These connect to data sources, both on and off chain, to ingest granular data in real time to be processed by the pipeline.

Each connector is containerised and orchestrated by a Kubernetes cluster. They are each stateless, able to be turned on and off quickly and without any issues. They are also isolated from each other, so in the event of a catastrophic failure of a single exchange, every other exchange will be completely unaffected. This also means we can add more sources and data without fear that the system won’t be able to scale.

Precautions have been taken to ensure that data isn’t missed when an unexpected event happens on the exchange. Many possible scenarios and status codes are accounted for, e.g. in the event of a 429 (rate limit), the connector will wait a certain amount of time before connecting again. In the event of a stale connection, a background process will pick up that no data has been sent over the connection, immediately alarming that the connection ought to be restarted.

For blockchain connectors, we connect directly to nodes via JSON RPC. Over websockets, the connectors call subscription methods to receive updates whenever a new block is minted on the chain. Afterwards, over HTTP, methods are called to receive information about the block, including all transactions that have occurred and all smart contract events. As an open source project, Openmesh has been designed to work with any connection to these nodes – it doesn’t rely on existing APIs. If users wish to run a connector themselves, they can run their own nodes or use any provider without worries for incompatibility.

The codebase has been developed to support any arbitrary data source in the future – a modular and flexible structure means that the community can extend the Openmesh source to connect to any source of crypto data. As an open source project, the community is encouraged to add any data source that could augment our coverage, contributing to the network and making even more data available. In the future, the process for adding additional data sources will be streamlined, and users will be able to monitor the status of all connectors in a dashboard that showcases the full lifecycle of development. A public ledger will also be published which will enforce immutability in the collected data, providing a layer of trustless security that will give the cryptocurrency community a guarantee that our data is valid and accurate.

Last updated