Modio Architecture

This is a brief coverage of the Modio systems architecture, and how we use Kubernetes.

Embedded Devices

Our embedded devices in the field are the source of most of our data, so I'll explain a bit about how they function.

Generic

Our devices run a custom Linux for data collection and logging, batching & streaming data based on availability of network. Not using "realtime" techniques but preferring continuous small batching to improve network efficency, and because networks turned out to be not very reliable in practice.

Data model

Different protocols are integrated & decoded on the edge, mapped to a normalized format and stored as a tuple of key,value,timestamp, making it a key=value time-series.

Each key maps distinctly to all of the following:

Data type (down to bits of precision expected from hardware)
Name (human readable, default)
Unit
Description (Also for human consumption)
Alerting rules

This device is then responsible for normalizing the data it collects from meters, and storing (a hopefully small amount) for batch processing.

Transfer

Currently using "Modern" TLS with HTTP/2 + Client & Server ceritifcates from a separate PKI, (there is no public CA trust root on the devices), and each device has it's own key for identification.

Processing

This is a rough outline of our processing pipeline, not a complete description.

Ingress / Load Balancer

Nginx is acting as load balancer for a multi-provider kubernetes cluster, splitting data processing jobs geographically and logically.

Submit

Data is processed in batches via the submit service, which can be scaled out to be geographically close to devices and customers in order to improve network reliability.

The data submit process includes:

Validating each data point according to type rules
Ensuring processed values are accessible
Handling notifications & alerting systems

Data is then pushed into storage tier one, PostgreSQL.

API

The api service provides our REST API, primarily giving access to historical data, events, and trends. It is also deployed in kubernetes, and here we also have a cache layer as the API is designed to serve data using cache-friendly methods.

Storage

Any data you want to store, you want to store more than once.

Backend database

Using Tiered PostgreSQL storage, both with replication for data availability (backups, off-site) and hot+cold storage or archival data using Foreign Data Wrapper (FDW) to secondary storage of older data. This allows us to store large amounts of historical data accurately, saving up to 90% of disk capacity for older data.

Redis cache

Redis is used as a cluster-internal service to be able to quickly answer the question "what was the last value for sensor xxx.yy.zzz", which is one of our most common queries, and something that is surprisingly difficult to answer efficiently in a relational database.

Multi cloud & k8s native

Data storage (Postgres + offloading) are not managed in k8s, but past that, the entire server-side stack is horizontally scalable, and easy to split geographically. We've always had a multi-cloud approach to managing our k8s environment for availability & cost reasons, and our existing clusters span multiple providers.

Kubernetes adoptation

Our story with kubernetes began three years ago after a few outages caused by restarting docker daemons during updates, upon which no containers could be started again until the machine's container storage had been wiped and re-initialized, something that required manual attention during new years eve weekend.

We originally migrated to k8s mostly as a glorified container launcher, and have gradually come to adopt more and more of the functionality available from the platform.

Where originally our containers were "OS-level" (including init-systems and more) and a way to package an OS stack and improve life for deploying consistently, they have since turned into a fairly standard k8s load of multiple individually scalable components.

Things not in Kubernetes

We do not keep databases & data-storage in kubernetes, as we try to keep all our kubernetes nodes stateless and keep no data in volumes or similar, to make the architecture more maintainable and manageable.

Neither is VPN and static IP endpoints in kubernetes. Again because of stateful loads and availability. It could in theory be migrated, but was deemed not worth it in practice.

Architecture and Kubernetes