We’re excited to announce a new open source project called Maestro which provides a declarative approach to building production-grade Kubernetes Operators covering the entire application lifecycle. Ever since the Operator concept was introduced two years ago we’ve seen a number of projects for automating many different distributed systems, created by the makers of those systems as well as community members. Many Operators that exist today handle the initial deployment of the software, but they don’t provide automation for Day 2 Operations tasks like binary upgrades, configuration updates, and failure recovery. Operators combine advanced Kubernetes concepts such as custom resources and controllers, and therefore require deep Kubernetes expertise. Implementing a production-grade controller for a complex workload typically requires thousands of lines of code and many months of development. As a result, the quality of operators that are available today varies. Some operators that can be found online are mere scripts, which are not able to handle the complex failure scenarios that can happen in distributed systems.
At Mesosphere, we’ve been working with our partners for years to automate the complex lifecycle of databases, message queues, file systems, and other stateful distributed systems. About three years ago we started capturing patterns we learned from building frameworks for technologies like Apache Kafka, Apache Cassandra, Elastic and more in the DC/OS Commons SDK. It essentially exposes high level human-centered concepts that enable anyone to build service automation using just a declarative spec in most cases. Before the SDK existed, creating a framework required deep expertise in Apache Mesos and distributed systems operations, and software vendors had to write thousands of lines of code. Looking at the Operators available today reminded us of those early days. Since the main concepts of the DC/OS SDK aren’t tied to Mesos, we knew that we could eventually create a version of the SDK for Kubernetes. This is how Maestro was born.
How Maestro Helps To Create Operators
Maestro generates many of the artifacts that are required to build and package an Operator, such as Helm charts. Without Maestro, Operator authors have to learn advanced Kubernetes concepts and the configuration formats of multiple different tools and libraries to ship a deployable package. Maestro can generate all these artifacts from a single spec.
Production-grade Universal Operator
Maestro includes a Universal Operator implementation that is based on a state machine, saving authors from writing thousands of lines of code and reinventing the wheel. The Universal Operator can reconcile even complex state changes resulting from failures with the desired system state. The result is that even people new to Kubernetes can create a production-grade Operator easily.
Best practices built-in
The best practices learned from creating Operators for various software are continuously contributed back to Maestro, so Maestro-based Operators can leverage them by simply upgrading to a new version.
Easy to test
Maestro ships with tools that make it easy to test Operators.
Operators built with Maestro have consistent API endpoints and a common packaging format, simplifying the experience for users who are running multiple different Operators.
The core of the Maestro Universal Operator is the “Framework” CRD. This CRD is a declarative spec that describes the implementation details of a framework, such as the deployment plan, components required, and other information needed to start a package. Plans are a key concept of Maestro which allows authors to model any lifecycle event of a system, such as the initial deployment, configuration updates, binary upgrades, and recovering from failures. They provide a higher level abstraction on top of Kubernetes that feels familiar to people that use runbooks, and are easy to grok for novices.
Creating a Framework triggers the operator to create package-specific CRDs. Users will create these CRDs, which are watched by Maestro to then create the service based on the Framework CRD’s specification. Initially, only a few lifecycle hooks will be supported. The roadmap includes implementation of arbitrary plan specs, as well as the ability to create plan overriders with custom code. This will enable support for more complicated lifecycles for managing software such as ZooKeeper or etcd clusters.