How to work with shared dev clusters (and why) - Part III: What works, and what doesn't work

This is the last article of this 3 part series. In the first article, you learned about the challenges when devs who work on Kubernetes-based applications try to run all services locally. In the second one I showed you how quickly factorials grow when services aren’t shareable, why you most certainly already have a complex setup (it’s just less visible when you only take a look at the individual level of developers) and what are the real costs of running everything locally.

Now we take a look at the solutions.

What works?

Of the many developers that I have talked to, the only ones that were happy were those that:

Were able to build locally and validate their changes before pushing to staging,
Had full access to the services they were working on, either by running them locally (majority) or having full access to a remote deployment (minority),
Had access to highly automated deployment available to every developer that removed (or reduced) the need to deal with the factorial complexity of their deployments.

How to get there

Getting to this point means:

Having automation that is capable of dealing with factorial complexity,
Having remote computing resources available to enable deployments for each developer,
Making services shareable so that dev deployments are affordable.

Fortunately, 1 & 2 are things you can buy. Unfortunately, 3 is something that you have to take care of internally: Your software needs to be able to support sharing services. Otherwise, the cost of providing dev deployments can be prohibitively high.

However, as mentioned before, this is something that can be worked on iteratively, one service at a time. This will require a change of thinking: Let backwards compatibility go out the window and focus on building something that will work in the future as well.

1: Dynamic configuration management + modular automation

Automation that is capable of dealing with factorial complexity means two things:

The ability to express multi-faceted constellations of dependencies and constraints in a maintainable way, which is the basis for creating configurations for specific deployments dynamically
Automation that is modular, allowing the re-use of automation steps to create many different outcomes (i.e. deployments)

Dynamic configuration management is an approach to managing configuration that makes it possible to deal with factorial complexity without having hundreds of different config files flying around that are all slightly different and hard to maintain. Instead, you have a system – a configuration database or a similar system – that allows you to define a data model that describes your dependencies and constraints.

Modular automation that is associated with a dynamic configuration management system allows you to automatically deploy a large number of different configurations, making it simple to manage factorial complexity. It ensures that all dependencies and constraints are taken into account, and provides fully automatic deployment to anyone.

Cloudomation Engine is an automation platform with a built-in dynamic configuration management system

Cloudomation Engine is an automation platform that is built for precisely this: it has a built-in configuration database that allows you to define custom configuration data models that describe your deployments. You can start with templates that describe common deployment scenarios and extend it to your own needs. You can also ask us to create a deployment data model for you based on your existing configuration files and deployment scripts.

For example, in this data model you could define which services of which versions are compatible with each other. You could also define which services are shareable and which aren’t. You can also define any additional deployment options, for example that your software can be deployed to an EKS or a GKE cluster – or anything else that is relevant for the deployment of your software.

The next step is to connect this data model that describes your deployments with modular automation to deploy the software with any of the possible configuration options.

In a first step, your data model can be small and express only the most common configurations for which you probably already have deployment scripts. These existing scripts can be referenced, allowing you to get started quickly and reusing existing scripts and configs.

Over time, you can separate your scripts into more modular automation steps that allow you to dynamically create more and more different deployment options. The data model can be extended in lockstep, allowing you to extend your automatic deployment capabilities iteratively.

In addition, Cloudomation DevStack is a platform for remote development environments that allows you to combine the automatic deployment of your software with automatic deployment of development tools. This is exposed to developers via a self-service portal that allows each developer to deploy full development environments that include the software they work on.

If developers want to deploy some services locally and connect them to a remote cluster, DevStack supports them by providing relevant config files and scripts that are tailored to the specific deployment they need. Where required, DevStack automatically deploys the remote cluster or remote services to an existing cluster, or just provides the relevant configurations if all required remote resources already exist.

Cloudomation Engine and Cloudomation DevStack make it possible for developers to work on complex Kubernetes-based software without having to deal with the complexity of running all services locally, or of running some services locally and connecting them to a remote cluster.

2: Sufficient compute resources

Compute resources to run the application they work on can be provided locally – by buying developers really beefy laptops – or remotely.

Remote computation becomes the only option once the application becomes too heavy-duty for laptops.

Fortunately, buying remote computation is easy. The problem here is not buying computation from a cloud provider, but the fact that this can quickly become very expensive. At the point where developers are not able to run the application locally anymore, the required compute resources for the application are large enough to represent significant cost in a cloud environment.

There are ways to manage cost even in such cases:

Downscaling, hibernating and removing unused resources automatically and swiftly,
Leaving as much locally as possible, for example by deploying non-shareable services locally and providing only a smaller subset of shareable services remotely,
Buying hardware instead of renting cloud computation: Running a local dev cluster in your office can be a lot cheaper than renting the same computation in the cloud.

All of this requires deployment automation to already be in place so that using and managing the remote compute resources efficiently is doable for developers.

3: Making services shareable so that remote compute becomes affordable

As described above, the best way for long-term cost efficiency while still providing developers with the ability to validate their work is to work on the shareability of your services.

Summary

There are two constraints for developers working on complex microservices architectures:

Factorial complexity of running the software
Limited local computing resources

Both problems increase with the age and complexity of the software that is being developed. For small and medium sized software companies with software of medium size and complexity, this leads to a lot of developer’s time (10-25%) being spent on managing local deployments.

For larger software companies with complex and large software products, this often means that developers are simply not able to validate their changes to the code before they push them to a shared repository. The result are costly quality issues in the software.

To solve this, both constraints have to be addressed.

Factorial complexity can be solved with dynamic configuration management and modular automation.

Dynamic configuration management allows to automatically create and maintain configurations for complex, multi-faceted constellations of dependencies and constraints. Dependencies and constraints can be formulated once, and configurations can then be created automatically based on the defined rules.
Modular automation means automating each individual step to create one bit of a deployment atomically, so that they can be combined dynamically to create different deployments.

Our products, Cloudomation Engine & DevStack, are built to do precisely this: taking existing scripts and configuration files, extracting a model of your deployments from them that clearly shows dependencies and constraints, which can then be extended and adapted as needed. Automatic creation of these deployments can be done by initially reusing as much of existing automation as possible, while iteratively moving towards modular automation that is maintainable and usable long-term.

The second problem, limited local computing resources, can be addressed in the time tested way of “throwing resources at the problem”: By simply buying the required computation from cloud providers. Depending on the resource requirements of the software, this can be prohibitively expensive.

To solve this, micro services need to become shareable, i.e. multi-tenant capable. In order to get your software to this point, it is likely that you will have to let go of backwards compatibility and make some fundamental changes to your micro services architecture.

Fortunately, this will pay off also by vastly increasing the scalability of your software, reducing cost in production, as well as by reducing complexity in development, making your developers faster and allowing you to bring new features to market more quickly.

Mid- to long-term, it will most likely also increase the quality of your software. As shown in an example in part II of this blog post series, backwards compatibility provides much smaller opportunities to save cost than service shareability anyway, making the case clear to invest in shareability much rather than backwards compatibility.