Why Cloudomation went RDE

Remote Development Environments (RDEs) / Cloud Development Environments (CDEs) seem to be all the rage lately. Or maybe it is just selective attention on my side that makes me see RDEs as a topic everywhere since I started thinking about them seriously about six months ago, in the summer of 2022.

Back then, I had only a vague understanding of what an RDE is. But even without knowing many details, it was clear to me that RDEs are the future. After all, I had long wondered why it is that the most tech-savvy people – software developers – are the people who are still the most dependent on their laptops, preaching the end of manually configured server infrastructure while continuing to insist on manually configured local development environments. It is obvious to me that the days of the manually configured developer laptop are numbered, and RDEs are where things are moving.

That realisation alone didn’t yet trigger any further action on my part: after all, electric cars are the future and I haven’t bought one yet either. But a lot has happened since then. Most of it happened in my head. Here is the story of how I first started thinking about RDEs as something that we might need, to the point where we as a company decided to develop an RDE product: Cloudomation DevStack.

Why local development environments are important

We are a software startup, founded in 2019. We are a team of 8 people, of which 5 are software developers. We develop a fairly complex application: the Cloudomation automation and integration platform, which consists of several containerised components.

Our software developers all run Cloudomation locally on their laptops. They do this to be able to write code, and get fast feedback on changes they make to the code. Developers iterate a lot over their code and build their local instances several times before committing code changes to the central repository. At this point, the developer should be sure that her code at least doesn’t break the build or the application.

Developing without having Cloudomation locally would be like writing code blindly – not something anybody would recommend. We have a centralised and fully automated CI/CD pipeline which is triggered by each git push, which runs not just the build but a full test set and deploys the new version to a feature branch system or directly into the shared development instance. Using only that for development without having a local build system would however mean to

wait until the pipeline finishes before being able to see your changes (or knowing if the build was successful or not),
waiting even longer when other developer’s builds queue up in the pipeline,
having only limited ability to inspect the application because during the production build the code is minimised,
having to connect to a remote deployment via ssh in order to see logs and
a long tail of smaller inconveniences which make it unlikely that running Cloudomation locally is something our developers would ever be willing to give up.

More importantly, being able to run Cloudomation locally and having a developer-optimised local build which is different from the production build makes our developers a lot more productive.

The problem: local development environments are hard

However, running Cloudomation – or any complex software – locally is also difficult. Even though Cloudomation is fully containerised, running several docker containers locally, managing local build and dependencies, turns out to be quite complex, and is becoming more and more so as our software becomes more sophisticated.

We develop an automation platform because we are obsessed with automation – and as such, we have invested a lot into automating our development processes and building advanced developer tools early on, already when our team was even smaller. Our team still consists of “only” five developers (of which only two work full time), but we are still unsatisfied with the amount of time developers spend on manually troubleshooting their local development environments. Every time a new person joins, it becomes obvious again how painful it still is to set up a local development environment, and to run Cloudomation locally.

In addition, in too many sprint retrospectives, developers tell stories about issues they had with their local development environments – from issues with Docker to issues with local builds and anything in between. Sometimes, they are nice stories of overcoming adversity and helping each other out in the team. But sometimes, the troubleshooting stories do not end with a joke, but rather a frustrated statement like: “Then I just dropped it all for the day because I simply didn’t have the patience anymore”.

As CEO, I sit and listen. I have come to think of this kind of troubleshooting as team building or learning exercises – just to find something positive about it. Because it hurts me that after so much time we have invested in building good tools, there are still so many issues left unresolved and still so much time is spent troubleshooting local development environments. And the return on additional time invested is diminishing more and more: adding a new feature to our local build tool to help resolve one potential issue for one operating system simply is not worth my people’s time, it will not be used much, if at all. There is a good chance that this same issue will simply never occur again in our small team. The sad truth is that the potential range of issues that can occur is too large and the entire ecosystem too changeable to be able to pre-empt all potential issues.

And so we plod on: developers spending time troubleshooting, me listening and trying to gauge how much frustration they are willing to accept, and how much of a drag on their productivity I’m willing and able to sponsor.

I’m not a perfectionist. No entrepreneur can afford perfectionism for long. It’s perfectly okay that we will not be able to resolve all issues and have 100% productivity of our developers. But I do have high standards. Wasting the time of my most expensive people on things that they do not enjoy doing and which provide no direct value simply does not sit well with me. I estimate that it is about 10% of my developer’s time that is spent on troubleshooting development tools, their laptops, and their local Cloudomation instance – i.e. troubleshooting their local development environment – as well as waiting for local builds and sometimes testing tools. Even if it cannot get to 0, 10% is simply too much.

There are of course attempts to resolve this issue.

Solution attempts: help but don’t go far enough

Talking to our developers, I learned that containerisation already helps a lot. Those in our team who have worked with non-containerised applications in previous jobs say that the situation was way worse, and try to reassure me that we are actually a highly efficient team. Apparently, the amount of time our team collectively spends on troubleshooting local environments is small compared to what other development teams have to deal with. Some tell stories of at least a day or more per week (20%+!) spent waiting on builds, fixing local builds, searching for root causes of local build failures, etc.

Talking to other software companies, I heard a lot of sighs: it’s a very common problem. Most teams have simply accepted the status quo, they have resigned to frustration and inefficiency. It pains me to remember some of the conversations I had.

Here is what I heard that others tried to get a grip on the time developers spend troubleshooting their environments:

The most common one is standardisation: forcing developers into a tight corset of one operating system with a fixed set of tools that they are allowed to use, which have been tested to work well together. Sure, this also helps – a bit. It also is very difficult to enforce, creates a lot of frustration with developers, and leaves too many issues still unaddressed.
Containerisation – which apparently does help – is another common approach. It also helps, but doesn’t solve everything because running containers is in itself a complex thing to do. Containerisation can also be very hard to achieve for software which was not originally designed to be run in containers.
I also heard CEOs, CIOs and even heads of development put a lot of faith into automated CI/CD pipelines. “Once we have the CI/CD pipeline up and running, all these issues will disappear. Developers won’t need to run the application locally anymore.” I’ve heard variations of this many times now. Unfortunately, it is not true.
I highly recommend fully automated CI/CD pipelines. I also highly recommend taking care of that before trying to address issues with local builds and deployments, because they are harder to resolve. But it is illusory to think that a CI/CD pipeline will make local builds obsolete. No. On the contrary: the CI/CD pipeline is something that will have to be maintained separately and which needs to be different from local build tools. It serves a different – very important – purpose, therefore needs different features. A CI/CD pipeline will therefore add to the effort you’ll have to invest in DevOps, and not decrease it. It will also mean a significant boon to the quality of your software and is a very worthy investment.

So – we were left thinking about standardisation as the only approach we haven’t tried to reduce the amount of time spent troubleshooting. We went as far as to forbidding our developers to use Windows, which had caused the most issues. We didn’t want to go any further. Our developers are highly competent, motivated and intelligent people. None of them have issues with their IDE or other tools they choose themselves: they chose them for a reason. They have issues with the parts that we give them: Cloudomation, the local build tool, Docker, and our new frontend testing tool which turns out to be a resource hog. Standardising other parts of their local dev environments would bring no benefit but create a lot of friction.

We started buying stronger laptops to deal with this resource hog of a test tool. We also cheered ourselves up with the thought that stronger laptops will also mean faster local builds and better experience overall. It still feels like a very crude approach that doesn’t really solve anything but just reduces the symptoms of a bad situation.

A new idea: Remote Development Environments (RDEs)

Then we started thinking about standardised remote development environments. We started talking about this during the pandemic, when it occurred to us how weird it is that many people are now doing more and more of their work in the cloud, but software developers are stubbornly sticking to working locally. Software developers might be among the group of professionals most affected by losing their laptop. Me, CEO, I can do my work from pretty much any device. Everything is in the cloud. Strategy, marketing, sales, product management, accounting and finance, communication, everything: I need a browser and my login, that’s it. My brother, co-founder, CTO: crash his laptop and he will need a day or two before he can start productively developing software again. And he’s by far the most skilled and fastest in setting up his tooling.

Why?

Because the tools for developers to work remotely simply don’t exist yet. There are some attempts and approaches, but nothing that comes close enough in functionality and comfort to working locally.

It took cloud-based office tools at least a decade from their first emergence in the early 2010s to become tools mature enough for a broad spectrum of people being willing and able to use them as their main work tools. Even though the pandemic gave cloud workspaces a big push, the majority of people still refuse to let go of their locally installed PowerPoint, Excel, Word. Why should it be any different for developers, who are a discerning bunch and who do complex work that needs complex tools.

They are also a much smaller group than “general office workers”, so the financial incentives were clearly much larger to develop something like Office365 or Google Workspace, rather than proper tooling for remote work for software developers.

But that is changing. Firstly, because software developers are not just becoming more in numbers, but also because the share of output they produce is increasing even faster than their number. Lack of skilled software developers means that their time is becoming more and more valuable.

It took us a long time to make the connection between the ability to work remotely – which was the core feature of remote development environments (RDEs) as we saw them at first – and the potential of RDEs to significantly reduce the time and effort spent on maintaining local development environments. Finding the right line of separation between what is kept locally and what is provided remotely could finally represent a realistic solution to drastically reduce the issues our developers face with local development environments, while retaining the level of control and comfort that they have gotten used to with their local tools (when they work).

The thought stuck with us. With time, we started to see more and more benefits: the ability to provide really beefy RDEs to speed up builds and tests, which can be scaled down or deleted when not needed. Developers wouldn’t need beefy laptops, they wouldn’t even need proper laptops, they could even use a tablet if they like. They could use Windows, if it makes them happy. They wouldn’t need to keep any source code locally. They could run tests without having test data on local machines. We don’t use PII data for testing, but for other applications, this could be a huge boon. Switching branches would become really easy – or rather, it would become obsolete. You’d simply spin up an RDE for a different branch or version and work in parallel. What a relief!

Keeping a clear head through the initial excitement was hard. Of course RDEs will not solve all our problems, and of course any first iteration of an RDE would most likely provide worse UX than working locally. But we felt that we had finally found a path forward, a way to address the root causes of all the time wasted on troubleshooting local development environments.

Decision: Cloudomation RDEs as a new product

We are starting our first RDE experiment right now, in February 2023. With Cloudomation Engine, we already have the backbone in place: an automation tool that allows us to deploy custom environments with developer tooling. There are some additional components we need, but for a bare-bones MVP, we were ready to go within a few days.

From the conversations I’ve had over the last months about RDEs with a range of people, I’ve come to realise that problems with local development environments are ubiquitous among software developers. I’ve also come to realise that the tools on the market are all limited and at a very early stage. There seems to be an explosion in interest right now, with many large players announcing moves towards RDE products (e.g. Gitpod, Gitlab) at the end of 2022.

So we decided to build RDEs not just as an addition to our internal developer tooling but as a product that others can use as well. I’d like to invite you to join us on that journey and become a first test user of our new RDEs. As with Cloudomation Engine, we want to develop our RDE product based on real-world requirements and feedback from users (including ourselves). If you want to be one of those users and get an RDE product built to your specifications, drop me a line to margot@cloudomation.com or reach out to me on LinkedIn https://www.linkedin.com/in/margot-mückstein/. I’m excited!