The remote dev decision: part one

Latest

Engineering

November 15, 2022

Author: Stytch Team

Here at Stytch, we’re focused on building great developer experiences — for our customers and our coworkers. Engineers’ time is a precious resource, and we want to do everything we can to make sure it’s spent efficiently and effectively. 

For customers, we’ve created a suite of developer-friendly auth solutions with our API and SDKs, which allow developers to experiment and compare authentication solutions easily, and get up in running in a matter of hours. 

In house, we need an equally strong developer experience to deliver on continuous improvements and features. We owe a lot of our agility, responsiveness, and speed to working in a remote development environment. 

But it wasn’t always that way! In this series, we’ll go over some initial challenges our engineering team faced with local development, why we moved to the cloud, and some of our favorite fixes and tools for making remote dev work as efficiently as possible. 

In this first installment, we’ll go over:

  • What remote dev is, its pros and cons
  • Stytch’s challenges with local dev
  • The reasons we ultimately went remote

Overview: remote dev pros and cons

What is remote development?

Remote development requires moving your team’s development environment outside of your local machine. Typically, this means moving the development process into a container or virtual machine (VM) in the cloud, and having a tool to sync the code from a developer’s machine to that remote environment. From there, a developer can build and run their service without being bogged down by the limits of their local machine. 

Pros and cons

The biggest benefits of remote dev are centralization, scale, and efficiency. Remote dev centralizes tooling versioning and updates, as well as idempotent resources, so there are no surprise edge cases in individual devs’ machines. Remote dev also helps maximize developer productivity by no longer limiting engineers to RAM or CPU of their individual machines, by minimizing differences between development and production, and by creating consistent and repeatable steps for developers. 

But migrating to remote dev is not without its challenges. Many companies delay their migration to remote dev because doing so creates a lot of initial friction to the development flow and requires a lot of time and person-power to get set up. This is why it’s more common for large  enterprises like Google and GitHub to operate in a remote environment, but less common for startups like Stytch. For many, the up front costs are too great to justify when a company is getting off the ground. 

We can think of the tradeoff between local dev and remote dev as a function of two variables: the size of your team and the size of your software services. As both of those sizes increase, the scale tips towards remote dev as the more productive way to develop.

Early days: local and friction-filled

In the first few months of Stytch’s life, we had one goal and one goal only: ship. A couple months after we launched our first product Email Magic Links, we took a step back to evaluate our development cycle. We wanted to take a strategic stance on what kind of developer experience we wanted to build in-house, and what we needed to meet our goals of becoming a preferred auth provider on the market.

Some things were apparent from the outset, just from roadblocks we’d hit in our day-to-day flow. 

For example, doing something as simple as editing our email template in our dashboard required running every service. This turned what would optimally be small tasks into time and resource intensive undertakings. And when we did ship features, we couldn’t guarantee that the development process was consistent. We even shipped a bug very early on because an engineer was writing a feature in the API with Redis disabled on their local machine. Local development was starting to slow us down. 

Before we took on our first customers, we wanted to minimize delays and mishaps like this as much as possible. As we looked at the bigger picture, we identified five main productivity drags in our dev environment we wanted to address:

  • Too ???? many ???? services: We had to run six different terminal windows and two databases to develop our dashboard and any example applications. 
  • Extra context, wasted time: The frontend team needed to understand how to run our backend code, and the backend team to understand the frontend code. That meant a lot of extra work on both sides. 
  • Time- and people-intensive: There was a fifteen-step process to set up the dashboard, and any changes to the process required coordinating with the other team members. The process would take over an hour to set up and required insider knowledge. New engineers needed context on code bases that they didn’t need to edit. 
  • Local configs vs. production configs: Some features were developed with different configuration than production, often causing issues even when they worked locally.
  • Hard to actually test: Testing across services was challenging as the team would need to make sure all the versions of the different services were aligned. 

With these five factors in mind, we set to work defining our ideal development environment.

What we needed

To meet our goals of scale and deployment speed, we had clear specifications for what we needed from our development environment. It needed to:

  • Allow engineers to update, upgrade, or experiment without disrupting other engineers or workflows or requiring extra work or syncing to avoid shipping bugs. 
  • Centralize tool versioning and updates: No more worrying about or managing weird edge cases in a lone engineer’s machine. 
  • Seamless transition between development and production: Because most local development is done with http (but we use https in production), any feature work related to domains like cookie and local storage was needlessly challenging. We wanted to minimize that challenge so we could get to production as easily as possible. 
  • Offer robust processing power: Stytch quickly grew from requiring six microservices to run to eight, in addition to two databases. To run and make changes to all of those and test end-to-end, we needed more raw computing power than we could milk from our local machines. 
  • Easy to test: “Fail fast” is one of our core values at Stytch, but to fail fast you need to be able to test end-to-end rapidly. We never wanted to hear the phrase “Well it worked on my machine” ever again

With these requirements, we first considered improving our local development environment, to defer the leap to the cloud. We saw three options:

  1. Run all the services in Docker on our developer’s machines. This reduced the work to run the services, but didn’t solve for load or keeping all our engineers’ work in sync. 
  2. Have the service in development hit our staging environment for other environments. This simplified development, but complicated our staging environment’s data and made it difficult to test a service connected to a local machine. 
  3. Mock out the endpoints for other services. While this is a common approach, we ran into problems where the system worked for a mock, but the real service returned different data.

In the end, we realized we needed to move ALL development into the cloud to meet our full requirements. Whatever initial set-up or resourcing might be required up front, we felt it was well worth it. Especially when we saw other startups who had deferred the decision and paid for it, we wanted to save ourselves those kinds of headaches down the road. 

Today, we couldn’t be happier with our decision. But there were some fun challenges along the way we needed to solve in order to make sure remote dev at Stytch could reach its full potential. Namely:

  • Scheduling all the different microservices and making them easily discoverable
  • Replacing a remote service with a local version for development 
  • Load balancing and routing traffic through DNS records
  • Enabling consistency in developer tools and configs
  • Creating valid certificates for HTTPS
  • Automating the provisioning and syncing of new environments
  • Provisioning external cloud resources from AWS to use during development
  • Fostering a culture that rewards developer productivity and innovation

In the rest of this series, we’ll tackle each of these challenges one by one, breaking down each challenge and the steps required to fix them.

If tackling problems like this interests you, check out our career page!

SHARE

Get started with Stytch