Reduce Cycle Time

Case study

A major auto manufacturer reduces cycle time by testing faster

One member of Launchable’s beta program is a leading car manufacturer. We'll call them RocketCar. The central tooling team at RocketCar serves approximately 150 developers.

Key challenges: Slow software delivery times. Slow test feedback is a major component of their cycle time challenges. Without Launchable, their only path forward is to add expensive testing hardware.

Employees Mode of Development Scenario Languages & Tooling
100,000+ Embedded software Pre-merge tests Python, C, C++, Jenkins, Gerrit, Nosetest, Yocto Linux

Summary

How does Launchable help? Launchable brings machine learning in test automation tooling to identify tests that matter to run for each change being made by developers. This AI based testing automation approach helps test faster and delivers cycle time reduction to the team. 

RocketCar can make the following improvements with Launchable:

  • Reduce test feedback time: We calculated the impact of slow feedback from tests (35k-50k developer hours per year), backing up their suspicions (as tests being the bottleneck) with hard data. It turns out, Launchable can reduce their test time by 40%. 
  • Maximize hardware capacity: Launchable can help them get 2.7x more juice out of their existing hardware, delaying the need for additional machines.
  • Reduce wait times for hardware: Launchable can reduce the wait time for test hardware and free up almost 68%.

The bottom line? Launchable can help RocketCar to save around $1.1M per year slashing feedback time by 40% (17.4k hours) and reducing the need for costly hardware.

About RocketCar: an embedded story

RocketCar builds some of the worlds top luxury vehicles: sedans, SUVs, and sports cars. The team at RocketCar is a central tooling team that supports around a thousand developers (internal + vendors). For the beta program, they chose a project that supports 100+ developers. 

The team builds software for in-car dashboard systems. The test automation platform thus tests builds on a custom hardware device which costs a few thousand dollars. They maintain an internal lab to house the hardware.  

At its most basic, the division at RocketCar is about delivering software on embedded devices. The challenges of slow feedback and optimal hardware utilization are applicable to any company delivering software on embedded devices. RocketCar is similar to several Launchable beta partners who fit this mode of development. 

The key challenge: slow feedback for developers

Their primary challenge is slow feedback caused by long build times. Thus, the end-to-end cycle time for delivering features is very high. The tools team faces a couple of executive escalations a month from their stakeholder development teams. Worse yet, this team cannot quantify the organizational cost of slow feedback.

Cycle time was just brought up in an escalation in a meeting. Developers are frustrated that things are not moving fast enough…There is not enough hardware, and we can't ramp up fast enough. Not enough hardware drives up drives up execution time. This is the biggest pain in our team.

Engineering Manager

RocketCar


Procuring and maintaining hardware

At heart, the division at RocketCar is about delivering software on an embedded device. The challenges of slow feedback, cycle time reduction, optimal hardware procurement & utilization are applicable to any company delivering software on devices.

RocketCar is archetypical of a few Launchable beta partners who fit this mode of software development. 

Developers have to wait for the hardware to be available before new workloads can be tested. This is where the lack of testing hardware hits the team hard.

The current mitigation path is to add new hardware, but they can only do so during the annual budget process. The annual process means that it is hard to course-correct during the year.

Adding more capacity is ideal although we have a steep ramp up for hardware. Our FY21 goal is 100% growth in capacity to meet demand… the hardware might be $100s of dollars, but if you include all other costs, like housing that gets into $1,000-$2,000 per unit.

Technical Lead

RocketCar


An interesting side effect of this process is that new projects are starved for resources until new hardware is ready. Therefore, newer projects suffer a higher wait time than older projects.

Maintaining the hardware and the corresponding lab is a non-trivial cost. A dedicated team manages the lab. Additional hardware comes with additional maintenance overhead.

At this very moment, due to COVID-19, infrastructure for tests is hard to maintain because it involves physical equipment.

Engineering Manager

RocketCar


Cycle time components

We can break RocketCar's test run time into several components:

Cycle Time Components

Each build takes about two hours to complete, so feedback time for developers is also about two hours per change. The fixed overhead and optimizable components are split evenly in the build.


The cost of slow feedback

Launchable brought insight into the aggregate time spent waiting across the division. The team runs between 316-476 builds per week. With this and build time data we were able to compute the daily and yearly hours spent waiting by all developers and the corresponding dollar impact. It costs them approximately 35k-50k hours for this team of 100-200 developers.


The Cost of Slow Feedback

Improving developer feedback and cycle time by 39%

The machine learning based testing algorithm analyzed builds, code changes, and test results to create what we call a "Confidence Curve" for RocketCar (shown below). This curve represents how quickly a developer finds that her changes have a problem. It shows the percentage of tests that Launchable needs to run on the x-axis to achieve the confidence level on the y-axis. You can think of confidence as the predicted number of regressions that will be found.

Improve Developer Cycle Time

The dotted line shows the pre-Launchable performance of the system. The key number on this curve is 90% confidence; 75% of the tests must run to get to 90% confidence (dotted line). With Launchable, only 20% of the test must run to get to the same 90% confidence number (red line).

By using Launchable to only run the pertinent 20% subset of tests, the test execution time drops by 50 minutes. The build time reduces from 127 minutes to 77 minutes which includes the 65 minutes overhead that we cannot influence. 

This is a 39% reduction in build time for developers!

Furthermore, Launchable allows RocketCar to choose any number on the red curve to optimize between feedback time and confidence. For example:

Test Execution TimeConfidence
1.5 minutes (2.5%)80%
6 minutes (10%)87.5%
11 minutes (20%)90%
29 minutes (50%)95%

Tripling hardware capacity

By testing what matters, the pressure on hardware units is reduced and the net impact is that capacity is effectively increased by 2.7x. The team can defer adding more hardware. Getting more juice out of the same hardware increase implies that the associated maintenance costs don’t grow. It is worth calling out that smaller projects don’t starve for resources.

Here are the numbers after choosing to optimize for 20% tests at 90% confidence (on the red curve from the earlier section).

Every test session occupies a single hardware unit. First, about 15 minutes are spent to reset and load the software onto the testing hardware. It then takes about 57 minutes to run the full test suite.

But by only testing what matters, that 57 minutes is cut down to 11 minutes. Now, a test session only occupies a hardware unit for 26 minutes instead of 72! This means that the same hardware pool can now accept 2.7x the previous workload, resulting in shorter cycle time and faster time to market. This stretching of the capacity comes in really handy early in the development cycle when hardware units are expensive and time consuming to manufacture.


Tripling hardware capacity

Reducing queue wait times

A developer must wait for test hardware to come available her tests can run. Today, RocketCar over-provisions hardware to keep the queue wait time reasonable for most cases. 50% of test sessions get the hardware unit they need within 1 minute.

But you might be surprised to hear that the average wait time is much worse – 5 minutes. This is because when things get busy, the queue goes much longer. The worst case scenario goes north of 60 minutes.  

Yet those are precisely the time that developers need test hardware the most!

The issue for the team is that most people remember the one time that they had bad experience, and that takes away from all the excellent work this team is doing.

By reducing the time a test session occupies a hardware unit, we can reduce the queue wait time by at least 63%. The impact is even bigger in a crunch time when the queue starts to back up.


Reducing queue wait times

The bottom line impact: cost savings

There are two ways to think about the benefits to RocketCar – bottom and top line impact. 

Here, we’ve focused on the bottom line impact, which is cost savings in terms of time spent waiting and hardware reduction, because it is easier to quantify. However, we posit that the bigger impact is on the top line, because faster cycle times with higher quality shippable code translate to more benefits delivered by RocketCar to its customers. 

The bottom line impact is $1.1M saved each year.

We have a two-fold impact here:

  • Saving developer hours waiting for feedback or saving hours wasted by developers doing something other than their primary task. We think of this as “leaks” or “waste” in an organization that have always been hard to quantify, let alone get a handle on. 
  • Hardware cost and maintenance cost reduction.

Massive drop in costs waiting for results

A 40% drop in a test run drives a corresponding drop in the yearly wait time. This was a staggering 17.4k (mean) hours saved (range 12.9k - 22.4k) for a group of 100-200 engineers. The potential for further savings is enormous because this team serves about 500-1000 engineers across all projects in the division.      

Saving 40% time implies a 40% dollar savings or roughly ~$1M. This saving easily exceeds the stated goal from the team as they began their engagement with us.

If we could drop execution time to 20-30% percent, that would be great for us.

Engineering Manager

RocketCar

Massive drop in costs waiting for results

Reduction in hardware purchases

The workload that used to require 11 hardware units can now be served by only 4 hardware units. At $2K per unit, this translates into $14K CapEx saving as well as $3K/yr OpEx saving, even assuming a conservative $35/HU/mo. (OpEx pricing is calculated based on Mac Mini colocation hosting.)

The team’s test suite grows at the rate of 9% per month. This implies that the team is in a race to add more hardware. More hardware requires more lab space, cooling units, and personnel to manage this lab. By reducing hardware, the team can delay these costs.


The top-line impact: testing faster to deliver more customer facing value

RocketCar isn’t a company that uses dated software delivery practices. However, the limitations imposed on them because of the underlying hardware implies that they cannot utilize modern development practices to a full extent. These practices recommend pushing smaller changes through the pipeline more often which in their environment is hard.

This is where Launchable shines. We reduce software delivery cycle times because we can identify the right tests to verify changes. Developers getting faster feedback means their eventual changes are higher quality. Consequently, developers can push through smaller changes faster and more often. The ultimate winner is the customer, who now gets value delivered to her earlier.


Happier developers

Wait times from tests is one of the biggest hurdles in a developer's day. Long wait times mean longer days or even longer weekends at work.

This team at RocketCar is responsible for helping 100-200 developers deliver code. Today, the central team sees a few executive escalations per month because because of slow cycle times. An occasional wait time of one hour during crunch time is what people remember, even if wait times are usually low.

This is where Launchable shines! Shorter test cycle times without as much context switching implies developers can pick up a task and continue working on it without interruption until it is done. This brings the employee satisfaction up. Ultimately, this helps engineering leaders drive down employee attrition. This is a potential benefit of bringing Launchable in for every company. 

By adding Launchable, the tools team can make a massive impact across the division.

You don't have time for slow tests!

Eliminate slow test cycles. Test what matters.