The First Steps in our DevOps Transformation

In the second chapter of this series – DevOps for the Win, we explore the crucial initial steps of a DevOps transformation journey. The first steps in any transformation journey are often the hardest. In DevOps, identifying the right initial steps-ones that are both impactful and achievable-is key to driving meaningful progress.

Many organizations make the mistake of blindly implementing best practices without considering their unique constraints, often leading to frustration and failure fatigue. Instead, a more strategic approach is required-one that identifies bottlenecks and systematically removes them.

How to Identify the First Steps?

A powerful principle for identifying where to start comes from research published 30 years ago in The Goal: A Process of Ongoing Improvement by Eliyahu M. Goldratt. The idea is simple: identify the primary bottleneck-the limiting factor that restricts system performance. In our case, this became clear when we saw work piling up with the development team. Tasks were being started but not completed, and the backlog kept growing.

To uncover the root causes, we organized deep-dive sessions with development teams, tech leads, and product managers. These discussions surfaced several key issues:

Teams were taking on new tasks before completing existing ones.
No clear task completion criteria existed-what developers considered “done” often differed from stakeholder expectations.
There was no objective quality measures before marking work complete, leading to defects and rework.
Frequent reassignments disrupted team stability, causing inefficiencies.

Establishing Stable Teams

One of the first corrective measures was restructuring teams for consistency and focus. Instead of treating groups of individuals as ad hoc task forces, we formed stable, long-lived teams with clearly defined ownership.

Setting up some teams, like development teams, was relatively straightforward. However, organizing support teams-such as infrastructure, project management, product management, and business analysis-proved trickier. We went through several iterations to find the right combinations.

Following the team-first thinking principle outlined in the Team Topologies by Matthew Skelton and Manuel Pais, we:

Created small, autonomous teams (4–7 members) to promote deep domain expertise and accountability.
Eliminated cross-team dependencies by ensuring individuals were fully allocated to a single team.
Iterated on team structures, particularly for support functions like infrastructure and product management, to optimize flow.

Leveraging Tools for Visibility and Management

Once the teams were structured, we moved to Azure DevOps for backlog and work management. A unified platform enabled:

Better work visibility so teams could track progress transparently.
Standardized definitions of “done” to align expectations across functions.
Improved backlog management, reducing scope creep and prioritization conflicts.

Enforcing Objective Quality Metrics

To address the issue of unclear quality measures before marking tasks as “ready for testing,” we integrated a static code analysis tool, providing objective insights into:

Code coverage
Security vulnerabilities
Code smells

We also enhanced our definition of “Done” to make quality checks a mandatory step before marking work items complete.

Managing Work in Progress (WIP)

One of the most impactful changes was monitoring and limiting Work in Progress (WIP). High WIP indicated bottlenecks, allowing us to proactively address areas where work was stalling.

This systematic approach, rooted in The Goal and Team Topologies, laid the foundation for continuous improvement, reducing dependencies and improving accountability.

In our DevOps transformation journey, one of the most significant realizations was that after the initial improvements, the primary constraint to flow shifted to software design—specifically, the testability of our code. After addressing initial bottlenecks by improving team setup, defining a “Definition of Done,” and implementing code analysis, we noticed that work was piling up during the testing phase. Code developed by scrum teams was often waiting in line for testing by the QA members within the same scrum teams.

Addressing the Next Bottleneck: The Testing Gap

With the initial bottlenecks addressed, a new constraint emerged-testing delays. While development tasks were progressing efficiently, testing became a roadblock, preventing faster releases. Upon investigation, we identified key challenges slowing down the process:

Manual testing reliance: Most tests were executed manually, leading to slow feedback loops and delayed defect detection.
UI dependency in testing: Since UI components were typically completed last, testing couldn’t start until late in the development cycle.
Lack of proactive automation: Automated tests, such as Selenium-based UI tests, were only written after manual testing, limiting their effectiveness in early-stage validation.

Shifting to Test-First Approaches

To break this bottleneck, we prioritized unit tests and API tests over UI tests. The shift required fundamental changes in development mindset and software design:

API Testing Improvements

To make API testing efficient:

Well-Defined APIs: APIs needed to be simple, well-documented, and available early in the development phase so that testers could create test cases proactively.
Avoiding Database as the Integration Point:
- Using databases as the integration point between teams created dependencies that slowed down testing.
- Test case creation required setting up databases with complex data, which increased setup time and limited the ability to test multiple scenarios.
- By moving to API-based integration (primarily REST over HTTP and, more recently, GraphQL), we significantly reduced the delays and complexities caused by database dependencies.

This shift enabled faster progress in automating test cases for APIs, showing that focusing on API-first designs improved both testability and efficiency.

Strengthening Unit Testing Practices

Unit testing posed a bigger challenge:

Rapid Progress, Then Coverage Theater: Initially, we achieved high levels of code coverage, but code reviews revealed that many unit tests were superficial, written just to meet coverage targets. These tests failed to check meaningful functionality, cover failure scenarios, or validate edge cases.
Changing Mindsets: Developers needed to understand the value of unit tests-not just for improving code quality but also for accelerating development by catching issues early.

Challenges in Writing Testable Code

The biggest roadblock to effective unit testing was the lack of testability in the codebase itself. Key issues included:

Large, monolithic functions.
Tightly coupled components.
Poor separation of concerns.
Insufficient abstraction.

These issues made it difficult to isolate and test individual units effectively. To address this, we:

Provided Guidelines: We shared best practices on:
- Selecting functions for unit testing.
- Refactoring code to improve testability.
- Designing and using mocks to facilitate testing.
Focused on Incremental Improvements: Developers were encouraged to make small, meaningful changes to improve testability over time.

Progress, Challenges, and the Road Ahead

For legacy products, improving testability remained a slow process due to architectural constraints. However, these actions weren’t just aimed at improving the current codebase-they were an investment in the future:

Developers built better habits, embedding testability into new codebases.
Teams became more self-sufficient, reducing reliance on external QA efforts.
The organization avoided repeating past mistakes, ensuring future products were easier to maintain and evolve.

The key takeaway? DevOps isn’t about implementing tools or checklists. It’s about continuously improving the flow of work, one constraint at a time.

In the next chapter of this series, we’ll explore “Designing for Testability”. We’ll dive deeper into how improving software design can remove testing constraints, enabling faster feedback loops and higher software quality-first in team structures, then in workflow management, and finally in testability-we set the foundation for a sustainable DevOps transformation.