From Technology‑Driven Delivery to Value‑Driven AO

Mar 11, 2026

An IELTS Modernization Case Study

1. Background and Challenge

In 2023, the IELTS modernization programme set out to accelerate delivery and improve quality for a complex, enterprise‑scale IT landscape. The organization adopted SAFe and operated with multiple PODs. There was strong architectural governance and substantial investment in tooling and QA. Yet, the volume of business functionality delivered remained below expectations.

Delivery suffered from long lead times, frequent rework, and low predictability, especially when changes were pushed late in the process. Despite “doing agile” from a process perspective, there was no shared, quantified view of system performance. There was also no coherent operating model aligned with value streams. This made focused improvement difficult.

The AO‑method (Agile Organization method) was selected to:

Diagnose the current operating system using a systemic, quantitative lens.
Design a value‑driven organizational model and transformation backlog.
Validate the new model via controlled experiments with selected PODs.

2. AO‑Method Assessment – AO scorecard baseline

The first AO assessment, conducted in early May 2023, evaluated five dimensions: Deliver, Demand, Capacity, Capability, and Organization.

On a 0–10 scale, IELTS scored:

Deliver: 0
Demand: 3
Capacity: 1
Capability: 1
Organisation: 0

The AO goal state for the coming quarter was defined. It includes Deliver 5, Demand 10, Capacity 10, Capability 10, and Organization 10. This makes the performance gap explicit for both leadership and teams.

Key findings by AO dimension

Deliver: Releases were irregular. They were often tied to big‑bang events. Most PODs delivered into sandbox or test environments rather than production‑like systems. CICD existed in parts of the landscape, but organizational conditions (dependencies, approvals, test environments) prevented continuous delivery.
Demand: Multiple product and feature backlogs co‑existed. They mixed functional and technical items. Business value was not consistently used as the primary prioritization criterion. This led to last‑minute decisions, frequent backlog push into sprints, and weak alignment of team backlogs with programme‑level roadmaps.
Capacity: Teams were not stable. People were frequently disturbed during sprints. Scrum Masters struggled to protect teams from organizational noise. The AO assessment showed that teams were far from the “pizza team” ideal with clear boundaries and stable allocation.
Capability: Developers were largely specialized by component or technology. They did not own work end‑to‑end. This limited self‑commitment and slowed flow. User stories were often too coarse. They lacked acceptance criteria and Definition of Done (DoD). Estimation sessions turned into capacity negotiations.
Organization: Architecture, QA, and support functions were organized as central service providers. They had high approval power. This created long dependency chains and blocked flow. Value streams were not structurally reflected in the organization. PODs often supported other teams instead of owning outcomes.

Kanban and Lean lens

Beyond the scorecard, AO applied a Kanban and Lean waste lens:

Work was not fully visualized end‑to‑end, policies (e.g. DoD, capacity allocation, risk rules) were implicit, and WIP limits were absent or local.
Feedback loops around quality, financial, and flow metrics (CFDs, control charts) were sporadic. Improvement routines (True North, target conditions, PDCA) were weak or missing.
The 8 wastes appeared clearly. Transportation and waiting were evident due to big‑bang releases and late testing. Overprocessing occurred through repeated test cycles. Underused talent was a result of siloed experts and central approvals.

3. Systemic Diagnosis: “Not a Team Problem”

The AO‑method reframed the situation from “teams not performing” to “system not designed for flow and learning”.

Technology‑driven model

The existing working model was heavily technology‑driven:

Architecture and QA were in the lead; development PODs essentially executed their decisions and handled defect backlogs.
Several teams (Hydra, Inception, Transformers, Decepticons) delivered mostly non‑functional or support work, indicating that development teams were not end‑to‑end.
PI planning did not consistently include all relevant teams and functions, which delayed architectural decisions and increased misalignment risk.

The programme behaved like a large, monthly Scrum with strong central control. It did not operate as a network of autonomous, value‑oriented PODs.

Events and ARTIFACTS symptoms

AO uncovered characteristic anti‑patterns around agile ceremonies and artifacts:

Sprint Planning often mixed functional and technical decomposition. The stories were not INVEST. The DoD was not used as the central contract. Developers tended to be told what to do rather than committing to clear, negotiable backlog items.
Sprint Reviews focused on code demos instead of working functionality and UAT, which weakened empirical control of product value.
Retrospectives were high‑level and management‑oriented, producing few concrete experiments for the PODs’ way of working.
Release backlogs and cross‑team dependencies were not clearly visible. Jira was not yet used as a single source of truth across the programme and teams.

From an AO perspective, these symptoms pointed to an operating system incapable of supporting continuous delivery, learning, and local decision‑making.

4. AO‑Based Intervention Design

The next step was to design interventions directly mapped to the AO dimensions and scorecard gaps.

AO scorecard‑driven goals

The AO scorecard was used to define explicit quarterly targets:

Deliver: move from big‑bang/quarterly releases to “deliver every sprint”, leveraging existing CICD capabilities, and progressively move towards DevOps “on‑the‑go”.
Demand: converge onto a functional product backlog, prioritized by business value, with clear PI‑level objectives and roadmaps decomposed into MVPs.
Capacity: stabilize PODs. Reduce disturbances. Strengthen Scrum Masters in their protective and coaching role. This will help move towards true pizza teams.
Capability: improve story slicing, adopt DoD at all levels (story, sprint, release, PI), and build more cross‑functional, end‑to‑end PODs.
Organization: realign the PI organization around value streams. Introduce an Executive Action Team (EAT) and Scrum‑of‑Scrums rhythm. Clarify the role of architecture and testing as enabling partners.

Roll‑out model and AO backlogs

The AO‑method proposed a staged rollout model, “deliver early and often value that matters”, supported by four explicit backlogs.

Vision & Strategy Backlog
- Clarify IELTS vision and PI objectives from a business perspective.
- Define a business plan and a technical plan for PI14, including a PI‑level DoD.
Execution Backlog
- Implement Scrum “by the book” in selected PODs (clear Sprint Goals, ready stories, DoD‑based commitments, protected sprints).
- Ensure usage of Jira, Confluence and Miro as integrated tools supporting transparency and collaboration.
Architecture Backlog
- Shift architecture from lead/approval to consult/enabling, providing clear architectural DoD, platforms, simulators, and stable environments (including pre‑production).
- Reduce reliance on late testing by supporting test‑driven and shift‑left practices in PODs.
Organization Backlog
- Redesign the PI organization around value streams, with architecture and testing each having a dedicated POD and Product Owner.
- Establish governance via EAT and structured PI planning that covers the entire flow from demand to production.

To support this, AO recommended engaging senior agile coaches at both the programme and team levels. They suggested co‑creating a transformation backlog owned by Product Owners, Scrum Masters, and programme leadership.

5. AO in Practice: Panda and Incredibles POC (Swarm)

To demonstrate the viability of the AO‑aligned working model, two PODs—Panda and Incredibles—were selected for an experiment.

Experiment design

The AO‑method prescribed:

Sprint Planning: Stories had to be ready, including description, acceptance criteria, and team‑defined DoD. Non‑ready items were pushed back to the Product Owner. The Sprint Goal was defined collaboratively, and developers committed only to clear, negotiable work.
Daily Scrum: each team member used Jira to show current tasks. They planned the day and highlighted blockers. The team did not accept scope changes except after story slicing.
Sprint Review: focused on working software, inviting stakeholders to inspect and adapt outcomes; code demos were not considered sufficient.
Sprint Retrospective: emphasized concrete changes to the way of working, using Sprint Review and sprint experience as inputs.

Measured impact via AO scorecard

Using the AO scorecard, the experiment showed a significant performance improvement for Panda and Incredibles:

In October 2023, they achieved Deliver 8. They also reached Demand 7 and Capacity 10. Their Capability was 9 and Organization was 0. (Score Max 10 for each dimension).
Compared to the initial May baseline (0–3–1–1–0), this demonstrated substantial improvement, particularly in Deliver, Capacity and Capability.
Other teams, still operating in the old model, lagged behind and continued to suffer from organizational and demand‑side issues.

Sprint 5 for Panda and Incredibles showed increased productivity. This happened when PODs were not disturbed by scope changes. They also had a clear sense of purpose. This provided evidence that AO‑compatible conditions—end‑to‑end scope, DoD‑based commitment, protected sprints—could significantly improve delivery without changing the underlying technology stack.

6. Constraints and ORGANIZATIONAL Response

Despite the encouraging POC results, several AO‑critical changes were not adopted at scale.

PI planning and governance

The PI14 planning event was a missed opportunity:

Many AO recommendations (end‑to‑end PI planning, explicit PI DoD, organization updates to reduce dependencies) were not implemented.
The event resembled a management meeting rather than a true collective planning session focused on reciprocal commitment.
Dependencies and environmental issues raised by PODs remained unresolved, and scope changes continued to be pushed into teams.

From an AO perspective, this indicated that the operating system at leadership and governance level largely remained unchanged. This situation limited the impact of team-level improvements.

Structural risks and recommendation

Given the lack of systemic adoption, the AO assessment concluded that:

Confidence in achieving PI14 goals was below 5/10. Delivery would be possible only at the cost of scope cuts and quality compromises.
Without organizational commitment to the AO‑aligned model, further steps toward an agile organization would be constrained. This commitment is especially crucial for architecture, testing, and PI governance.

The final recommendation was to consider repositioning the supplier teams as a genuine service provider. This includes defining clear performance and quality metrics. These actions should be taken if the client organization did not commit to the AO‑based organizational model. This would allow measurement and transparency even within a suboptimal system.

7. Lessons Learned for AO‑Method

The IELTS case yields several key lessons for practitioners of the AO‑method:

Quantitative AO scorecards create a shared reality.
AO scored Deliver, Demand, Capacity, Capability and Organization over time and across teams. This scoring made systemic issues visible. It transformed subjective complaints into a common fact base for leadership and teams.
End‑to‑end PODs are non‑negotiable.
PODs act as support teams in a technology‑driven structure. As long as this continues, value flow remains fragile. It also remains dependent on central functions. The Panda and Incredibles experiment shows that true end‑to‑end PODs, empowered with DoD and clear scope, can quickly improve performance.
PI planning and governance are part of the operating system.
Without AO‑compatible PI planning (vision, PI DoD, dependency management, reciprocal commitment), team‑level improvements remain local optimizations.
AO‑style experiments are powerful change levers.
Carefully designed experiments with selected PODs provide evidence that the target model works. This approach reduces perceived risk. It also gives leadership a concrete reference.
Organizational will is the ultimate constraint.
AO can diagnose and propose a coherent model. It can demonstrate its effectiveness locally. However, sustainable change requires leadership to adopt the new operating system. It’s not just about team‑level practices.

Discover more from Menschgeist

Subscribe to get the latest posts sent to your email.

CONTACT US