Why CROs Are Slow, Part 1: The Kinetics of Busy Systems
Assays are fast; so why does it take so long to get results from a CRO?
Consider a routine enzyme-linked immunosorbent assay (ELISA). While protocols vary, the core workflow consists of microtiter plate preparation, incubations, washes, detection, and analysis. The hands-on time is typically a couple of hours; including long incubations, the entire assay fits comfortably in one working day.
But the timeline for executing that same assay is quoted in months at most contract research organizations (CROs). The disconnect is so common that it’s become baked into how sponsors interact with CROs: program managers build buffers around it and include those buffers in development timelines. Meanwhile, the CRO teams are working hard, staff and instruments are fully utilized–in many instances, over-utilized.
So why does outsourcing an assay to a specialized organization seem to make it slower, not faster?
We’ve thought a lot about this paradox at Dash. The explanation, I believe, lies in the structure of busy systems. In this article, I’ll address part 1 of 2 on why CROs are slow by focusing on the operating model of how CROs function.
The Kingman Equation
In 1961, British mathematician John Kingman published a paper, “The Single Server Queue in Heavy Traffic”, describing how waiting time behaves in a busy system.
In simplified form, his result can be expressed as:
The interpretation is more important than the algebra: the time you wait (w) is proportional to the service time itself, s, and two additional factors: how much of the system’s capacity is already in use (u, utilization), and how much variation there is in when work arrives and how long it takes to complete, (v², variability).
Imagine waiting for your meal in a busy restaurant. Each server is covering as many tables as possible (utilization is high), customers arrive unpredictably and some dishes take longer than others (variability is high). As you wait to be seated, then for your meal, then for the check, your frustration builds. But the system isn’t broken. To the contrary: Kingman shows that the system is behaving exactly as busy systems do. The same is true at the grocery store on a Sunday afternoon or at the DMV on a Saturday morning. (That's the RMV for those of us in Massachusetts.)
CROs are no exception. A contract research organization is, like the other examples, a queueing system. Work arrives at irregular intervals, takes variable time to complete, and competes for the same pool of resources.
Let’s consider first what happens when you maximize utilization. Here I’ve plotted waiting time as utilization increases for three values of variability.
On the left hand side of this graph, when a small fraction of total capacity is in use, wait times remain low irrespective of variability. But as utilization climbs, wait times explode, and they do this faster the more variable the process is. So in response to a growing queue of projects, the CRO behaves like a busy system should: wait times increase because of higher utilization.
This feels counterintuitive: why does simply being efficient and making full use of your resources actually lead to longer wait times? At a high utilization rate, scientists and instruments are busy a high percentage of the time. In order for the utilization to remain at that high level, there must be work waiting to be started once the previous project is completed. If there’s ever an empty queue where no project is waiting when the prior finishes or is paused, the resource goes idle and utilization drops. So as a CRO pushes utilization upward and seeks to maintain that level, it has to carry a larger backlog of projects queued up to fill the next opening. This happens not just at the initial assignment of a project but also throughout the process whenever there is a handoff or pause. It’s these queues of projects that sponsors experience as “waiting.”
Now introduce variability. Even at lower utilization, some assays take longer than planned; projects arrive in bursts; some require rework. When utilization is high, there is no slack capacity to absorb those fluctuations. Small variations in service times or in onboarding projects accumulate. A queue forms; without excess capacity, it persists.
Let’s consider a specific example. Consider a realistic scenario where a CRO runs at 80% utilization and projects have a 50% standard deviation from expected protocols and wait times. In this case, the math shows that our one-day ELISA incurs a full additional day of waiting, doubling the end-to-end turnaround time. And as you move to 95% utilization, the math shows that the waiting time grows to a full workweek of waiting. The delays that have become a natural part of CRO work emerging from this interaction between utilization and variability.
The Consequences of Specialization
Some of this variability is unavoidable, but some of it is structural.
The way most CROs are organized only feeds into variability: projects are typically run by a combination of separate study directors, technicians, analysts, report writers, and QA functions who all take part. Splitting a project into different components that are each performed by specialists is a natural and logical decision, but it turns out that this specialization exacerbates variability, and in turn, increases wait times.
I’ll spare you the math, but it turns out that if you split a process into n parts, the variability increases roughly by that same factor of n. The reason for this is that if one resource is running a processes with multiple steps end-to-end, the variability in one step is absorbed by the others. One step might take longer than usual and another less. When you put them together, the total variability decreases. In constrast, if you split a process, the variability will increase.
We can simulate this to see how it works visually using Monte Carlo simulation. Let’s imagine we have a process that takes 20 days on average, with 4 people who can work on projects, and let’s model two scenarios. Scenario 1 is the “Generalist model”: we let all 4 of the people process a project end-to-end before working on the next. Scenario 2 is the “Specialist Model”: we specialize each person and split the projects in 4 unit operations where each person is dedicated to a given unit operation (imagine: technicians, analysts, report writers, QA). With projects coming in randomly every 6 days on average, you get a resource utilization like this when you run 25 projects through:
Notice that the specialist solution is slower. A lot slower. Many of the individual projects start later and take longer to complete, and the full scope of all the work finishes an entire month later than the generalist model. This is despite the fact that total processing time of each project is identical and the utilization of the resources is the same in both scenarios. And this is just a small example with 25 projects. If we let the simulation run for 50,000 projects, you can see broad statistical trends emerge.
It's not just that the specialist model is slower. The variation of project turnaround times is much higher as well. Some specialist projects take an order of magnitude longer to complete than a typical generalist project.
So what’s going on? The issue is synchronization. In the generalist model, a single operator can take a project from start to finish with no delays in the middle. In the specialist model, a project is passed to the next step who is likely working on another project at the time, so the project is queued. When we split a project into parts we’re unwittingly multiplying the opportunities for queuing, exactly as Kingman’s formula predicts.
This slowdown due to specialization is something you’re very familiar with if you’ve ever been to the doctor’s office or a hospital. Healthcare is an incredibly specialized field with nurses, PAs, and countless specialized doctors. The staff is always rushing from patient to patient but yet it seems like the primary activity for patients is waiting. Specialization is the cause.
This specialization that slows down processes is also extremely common at CROs. So why do CROs split work this way? In some cases it’s because components of the work are indeed specialized. In some cases it is a requirement of regulatory guidelines that require independence of quality assurance. But in part, it’s a response to labor constraints. At most CROs today staffing is constrained, turnover is high, and manual work is grueling. Narrowing roles (technicians pipette, analysts analyze, writers write) is a rational way to stretch limited expertise across more projects. But while it makes staffing easier, it also multiplies handoffs, increases variability, and pushes more work into queues. The system becomes busy, fragmented, and painfully slow.
Engineering Around the Queue
The important conclusion of this analysis, though, is that this isn’t fate. The equations we’ve been using tell us exactly what matters. If we want to speed up turnaround time, we need to reduce service time, keep utilization from getting too high, and drive down variability.
We’ve specifically built our highly automated and integrated operating model at Dash to address these problems. It’s not just that robots and algorithms run assays faster; they eliminate the variability of manual labor and they replace entire chains of operator-to-operator transfers. This model allows us to hire exceptional scientists and equip them with tools that eliminate rote, manual labor. This automation acts as a force multiplier for their talents while eliminating the waste that plagues the traditional manual approach.
The result is not just that individual steps are quicker, but that work flows end-to-end with far less work-in-progress trapped between stages. In the language of Kingman’s equation, we are directly reducing the terms that make wait times explode.
But operations are only half the story.
In Part 2, I’ll turn to the business model of CROs, the incentives that make delay a rational outcome, and the structural changes required to engineer around it.