Image
Nate Craddock Headshot

Nate Craddock

Builds engineering teams and weird creative projects, sometimes at the same time.

A 90-day reflection on a new engineering team, the data that proved my instincts wrong, and what AI actually changed.

Ninety days ago, I rebuilt my engineering team from scratch. Four engineers, one shared start date, a clean slate. I had spent weeks on the hiring: more than forty candidates screened against a deliberate read of how each one actually worked with AI, take-home challenges, the whole process. I went in with high expectations and a healthy fear that I had over-promised them to myself.

Thirty days in, the team had already shipped a real release. Not a demo, not a staging branch nobody trusts. Production. And the volume and quality of what they were putting out was higher than any team I have led.

Here is the part I am not proud of. My first reaction was not pride. It was suspicion.

I have been doing this long enough to have a reflex, and the reflex said: output this good, this fast, this early, is output you are borrowing against. Velocity at that level usually means a corner is being cut somewhere you cannot see yet. Speed today, repaid as incidents two quarters from now. And when a team has AI in the loop, the reflex gets louder. The easy story is that the AI is writing checks the codebase will eventually have to cash, and that the bill comes due right around the time everyone has stopped paying attention.

So I watched. Closely. Looking, if I am honest, for the thing that would prove the speed was a mirage.

It never showed up. Instead, the numbers got better the longer I looked.

Across the quarter, merged throughput roughly doubled. That alone would have fed my suspicion. But the rest of the picture is what changed my mind. Review coverage went from 24% to 85%, which is to say the team went from mostly self-merging to actually reviewing each other's work. The revert and hotfix rate more than halved. The share of changes that touched tests tripled. We went from no systematic per-feature documentation to a written trail behind every feature, produced as a byproduct of how the work now flows. The output did not just grow. It grew while getting more rigorous, not less.

My favorite number is the one that looks bad until you understand it. Median time-to-merge went up, from a few minutes to most of a day. Before, a typical change was merged in minutes because no one was reviewing it. After, changes wait because they are being reviewed. Slower in that one metric is the entire point.

I had seen the opposite of this before, and that comparison is what finally landed it for me. At a previous company, standing up real CI/CD and end-to-end testing from a ramshackle starting point took us the better part of a year. This team did the equivalent in under a month. I want to be careful about what that does and does not mean, because the engineers I worked with before were sharp, and this is not a story about them being slower or worse. The difference was not the caliber of the people. The difference is that this time, the team had Claude to pair with, and smart engineers were finally able to move at the speed of their own thinking.

That reframe is the whole lesson, so let me be plain about it. I had read the speed as fragility. The speed was the payoff of structure. The thing I almost slowed down in order to feel in control was the thing my own architecture had made possible.

AI does not make a team fast in a way that is inherently dangerous. It makes a team fast, full stop. Whether that speed compounds or collapses depends entirely on the structure it runs through. Give it specs, comprehension gates, and a team that owns the why, and the velocity is durable. The reason this team can safely change code that AI helped write is that no stage advances until the person responsible can explain it. That is not bureaucracy. That is the thing standing between "fast" and "fast until it isn't."

My own job through all of this was smaller than it sounds and harder than it looks: set the structure, then get out of the way and clear whatever was blocking them. Most days, the highest-leverage thing I did was unblock someone, not out-produce them.

My distrust was the wrong instinct. But interrogating it was the right move, because it forced me to find the actual reason the team was working instead of either congratulating myself or panicking, both of which would have taught me nothing. The lesson I am taking out of these ninety days is not "trust your team," even though I should have. It is more specific. When good work shows up faster than your experience says it should, the question is not "what is about to break." It is "what did we build that made this possible, and how do we protect it." That second question is the entire job now.

And it is where this goes next. We spent a quarter using AI to rebuild how we ship. The natural next move is to turn that same expertise on the product itself, and build the kind of AI-driven capability into what we sell that we just spent ninety days learning how to wield internally. The force multiplier did not stop at the team. It is becoming the product.

I spent thirty days waiting for a team this good to fall apart. It didn't, because falling apart was never in the architecture. I just had to stop distrusting the evidence long enough to read it.