Something predictable happens when organizations introduce AI tools to a team. A portion of the group resists – hard. Another portion embraces – completely. Both responses feel decisive. Neither is.
The resistant group holds the line until something changes their mind. Often it’s a demonstration. AI produces something impressive, and it produces it fast. The wall comes down. The same person who was most skeptical last quarter is now the least careful in this quarter. The speed of what they witnessed felt like evidence that the tool could be trusted. It wasn’t – not on its own.
The enthusiastic group never built that wall in the first place. They saw possibility – correctly – but skipped the step where possibility gets defined and bounded. If AI can do so much, the question of what it should do, and who remains responsible for checking, gets deferred indefinitely.
Both groups end up in the same place: using powerful tools without having asked the questions that would make them useful.
The questions most teams never ask
Before a team begins working with AI in any serious capacity, someone at the leadership level needs to answer a short list of questions – not once, but clearly enough that everyone knows the answers.
Who checks the work? AI produces output quickly. That speed creates pressure to move on before anyone has verified that what was produced is actually correct. In most organizations, no one has explicitly assigned this responsibility. Everyone assumes someone else is doing it.
Who is responsible when something goes wrong? This sounds like a legal question. It isn’t – it’s a clarity question. If an AI-assisted decision leads to a bad outcome, the team needs to know in advance whose judgment was supposed to catch it. The absence of that clarity doesn’t eliminate accountability. It just makes it impossible to learn from the failure.
Who decides what AI should and shouldn’t do? This is a strategic question dressed as a technical one. The answer involves values, risk tolerance, and organizational priorities – none of which the tool can determine for you.
These aren’t questions that individual contributors should answer for themselves. They’re leadership work. And in most organizations right now, nobody’s done it.
The check that used to happen automatically
Work that looks finished and work that is finished are not the same thing. This has always been true. What AI changes is how easy it becomes to confuse the two.
When a person does a piece of work, they trace through it. They notice the places where something seems off. That tracing is imperfect – people miss things – but it creates a natural checkpoint before anything is declared complete.
AI produces output without tracing. The result can look exactly like finished work. It often is. But the verification step that used to happen naturally now needs to happen intentionally, and someone needs to own it. In the absence of a clear role charter, that step gets skipped – not out of negligence, but because no one knew it was theirs to do.
The standard for what counts as done hasn’t changed. What’s changed is that meeting the standard now requires a deliberate decision about who’s responsible for checking.
The meeting nobody is having
The retrospective exists to surface what actually happened – not the version that looked good in the status update, but the version where the team examines what worked, what didn’t, and what to do differently.
Most teams are holding those conversations right now without the most active participant in the room.
If AI is doing meaningful work, it also has observations worth surfacing. Where things stalled. Where it produced something that needed correction. Where the instructions it was given were unclear. Those observations exist. Right now, they simply aren’t being gathered.
A retrospective that includes AI as a contributor looks different from one that merely reviews what AI produced. The agent can bring a record of where things went sideways since the last retrospective – missteps, corrections, moments of confusion – and that material belongs on the retro board alongside what the humans experienced. The findings that come out can be fed back to the agent: not to overwhelm it, but to put the right insights in the right places so it works better going forward.
This isn’t a technical project. It’s a design question and a leadership question: who is responsible for making sure the retrospective is actually complete? Who closes the loop?
You cannot hold any contributor to a higher standard if they’re never part of the conversation about what to improve. That was true of people long before it was true of AI.