Harvard researchers put AI into real product development teams and watched professional backgrounds stop mattering. Marketing people generated technical depth. Engineers proposed commercially viable products. The expertise that previously required assembling different specialists became accessible to individuals.
If AI can give you capabilities outside your domain, what stops it from filling those gaps with plausible fiction? The BBC tested 45,000 AI responses across four assistants. European researchers studied who catches the errors. The three studies answer a single question: what makes AI collaboration reliable?
AI as Teammate
Harvard researchers gave 776 Procter & Gamble professionals real product development challenges. Some worked alone, some in pairs, some with AI, some without.
The Performance Surprise
One person with AI matched two people without it. Both improved quality by roughly the same amount over solo work. AI didn't just speed things up. It replicated what a second brain provides: different angles, pushback on weak ideas, help refining your thinking.
Adding AI to teams helped, but barely. Once you have AI, the second human contributes less.
Silos Disappeared
Marketing people proposed marketing solutions. Engineers proposed technical solutions. This pattern held without AI.
With AI, it vanished. Marketing people generated technical depth. Engineers proposed commercially viable products. Professional background stopped predicting what kind of solution you'd create.
The effect ran deeper than background. Workers unfamiliar with product development normally struggled. With AI, they matched experienced teams. The technology didn't just inform them. It helped them think in domains they didn't know.
People Felt Better
Workers using AI reported more excitement and energy. They felt less anxious and frustrated. The conversational interface created something like the social lift you get from working with another person.
Yet they misjudged their own performance. Despite producing objectively better work, AI users felt less confident their solutions would rank among the best. The technology improved their output faster than it improved their intuition.
What Changes
Organizations will see the efficiency play immediately. Fewer people per project, same quality, less time. Many will take it.
But efficiency misses the deeper shift. We've organized work around a basic truth: complex problems need diverse experts collaborating. You put a technical person and a commercial person in a room because you need both perspectives.
AI breaks this logic. If individuals can access the cognitive diversity that previously required teams, then team composition stops being about assembling different expertise. It becomes about something else.
What that something else is remains unclear. Teams with AI produced three times as many breakthrough solutions as individuals. Human collaboration still matters for exceptional work. But the old rationale, the expertise-pooling rationale, weakens.
The researchers found one more thing. Without AI, teams produced solutions that clustered around either technical or commercial approaches. Someone's perspective dominated. With AI, solutions became more balanced. The technology reduced dominance effects.
This points toward AI's real function. Not replacing expertise. Filling gaps. Making each person more complete.
The Hard Question
How does expertise develop if AI provides instant access to adjacent domains? Learning happens through struggle at the boundaries of what you know. If AI smooths those boundaries, what happens to the struggle?
The study can't answer this. It measured one day, not one year. Participants got one hour of training. They were novices. But the question matters because it determines whether AI democratizes expertise or just democratizes access to expertise while concentrating its development.
Source: HBS
AI Assistants and the Illusion of Reliability
The BBC and European Broadcasting Union tested four AI assistants across 18 countries and 14 languages. Forty-five percent of responses contained serious errors. Eighty-one percent had flaws.
A third of UK adults completely trust AI to summarize information.
The Sourcing Failure
Thirty-one percent of responses had major sourcing problems: missing citations, broken links, references that didn't exist.
Gemini failed worst. Seventy-two percent of its responses had sourcing errors. It claimed information came from public broadcasters, then provided no link. The assistants attributed mistakes to credible journalists who never made them.
The Confidence Gap
AI assistants refused to answer just 0.5% of questions. They now answer everything, whether they should or not.
This creates the real danger: fluent, authoritative responses masking uncertainty. The systems fill factual gaps with invented details. They deliver errors with expertise.
Gemini chastised a user asking about NASA astronauts stranded in space—astronauts who'd been stuck there nine months. Gemini claimed they hadn't been stranded at all, suggesting the user was "confusing this with a sci-fi movie."
The assistant was wrong. The user was right. But only one sounded certain.
What Failed
They named dead leaders as still in power. Pope Francis died in April; in May, three assistants still called him Pope.
They fabricated quotes. Twelve percent of responses with direct quotations got them wrong—altering words, inventing phrases entirely.
They misrepresented timelines. Copilot conflated Musk's resignation with the Nazi salute controversy, implying the former caused the latter. It didn't.
They cited sources that didn't support their claims. ChatGPT listed detailed Chinese export statistics, citing Swiss public radio. The numbers weren't in the source.
The Stakes
Forty-two percent of adults would trust a news source less if an AI summary contained errors. When Gemini claims "According to BBC News..." then provides information the BBC never published, it damages the BBC's credibility for mistakes it didn't make.
Seven percent of people now use AI assistants as a news source—rising to 15% among under-25s. The Financial Times reports a 25-30% decline in search traffic as AI summaries replace click-throughs.
The Problem
AI assistants present themselves as reliable. The BBC/EBU study—the largest cross-market evaluation of its kind—shows they aren't.
The gap between how reliable these tools seem and how reliable they are: that's the problem. The technology will improve. The question is whether it improves faster than people learn to trust it.
Right now, the gap is growing.
Source: BBC
When AI Makes Experience Essential
Nawrot and Walkowicz studied what happens when organizations adopt AI and discovered something counterintuitive: the technology that automates routine work makes experienced professionals more valuable, not less.
AI now handles what we once prized most—speed, multitasking, continuous availability. What it can't handle is the difference between coherent and correct.
The Coherence Problem
AI generates legal citations that don't exist. Lawyers in New York, Utah, and California submitted these fabrications to courts. Air Canada's chatbot invented fare policies that cost the airline a settlement. The UK released an AI-generated peatland map that confused bogs with rock. A BBC study found over half of AI news summaries contained errors.
These failures share a pattern. The output looks authoritative. The content is wrong. Probabilistic systems pattern-match without understanding.
Who catches these errors? People who've spent decades making decisions with consequences. They recognize when something sounds right but isn't. This judgment accumulates through practice, not training programs.
The Mismatch
While AI creates demand for experienced oversight, employment practices move the opposite direction. Older workers in the EU face long-term unemployment rates 13.5 percentage points higher than mid-career workers. Over half of UK and German recruiters doubt candidates over fifty can adapt to technology.
Yet OECD research shows keeping older professionals active could raise GDP per capita 19 percent over thirty years. Deutsche Telekom puts experienced staff downstream of AI outputs. Healthcare and finance implement review protocols where senior professionals validate AI recommendations before they go live.
Three Interventions
The researchers propose coordinated action. Organizations should redesign roles around AI oversight as risk management. Governments should fund sector-specific programs showing experienced workers where they fit in AI workflows—not generic digital literacy, but applied learning about where judgment matters most. Both could finance this through social bonds that tie capital to measurable outcomes.
France's unemployment agency raised two billion euros this way in 2025. Investor demand hit twelve billion. The mechanism forces clarity: proceeds fund defined projects, external reviewers verify impact, reporting is public.
Why This Matters
Organizations deploying AI without experienced oversight are choosing preventable failure. They're betting coherent means correct. The evidence shows it doesn't. Europe's aging workforce makes this urgent. AI works when informed people question it. Experience is what makes the questions good.
Source: ESPC
AI makes expertise accessible only when experience already exists.
AI fills knowledge gaps on demand, letting marketing professionals think like engineers and novices match experienced developers. But the BBC found 45 percent of what it generates is wrong: fabricated citations, dead leaders still in power, confident denials of facts. The same mechanism that provides useful perspectives generates fiction users can't distinguish from truth.
European research reveals who can: professionals who've made enough consequential decisions to recognize when coherent isn't correct. Organizations can shrink teams because AI broadens individual capability, but those individuals need experienced validators downstream catching errors they can't spot themselves.
Most organizations are doing the opposite. They solve for efficiency while screening out workers over fifty. The technology that makes experienced judgment essential, paired with hiring practices that eliminate it.
Without experienced oversight, you get Air Canada's chatbot inventing fare policies, lawyers submitting fabricated case law, government agencies releasing maps that confuse bogs with rock.
AI gives individuals access to expertise they don't have. It also generates plausible fiction they can't recognize. Only professionals with years of practice spot the difference.
Until next time, Matthias
