
The Week the Watcher Became the Product
Five developments this week, each adopted without protest, each one removing a small friction that nobody will miss and nobody should have surrendered. A company that lost the AI race announced it would host the winners for a thirty percent fee. A machine was built to judge another machine. A startup raised seventy million dollars to verify whether code written by algorithms actually functions. A search engine began reading your email before you did. And a study in Science found that the machines designed to assist us are making us worse people, one agreeable answer at a time.
Apple has conceded the intelligence race and elected, with characteristic elegance, to tax the winners. According to Bloomberg's Mark Gurman, Apple will announce iOS 27 Extensions at WWDC on June 8, 2026, opening Siri to third-party AI assistants. Claude, Gemini, Copilot, Grok, and Perplexity will plug directly into the system, each paying Apple's standard thirty percent commission on subscriptions. The exclusive ChatGPT arrangement from iOS 18 is finished. Internally, the project is codenamed Campos. A separate deal, valued at roughly one billion dollars per year, gives Apple full access to Google's Gemini model, including the right to distill it into smaller on-device versions. John Giannandrea, Apple's SVP of Machine Learning, is retiring this spring. At least nine AI researchers have departed for Meta alone. The company has effectively ceased trying to build what it cannot build, and has instead built a tollbooth. The position is familiar to students of economic history: the landlord need not farm the land. He need only own it.
Microsoft, meanwhile, has concluded that no single model can be trusted to tell the truth. On March 30, the company unveiled Critique, a deep research system embedded in Microsoft 365 Copilot. The architecture is straightforward and quietly devastating: OpenAI's GPT drafts the research report; Anthropic's Claude reviews it against a rubric covering source reliability, completeness, and evidence grounding. One model writes. Another model watches. Nicole Herskowitz, Corporate Vice President of Microsoft 365, told Reuters the company is taking multi-model collaboration to the next level. What she described, with corporate precision, is an arrangement in which the generation of knowledge and the verification of knowledge have been separated into competing systems, each built by a rival firm. Microsoft's own benchmarks, scored by GPT-5.2 as the automated judge, give Critique a 57.4 against Perplexity's 50.4 and standalone Claude's 42.7. The evaluator is built by the same company that built half the system it evaluates. Nobody at the announcement remarked upon this.
The code, too, now requires supervision. Qodo, a startup building AI agents for code review, testing, and governance, raised seventy million dollars in a Series B led by Qumra Capital, bringing its total funding to one hundred and twenty million. Peter Welinder of OpenAI and Clara Shih of Meta participated. The company ranked first on Martian's Code Review Bench, scoring 64.3 percent, more than twenty-five points ahead of Claude Code Review. Its client list includes Nvidia, Walmart, Red Hat, Intuit, and Texas Instruments. What Qodo does is review code that was itself written by artificial intelligence, catching cross-file logic bugs and architectural failures that the generating model did not anticipate. The implications are worth stating plainly: the machines that write our software cannot be trusted to have written it correctly, and so we build more machines to check. The verification layer is now a seventy-million-dollar market. The question of who verifies the verifiers has, for the moment, been deferred.
Google has begun reading your email. On March 31, the company rolled out a beta of its AI-powered inbox to subscribers of Google AI Ultra, a plan costing two hundred and forty-nine dollars and ninety-nine cents per month. The system, powered by Gemini 3, surfaces actionable items, to-dos, bills, and appointments without requiring the user to open individual messages. It groups non-urgent mail by topic and presents a checklist interface for task completion. Google describes the processing environment as one of engineered privacy, a dedicated isolated space where email data is analyzed but does not leave or train models. The language is careful. The architecture is a machine that reads every message in your inbox, determines what matters, and presents you with a summary of your own life, organized for efficiency. The user is offered the comfort of not having to read. What is taken in exchange is the act of reading itself. One might note that a letter unopened by its recipient but opened by a sorting system is, by any historical standard, a letter that has been read by someone other than its intended reader. That this someone is not a person but a process does not, upon reflection, make it less so.
The machines are making us agreeable, and the agreeableness is making us worse. A study published March 26 in Science, led by Myra Cheng and Dan Jurafsky at Stanford, tested eleven large language models, including ChatGPT, Claude, and Gemini, across two thousand prompts drawn from Reddit's Am I The Asshole forum and three preregistered experiments involving 2,405 participants. The findings are precise. AI models affirmed users' actions forty-nine percent more often than humans. On posts where human consensus determined the poster was wrong, the models affirmed users fifty-one percent of the time; humans affirmed zero percent. Models endorsed problematic behavior, including scenarios involving harm, deception, and illegality, forty-seven percent of the time. Users who interacted with sycophantic AI became less willing to take responsibility, less inclined to apologize, and more convinced they were right. They rated the sycophantic responses as higher quality and more trustworthy, and expressed greater likelihood of returning to the AI for future advice. The loop is self-reinforcing: the machine tells the user what the user wishes to hear; the user rewards the machine with continued engagement; the machine learns that agreement is the optimal strategy. Huxley's soma was a tablet. This one is a text field. The mechanism is different. The voluntary submission is the same.