AI agents are moving from answering questions to taking actions (such as checkout) inside browsers, business software and automation platforms, but a new research index finds that the rules governing how those systems operate — and how they are held accountable — are lagging their commercial use.
Researchers affiliated with Massachusetts Institute of Technology (MIT) and the University of Cambridge released the 2025 AI Agent Index. It examines 30 widely used “agentic” AI systems and concludes that most developers disclose little about safety testing, evaluation methods or how their agents behave when interacting with third-party systems. For digital commerce, the findings point to growing risk as agents begin to influence pricing, orders, procurement and customer interactions.
“Tracking these developments is difficult because the AI agent ecosystem is complex, rapidly evolving and inconsistently documented,” the authors wrote in the paper introducing the index. They added that “most developers share little information about safety, evaluations and societal impacts.”
Gaps in checkout via AI agents
The index documents 45 data fields per system, ranging from autonomy and control to ecosystem interaction and safety practices. Yet the researchers said they could not find public information for 200 of the 1,350 fields they reviewed. And the largest gaps appeared in areas most relevant to commerce:
- How agents are monitored.
- How risks are evaluated.
- How systems identify themselves when acting online.
Those omissions matter as agents increasingly perform tasks once managed by employees or customers. Several of the systems the researchers tracked in the index are already capable of browsing websites, triggering workflows and acting inside enterprise software with limited human oversight. The list includes consumer-facing tools such as OpenAI ChatGPT Agent and Perplexity Comet. It also includes enterprise platforms like HubSpot Breeze Agents, Microsoft Copilot Studio and ServiceNow AI Agents.
The researchers found that autonomy is increasing faster than governance. Some browser-based agents operate with minimal opportunity for human intervention once a task begins. Meanwhile, enterprise agents are often triggered automatically by events such as incoming emails or database changes.
“For many enterprise agents, it is unclear from publicly available information whether monitoring for individual executions exists,” the authors wrote.
MIT, Cambridge researchers identify lack of disclosure among AI agents
Identity and disclosure emerged as another fault line. According to the index, most agents do not clearly document whether they identify themselves as AI when interacting with websites or external systems.
“Most agents do not disclose their AI nature to end users or third parties by default,” the paper said.
For retailers, distributors and marketplaces, that ambiguity complicates management, fraud detection and basic questions of attribution when something goes wrong.
The index also highlights how concentrated the agent ecosystem has become. Most of the systems studied rely on a small number of underlying foundation models from major providers, increasing the potential for upstream disruptions to ripple through commerce workflows. At the same time, only a handful of agents — including Anthropic Claude Code and Google Gemini CLI — publish agent-specific safety documentation, leaving buyers to rely on general claims about base models.
For digital commerce leaders, the message is not theoretical. Agents are already being positioned to research products, assemble carts, place reorders, resolve service issues and automate back-office workflows. The index suggests those capabilities are arriving without a shared trust framework to define identity, permissions, audit trails and responsibility.
The authors stop short of calling for a halt to deployment. Instead, their findings underline a practical reality.
As agents move closer to transactions, inventory and customer relationships, companies can no longer treat them like smarter chatbots. They are autonomous actors inside the commerce stack, and the index makes clear that governance, transparency and accountability have not yet caught up to that role.
Sign up
Sign up for a complimentary subscription to Digital Commerce 360 B2B News. It covers technology and business trends in the growing B2B ecommerce industry. Contact Mark Brohan, senior vice president of B2B and Market Research, at mark@digitalcommerce360.com. Follow him on Twitter @markbrohan. Follow us on LinkedIn, X (formerly Twitter), Facebook and YouTube.
Favorite