Shopify Reports 7× Surge in AI-Driven Traffic

Shopify says artificial intelligence (AI) is now driving record levels of shopping activity, with traffic to its merchants’ stores up sevenfold since January and AI-attributed orders rising elevenfold, claiming it marks the start of a new “agentic commerce” era.

Shopify’s AI Milestone Announced Alongside Strong Financials

These latest figures were unveiled on 4 November 2025 during Shopify’s third-quarter earnings call for the period ending 30 September. The Canadian-based e-commerce software company, which powers millions of businesses in more than 175 countries, reported revenue of around US $2.84 billion, a 32 per cent rise year on year, with gross merchandise volume (GMV) climbing to US $92 billion, also up 32 per cent. Free cash flow margin (the profit left after expenses and investments) stood at about 18 per cent, marking nine consecutive quarters of double-digit free cash flow margins.

Operating income reached US $434 million, slightly below analyst expectations, but executives emphasised that AI-driven performance was the real story of the quarter. “AI is not just a feature at Shopify. It is central to our engine that powers everything we build,” said president Harley Finkelstein during the call, describing AI as “the biggest shift in technology since the internet.”

Shopify and Its Role in Global Commerce

Founded in Ottawa in 2006, Shopify provides digital infrastructure that allows merchants to start, scale and run retail operations online and in-store. For example, the company’s tools cover web hosting, checkout, payments, logistics, marketing, analytics and third-party app integrations. Its reach includes major brands such as Estée Lauder and Supreme, as well as small independent businesses.

The Value of Its Data Network

Shopify’s value essentially lies in its vast data network. For example, with millions of active merchants generating billions of transactions each year, the company can analyse patterns across product categories, price points, consumer behaviour and regional trends. Finkelstein said this data scale provides a distinct edge in the AI era, allowing Shopify to “turn our own signals — support tickets, usage data, reviews, social interactions or even Sidekick prompts — into fast, informed decisions.”

AI Traffic and Orders See Explosive Growth

The most striking statistics from the earnings call were that traffic from AI tools to Shopify-hosted stores is up seven times since January 2025, and that orders attributed to AI-powered search are up eleven times over the same period. Although Shopify did not provide absolute numbers, the growth rate suggests that AI chatbots and conversational assistants are starting to play a meaningful role in how customers find and purchase products.

The company’s internal survey found that 64 per cent of consumers are likely to use AI during the Christmas holiday shopping season, which is a sign, it says, that shoppers are already comfortable relying on digital assistants for product discovery and comparison.

Finkelstein has framed this change as more than a short-term sales boost. “We’ve been building and investing in this infrastructure to make it really easy to bring shopping into every single AI conversation,” he told analysts. “What we’re really trying to do is lay the rails for agentic commerce.”

What Does ‘Agentic Commerce’ Mean?

Shopify’s term “agentic commerce” refers to a model where AI agents act on behalf of consumers, guiding them through discovery, evaluation, checkout and even post-purchase stages such as returns and reordering. For example, rather than searching through multiple sites, a user can simply describe what they want to a conversational AI assistant, which can then query databases, compare prices, and complete the transaction.

The “Commerce for Agents” Stack

To support this model, Shopify has built what it calls its “commerce for agents” stack. This includes a product catalogue system designed for AI queries, a universal shopping cart that lets consumers buy across multiple merchants, and an embedded checkout layer using Shop Pay for one-click transactions. These features are being integrated into platforms such as ChatGPT, Microsoft Copilot and Perplexity through formal partnerships announced earlier this year.

This infrastructure means that AI assistants can browse Shopify merchants’ catalogues and complete purchases directly within chat interfaces. As AI-driven discovery becomes more conversational, Shopify aims to position itself as the primary retail backbone behind these agent-led interactions.

The Scout System

Shopify is also deploying AI internally. For example, its “Scout” system analyses hundreds of millions of pieces of merchant feedback to help employees make product and support decisions more effectively. “Scout is just one of many tools we’re developing to turn our own signals into fast, informed decisions,” Finkelstein said.

Sidekick

Another major tool is Sidekick, an AI assistant embedded within Shopify’s merchant dashboard. Sidekick can analyse sales trends, suggest pricing adjustments, generate marketing copy, or create reports on command. In the third quarter alone, more than 750,000 shops used Sidekick for the first time, generating close to 100 million conversations. Shopify says this helps merchants operate more efficiently and spend less time on routine administrative work.

Shop Pay

Shop Pay is the company’s one-click checkout solution and remains a cornerstone of its AI ecosystem. In Q3 it processed about US $29 billion of GMV, a 67 per cent increase year on year, and accounted for around 65 per cent of all transactions on the platform. This integration ensures that when AI agents complete orders, Shopify retains control of the payment flow and associated data.

Global Impact and European Opportunity

Finkelstein told investors that consumer confidence “is measured at checkout,” adding that shoppers on Shopify “keep buying” and “keep returning.” He noted that demand has remained resilient across categories, even as economic uncertainty persists. Europe appears to be a particular bright spot, with cross-border GMV (the total value of all sales made through Shopify’s platform) steady at 15 per cent of total sales and growth in sectors such as fashion and consumer goods.

For UK and European merchants, this could present a new phase of opportunity. For example, businesses already using Shopify can benefit from being automatically visible to AI-driven discovery systems without developing custom integrations with each platform. By ensuring that product listings are detailed, structured and machine-readable, merchants can increase their chances of being recommended by AI agents.

There is also a potential opening for agencies and developers to specialise in optimising “agent-ready” storefronts, designing catalogues and metadata that AI systems can interpret accurately. For smaller retailers, this could be an efficient route into AI commerce without the high cost of in-house development.

How AI Is Changing the Competitive Landscape

Shopify’s emphasis on AI-driven commerce poses strategic questions for competitors. For example, Amazon and major regional marketplaces already use AI recommendation engines, but Shopify’s model offers decentralised access: independent merchants can collectively benefit from the same AI infrastructure without surrendering control of their brands.

If agentic commerce grows as Shopify predicts, discovery and purchasing could increasingly occur inside chat platforms rather than traditional websites or search engines. That would reshape marketing and customer acquisition strategies, pushing retailers to focus more on structured data, integration quality and conversational optimisation.

For Shopify itself, the rise of agent-driven traffic could also reinforce its role as the connective tissue of global retail, potentially deepening its partnerships with large AI providers and securing a share of new sales channels that bypass traditional web search entirely.

Opportunities and Challenges for Businesses

For merchants, the potential benefits include higher-quality leads, faster conversions, and less reliance on paid advertising. AI-powered assistants can surface relevant products to users who are ready to buy, reducing friction in the path to purchase. The integration of Sidekick also promises time savings through automation of everyday tasks like inventory monitoring and campaign planning.

However, the challenges are equally significant. For example, attribution remains a key question, i.e., determining which sales are truly “AI-driven” is difficult when customers interact across multiple devices and channels. There is also the issue of discoverability. As AI agents narrow recommendations to just a few products, competition for visibility may intensify, potentially favouring larger brands that can afford dedicated AI-optimisation strategies.

Data privacy and regulatory compliance are further concerns, especially in the UK and EU. For example, agentic commerce depends on detailed user data to personalise results, and any sharing of this data between Shopify, AI partners and merchants will attract scrutiny under GDPR and related frameworks. Businesses will need clear consent processes and transparent data handling to maintain consumer trust.

Critics also warn of overreliance on automated systems that can misinterpret queries or produce inaccurate results. Large language models are known to “hallucinate”, and shopping assistants could recommend inappropriate or unavailable items. Shopify’s claim that AI represents autonomy rather than mere automation raises questions about accountability if an agent completes a transaction incorrectly or processes returns without oversight.

Despite these uncertainties, Shopify’s strategy and apparent success with it could be seen as a signal that conversational and agentic shopping will become a defining feature of global retail. The company’s 7× rise in AI-driven traffic and 11× increase in orders could be seen as providing the clearest evidence yet that the technology is beginning to translate from hype into measurable commerce.

What Does This Mean For Your Business?

Shopify’s results appear to show that AI-driven shopping is no longer an abstract concept but a tangible factor reshaping how consumers buy and how merchants sell. The company’s data and partnerships give it a strong early foothold in this emerging space, yet they also highlight the scale of change underway across the entire retail ecosystem. For merchants and technology partners, particularly in the UK, the lesson appears to be that conversational and agent-led shopping channels are likely to become a growing part of how customers discover and complete purchases. Those who adapt their product data, content and customer engagement models early will be better placed to capture new demand as AI assistants become a standard entry point to commerce.

At the same time, the risks are becoming more visible. For example, the concentration of traffic within a handful of AI platforms introduces new dependencies and competition for visibility that could prove as intense as traditional search engine optimisation. Data protection and transparency will remain major issues, especially in the UK and EU where regulators are tightening scrutiny on how consumer data is shared between AI systems and third-party platforms. Businesses will need to ensure that automation enhances customer experience without removing human accountability or trust.

For Shopify, the early surge in AI-related sales provides some validation of its long-term investment in agentic commerce, but the road ahead will depend on whether AI tools can sustain accuracy, reliability and fairness at scale. For retailers, investors and consumers alike, the company’s current momentum highlights the fact that AI is already changing commerce in practice, not just in theory, and the balance between innovation, control and transparency will define who benefits most from this new era.

How To Get The Most From WhatsApp Groups

In this Tech Insight, we look at how to get the most from WhatsApp groups by using all their key features to make chats more organised, productive, and secure for both organisers and members.

Why WhatsApp Groups Matter More Than Ever

WhatsApp has become the world’s primary messaging platform, used by over 2.9 billion people each month and handling around 130 billion messages every day. For families, clubs, workplaces, and local communities, it has evolved into an essential coordination tool rather than just a place to chat.

Groups now include a wide range of built-in tools designed to help organisers manage communication more effectively. For example, features such as polls, events, message reactions, and Communities have turned WhatsApp into a structured environment capable of handling everything from social groups to large-scale organisational networks.

Groups, Communities And Channels Explained

WhatsApp currently offers three main ways to reach groups of people, i.e., Groups, Communities, and Channels. Understanding the difference helps users decide which best fits their purpose.

1. Groups are the most familiar format. Everyone can send and receive messages, share files, and react to posts. Each group can now include up to 1,024 members, according to WhatsApp’s official documentation.

2. Communities sit above ordinary groups, acting as an umbrella structure that links several related groups together under one theme. They include a dedicated announcements group that lets admins share key updates with all members across every linked group. This is useful, for example, for schools, businesses, or local organisations that want to keep large audiences connected without merging everyone into one chat.

3. Channels work as a one-way broadcast tool. For example, followers can receive updates from public figures, brands, or organisations, but they can’t reply within the channel itself. Channels are therefore designed for updates rather than conversations.

Setting Up A Group The Right Way

Creating a WhatsApp group is quick, but setting it up thoughtfully helps it run smoothly in the long term. The organiser starts by naming the group, adding an image, and setting a short description that explains what the group is for.

Within the Group Settings menu, admins can control who is allowed to send messages, change the group name or picture, and add new members. The Approve New Members feature lets admins review join requests before they are accepted, helping to prevent unwanted participants or spam.

For example, a workplace coordinator might use this setting to restrict a project group to approved team members, while a community organiser could use it to make sure only verified residents join a neighbourhood group.

How To Create A Well-Structured WhatsApp Group

– Open WhatsApp and tap New chat, then New group.

– Add your chosen members and set a clear, descriptive group name.

– Write a short description outlining the group’s purpose and any basic rules.

– In Group settings, decide who can post messages or add new members.

– Enable Approve New Members if the invite link might be shared beyond your core group.

Admin Tools That Keep Groups Organised

Admins now have more control than ever before. For example, they can appoint multiple admins to share responsibilities, remove members, reset invite links, or change permissions without deleting the group.

When people are invited to groups they don’t recognise, WhatsApp now displays a context screen showing who created it, how many members it has, and options to leave immediately. This reduces confusion and protects users from scams or spam invitations.

Day-To-Day Tools That Keep Conversations On Track

The simplest features often make the biggest difference. Message reactions let users acknowledge posts quickly without sending separate replies, keeping chats more concise.

Users can press and hold a message (or hover over it on desktop) to bring up a reaction bar and choose an emoji that fits their response. Others can tap the same emoji to agree without cluttering the chat.

WhatsApp is also introducing threaded replies to group messages, allowing related responses to be grouped together for easier reading. Editing messages after sending has already been rolled out, offering more flexibility in busy conversations.

Polls And Events Simplify Group Decisions

Polls and events are now standard tools for coordination. For example, polls allow users to ask a question with multiple options and gather votes directly within the chat. They are ideal for deciding on meeting times, event themes, or team preferences.

Events, by contrast, let organisers create a calendar-style entry with a date, time, location, and optional call link. Members can mark whether they are going, maybe, or not going. Any changes made by the organiser update automatically for everyone.

How To Create A Poll Or Event

– Open the group chat and tap the attachment icon.

– Select Poll to create a multi-option question or Event to schedule an activity.

– Add the details, then send it to the group for members to respond.

File Sharing, Calls And Productivity Features

WhatsApp’s file-sharing limit is now 2GB, which allows large presentations, videos, or PDFs to be shared directly. These files are automatically sorted under Media, links and docs in the group information screen, making them easy to find later.

Also, voice and video calls within groups now support up to 32 participants, a feature that has made WhatsApp a lightweight alternative to traditional meeting apps. For example, the platform says more than two billion calls are made every day, many of them through group chats.

Pinned messages, currently being rolled out more widely, help keep key updates visible at the top of busy chats.

Creating Smaller Groups Within Groups

While WhatsApp doesn’t allow true “sub-groups” within a single chat, it now offers two ways to achieve similar organisation.

1. Communities link multiple related groups under one structure. For example, a school might have one community containing groups for each class plus an announcements group for all parents. A business could create groups for each department within a single community for consistent communication.

2. There is also the ability to set up smaller side groups manually for focused discussions. For example, a large club might have a main group for all members and a smaller Committee Planning group for organisers. The key is to keep communication transparent by mentioning in the main group when side discussions are taking place.

In general, Communities are best for structured, ongoing organisation with clear roles, while smaller groups work well for short-term collaboration or private planning.

Privacy, Security And Safety Features

WhatsApp also still offers end-to-end encryption across all group messages and calls, meaning only participants can read or hear what’s shared. The platform also provides safety screens that warn users before joining unknown groups, and clearer options for reporting suspicious content.

Admins can reset invite links at any time, preventing them from spreading publicly. Reporting tools now allow users to flag specific messages instead of entire chats, helping WhatsApp review potential scams more accurately.

These measures, combined with the Approve New Members setting and improved admin controls, make groups safer and easier to manage even as they grow larger.

Power Features For Large Organisations

For schools, clubs, or businesses managing multiple groups, the Communities feature provides top-level organisation. Each community includes an announcements group for updates, while linked sub-groups handle topic-specific discussions.

WhatsApp has also begun rolling out built-in message translation, allowing users to translate posts directly inside chats without switching apps. This is especially valuable for international organisations or multicultural teams.

Quick Checklist For Group Organisers

– Use clear names and descriptions to define the purpose of each group.

– Set permissions carefully to control who can post or add new members.

– Turn on Approve New Members to reduce spam.

– Replace long discussions with polls and events for better organisation.

– Encourage message reactions rather than repetitive replies.

– Consider Communities for managing multiple related groups.

– Keep shared files accessible under Media, links and docs.

– Reset invite links if they become public.

Used effectively, these features transform WhatsApp groups from cluttered chats into structured, secure, and genuinely productive spaces that make everyday coordination simpler for everyone involved.

What Does This Mean For Your Business?

When used well, WhatsApp groups can bring order and clarity to communication that might otherwise become fragmented across emails, calls, and messages. The platform’s steady introduction of tools such as polls, events, Communities, and permissions settings reflects a clear move towards more professional and structured group management. For everyday users, these features simplify coordination and decision-making. For organisers, they offer genuine administrative control without adding unnecessary complexity.

For UK businesses in particular, WhatsApp’s evolution into a full-featured collaboration space has many practical benefits. For example, many small firms already rely on it informally to connect remote staff, contractors, and customers, and the new tools make those networks easier to manage. The ability to approve new members, create communities for different departments, or schedule meetings directly within the app offers a low-cost way to keep teams connected in real time. Used responsibly, it could become an accessible alternative to larger, paid communication platforms for smaller organisations.

The same features also have value beyond business. Local groups, volunteer networks, and schools can all benefit from Communities that link separate discussions together while maintaining privacy and control. The addition of safety screens and end-to-end encryption keeps users protected while helping organisers maintain accountability.

WhatsApp is now becoming less of a casual messaging app and more of an organised communication environment where structure, transparency, and security define how people interact. For businesses, community organisers, and individual users alike, understanding and applying these group features effectively could turn WhatsApp into one of the most useful everyday coordination tools available.

Amazon Targets Perplexity Over AI Shopping Assistant Comet

Amazon has accused AI startup Perplexity of illegally accessing its e-commerce systems through its agentic shopping assistant, Comet, marking one of the first major legal tests of how autonomous AI tools interact with major online platforms.

Perplexity and Comet

Perplexity is a fast-growing Silicon Valley AI company valued at around $18 billion and known for its “answer engine”, which competes with Google and ChatGPT by providing direct, cited responses rather than lists of links. Its newest product, Comet, extends this model into what’s known as “agentic browsing”, which is software that not only searches but acts.

Comet can log into websites using a user’s own credentials, find, compare and purchase products, and complete checkouts automatically. The user might, for example, tell Comet to “find the best-rated 40-litre laundry basket under £30 on Amazon and buy it”. Comet then navigates the site, checks prices and reviews, and completes the order.

Perplexity says Comet is private, with login credentials stored only on the user’s device. It argues that when users delegate tasks to their assistant, the AI is simply acting as their agent, meaning it has the same permissions as the human user.

Amazon’s Legal Threat And Allegations

On 31 October 2025, Amazon sent Perplexity a 10-page cease-and-desist letter through its law firm Hueston Hennigan, demanding it immediately stop “covertly intruding” into Amazon’s online store. The letter essentially accuses Perplexity of breaking US and California computer misuse laws, including the Computer Fraud and Abuse Act (CFAA) and California’s Comprehensive Computer Data Access and Fraud Act (CDAFA), by accessing Amazon’s systems without permission and disguising Comet as a Chrome browser.

Amazon’s counsel, Moez Kaba, wrote that “Perplexity must immediately cease using, enabling, or deploying Comet’s artificial intelligence agents or any other means to covertly intrude into Amazon’s e-commerce websites.” The letter says Comet repeatedly evaded Amazon’s attempts to block it and ignored earlier warnings to identify itself transparently when operating in the Amazon Store.

According to the letter, Perplexity’s unauthorised behaviour dates back to November 2024, when it allegedly used a “Buy with Pro” feature to place orders using Perplexity-managed Prime accounts, a practice that Amazon says violated its Prime terms and led to problems such as customers being unable to process returns. After being told to stop, Amazon says, Perplexity later resumed the same conduct using Comet.

The company also alleges that Comet “degrades the Amazon shopping experience” by failing to consider features like combining deliveries for faster, lower-carbon shipping or presenting important product details. Amazon claims this harms customers and undermines trust in the platform.

Security Risks And Data Concerns

Amazon’s letter also accuses Perplexity of endangering customer data. For example, it points to Comet’s terms of use, which it says grant Perplexity “broad rights to collect passwords, security keys, payment methods, shopping histories, and other sensitive data” while disclaiming liability for data security.

The letter cites security researchers who have identified vulnerabilities in Comet. For example, The Hacker News reported in October that a flaw dubbed “CometJacking” could hijack the AI assistant to steal data, while a Tom’s Hardware investigation in August found that Comet could visit malicious websites and prompt users for banking details without warnings. Amazon says such flaws illustrate the dangers of “non-transparent” agents interacting directly with sensitive e-commerce systems.

Must Act Openly and Be Monitored, Says Amazon

While Amazon insists it is not opposed to AI innovation, it argues that third-party AI agents must act openly so their behaviour can be monitored. “Transparency is critical because it protects a service provider’s right to monitor AI agents and restrict conduct that degrades the shopping experience, erodes customer trust, and creates security risks,” the letter states.

Amazon warns that Perplexity’s actions violate its Conditions of Use, impose significant investigative costs, and cause “irreparable harm” to its customer relationships. It has demanded written confirmation of compliance by 3 November 2025, threatening to pursue “all available legal and equitable remedies” if not.

What Is Agentic Browsing?

Agentic browsing describes AI systems that can autonomously act on users’ behalf, e.g., from finding products and booking travel to filling forms and making payments. The concept represents a step beyond traditional automation, potentially turning AI from a passive search tool into an active personal assistant.

The appeal is that these systems can save time, reduce manual effort, and make repetitive digital tasks simpler. For consumers and business users alike, agentic assistants could automate procurement, research, and routine purchases.

However, it seems that this new autonomy also challenges the rules of engagement between users, AI developers, and online platforms. For example, when a human browses a site, the platform can track preferences, display promotions and tailor recommendations. When an AI agent acts in their place, it may bypass all those mechanisms and, crucially, any monetised placements or advertising.

Perplexity’s Response

Perplexity quickly went public with its response, publishing a blog post entitled Bullying is Not Innovation. It described Amazon’s legal threat as “aggressive” and claimed it was an attempt to “block innovation and make life worse for people”.

The company argued that Comet acts solely under user instruction and therefore should not be treated as an independent bot. “Your AI assistant must be indistinguishable from you,” it wrote. “When Comet visits a website, it does so with your credentials, your permissions, and your rights.”

Perplexity’s blog also accused Amazon of prioritising advertising profits over user freedom. It cited comments by Amazon CEO Andy Jassy, who recently told investors that advertising spend was producing “very unusual” returns, and claimed Amazon wants to restrict independent agents while developing its own approved ones.

Chief executive Aravind Srinivas added that Perplexity “won’t be intimidated” and that it “stands for user choice”. In interviews, he suggested that agentic browsing represents the next stage of digital personalisation, where users, not platforms, control their experiences.

Previous Allegations Against Perplexity

Amazon’s claims are not the first to question Perplexity’s web practices. For example, earlier this year, Cloudflare (a web infrastructure and security company) published research showing that Perplexity’s AI crawlers were accessing websites that had explicitly opted out of AI scraping. Cloudflare alleged that the company disguised its crawler as a regular Chrome browser and used undisclosed IP addresses to avoid detection.

Perplexity denied intentionally breaching restrictions and said any access occurred only when users specifically asked questions about those sites. However, Cloudflare later blocked its traffic network-wide, citing security and transparency concerns.

The startup is also facing ongoing lawsuits from publishers including News Corp, Encyclopaedia Britannica and Merriam-Webster over alleged misuse of their content to train its models. Together, those disputes portray a company pushing at the legal and ethical boundaries of how AI interacts with the web.

Why The Amazon Clash Matters

The dispute with Amazon is really shaping up as an early test case for how much autonomy AI agents will have across the commercial web. For example, Amazon maintains that any software acting on behalf of users must still identify itself, follow platform rules, and respect the right of websites to decide whether to engage with automated systems.

However, Perplexity argues that an AI assistant used with a person’s consent is part of that person’s digital identity and should have the same access as a regular browser session. The company believes restricting that principle could undermine the emerging concept of user-controlled AI and set back progress in agentic browsing.

For Amazon, the matter is tied to the customer experience it has spent decades refining, and one that depends on data visibility, targeted recommendations and carefully managed fulfilment. For AI developers, the case signals the likelihood of tighter scrutiny and the potential for conflict if agents interact with online platforms without explicit approval.

Businesses experimenting with autonomous procurement or digital assistants will also be watching closely. Tools that can buy or book on behalf of staff offer obvious productivity benefits, but only if those agents operate within clear contractual and technical limits.

Regulators are beginning to take interest too. For example, questions are emerging over where accountability lies if an agentic system breaches a website’s terms or handles personal data incorrectly, and whether users, developers or platforms should bear responsibility. How these questions are answered will influence how agentic AI evolves, and how openly such systems are allowed to participate in the online economy.

What Does This Mean For Your Business?

The outcome of Amazon’s confrontation with Perplexity will set a practical benchmark for how far autonomous AI agents can go before platforms intervene. What began as a dispute over one shopping assistant now touches the wider question of how digital power is distributed between users, developers and global platforms. If Amazon succeeds in forcing explicit disclosure and control over third-party agents, it could consolidate platform dominance and slow the development of independent AI tools. If Perplexity’s position gains support, the web could see a surge of user-driven automation that bypasses traditional commercial gateways.

For UK businesses, companies already exploring AI tools to handle purchasing, market research or logistics will need to ensure those systems act within recognised platform rules and data protection standards. The eventual precedent could shape how British firms integrate AI agents into supply chains, e-commerce systems and customer service platforms. It may also affect costs and compliance responsibilities, depending on whether platforms like Amazon begin enforcing stricter access requirements on all autonomous systems.

For consumers, the promise of convenience from agentic browsing is balanced by legitimate concerns about data security and transparency. For regulators, the case underscores the urgent need to clarify who is accountable when AI systems act independently. For AI companies, it highlights that technical innovation alone is no longer enough; transparent cooperation with platform owners and adherence to existing legal frameworks will now be part of the competitive landscape.

The Amazon–Perplexity dispute has, therefore, become more than a legal warning. In fact, it looks like marking the start of a global debate over how automation, commerce and trust can coexist online, and one that every business and policymaker will have to engage with as agentic AI becomes part of everyday digital life.

Microsoft’s Fake Marketplace Reveals AI Agents Still Struggle

Microsoft has built a synthetic online marketplace to stress test AI agents in realistic buying and selling scenarios, but the early results appear to have revealed how fragile even the most advanced models remain when faced with complex, competitive environments.

Why Microsoft Built A Fake Marketplace

Magentic Marketplace is Microsoft’s new open source simulation environment for what the company calls “agentic markets”, where AI systems act as autonomous customers and businesses that search, negotiate and transact with each other. The project, developed by Microsoft Research in collaboration with Arizona State University, is designed to explore how AI agents behave when placed in a simulated economy rather than isolated single agent tasks.

The initiative reflects growing excitement across the tech sector about so-called agentic AI, systems capable of taking actions on a user’s behalf, such as comparing products, booking services or handling customer enquiries. Microsoft’s researchers argue that while such systems promise major economic efficiency gains, there is still little understanding of what happens when hundreds of agents operate simultaneously in the same market.

The Value of Studying AI Agents’ Behaviours

Ece Kamar, corporate vice president and managing director of Microsoft Research’s AI Frontiers Lab, has said that understanding how AI agents interact, collaborate and negotiate with one another will be critical to shaping how such systems influence real world markets. Microsoft describes the project as part of a broader effort to study these behaviours safely and in depth before agentic systems are deployed in everyday economic settings.

The work sits alongside a broader research programme at Microsoft exploring what it calls the “agentic economy”. The associated technical report, MSR-TR-2025-50, was published in late October 2025, followed by a detailed blog post and open source release on 5 November.

How Magentic Marketplace Works

Instead of experimenting with real online platforms, Microsoft built a fully synthetic two sided marketplace. One side features “assistant agents” representing customers tasked with finding products or services that meet specific requirements, for example ordering food with certain dishes and amenities. The other side features “service agents” acting as competing businesses, each advertising their offerings, answering questions and accepting orders.

The marketplace environment itself manages all the underlying infrastructure, from product catalogues and discovery algorithms to transaction handling and payments. Agents communicate with the central server via a simple HTTP/REST interface, using just three endpoints for registration, protocol discovery and action execution. This minimalist architecture allows the researchers to plug in a wide range of AI models and keep experiments reproducible.

The Experiment

Microsoft ran its initial experiments using 100 customer agents and 300 business agents. The test scenarios included synthetic restaurant and home improvement markets, allowing the team to control every variable and analyse outcomes in detail. The study compared a range of proprietary and open source models, including GPT 4o, GPT 4.1, GPT 5, Gemini 2.5 Flash, GPT OSS 20b and Qwen3 variants, and measured performance using standard economic metrics such as consumer welfare (the perceived value of purchases minus prices paid).

What Happened When Microsoft Let The Agents Loose

When given a simplified “perfect search” setup, where only a handful of highly relevant options were available, leading models such as GPT 5 and Anthropic’s Claude Sonnet 4.x achieved near optimal performance. In these ideal conditions they consistently selected the best options and maximised consumer welfare.

However, when Microsoft introduced more realistic challenges, such as requiring the agents to form their own search queries, navigate lists of results and choose which businesses to contact, performance dropped sharply. While most agents still performed better than random or cheapest option baselines, the advantage over simple heuristics often disappeared under realistic levels of complexity.

A Paradox of Choice Revealed

Interestingly, the study also revealed an unexpected “paradox of choice”. For example, when the number of search results increased from three to one hundred, most agents failed to explore the wider set of options. In fact, it was found that many simply picked the first “good enough” choice, regardless of how many alternatives existed. Also, consumer welfare fell as more results were shown, particularly for models like Claude Sonnet 4, which saw average welfare scores drop from around 1,800 to 600. GPT 5 also showed a steep decline, from roughly 2,000 to just over 1,000, suggesting that even large models struggle to reason across large decision spaces.

Collaboration Tested

The researchers also tested how well multiple AI agents could collaborate on shared tasks, such as dividing roles in joint decision making. Without clear instructions, most agents became confused about who should do what. When researchers provided explicit step by step guidance, performance improved, but Kamar noted that true collaboration should not depend on such micromanagement.

Manipulation, Bias And Behavioural Failures

One of the most striking findings came from experiments testing whether business side agents could manipulate their AI customers. Microsoft tested six tactics, ranging from standard persuasion techniques such as fake credentials (“Michelin featured” or “award winning”) and social proof (“Join 50,000 happy customers”) to more aggressive prompt injection attacks that directly tried to rewrite a customer agent’s instructions.

The results varied widely between models. For example, Anthropic’s Claude Sonnet 4 resisted all manipulation attempts, while Google’s Gemini 2.5 Flash showed mild susceptibility to strong prompt injections. By contrast, GPT 4o and several open source models, including Qwen3 4b, were easily compromised, with manipulated businesses successfully redirecting all payments towards themselves. Even subtle tactics such as fake awards or inflated review counts could influence purchasing decisions for some systems.

These findings appear to highlight a broader concern in AI safety research, i.e., that large language models are easily swayed by adversarial inputs and emotional framing. In a marketplace context, such weaknesses could enable dishonest sellers to exploit customer side agents and distort competition.

Bias

The experiments also appear to have uncovered systemic biases in agent decision making. For example, across all tested models, agents showed a strong “first proposal” bias, accepting the first seemingly valid offer rather than waiting for additional responses. This behaviour gave a ten to thirty fold advantage to faster responding sellers, regardless of quality. Some open source models also displayed positional bias, tending to pick the last option in a list regardless of its actual merits.

Together, these findings seem to suggest that agentic markets could replicate and even amplify familiar real world problems such as information asymmetry, bias and manipulation, only at machine speed.

Microsoft And Its Competitors

Microsoft is positioning itself as a leader in agentic AI, building Copilot systems that can act semi autonomously across Office, Windows and Azure. However, publishing this research about Magentic Marketplace that exposes major limitations in current agent behaviour shows not just scientific transparency, but also an acknowledgement that current systems remain brittle.

At the same time, releasing Magentic Marketplace as open source code on GitHub and Azure AI Foundry Labs gives Microsoft significant influence over how the next phase of AI evaluation is conducted. The company has effectively created a public benchmark for testing AI agents in market like environments. This may shape how regulators, researchers and competitors such as Google, OpenAI and Anthropic measure progress towards safe deployment.

It is worth noting here that the agentic AI race is on and competitors are pursuing their own versions of agentic systems, from OpenAI’s Operator tool, which can perform real web tasks, to Anthropic’s Computer Use feature, which controls software interfaces on behalf of users. None has yet published a similarly large scale testbed for multi agent markets. Industry analysts suggest that Microsoft’s decision to expose failures so openly may also be strategic, helping the company frame itself as a responsible actor ahead of tighter global regulation on AI autonomy.

Businesses, Users And Regulators

For businesses hoping to integrate agentic AI into procurement, sales or customer support, the message from this research is that these systems still require close human supervision. Agents proved capable of making simple transactions but were easily overloaded by large product ranges, misled by false claims and prone to favouring the first acceptable offer. In high stakes contexts such behaviour could lead to financial losses or reputational harm.

The findings also raise new competitive and ethical questions. For example, if agentic marketplaces reward speed over accuracy, or if certain models are more vulnerable to manipulation, companies that optimise for aggressive tactics could gain unfair advantages. Microsoft’s economists warned that such structural biases could distort future digital markets if left unchecked.

For regulators, Magentic Marketplace offers a rare tool to observe how autonomous agents behave before they enter real economies. The ability to run controlled experiments on transparency, bias and manipulation could inform emerging AI safety standards and consumer protection frameworks.

Challenges And Criticisms

While widely praised for its openness, the Magentic Marketplace research has also drawn some scrutiny. For example, the test scenarios focus mainly on low risk domains like restaurant ordering, which may not reflect the complexity or stakes of sectors such as healthcare or finance. Also, because the data is fully synthetic, it avoids privacy issues but may underrepresent the messiness and unpredictability of human driven markets.

The current experiments also study static interactions rather than dynamic markets, where agents learn and adapt over time. Real economies evolve as participants change strategy, something Microsoft plans to explore in future iterations. Some researchers have also pointed out that focusing mainly on “consumer welfare” may overlook broader measures of fairness, accessibility and long term market stability.

That said, at least the findings so far give researchers a clearer view of how AI agents behave when placed in competitive settings. Microsoft’s approach could also be said to provide a fairly structured way to observe these systems under controlled market conditions and to identify where improvements are most needed before they are applied more widely in real commercial use.

What Does This Mean For Your Business?

For all the progress in developing intelligent assistants, Microsoft’s Magentic Marketplace experiment has exposed how far current AI models still are from handling the unpredictability of real markets. The failures observed in decision making, collaboration and manipulation resistance point to weaknesses that could directly affect trust and reliability if similar systems were deployed commercially. For UK businesses exploring automation through AI agents, this research is a reminder that the technology is not yet capable of making independent purchasing or negotiation decisions without oversight. The risks of bias, misjudged choices and exploitability remain significant.

At the same time, the study shows why testing environments like Magentic Marketplace will be vital for regulators, developers and investors as agentic AI moves closer to practical use. For example, controlled simulations can reveal hidden biases and security flaws before these systems handle real financial transactions. For policymakers in the UK and elsewhere, the findings reinforce the need for standards that ensure accountability and human control within automated decision systems.

For Microsoft, this project strengthens its image as a company willing to expose and study AI limitations rather than conceal them. For its competitors, the research sets a benchmark for transparency and evaluation that others will be expected to meet. For businesses and public institutions, it highlights the importance of using AI agents as supportive tools rather than autonomous decision makers until reliability, fairness and resilience can be proven in real economic conditions.

Company Check : Microsoft’s ‘Humanist Superintelligence’ For Medical Diagnosis

Microsoft has launched a new research division called the MAI Superintelligence Team, aiming to build artificial intelligence systems that surpass human capability in specific fields, beginning with medical diagnostics.

AI For “Superhuman” Performance in Defined Area

The new team sits within Microsoft AI and is led by Mustafa Suleyman, the company’s AI chief, with Karen Simonyan appointed as chief scientist. Suleyman, who previously co-founded Google DeepMind, said the company intends to invest heavily in the initiative, which he described as “the world’s best place to research and build AI”.

The project’s focus is not on creating a general artificial intelligence capable of performing any human task, but rather on highly specialised AI that achieves “superhuman” performance in defined areas. The first application area is medical diagnosis, which Microsoft sees as an ideal testing ground for its new “humanist superintelligence” concept.

Suleyman said Microsoft is not chasing “infinitely capable generalist AI” because he believes self-improving autonomous systems would be too difficult to control safely. Instead, the MAI Superintelligence Team will build what he calls “humanist superintelligence”, i.e., advanced, controllable systems explicitly designed to serve human needs. As Suleyman says, “Humanism requires us to always ask the question: does this technology serve human interests?”.

How Much?

Microsoft has not disclosed how much it plans to spend, but reports suggest the company is prepared to allocate significant resources and recruit from leading AI research labs globally. The new lab’s mission is part of Microsoft’s wider effort to develop frontier AI while maintaining public trust and regulatory approval.

From AGI To Humanist Superintelligence

The company’s public messaging about this subject appears to mark a deliberate shift away from the competitive narrative around Artificial General Intelligence (AGI), which seeks to match or exceed human performance across all tasks. For example, Suleyman argues that such systems would raise unsolved safety questions, particularly around “containment”, i.e., the ability to reliably limit a system that can constantly redesign itself.

What Does Microsoft Mean By This?

In a Microsoft AI blog post titled Towards Humanist Superintelligence, Suleyman describes the new approach as building “AI capabilities that always work for, in service of, people and humanity more generally”. He contrasts this vision with what he calls “directionless technological goals”, saying Microsoft is interested in practical breakthroughs that can be tested, verified, and applied in the real world.

By pursuing domain-specific “superintelligences”, Microsoft appears to be trying to avoid some of the existential risks linked with unrestricted AI development. The company is also trying to demonstrate that cutting-edge AI can be both safe and useful, contributing to tangible benefits in health, energy, and education rather than theoretical intelligence milestones.

Why Start With Medicine?

Medical diagnostics is an early focus because it combines measurable human error rates with large, high-quality data sets and, crucially at the moment, high potential public value. In fact, studies suggest that diagnostic errors account for around 16 per cent of preventable harm in healthcare, while the World Health Organization has warned that most adults will experience at least one diagnostic error in their lifetime.

Suleyman said Microsoft now has a “line of sight to medical superintelligence in the next two to three years”, suggesting the company believes AI systems could soon outperform doctors at diagnostic reasoning under controlled conditions. He argues that such advances could “increase our life expectancy and give everybody more healthy years” by enabling much earlier detection of preventable diseases.

The company’s internal research already points in that direction. For example, Microsoft’s MAI-DxO system (short for “Diagnostic Orchestrator”) has achieved some striking results in benchmark tests designed to simulate real-world diagnostic reasoning.

Inside MAI-DxO

The MAI-DxO system is not a single model, but a kind of orchestration layer that coordinates several large language models, each with a defined clinical role. For example, one AI agent might propose diagnostic hypotheses, another might choose which tests to run, and a third might challenge assumptions or check for missing information.

In trials based on 304 “Case Challenge” problems from the New England Journal of Medicine, MAI-DxO reportedly achieved 85 per cent accuracy when paired with OpenAI’s o3 reasoning model. By comparison, a group of experienced doctors averaged around 20 per cent accuracy under the same test conditions.

The results suggest that carefully designed orchestration may allow AI to approach diagnostic problems more efficiently than either humans or single large models working alone. In simulated tests, MAI-DxO also reduced diagnostic costs by roughly 20 per cent compared with doctors, and by 70 per cent compared with running the AI model independently.

However, Microsoft and external observers have both emphasised that these were controlled experiments. The doctors involved were not allowed to consult colleagues or access reference materials, and the cases were adapted from academic records rather than live patients. Clinical trials, regulatory approval, and real-world validation will all be necessary before any deployment.

Suleyman has presented these results as an example of what he calls a “narrow domain superintelligence”, i.e., a specialised system that can safely outperform humans within clearly defined boundaries.

Safety And Alignment

Microsoft’s framing of humanist superintelligence is also a response to growing concern about AI safety. Suleyman has warned that while a truly self-improving superintelligence would be “the most valuable thing we’ve ever known”, it would also be extremely difficult to align with human values once it surpassed our ability to understand or control it.

The company’s strategy, therefore, centres on building systems that remain “subordinate, controllable, and aligned” with human priorities. By keeping autonomy limited and focusing on specific problem areas such as medical diagnosis, Microsoft believes it can capture the benefits of superhuman capability without the existential risk.

As Suleyman writes: “We are not building an ill-defined and ethereal superintelligence; we are building a practical technology explicitly designed only to serve humanity.”

Some analysts have noted that this positioning may also help Microsoft distinguish its strategy from competitors such as Meta, which launched its own superintelligence lab earlier this year, and from start-ups like Safe Superintelligence Inc that are explicitly focused on building self-improving models.

A Race With Different Rules

Microsoft’s announcement comes as major technology firms increasingly compete for elite AI researchers. For example, Meta reportedly offered signing bonuses as high as $100 million to attract top scientists earlier this year. Suleyman has reportedly declined to confirm whether Microsoft would match such offers but said the new team will include “existing researchers and new recruits from other top labs”.

Some industry observers see the MAI Superintelligence Team as both a research investment and a public statement that Microsoft wants to lead the next stage of AI development, but with a clearer safety and governance narrative than some rivals.

What It Could Mean For Healthcare

For health systems under pressure, AI that can help clinicians reach accurate diagnoses faster could be transformative. For example, delays and misdiagnoses are a major cost driver in both public and private healthcare. A reliable diagnostic assistant, therefore, could save time, reduce unnecessary testing, and improve outcomes, especially in regions with limited access to specialist expertise.

The potential educational impact is also significant. A system like MAI-DxO, which explains its reasoning at every step, could be used as a learning aid for medical students or as a decision-support tool in hospitals.

Questions

However, researchers and regulators warn that AI accuracy in controlled environments does not guarantee equivalent performance in diverse clinical settings. Questions remain about bias in training data, patient consent, and accountability when human and AI opinions differ. The European Union’s AI Act and emerging UK regulatory frameworks are expected to impose strict safety and transparency requirements on medical AI before systems like MAI-DxO can be used in practice.

That said, Microsoft says it welcomes such oversight. For example, Suleyman’s blog argues that accountability and collaboration are essential, stating that “superintelligence could be the best invention ever — but only if it puts the interests of humans above everything else”.

The creation of the MAI Superintelligence Team may mark Microsoft’s clearest statement yet about its long-term direction in AI, i.e., pursuing domain-specific superintelligence that is powerful, safe, and focused on real-world benefit, beginning with medicine.

What This Means For Your Business?

If Microsoft succeeds in building “humanist superintelligence” for medicine, the result could reshape both healthcare delivery and the wider AI industry. For example, a reliable diagnostic system that outperforms clinicians on complex cases would accelerate the shift towards AI-assisted medicine, allowing earlier detection of disease and reducing the burden on overstretched health services. For hospitals and healthcare providers, it could mean shorter waiting times and lower diagnostic costs, while patients might gain faster and more accurate treatment.

At the same time, Microsoft’s framing of the project as a test of safety and alignment signals a growing maturity in how frontier AI is being discussed. Instead of competing purely on speed or model size, companies are now being judged on whether their technologies can be controlled, verified, and trusted. That may influence regulators, insurers, and even investors who want to see real-world impact without escalating risk.

For UK businesses, the implications go beyond healthcare. If Microsoft’s “narrow domain superintelligence” model proves viable, it could create opportunities for British technology firms, research institutions, and service providers to build or adapt specialist AI tools within defined safety limits. Such systems could apply to areas as diverse as pharmaceuticals, energy storage, materials science, or industrial maintenance, giving early adopters a measurable productivity advantage while keeping human oversight at the centre.

What makes this initiative particularly relevant and interesting to policymakers and business leaders is its emphasis on control. For example, in a world increasingly concerned with AI governance, Microsoft’s commitment to “humanist” principles offers a version of superintelligence that regulators can engage with rather than resist. It positions the company as both a technological leader and a cautious steward, and it hints at a future where advanced AI could enhance human capability rather than replace it. Whether that balance can be achieved will now depend on how well Microsoft’s theories hold up in real clinical trials, and how much trust its humanist approach can earn in practice.

Security Stop-Press: Cyber Attack Almost Wipes Out M&S Profits

Marks & Spencer has confirmed that a major cyber attack in April 2025 almost wiped out its half-year profits, cutting statutory profit before tax by 99 per cent, from £391.9 million to just £3.4 million.

The retailer said the incident, linked to the DragonForce ransomware group and the Scattered Spider hacking network, forced it to suspend online orders and click-and-collect services for weeks and caused widespread supply chain disruption.

M&S recorded £102 million in one-off costs and expects to spend another £34 million before year-end. An insurance payout of £100 million offset part of the impact, though overall losses are expected to reach around £300 million.

Chief executive Stuart Machin said the company “responded quickly” to protect customers and suppliers, confirming that customer data such as contact details and order histories were taken, but not payment information.

The case highlights the scale of damage social engineering and ransomware can cause. Businesses can protect themselves by improving staff awareness, enforcing multi-factor authentication, and testing their incident response plans regularly.

Each week we bring you the latest tech news and tips that may relate to your business, re-written in an techy free style. 

Archives