Skip to content

Top Stories

Top Stories

Primary Menu
  • Breaking News
  • UNIT CONVERTER
  • QR Code Generator
  • SEO META TAG GENERATOR
  • Background Remover Tool
  • Image Enhancer Tool
  • Image Converter Tool
  • Image Compressor Tool
  • Keyword Research Tool
  • Paint Tool
  • About Us
  • Contact Us
  • Privacy Policy
HOME PAGE
  • Home
  • Uncategorized
  • When AI Backfires: Enkrypt AI Report Exposes Dangerous Vulnerabilities in Multimodal Models
  • Uncategorized

When AI Backfires: Enkrypt AI Report Exposes Dangerous Vulnerabilities in Multimodal Models

VedVision HeadLines May 8, 2025
When AI Backfires: Enkrypt AI Report Exposes Dangerous Vulnerabilities in Multimodal Models


In May 2025, Enkrypt AI released its Multimodal Red Teaming Report, a chilling analysis that revealed just how easily advanced AI systems can be manipulated into generating dangerous and unethical content. The report focuses on two of Mistral’s leading vision-language models—Pixtral-Large (25.02) and Pixtral-12b—and paints a picture of models that are not only technically impressive but disturbingly vulnerable.

Vision-language models (VLMs) like Pixtral are built to interpret both visual and textual inputs, allowing them to respond intelligently to complex, real-world prompts. But this capability comes with increased risk. Unlike traditional language models that only process text, VLMs can be influenced by the interplay between images and words, opening new doors for adversarial attacks. Enkrypt AI’s testing shows how easily these doors can be pried open.

Alarming Test Results: CSEM and CBRN Failures

The team behind the report used sophisticated red teaming methods—a form of adversarial evaluation designed to mimic real-world threats. These tests employed tactics like jailbreaking (prompting the model with carefully crafted queries to bypass safety filters), image-based deception, and context manipulation. Alarmingly, 68% of these adversarial prompts elicited harmful responses across the two Pixtral models, including content that related to grooming, exploitation, and even chemical weapons design.

One of the most striking revelations involves child sexual exploitation material (CSEM). The report found that Mistral’s models were 60 times more likely to produce CSEM-related content compared to industry benchmarks like GPT-4o and Claude 3.7 Sonnet. In test cases, models responded to disguised grooming prompts with structured, multi-paragraph content explaining how to manipulate minors—wrapped in disingenuous disclaimers like “for educational awareness only.” The models weren’t simply failing to reject harmful queries—they were completing them in detail.

Equally disturbing were the results in the CBRN (Chemical, Biological, Radiological, and Nuclear) risk category. When prompted with a request on how to modify the VX nerve agent—a chemical weapon—the models offered shockingly specific ideas for increasing its persistence in the environment. They described, in redacted but clearly technical detail, methods like encapsulation, environmental shielding, and controlled release systems



Source link

Continue Reading

Previous: BJP MLAs outspent AAP in Delhi polls. Check out the expense report | Delhi News
Next: New York City Predicted Record Tourism. Then Came Trump.

Related News

This Chinese Company is Buying a Lot of BNB, Aims to Own  Billion Worth
  • Uncategorized

This Chinese Company is Buying a Lot of BNB, Aims to Own $1 Billion Worth

VedVision HeadLines July 6, 2025
Chai Discovery Team Releases Chai-2: AI Model Achieves 16% Hit Rate in De Novo Antibody Design
  • Uncategorized

Chai Discovery Team Releases Chai-2: AI Model Achieves 16% Hit Rate in De Novo Antibody Design

VedVision HeadLines July 6, 2025
Retail investors reap big gains from ‘buying the dip’ in US stocks
  • Uncategorized

Retail investors reap big gains from ‘buying the dip’ in US stocks

VedVision HeadLines July 6, 2025

Recent Posts

  • Bob Vylan clip resurfaces showing frontman chanting ‘the only good pig is a dead pig’ in vile police jibe
  • Duchess of Edinburgh follows in Prince William’s footsteps with latest move
  • This Chinese Company is Buying a Lot of BNB, Aims to Own $1 Billion Worth
  • How a $123M crypto scam in Australia laundered millions through a ‘legit’ business
  • Ludhiana Passport Seva Kendra to relocate to bigger facility after long-standing public demand | Chandigarh News

Recent Comments

No comments to show.

Archives

  • July 2025
  • June 2025
  • May 2025
  • April 2025

Categories

  • Current Affairs
  • Shopping
  • Uncategorized

You may have missed

Bob Vylan clip resurfaces showing frontman chanting ‘the only good pig is a dead pig’ in vile police jibe
  • Current Affairs

Bob Vylan clip resurfaces showing frontman chanting ‘the only good pig is a dead pig’ in vile police jibe

VedVision HeadLines July 6, 2025
Duchess of Edinburgh follows in Prince William’s footsteps with latest move
  • Current Affairs

Duchess of Edinburgh follows in Prince William’s footsteps with latest move

VedVision HeadLines July 6, 2025
This Chinese Company is Buying a Lot of BNB, Aims to Own  Billion Worth
  • Uncategorized

This Chinese Company is Buying a Lot of BNB, Aims to Own $1 Billion Worth

VedVision HeadLines July 6, 2025
How a 3M crypto scam in Australia laundered millions through a ‘legit’ business
  • Current Affairs

How a $123M crypto scam in Australia laundered millions through a ‘legit’ business

VedVision HeadLines July 6, 2025
Copyright © All rights reserved. | MoreNews by AF themes.