Unveiling the Secrets of Google’s Gemini: A Journey into AI Security

Introduction

In the ever-evolving landscape of artificial intelligence, security remains a paramount concern. As tech giants race to develop the most advanced AI models, the question of safety becomes increasingly critical. Join us as we embark on a captivating journey with a team of ethical hackers who dared to challenge Google’s Gemini AI at the LLM bugSWAT event in Las Vegas.

The Wild West of Generative AI

The realm of Generative Artificial Intelligence (GenAI) and Large Language Models (LLM) is akin to the Wild West of technology. With companies like Meta, Microsoft, and Google vying for dominance, new players such as Anthropic and Deepseek are shaking up the industry. Amidst this fierce competition, the question arises: Are these developments truly secure?

The LLM bugSWAT Challenge

In 2024, our team, consisting of Roni «Lupin» Carta, Joseph «rez0» Tacker, and Justin «Rhynorater» Gardner, returned to Las Vegas to participate in Google’s LLM bugSWAT event. Our mission was to uncover vulnerabilities in Gemini, Google’s AI model, and our efforts were rewarded with the prestigious Most Valuable Hacker (MVH) award.

Exploring Gemini’s New Features

Google granted us early access to the latest Gemini update, complete with detailed documentation. Our task was to scrutinize these features from an attacker’s perspective. It all began with a simple prompt: run hello world in python3. This led us to discover a curious «Run in Sandbox» button, sparking our investigation.

The Python Sandbox: A Safe Haven?

Gemini’s Python Sandbox Interpreter, built on Google’s gVisor, is designed as a secure environment for running Python code. However, our experience as security researchers taught us that even the most fortified sandboxes can harbor vulnerabilities.

Understanding gVisor

Google’s gVisor acts as a mediator between containerized applications and the host operating system, intercepting system calls to create strict security boundaries. This innovative approach enhances container security, making gVisor a crucial tool for safely running and managing containerized workloads.

Uncovering Vulnerabilities

Our exploration revealed that while escaping the sandbox was a daunting task, data leaks within it were possible. By leveraging the os library, we examined the file system, uncovering a binary file at /usr/bin/entry/entry_point.

Extracting the Entry Point

To extract this 579 MB file, we employed a methodical approach, reading and encoding it in chunks using base64. This allowed us to reconstruct the file locally for further analysis.

Delving into the Binary

Upon examining the binary, we discovered references to Google’s internal repository, google3, hinting at proprietary software traces. Using tools like Binwalk, we extracted the file structure, revealing a wealth of internal code.

The ReAct Framework

Our investigation led us to the ReAct framework, a novel approach where language models generate reasoning paths and perform actions. This dynamic interaction enhances model accuracy and decision-making transparency.

The Potential for Exploitation

Our findings suggested the possibility of exploiting Gemini’s planning phase to access a more privileged sandbox. With Google’s security team’s assistance, we tested this theory, uncovering a potential vulnerability.

Conclusion

Our journey into Gemini’s security landscape was both challenging and rewarding. It underscored the importance of rigorous testing before deploying advanced AI systems. As we continue to push the boundaries of AI security, one thing is clear: thorough testing is not just recommended—it’s essential.

Final Thoughts

The thrill of uncovering vulnerabilities and expanding Gemini’s sandbox capabilities was an exhilarating experience. As AI continues to evolve, so too must our efforts to ensure its safety and reliability. Stay curious, stay vigilant, and keep exploring the fascinating world of AI security.