Google’s AI Takes a Dive into Dolphin Communication with ‘DolphinGemma’

Dolphins have long fascinated scientists due to their remarkable intelligence and social behaviors. Known for their ability to cooperate, learn, and even recognize their own reflections, dolphins also possess a complex system of whistles and clicks that hint at a unique form of communication. Now, researchers may be closer than ever to decoding it—with help from Google’s latest AI innovation.

Google has teamed up with the Wild Dolphin Project (WDP), a research group that has studied a community of Atlantic spotted dolphins since 1985. Using non-invasive techniques, WDP collects extensive video and audio data on dolphin behavior. The collaboration aims to apply generative AI to this rich dataset in an effort to better understand dolphin vocalizations and potentially discover whether their communication qualifies as language.

One of the WDP’s key focuses is how specific dolphin sounds correspond to social interactions. Over the years, researchers have identified certain sound patterns—such as signature whistles used similarly to names, and “squawks” during conflicts. However, understanding the full structure and nuance of dolphin vocal behavior remains a significant challenge.

To bridge this gap, WDP and Google have developed a new AI model called DolphinGemma, based on Google’s open-source Gemma language models. This tool utilizes SoundStream, a proprietary audio-processing technology, to convert dolphin sounds into data that the AI can analyze. Like human language models, DolphinGemma predicts audio patterns and can generate new ones, possibly mimicking dolphin-like communication.

The model was trained using WDP’s comprehensive audio archives, and early testing shows promise. Researchers hope it can uncover intricate sound patterns and help build a “shared vocabulary” that humans can begin to interpret. Google notes that manually analyzing this volume of acoustic data would take researchers years—AI, by contrast, could do it much faster and more efficiently.

What makes this AI tool especially practical is its compatibility with smartphones. The WDP team already uses Pixel phones in the field, and DolphinGemma was specifically designed to run efficiently on mobile devices. With about 400 million parameters, it’s relatively lightweight compared to larger AI models, making it ideal for on-the-go use.

In parallel, WDP uses a device known as CHAT (Cetacean Hearing Augmentation Telemetry), developed at Georgia Tech and powered by Pixel phones. CHAT is designed to produce synthetic dolphin-like sounds and listen for similar responses from dolphins. The next generation of this device, based on the Pixel 9, is expected to boost capabilities for real-time deep learning and sound pattern recognition.

While researchers are not expecting DolphinGemma to enable fluent conversations with dolphins anytime soon, they believe it could lead to limited interactions in the near future—such as basic signals or responses.

Importantly, DolphinGemma will be released as an open-source model this summer, allowing researchers worldwide to use and adapt it. Although it’s currently trained on Atlantic spotted dolphin vocalizations, it may also be fine-tuned to understand the communication of other marine species.

This groundbreaking work opens the door to a deeper understanding of animal intelligence and interspecies communication—one whistle at a time.