New generative media models and tools, built with and for creators-001

pexels-photo-26050140.jpeg
We’re introducing Veo, our most capable model for generating high-definition video, and Imagen 3, our highest quality text-to-image model. We’re also sharing new demo recordings created with our Music AI Sandbox.

Over the past year, we’ve made incredible progress in enhancing the quality of our generative media technologies. We’ve been working closely with the creative community to explore how generative AI can best support the creative process, and to make sure our AI tools are as useful as possible at each stage.

Today, we’re introducing Veo, our latest and most advanced video generation model, and Imagen 3, our highest quality text-to-image model yet.

We’re also sharing some of our recent collaborations with filmmaker Donald Glover and his creative studio, Gilga, and new demo recordings being released by artists Wyclef JeanMarc Rebillet and songwriter Justin Tranter, made with help from our Music AI Sandbox.

our most capable video generation model

Veo generates high-quality 1080p resolution videos in a wide range of cinematic and visual styles that can go beyond a minute. With an advanced understanding of natural language and visual semantics, it generates video that closely represents a user’s creative vision — accurately capturing a prompt’s tone and rendering details in longer prompts.

The model provides an unprecedented level of creative control, and understands cinematic terms like “timelapse” or “aerial shots of a landscape”. Veo creates footage that’s consistent and coherent, so people, animals and objects move realistically throughout shots.

Examples of Veo’s high-quality video generation capabilities. All videos were generated by Veo and have not been modified.

To discover how Veo can best support the storyteller’s creative process, we’re inviting a range of filmmakers and creators to experiment with the model. These collaborations also help us improve the way we design, build and deploy our technologies to make sure creators have a voice in how they’re developed.

Here's a preview of our work with filmmaker Donald Glover and his creative studio, Gilga, who experimented with Veo for a film project.

https://www.youtube-nocookie.com/embed/dKAVFLB75xs?enablejsapi=1&origin=https%3A%2F%2Fblog.google&widgetid=1

1:38Veo builds upon years of our generative video model work, including Generative Query Network (GQN), DVD-GANImagen-VideoPhenakiWALTVideoPoet and Lumiere — combining architecture, scaling laws and other novel techniques to improve quality and output resolution.

With Veo, we’ve improved techniques for how the model learns to understand what's in a video, renders high-definition images, simulates the physics of our world and more. These learnings will fuel advances across our AI research and enable us to build even more useful products that help people interact and communicate in new ways.

Starting today, Veo is available to select creators in private preview in VideoFX by joining our waitlist. In the future, we’ll also bring some of Veo’s capabilities to YouTube Shorts and other products.

Learn more about Veo’s capabilities.

our highest quality text-to-image model

Over the last year, we’ve made incredible progress improving the quality and fidelity of our image generation models and tools.

Imagen 3 is our highest quality text-to-image model. It generates an incredible level of detail, producing photorealistic, lifelike images, with far fewer distracting visual artifacts than our prior models.

Item 1 of 6

A close-up portrait of a gray wolf with intense yellow eyes. The wolf has a thick, gray and brown fur coat and a black nose. It is looking directly at the viewer with a calm but alert expression. The background is a blurred blue and gray sky.Prompt: A close up of a sleek wolf perched regally in front of gray background, in a high-resolution photograph with detailed fine details, isolated on a plain stock photo with color grading in the style of a hyper-realistic style.

A large jellyfish with long, flowing tentacles drifts through the ocean. The jellyfish has a round, translucent bell with brown stripes and a cluster of frilly oral arms underneath. It is surrounded by blue water and a coral reef is visible in the background.Prompt: Close-up of a jellyfish pulsating through crystal-clear water, tentacles trailing, vibrant coral reef background, macro photography, stock photo, high resolution, very detailed, soft lighting, professional color grading, shallow depth of field, sharp focus, taken with a DSLR camera in the style of professional photographers.

A wide river winds through a deep gorge carved into a lush, green mountain range under a clear blue sky. The river is calm and reflects the surrounding landscape. The sun shines brightly, casting shadows on the slopes and highlighting the textures of the rocks.Prompt: View from above of beautiful river canyon with trees, showcasing its stunning natural beauty with green mountains and blue waters. The photo captures the vastness of nature's creation in the style of its creation.

Three hot air balloons float in the sky above a rugged landscape of rock formations. The balloons are colorful and have a basket hanging below them. The sun is shining and the sky is blue.Prompt: Shot in the style of DSLR camera with the polarizing filter. A photo of two hot air balloons floating over the unique rock formations in Cappadocia, Turkey. The colors and patterns on these balloons contrast beautifully against the earthy tones of the landscape below. This shot captures the sense of adventure that comes with enjoying such an experience.

A curious squirrel peeks out from a muddy hiking boot, set against a blurred background of mountains.Prompt: A pair of well-worn hiking boots, caked in mud and resting on a rocky trail. The head of a squirrel is poking out of one of the boots, and it looks lazily at the camera, a little king of its shoe. The laces of both boots fall loosely to the ground. There's a mountainous landscape in the background. Cinematic movie still, high quality DSLR photo.

Three young women are standing in a circle and happily laughing. Behind them, the sun is setting creating a lens flare and imbuing the image with a warm glow.Prompt: Three women stand together laughing, with one woman slightly out of focus in the foreground. The sun is setting behind the women, creating a lens flare and a warm glow that highlights their hair and creates a bokeh effect in the background. The photography style is candid and captures a genuine moment of connection and happiness between friends. The warm light of golden hour lends a nostalgic and intimate feel to the image.

Imagen 3 better understands natural language, the intent behind your prompt and incorporates small details from longer prompts. The model’s advanced understanding helps it master a range of styles.

Item 1 of 6

  • A photo of a black man with short hair and beard smiling. In background there are blurry trees and buildings.Prompt: A photo of a man with short hair and beard smiling at the camera. The background is blurry and it shows trees and buildings in light colors.
  • A person’s hand as they hold a small clay figurine of a bird in one hand and sculpt it with a modeling tool in the other. Their hands are covered in clay dust. The sculptor is wearing a gray fleece jacket and a brown and burgundy scarf.Prompt: A view of a person's hand as they hold a little clay figurine of a bird in their hand and sculpt it with a modeling tool in their other hand. You can see the sculptor's scarf. Their hands are covered in clay dust. a macro DSLR image highlighting the texture and craftsmanship.
  • A charcoal sketch of a female dancer capturing her in the middle of a dynamic movement. The sketch is rendered on aged parchment paper.Prompt: Abstract sketch: A blur of expressive lines and energy captures the dynamic movement of a dancer in a gestural charcoal drawing. Sketch on aged parchment paper.
  • A small, gray crocheted elephant toy stands on a dirt path in a grassy field. The elephant has white tusks and toenails and black eyes. The background is a blur of green and brown foliage, with the sun setting in the distance.Prompt: Elephant amigurumi walking in savanna, a professional photograph, blurry background.
  • An image in the style of anime showing a girl in a white dress standing on the bank of an expansive lake, holding flowers and looking at the sky full of pink clouds. The sky is reflected by the water surface. Around her there are small hills covered in wildflowers.Prompt: The girl in white dress stood on the bank of an endless lake, holding flowers and looking at the sky full of pink clouds. The sky is reflected by the water surface, creating a beautiful anime scene. There were small hills covered with wildflowers around her, adding to its beauty. Anime style background, purple blue tone, soft light, warm colors, dreamy atmosphere, and romantic emotions.
  • A moss-covered wooden robot stands in a field of wildflowers, holding out its hand to a small bluebird perched on it. A waterfall flows down a cliff in the background.Prompt: A weathered, wooden mech robot covered in flowering vines stands peacefully in a field of tall wildflowers, with a small bluebird resting on its outstretched hand. Digital cartoon, with warm colors and soft lines. A large cliff with waterfall looms behind.

It’s also our best model yet for rendering text, which has been a challenge for image generation models. This capability opens up possibilities for generating personalized birthday messages, title slides in presentations and more.

Item 1 of 6

The entrance to a grand, stone building with the words "Central Library" engraved above the doorway. The doorway is framed by two columns and features a set of large wooden doors with glass panes.Prompt: A photograph of a stately library entrance with the words "Central Library" carved into the stone.

A detailed origami owl, made of brown paper, perches on a pine branch with closed eyes. Its feathers are intricately folded, and it has a serene expression. The background is a blur of green foliage.Prompt: An origami owl made of brown paper is perched on a branch of an evergreen tree. The owl is facing forward with its eyes closed, giving it a peaceful appearance. The background is a blur of green foliage, creating a natural and serene setting.

A felt robot stands in a sunlit forest clearing, with a felt owl perched on its shoulder and a felt fox sitting at its feet. The robot is grey, with large round eyes and a slightly worried expression. The owl has large, orange eyes and brown feathers. The fox has red fur and a bushy tail. The forest floor is covered in green moss and fallen leaves.Prompt: Photo of a felt puppet diorama scene of a tranquil nature scene of a secluded forest clearing with a large friendly, rounded robot is rendered in a risograph style. An owl sits on the robots shoulders and a fox at its feet. Soft washes of color, 5 color, and a light-filled palette create a sense of peace and serenity, inviting contemplation and the appreciation of natural beauty.

A pixel art illustration of the Space Shuttle STS-1 launching into a blue sky, leaving a trail of smoke and flames. The text "STS-1" is at the bottom of the image.Prompt: Pixel art of a space shuttle blasting of. Cape Canaveral in the background, blue skies, with plumes of smoke billowing out. "STS-1" is written below it.

The word "light" formed from colorful feathers arranged on a black background.Prompt: Word “light” made from various colorful feathers, black background.

A scene made entirely of clay depicting an elderly woman wearing a flowing red top and taupe skirt. She is walking on a straight path in a garden, with lush plants growing on either side of the path. She is holding a large orange watering can in her right hand and is using it to water the plants.Prompt: Claymation scene. A medium wide shot of an elderly woman. She is wearing flowing clothing. She is standing in a lush garden watering the plants with an orange watering can.

Starting today, Imagen 3 is available to select creators in private preview in ImageFX, and by joining our waitlist. Imagen 3 will be coming soon to Vertex AI.

Learn more about Imagen 3’s capabilities.

Our collaborations with the music community

As part of our continued exploration into the role AI can play in art and music creation, we’re collaborating in partnership with YouTube, with some amazing musicians, songwriters and producers.

These collaborations are also informing the development of our generative music technologies, including Lyria, our most advanced model for AI music generation.

As part of this work, we’ve been developing a suite of music AI tools called Music AI Sandbox. These tools are designed to open a new playground for creativity, allowing people to create new instrumental sections from scratch, transform sound in new ways and much more.

https://www.youtube-nocookie.com/embed/-dPqc7l2zu8?enablejsapi=1&origin=https%3A%2F%2Fblog.google&widgetid=2

3:01

We're partnering with musicians, songwriters, and producers to investigate the exciting role artificial intelligence can have in the music creation process.

Today, we’re continuing that experimentation in music with Grammy-winning musician Wyclef Jean, Grammy-nominated songwriter Justin Tranter and electronic musician Marc Rebillet — who are releasing new demo recordings on their YouTube channels, created with help from our music AI tools.

https://www.youtube-nocookie.com/embed/?listType=playlist&list=PLqYmG7hTraZA7o7KkLWoVscoELWRGu3Xg&enablejsapi=1&origin=https%3A%2F%2Fblog.google&widgetid=3

2:56

Wyclef Jean, Justin Tranter, and Marc Rebillet are the first to release new demos using the Music AI Sandbox, and each demo is now available for listening on their YouTube channels.

Responsible from design to deployment

We’re mindful about not only advancing the state of the art, but doing so responsibly. So we’re taking measures to address the challenges raised by generative technologies and helping enable people and organizations to responsibly work with AI-generated content.

For each of these technologies, we’ve been working with the creative community and other external stakeholders, gathering insights and listening to feedback to help us improve and deploy our technologies in safe and responsible ways.

We’ve been conducting safety tests, applying filters, setting guardrails, and putting our safety teams at the center of development. Our teams are also pioneering tools, such as SynthID, which can embed imperceptible digital watermarks into AI-generated images, audio, text and video. And starting today, all videos generated by Veo on VideoFX will be watermarked by SynthID.

The creative potential for generative AI is immense and we can’t wait to see how people around the world will bring their ideas to life with our new models and tools.

Share the Post:
Scroll to Top