The capabilities of multimodal AI | Gemini Demo |
|
Our natively multimodal AI model Gemini is capable of reasoning across text, images, audio, video and code. Here are favorite moments with Gemini Learn more and try the model: https://deepmind.google/gemini
Explore Gemini: https://goo.gle/how-its-made-gemini For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity. Subscribe to our Channel: https://www.youtube.com/google Tweet with us on X: https://twitter.com/google Follow us on Instagram: https://www.instagram.com/google Join us on Facebook: https://www.facebook.com/Google 0:00 Intro 0:19 Multimodal Dialogue 1:32 Multilinguality 2:04 Game Creation 2:31 Visual Puzzles 3:17 Making Connections 3:39 Image & Text Generation 4:06 Logic & Spatial Reasoning 4:55 Translating Visuals 5:27 Cultural Understanding |