¿El fin de los tutoriales en YouTube? El nuevo GEMINI lo cambia todo

Xavier Mitjana16m 26s

Google's new Gemini 3.1 Flash Live model introduces real-time multimodal AI that can see your screen and provide voice guidance, potentially replacing YouTube tutorials. The presenter demonstrates using it as a personal tutor for software like Premiere Pro and as a voice assistant for websites.

Summary

The video introduces Google's Gemini 3.1 Flash Live, a new multimodal AI model that addresses the common frustration of searching through lengthy YouTube tutorials for specific software help. The presenter argues this technology could revolutionize how people learn to use programs by allowing real-time screen sharing with an AI that can see what you see and provide step-by-step voice guidance. Two main use cases are demonstrated: first, using the AI as a personal tutor for Adobe Premiere Pro, where it successfully guides the user through fixing multi-camera sequence duration issues and adjusting color grading with the Lumetri panel. The AI can identify interface elements, navigate menus, and provide specific instructions while viewing the user's screen in real-time. The second use case shows how to integrate this technology as a voice assistant on websites, demonstrated with a travel booking site where the AI can understand page content and help users navigate through voice interaction. The presenter explains how to access these features through Google AI Studio's Playground section, configure various settings like voice selection, image processing resolution, and system instructions. The video also covers using Google AI Studio's Build section for rapid web development with integrated voice assistants. A sponsored segment promotes Internxt cloud storage as a solution for file management and collaboration needs.

Key Insights

  • The presenter claims Gemini 3.1 Flash Live could replace traditional YouTube tutorials by providing real-time, personalized guidance while viewing the user's actual screen and software interface
  • Google has created a multimodal model that can simultaneously process text, voice, and visual input, enabling it to see what users see on their screens and respond with voice guidance
  • The AI demonstrated ability to identify specific interface elements in complex software like Adobe Premiere Pro and guide users through multi-step technical processes like color grading and timeline editing
  • The presenter argues this technology enables rapid creation of voice-enabled web assistants that can understand full page context and help users navigate through natural voice interaction
  • Google AI Studio's Build section allows users to create functional websites and integrate voice assistants through simple text prompts, with the AI automatically generating appropriate code and API implementations

Topics

Gemini 3.1 Flash Live AI modelReal-time screen sharing with AIVoice-guided software tutorialsWebsite voice assistant integrationGoogle AI Studio platform

Full transcript available for MurmurCast members

Sign Up to Access

Get AI summaries like this delivered to your inbox daily

Get AI summaries delivered to your inbox

MurmurCast summarizes your YouTube channels, podcasts, and newsletters into one daily email digest.