TechnicalFunny

I cloned myself with Gemini Omni in 15 minutes (and it's terrifyingly good)

How I AIJune 3, 2026

Claire Ho from the 'How I AI' podcast demonstrates creating an AI video avatar of herself using Google Flow and the Gemini Omni model in approximately 15 minutes. She generates a full storyboard, produces multiple AI video scenes using her avatar, and stitches them into a one-minute hype video for her podcast. Despite imperfections like inconsistent backgrounds and uncanny valley moments, she considers the result a success given the minimal time and effort involved.

Summary

In this episode of 'How I AI,' host Claire Ho conducts a live experiment to create an AI video avatar of herself using Google Flow and the new Gemini Omni video generation model, completing the entire process in roughly 15 minutes. She begins by scanning a QR code to take photos of her face from multiple angles, which Flow uses to generate a personalized avatar. Despite having failed at this process on a previous attempt, the avatar generation succeeds this time, producing what she describes as a 'fish eye lens version' of herself.

Using Flow's built-in AI assistant, Claire collaborates with the tool to brainstorm a storyboard for a hype video for her podcast. She describes her desired aesthetic — a dark home office with green walls, AI books, posters, and a 'hacker vibe' — and the AI generates a seven-scene storyboard including shots like extreme close-ups of keyboard typing, a spinning chair reveal, a heads-up display overlay, and a call-to-action segment. She then generates videos for each scene by referencing her avatar character, navigating a minor mistake where she accidentally generated images instead of videos.

The resulting videos show mixed but impressive results. Roughly 50% of the time the avatar closely resembles Claire's actual face, capturing details like her sun damage and side profiles. However, inconsistencies emerge across scenes, including varying background colors, changing books on shelves, and anachronistic AI tropes like a giant iPad showing schematics. One scene depicting her laughing is described as deep in the 'uncanny valley.' After stitching the best video clips together in Flow's in-browser editor, she premieres the final one-minute hype video and expresses genuine excitement about the outcome.

Claire reflects on the broader significance of generative video AI, noting that it enables her — someone without video production skills — to solo-produce creative content she never could have before. She concludes that while the result is roughly 50% polished, achieving it with no prior tool knowledge in 15 minutes represents a meaningful milestone in accessible AI-driven content creation.

Key Insights

Claire notes that generative video AI unlocks creative abilities she previously lacked entirely — she says she would never have been able to solo-produce a hype video before, as she wouldn't know how to frame it, brainstorm it, or block it.
Claire observes that the avatar captured background details from her photo session — including her poster — because those objects were visible behind her when she took the reference photos, showing that the model incorporates environmental context from avatar capture.
Claire finds that character consistency is a significant weakness, noting that approximately 50% of the time the avatar resembles her actual face and 50% of the time it looks like an uncanny version, with background elements like shelf books and wall color also changing between scenes.
Claire points out that video generation models default to early-2000s stereotypes of what 'impressive AI technology' looks like — she is shown holding a 24-inch iPad displaying a church schematic and getting a heads-up display overlaid on her face while apparently coding a robot.
Claire argues that the entire workflow — recording her avatar, learning the tool, building a storyboard, generating all videos, and stitching them together — took roughly 15 minutes total, and she considers the 50%-polished result a genuine success given zero prior knowledge of the tool.

Topics

AI video avatar creation with Google FlowGemini Omni multimodal video generation modelAI-assisted storyboarding and creative productionCharacter consistency challenges in generative videoAccessibility of AI tools for non-technical creative tasks

Transcript

[0:00] Today, I am doing a very strange episode where I'm going to create a video avatar of myself and in about 15 minutes get to a full minute long video starring none other than your favorite podcast host, Claire Ho. Let's get to it. This episode is brought to you by Merge. Building an AI product is one thing. The hard part is everything around it. Connecting to the tools your team and customers rely on, letting agents take action with the right permissions, and keeping [music] [0:30] everything reliable and cost-efficient once you're in production. Most teams end up piecing that [music] together themselves. So, instead of building the product you actually care about, you get pulled into…

Full transcript available for MurmurCast members

View original source →

More from How I AI

Get AI summaries like this delivered to your inbox daily

I cloned myself with Gemini Omni in 15 minutes (and it's terrifyingly good)

Summary

Key Insights

Topics

Transcript

More from How I AI

GPT-5.6's video editing via Codex is genuinely one of my favorite new workflows

Theoretically Intelligent vs. Practically Effective: Why GPT-5.6 Sol Beats Fable for Product Work

Build a harness when the same workflow needs the same setup and the same outcomes, every time

GPT 5.6-Sol vs. Claude Fable: Why OpenAI’s new model crushes my benchmark

Context offloading is an underrated AI use case

Get AI summaries delivered to your inbox