Google has unveiled Whisk, its latest generative AI image creation tool. Whisk offers a user-friendly design to make AI image generation accessible to everyone, including those without any previous knowledge of AI.
Instead of typing out long, detailed text prompts, Whisk allows users to create images by simply dragging and dropping image prompts, eliminating the need for overly complex text-based inputs.
Generative AI leverages deep-learning models to create high-quality content based on existing data. However, text prompts sometimes fail to capture certain design elements. Whisk addresses this by allowing users to customize images using specific inputs for the main subject, the scene, and preferred art styles.
The Gemini model generates detailed captions for selected images, which are then fed into Google’s Imagen 3 image generation model. This process helps capture the subject’s “essence” and doesn’t create a replica.
Take note that Whisk only extracts a few key characters from images, so the resulting content will not always match expectations. However, users can view and edit the underlying prompts at any time to achieve desired results.
For more information about Whisk, go here: https://labs.google/fx/tools/whisk.