Today in the US, we’re launching our newest experiment in generative AI: Whisk. Instead of generating images with long, detailed text prompts, Whisk lets you prompt with images. Simply drag in images, and start creating.
Whisk lets you input images for the subject, one for the scene and another image for the style. Then, you can remix them to create something uniquely your own, from a digital plushie to an enamel pin or sticker.
Behind the scenes, the Gemini model automatically writes a detailed caption of your images. It then feeds those descriptions into Google’s latest image generation model, Imagen 3. This process captures your subject's essence, not an exact replica. That way, you can easily remix your subjects, scenes and styles in novel ways.
Since Whisk extracts only a few key characteristics from your image, it might generate images that differ from your expectations. For example, the generated subject might have a different height, weight, hairstyle or skin tone. We understand these features may be crucial for your project and Whisk may miss the mark, so we let you view and edit the underlying prompts at any time.
In our early testing with artists and creatives, people have been describing Whisk as a new type of creative tool — not a traditional image editor. We built it for rapid visual exploration, not pixel-perfect edits. It’s about exploring ideas in new and creative ways, allowing you to work through dozens of options and download the ones you love.
If you are based in the US, you can try it out today at labs.google/whisk and tell us what you think.
Google Labs is where we cook up experiments with the latest generative AI models like Gemini, Imagen and Veo. Our goal is to get feedback on new products and features as we work to shape technology together. You can stay up to date on Whisk and other experiments by signing up for our newsletter and following Google Labs on X, Reddit and Discord.
Blog Article: Here