Google's text-to-image generator takes you on a truly wild ride

Imagen can conjure snakes made out of corn and raccoon kings using just a little bit of word salad.


Google recently unveiled Imagen, its very own text-to-image generator that makes use of machine learning and AI, to produce photorealistic images from a set of input text.

The text used for this one: “A giant cobra snake on a farm. The snake is made out of corn.”Google

Imagen arrives just a month and change after OpenAI unveiled its DALL-E 2 system.

Similar to Google, OpenAI has implemented safeguards for its own image generator — users cannot generate distinct faces based on a name or explicit imagery.

The text used for this one: “A strawberry mug filled with white sesame seeds. The mug is floating in a dark chocolate sea.”Google

As noted by Google, the rise in algorithmic advances has been partly related to the availability of enormous, uncurated datasets that help fuel sophisticated machine learning systems. Since these datasets are scraped from the internet, they reflect some of our worst tendencies like “social stereotypes, and oppressive viewpoints.”

The text used for this one: “A dog looking curiously in the mirror, seeing a cat.”Google
The text used for this one: “A robot couple fine dining with Eiffel Tower in the background.”Google

Using a self-developed benchmark known as DrawBench, Google found that Imagen was rated more favorably compared to existing text-to-image AI, like the aforementioned DALL-E 2 system. Drawbench is simply a side-by-side human evaluation, with the same text prompt being fed to both Imagen and the point of comparison. Users were asked to evaluate both images, and Imagen was largely preferred.