Much has been written about text-to-image AI systems, of late. However, while the likes of OpenAI and Meta AI have shown off their AI models or even made them publicly available, Google has been staying comparatively quiet about its efforts – until last week.
Now available… sort of
According to Google, its Imagen text-to-image model will finally be made available to the public – albeit in a very limited fashion, through its AI Test Kitchen app to get early feedback about its technology.
AI Test Kitchen was introduced at Google I/O earlier this year as a way for people to learn about, experience, and offer feedback on emerging AI technology. In August, Google finally announced that it is allowing the public to register their interest as it is rolled out to “small groups of users in the US”.
But why the interest in Imagen? For a start, a paper released in May this year showed output comparable to those of DALL-E 2 and Stable Diffusion. However, Google appears to have put additional attention to boosting image fidelity and alignment with text inputs by increasing the size of the language model in Imagen.
Indeed, Google says that human raters prefer Imagen over other models in side-by-side comparisons, both in terms of sample quality and image-text alignment. And the samples shown on its white paper appear to bear this out – though Google is presumably only showing off the very best images there.
Powered by Imagen
As reported on The Verge, selected users will be able to interact with Imagen through two interfaces in AI Test Kitchen: “City Dreamer” and “Wobble”.
The former will allow users to ask the model to generate elements from a city designed around a theme of their choice. Designs will then appear as isometric models SimCity-style. In Wobble, users can create a monster by deciding the material it is made of and dressing it in the clothing of their choice.
This is far more constrained than the likes of DALL-E 2, which allows users to generate just about anything within seconds, or Stable Diffusion, which can be downloaded and run locally on a machine with just one GPU.
Why the limit? The explanation offered by Google for not open-sourcing code and demos gives us a hint: It is concerned about the potential for misuse. However, it promised to explore a framework for “responsible externalization that balances the value of external auditing with the risks of unrestricted open access.”
For now, those who are interested to give Google’s text-to-image service a spin can apply via the AI Test Kitchen app, which is available for download on Google Play and iOS App Store.
You can also read more about Imagen at Google’s research site here or access the white paper here.
Image credit: Google Imagen