Imagen (text-to-image model)

On this page

Developer(s) Google DeepMind

Initial release May 2022; 3 years ago

Stable release Imagen 4 / 20 May 2025; 33 days ago

Type Text-to-image model

Additional information

Imagen (text-to-image model)

Image-generating machine learning model

Imagen is a series of text-to-image models developed by Google DeepMind, originally created by Google Brain before their merger in 2023. It generates high-quality images from text prompts, similar to Stability AI's Stable Diffusion, OpenAI's DALL-E, and Midjourney. First described in a May 2022 paper, Imagen is accessible to all Google users through platforms such as Gemini, ImageFX, and Vertex AI, offering a versatile tool for creative image generation.

Related Image Collections Add Image

Profiles

1 Image

We don't have any YouTube videos related to Imagen (text-to-image model) yet.

You can add one yourself here.

We don't have any PDF documents related to Imagen (text-to-image model) yet.

You can add one yourself here.

We don't have any Books related to Imagen (text-to-image model) yet.

You can add one yourself here.

We don't have any archived web articles related to Imagen (text-to-image model) yet.

You can submit a link to a page to archive here.

History

Imagen's original version was first presented in a paper published in May 2022. It featured the ability to generate high-fidelity images from natural language.⁴ The second version, Imagen 2 was released in December 2023.⁵ The standout feature was text and logo generation.⁶ Imagen 3 was released in August 2024.⁷ Google claims that the newest version provides better detail and lighting on generated images.⁸ On 20 May 2025 at Google I/O 2025 the company released an improved model, Imagen 4.⁹

Technology

Imagen uses two key technologies. The first is the use of transformer-based large language models, notably T5, to understand text and subsequently encode text for image synthesis. The second is the use of cascaded diffusion models providing high-fidelity image generation. It generates image in three stages, starting from a base of 64x64, then upsampled to 256x256 and 1024x1024.¹⁰

Capabilities

Imagen can generate photorealistic images from text prompts.¹¹ It can also create various styles, such as cinematic, 35mm film, illustration, and surreal. The model can generate images in five aspect ratios, namely 9:16, 3:4, 1:1, 4:3, and 16:9. Imagen can also refine already generated images by editing existing text prompts.¹²

External links

Imagen website

References

Roth, Emma; Peters, Jay (April 20, 2023). "Google's big AI push will combine Brain and DeepMind into one team". The Verge. Archived from the original on April 20, 2023. Retrieved March 18, 2025. https://www.theverge.com/2023/4/20/23691468/google-ai-deepmind-brain-merger ↩
Saharia, Chitwan; Chan, William; Saxena, Saurabh; Li, Lala; Whang, Jay; Denton, Emily; Seyed Kamyar Seyed Ghasemipour; Burcu Karagol Ayan; Sara Mahdavi, S.; Rapha Gontijo Lopes; Salimans, Tim; Ho, Jonathan; David J Fleet; Norouzi, Mohammad (2022). "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding". arXiv:2205.11487 [cs.CV]. /wiki/ArXiv_(identifier) ↩
Peterson, Jake (2024-08-16). "Anyone With a Google Account Can Try Google's Latest AI Image Generator Right Now". Lifehacker. Retrieved 2025-03-18. https://lifehacker.com/tech/you-can-try-googles-latest-ai-image-generator-right-now ↩
Saharia, Chitwan; Chan, William; Saxena, Saurabh; Li, Lala; Whang, Jay; Denton, Emily; Seyed Kamyar Seyed Ghasemipour; Burcu Karagol Ayan; Sara Mahdavi, S.; Rapha Gontijo Lopes; Salimans, Tim; Ho, Jonathan; David J Fleet; Norouzi, Mohammad (2022). "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding". arXiv:2205.11487 [cs.CV]. /wiki/ArXiv_(identifier) ↩
"Imagen 2 - our most advanced text-to-image technology". Google DeepMind. 2025-03-12. Retrieved 2025-03-18. https://deepmind.google/technologies/imagen-2/ ↩
Wiggers, Kyle (2023-12-13). "Google debuts Imagen 2 with text and logo generation". TechCrunch. Retrieved 2025-03-18. https://techcrunch.com/2023/12/13/google-debuts-imagen-2-with-text-and-logo-generation/ ↩
Schoon, Ben (2024-08-16). "Google opens access to Imagen 3, its latest model for AI image generation". 9to5Google. Archived from the original on 2024-08-18. Retrieved 2025-03-18. http://web.archive.org/web/20240818012446/https://9to5google.com/2024/08/16/google-imagen-3-launch/ ↩
Christian Rowlands (2025-02-26). "Some of the most realistic AI images you'll see were created with this free tool". TechRadar. Retrieved 2025-03-18. https://www.techradar.com/computing/artificial-intelligence/what-is-imagen-3-everything-you-need-to-know-about-googles-text-to-image-model ↩
Kyle Wiggers (2025-05-20). "Imagen 4 is Google's newest AI image generator". techcrunch.com. Retrieved 2025-03-18. https://techcrunch.com/2025/05/20/imagen-4-is-googles-newest-ai-image-generator/ ↩
Saharia, Chitwan; Chan, William; Saxena, Saurabh; Li, Lala; Whang, Jay; Denton, Emily; Seyed Kamyar Seyed Ghasemipour; Burcu Karagol Ayan; Sara Mahdavi, S.; Rapha Gontijo Lopes; Salimans, Tim; Ho, Jonathan; David J Fleet; Norouzi, Mohammad (2022). "Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding". arXiv:2205.11487 [cs.CV]. /wiki/ArXiv_(identifier) ↩
Peterson, Jake (2024-08-16). "Anyone With a Google Account Can Try Google's Latest AI Image Generator Right Now". Lifehacker. Retrieved 2025-03-18. https://lifehacker.com/tech/you-can-try-googles-latest-ai-image-generator-right-now ↩
Christian Rowlands (2025-02-26). "Some of the most realistic AI images you'll see were created with this free tool". TechRadar. Retrieved 2025-03-18. https://www.techradar.com/computing/artificial-intelligence/what-is-imagen-3-everything-you-need-to-know-about-googles-text-to-image-model ↩

History

Technology

Capabilities

See also

External links

References