Image Generation: Creating with AI

How does AI create images?

Simple explanation

Image generation AI works like a sculptor starting from random noise.

Imagine you start with TV static — pure random dots. The AI model gradually refines that noise, step by step, until a clear image emerges that matches your text description. It’s like watching a photo develop in a darkroom, except the AI is guided by your words.

You type “a fluffy orange cat wearing a tiny top hat, watercolour style” → the model generates a completely new image that’s never existed before.

Computer vision vs image generation

Understanding images vs creating images
Feature	Computer Vision	Image Generation
Direction	Image → Understanding	Text → Image
Input	An existing image	A text description (prompt)
Output	Text (labels, descriptions, extracted text)	A new image
Example	'This image contains a cat and a dog'	Creates a new image of 'a cat and a dog playing in a park'
Azure service	Azure AI Vision (Foundry Tools)	GPT-image-1.5 via Azure OpenAI

GPT-image-1.5: Azure’s image generation model

GPT-image-1.5 is OpenAI’s latest image generation model, available in Microsoft Foundry.

Note: The previous model (DALL-E 3) was retired on March 4, 2026. GPT-image-1.5 is the current GA replacement with improved capabilities.

Key capabilities:

Generate images from text prompts
Edit existing images with text instructions
Specify image size and quality settings
Generate multiple images per request

Use cases:

Scenario	Prompt Example
Marketing	”Professional product photo of a smartwatch on a white background”
Concept art	”Futuristic city skyline at sunset, cyberpunk style”
Education	”Diagram showing how a solar panel converts sunlight to electricity”
Prototyping	”Mobile app mockup for a fitness tracker, dark mode, minimal design”

GreenLeaf scenario: GreenLeaf uses GPT-image to generate visuals for their sustainability reports — illustrations of farming practices, infographics about crop yields, and concept images for new products — all without hiring a designer.

Responsible AI in image generation

Image generation has unique responsible AI considerations:

Consideration	How Azure Handles It
Harmful content	Content filters block violent, sexual, or harmful image generation
Deepfakes	Azure AI embeds C2PA provenance metadata in generated images to prove AI origin
Bias	Models are tested for demographic bias in generated faces and scenarios
Copyright	Users should not generate images of copyrighted characters or trademarks
Transparency	Generated images should be labelled as AI-created

What is C2PA?

C2PA (Coalition for Content Provenance and Authenticity) is an open standard for embedding provenance metadata into digital content. When GPT-image generates an image, it includes C2PA metadata that records the content’s origin — that it was AI-generated, by which service, and when.

This supports the transparency responsible AI principle — tools that read C2PA metadata can verify whether content is AI-generated, even if the creator doesn’t visually label it.

🎬 Video walkthrough

Flashcards

Question

How do diffusion models generate images?

Click or press Enter to reveal answer

Answer

They start from random noise and progressively denoise it, guided by the text prompt, until a clear image emerges. During training, the model learned to reverse the process of adding noise to images.

Click to flip back

Question

What is GPT-image-1.5?

Click or press Enter to reveal answer

Answer

OpenAI's image generation model, available in Microsoft Foundry through Azure OpenAI. It creates new images from text prompts and supports various sizes and styles.

Click to flip back

Question

What is C2PA metadata in AI-generated images?

Click or press Enter to reveal answer

Answer

An open provenance standard (Coalition for Content Provenance and Authenticity) embedded in AI-generated images that records their origin — proving they were created by AI. Supports the transparency principle.

Click to flip back

Knowledge Check

GreenLeaf wants to create professional illustrations for their annual sustainability report without hiring a graphic designer. Which AI capability should they use?

Knowledge Check

A news organisation uses GPT-image-1.5 to generate images for articles. Which responsible AI concern is most critical for this use case?

Next up: Information Extraction — how AI pulls structured data from chaotic documents, images, audio, and video.