Image Generation: Creating with AI
AI doesn't just understand images — it can create entirely new ones from text descriptions. Learn how image generation works, when to use it, and what GPT-image can do.
How does AI create images?
Image generation AI works like a sculptor starting from random noise.
Imagine you start with TV static — pure random dots. The AI model gradually refines that noise, step by step, until a clear image emerges that matches your text description. It’s like watching a photo develop in a darkroom, except the AI is guided by your words.
You type “a fluffy orange cat wearing a tiny top hat, watercolour style” → the model generates a completely new image that’s never existed before.
Computer vision vs image generation
| Feature | Computer Vision | Image Generation |
|---|---|---|
| Direction | Image → Understanding | Text → Image |
| Input | An existing image | A text description (prompt) |
| Output | Text (labels, descriptions, extracted text) | A new image |
| Example | 'This image contains a cat and a dog' | Creates a new image of 'a cat and a dog playing in a park' |
| Azure service | Azure AI Vision (Foundry Tools) | GPT-image-1.5 via Azure OpenAI |
GPT-image-1.5: Azure’s image generation model
GPT-image-1.5 is OpenAI’s latest image generation model, available in Microsoft Foundry.
Note: The previous model (DALL-E 3) was retired on March 4, 2026. GPT-image-1.5 is the current GA replacement with improved capabilities.
Key capabilities:
- Generate images from text prompts
- Edit existing images with text instructions
- Specify image size and quality settings
- Generate multiple images per request
Use cases:
| Scenario | Prompt Example |
|---|---|
| Marketing | ”Professional product photo of a smartwatch on a white background” |
| Concept art | ”Futuristic city skyline at sunset, cyberpunk style” |
| Education | ”Diagram showing how a solar panel converts sunlight to electricity” |
| Prototyping | ”Mobile app mockup for a fitness tracker, dark mode, minimal design” |
GreenLeaf scenario: GreenLeaf uses GPT-image to generate visuals for their sustainability reports — illustrations of farming practices, infographics about crop yields, and concept images for new products — all without hiring a designer.
Responsible AI in image generation
Image generation has unique responsible AI considerations:
| Consideration | How Azure Handles It |
|---|---|
| Harmful content | Content filters block violent, sexual, or harmful image generation |
| Deepfakes | Azure AI embeds C2PA provenance metadata in generated images to prove AI origin |
| Bias | Models are tested for demographic bias in generated faces and scenarios |
| Copyright | Users should not generate images of copyrighted characters or trademarks |
| Transparency | Generated images should be labelled as AI-created |
What is C2PA?
C2PA (Coalition for Content Provenance and Authenticity) is an open standard for embedding provenance metadata into digital content. When GPT-image generates an image, it includes C2PA metadata that records the content’s origin — that it was AI-generated, by which service, and when.
This supports the transparency responsible AI principle — tools that read C2PA metadata can verify whether content is AI-generated, even if the creator doesn’t visually label it.
🎬 Video walkthrough
🎬 Video coming soon
Image Generation — AI-901 Module 10
Image Generation — AI-901 Module 10
~10 minFlashcards
Knowledge Check
GreenLeaf wants to create professional illustrations for their annual sustainability report without hiring a graphic designer. Which AI capability should they use?
A news organisation uses GPT-image-1.5 to generate images for articles. Which responsible AI concern is most critical for this use case?
Next up: Information Extraction — how AI pulls structured data from chaotic documents, images, audio, and video.