Image & Video Generation
From text prompts to stunning visuals. Learn how to generate images and videos, edit with inpainting and masks, and apply the right generation controls for quality and safety.
Creating visual content with AI
Image generation is like describing a painting to an artist — you write what you want, and the AI creates it. Video generation does the same but with moving pictures.
Beyond creating from scratch, you can also edit existing images: fill in removed areas (inpainting), change specific parts using masks, or modify elements with text instructions. The AI handles the pixel-level work.
Image generation capabilities
| Capability | What It Does | Use Case |
|---|---|---|
| Text-to-image | Generate an image from a text prompt | ”A professional office meeting with diverse team members” |
| Image variation | Generate variations of a reference image | Create 5 alternatives of a product photo concept |
| Inpainting | Replace masked areas with new generated content | Remove background objects, change clothing colour |
| Mask-based editing | Extend or modify composition via masks on a larger canvas | Expand a portrait to include more background |
| Style-directed generation | Prompt the model for a specific visual style | ”A product photo in watercolour style” — achieved through prompt wording, not a separate API |
Image editing with masks
| Edit Type | How It Works | Example |
|---|---|---|
| Mask-based inpainting | Define an area (mask), AI fills it with new content | Mask the sky, generate a sunset instead of grey clouds |
| Prompt-driven modification | Describe what to change, AI modifies the image | ”Change the car colour from red to blue” |
| Object removal | Mask an object, AI fills with matching background | Remove a person from a product photo |
| Object replacement | Mask an object, describe replacement | ”Replace the chair with a modern standing desk” |
Video generation and editing
| Feature | Description | Control Options |
|---|---|---|
| Text-to-video | Generate video clips from text prompts | Duration, resolution, aspect ratio |
| Reference-based | Generate video matching a reference image or clip | Style, motion, subject consistency |
| Video editing | Modify specific segments of generated video | Text instructions for changes |
| Generation controls | Platform-provided quality and safety settings | Content filters, watermarks, resolution limits |
Real-world example: MediaForge's content pipeline
MediaForge uses image generation for client marketing campaigns:
- Brief → concept images: Client brief describes “modern tech office, diverse team, warm lighting” → generate 10 concept images
- Selection → variations: Client picks favourite → generate 5 variations with different compositions
- Refinement → inpainting: Client wants the window view changed → mask the window, prompt “city skyline at sunset”
- Final → style application: Apply brand-consistent colour grading to the final image
Total time: 20 minutes. Traditional photography: 2 days + $5,000.
Generation controls
| Control | What It Does | When to Use |
|---|---|---|
| Content filters | Block generation of unsafe content | Always enabled — additional custom filters for brand safety |
| Watermarks | Add invisible or visible watermarks to generated content | Compliance with AI content disclosure requirements |
| Resolution | Set output image/video dimensions | Match target platform requirements (social, print, web) |
| Seed | Reproduce similar results from the same prompt | A/B testing, consistent brand imagery |
| Quality settings | Standard vs HD generation | Standard for prototyping, HD for final production |
Exam tip: Generation controls are about safety AND quality
The exam tests both:
- Safety controls: content filters, watermarks, prohibited content detection
- Quality controls: resolution, seed for reproducibility, style parameters
When a question asks about “appropriate generation controls,” consider both dimensions.
Key terms
Knowledge check
MediaForge needs to replace the background in a product photo — keeping the product but changing the background from a studio to a beach scene. Which technique should they use?
NeuralMed generates anatomical diagrams for patient education materials. Which generation control is MOST important to configure?
🎬 Video coming soon