Image & Video Generation

Creating visual content with AI

Simple explanation

Image generation is like describing a painting to an artist — you write what you want, and the AI creates it. Video generation does the same but with moving pictures.

Beyond creating from scratch, you can also edit existing images: fill in removed areas (inpainting), change specific parts using masks, or modify elements with text instructions. The AI handles the pixel-level work.

Image generation capabilities

Capability	What It Does	Use Case
Text-to-image	Generate an image from a text prompt	”A professional office meeting with diverse team members”
Image variation	Generate variations of a reference image	Create 5 alternatives of a product photo concept
Inpainting	Replace masked areas with new generated content	Remove background objects, change clothing colour
Mask-based editing	Extend or modify composition via masks on a larger canvas	Expand a portrait to include more background
Style-directed generation	Prompt the model for a specific visual style	”A product photo in watercolour style” — achieved through prompt wording, not a separate API

Image editing with masks

Edit Type	How It Works	Example
Mask-based inpainting	Define an area (mask), AI fills it with new content	Mask the sky, generate a sunset instead of grey clouds
Prompt-driven modification	Describe what to change, AI modifies the image	”Change the car colour from red to blue”
Object removal	Mask an object, AI fills with matching background	Remove a person from a product photo
Object replacement	Mask an object, describe replacement	”Replace the chair with a modern standing desk”

Video generation and editing

Feature	Description	Control Options
Text-to-video	Generate video clips from text prompts	Duration, resolution, aspect ratio
Reference-based	Generate video matching a reference image or clip	Style, motion, subject consistency
Video editing	Modify specific segments of generated video	Text instructions for changes
Generation controls	Platform-provided quality and safety settings	Content filters, watermarks, resolution limits

Real-world example: MediaForge's content pipeline

MediaForge uses image generation for client marketing campaigns:

Brief → concept images: Client brief describes “modern tech office, diverse team, warm lighting” → generate 10 concept images
Selection → variations: Client picks favourite → generate 5 variations with different compositions
Refinement → inpainting: Client wants the window view changed → mask the window, prompt “city skyline at sunset”
Final → style application: Apply brand-consistent colour grading to the final image

Total time: 20 minutes. Traditional photography: 2 days + $5,000.

Generation controls

Control	What It Does	When to Use
Content filters	Block generation of unsafe content	Always enabled — additional custom filters for brand safety
Watermarks	Add invisible or visible watermarks to generated content	Compliance with AI content disclosure requirements
Resolution	Set output image/video dimensions	Match target platform requirements (social, print, web)
Seed	Reproduce similar results from the same prompt	A/B testing, consistent brand imagery
Quality settings	Standard vs HD generation	Standard for prototyping, HD for final production

Exam tip: Generation controls are about safety AND quality

The exam tests both:

Safety controls: content filters, watermarks, prohibited content detection
Quality controls: resolution, seed for reproducibility, style parameters

When a question asks about “appropriate generation controls,” consider both dimensions.

Key terms

Question

What is inpainting?

Click or press Enter to reveal answer

Answer

An image editing technique where you mask (select) an area of an image, and AI generates new content to fill that area. Used for object removal, background replacement, or targeted edits.

Click to flip back

Question

What is a generation seed?

Click or press Enter to reveal answer

Answer

A numerical value that makes image generation reproducible. Using the same prompt + seed produces very similar images each time. Useful for A/B testing and maintaining visual consistency.

Click to flip back

Question

What is text-to-video generation?

Click or press Enter to reveal answer

Answer

Creating video clips from text descriptions. The AI generates frames and motion based on the prompt, with controls for duration, resolution, and style. Can use reference images for visual consistency.

Click to flip back

Knowledge check

Knowledge Check

MediaForge needs to replace the background in a product photo — keeping the product but changing the background from a studio to a beach scene. Which technique should they use?

Knowledge Check

NeuralMed generates anatomical diagrams for patient education materials. Which generation control is MOST important to configure?