Text-to-Image Prompting

Craft effective prompts for AI image generators like DALL-E, Midjourney, and Stable Diffusion.

8 min read
3 quiz questions

Text-to-image prompting is fundamentally different from text-to-text prompting. Instead of giving instructions, you're providing a description that the model translates into visual content. The more precisely you describe the image — including style, composition, lighting, and mood — the closer the output matches your vision.

Effective image prompts generally follow this structure: Subject + Action/Pose + Environment + Style + Technical Details. Each element adds specificity that reduces randomness in the output.

Weak prompt: "A cat" → Random cat image, any style, any setting Strong prompt: "A tabby cat sitting on a windowsill at sunset, looking outside at falling autumn leaves. Warm golden hour lighting. Shot from slightly below. Photorealistic style, shallow depth of field, 85mm lens look." → Specific, vivid, reproducible result

  1. Subject: What is in the image? Be specific about appearance, pose, expression, clothing.
  2. Environment/setting: Where is it? Indoor, outdoor, time of day, weather, background details.
  3. Style: Photorealistic, illustration, watercolor, oil painting, digital art, anime, 3D render, etc.
  4. Lighting: Golden hour, dramatic shadows, soft diffused, neon, backlit, studio lighting.
  5. Composition: Close-up, wide shot, aerial view, rule of thirds, symmetrical, Dutch angle.
  6. Mood/atmosphere: Peaceful, dramatic, mysterious, energetic, melancholic.
  7. Technical: Camera lens (35mm, 85mm), film stock, resolution, aspect ratio.

  • DALL-E 3 (via ChatGPT): Accepts natural language descriptions. Best for quick, clear concepts. Rewrites your prompt internally — be descriptive but not overly technical.
  • Midjourney: Responds well to artistic references and style keywords. Use parameters like --ar 16:9, --style raw, --v 6. Comma-separated keywords often work better than sentences.
  • Stable Diffusion: Most control via positive/negative prompts and parameters. Token weighting with (keyword:1.5) is powerful. Needs negative prompts to avoid common artifacts.

Product Photography Prompt

Generates product photography prompts for e-commerce and marketing.

[PRODUCT NAME], professional product photography, centered on a [SURFACE: marble/wooden/minimal white] surface. [LIGHTING: soft studio lighting / dramatic side light / natural window light]. Clean background, [COLOR]. Sharp focus, high resolution. Style: modern commercial photography, [BRAND MOOD: luxury/playful/minimalist/bold].
Negative prompts matter: telling the model what NOT to include (blurry, distorted hands, text, watermark) is as important as what to include. Midjourney uses --no, Stable Diffusion has a dedicated negative prompt field.

Rarely does the first image generation match your vision perfectly. Treat image prompting as iterative: generate, evaluate, adjust specific elements, and regenerate. Keep what works and refine what doesn't. Most professional-quality results take 3-5 iterations.

Prompt Templates

Blog Header Image

Generates blog header images with clean, professional aesthetics.

A [ABSTRACT/CONCEPTUAL] illustration representing [CONCEPT]. Modern flat design with [COLOR PALETTE: blues and greens / warm oranges / corporate navy]. Clean, minimal composition with plenty of whitespace. Suitable as a blog header at 1200x630 pixels. Professional, contemporary style. No text in the image.

Portrait Style

Creates professional portrait-style images with controlled composition.

Professional headshot-style portrait of a [DESCRIPTION OF PERSON]. [EXPRESSION: friendly smile / thoughtful / confident]. Shot against a [BACKGROUND]. Soft, diffused studio lighting from the left. Shallow depth of field. 85mm lens equivalent. Color palette: [WARM/COOL/NEUTRAL]. Modern corporate photography style.

Test Your Knowledge

Knowledge Check

1 / 3

What is the recommended formula for text-to-image prompts?

Key Takeaways

  • Image prompting is descriptive, not instructive — you describe what you want to see
  • Follow the formula: Subject + Action + Environment + Style + Technical Details
  • Negative prompts prevent common artifacts — always specify what to avoid
  • Different platforms (DALL-E, Midjourney, Stable Diffusion) have different optimal prompt styles
  • Treat image generation as iterative: 3-5 rounds of refinement is normal