GPT-4o Image

By OpenAI — Native multimodal image generation with industry-leading text rendering

サインインして生成
GET STARTED

How It Works

1

Describe Your Image

Enter a detailed text prompt describing the image you want to create. You can also upload a reference image for editing, style transfer, or transformation. GPT-4o's advanced language understanding ensures even complex, nuanced descriptions are captured accurately.

2

AI Generation

GPT-4o processes your prompt using its native autoregressive image generation, producing high-quality images with accurate text rendering, precise spatial composition, and faithful adherence to your instructions. Select from square (1:1), portrait (2:3), or landscape (3:2).

3

Refine & Download

Review your generated image and refine with follow-up prompts. Change colors, add elements, modify text, or alter the style. Once satisfied, download your final image in high resolution.

About the Model

What Is GPT-4o Image?

GPT-4o Image Generation is OpenAI's native image creation capability built directly into the GPT-4o multimodal model. Released in March 2025, it represents a fundamentally different approach to AI image generation by producing images autoregressively as part of GPT-4o's unified text-and-image output, rather than relying on a separate diffusion model like DALL-E.

Unlike traditional diffusion-based image generators, GPT-4o generates images token-by-token using the same autoregressive transformer architecture that powers its text generation. This unified approach means the model has a deep, integrated understanding of both text and visual content. The model supports three output resolutions (1024x1024, 1024x1536, and 1536x1024) and can produce up to 5 images per request. It includes built-in safety systems with C2PA metadata for content provenance.

What truly sets GPT-4o Image apart is its exceptional text rendering — it can accurately place readable text on signs, labels, posters, and within complex designs, something diffusion models have historically struggled with. Combined with conversational editing, where users can iteratively refine images through follow-up prompts, it offers an intuitive creative workflow that bridges the gap between ideation and final output.

Capabilities

Key Features

GPT-4o Image brings native multimodal generation with unmatched text rendering and prompt fidelity.

Native Multimodal

Image generation built directly into the GPT-4o language model, enabling seamless interleaving of text and image outputs without a separate diffusion pipeline.

Superior Text Rendering

Excels at rendering accurate, legible text within images — signs, labels, logos, memes, and typographic designs — a major weakness of traditional diffusion models.

Conversational Editing

Supports iterative, multi-turn image editing through natural language instructions, allowing users to refine images through follow-up messages.

Image-to-Image

Accepts input images and transforms them based on text prompts, supporting style transfer, modifications, and creative remixing while preserving key visual elements.

Batch Generation

Generate up to 5 images in a single request, enabling efficient batch workflows for creative and production use cases.

Spatial Reasoning

Leverages GPT-4o's language understanding for precise prompt adherence, accurate object placement, and complex compositional scenes.

AIクリエイティブツール

AI動画&画像生成スイート

動画エフェクト、バーチャル試着、着せ替え、フォトスタジオ、背景除去など——最新AIモデルで実現。

AIキス動画

二人の写真をロマンチックなAIキスアニメーション動画に変換。リアルな表情と自然な動き。

試してみる

AIキスラッシュ動画

自撮りをアップロードすると、魅力的な人々が両側から現れてキスするユニークな動画を生成。

試してみる

AIハグ動画

写真を心温まるAIハグ動画に変換。自然な抱擁アニメーションと感情豊かな表情。

試してみる

AIビキニジェネレーター

スタイリッシュなビキニや水着への滑らかで自然な着替え動画を生成。

試してみる

AI筋肉ジェネレーター

写真からドラマチックな筋肉成長変身動画を作成。リアルなボディアニメーション。

試してみる

AIダンス動画

誰でもエネルギッシュなダンスを踊らせる。流れるような体の動きとリズミカルなモーション。

試してみる

Ghibli AI動画

写真を魔法のようなスタジオGhibli風アニメーションに変換。柔らかな水彩テクスチャと幻想的な動き。

試してみる

AI GIFジェネレーター

静止画からシームレスにループするアニメーションGIFを作成。SNSやメッセージに最適。

試してみる

AI漢服変身

写真を美しい伝統的な中国漢服の変身動画に変換。流れるシルクの長衣と精巧な刺繍付き。

試してみる

AI着せ替え動画

シームレスな衣装チェンジ動画を作成 — AIが5〜6種類のスタイリッシュな衣装を循環。アイデンティティを保持。

試してみる

Frequently Asked Questions

Everything you need to know about GPT-4o Image

START CREATING

Create Images with
GPT-4o's Native Generation

Industry-leading text rendering, conversational editing, and precise prompt adherence powered by OpenAI's multimodal architecture.