Category · 95 models

Make Pictures

Type a description, get an image — for marketing, design, and ideation.

What it is

You describe what you want in words and the AI paints it. Useful for marketing visuals, product mockups, social posts, or just exploring ideas without hiring a designer for every iteration.

Real-world examples

  • ·Create a hero image for a landing page
  • ·Design a logo concept in five styles
  • ·Generate product photos in different settings
  • ·Illustrate a children's book

What to look for

  • ·How well it follows the prompt
  • ·Whether it can render legible text in images
  • ·Consistency across a series

95 models in this category

GPT-image-1

OpenAI

AIDB93

Production-grade image generation API with strong text rendering.

Image Generation
ImageProprietary

DALL·E 3

OpenAI

AIDB91

Prompt-faithful image generator integrated across ChatGPT.

Image Generation
ImageProprietary

Imagen 4

Google DeepMind

AIDB92

Photoreal image model with sharp typography and detail.

Image Generation
ImageProprietary

Midjourney v7

Midjourney

AIDB84

Aesthetic-first image model beloved by designers and concept artists.

Image Generation
ImageProprietary

Stable Diffusion 3.5

Stability AI

AIDB90

Open-weights image generator with strong fine-tuning ecosystem.

Image Generation
ImageOpen Weights

FLUX.1

Black Forest Labs

AIDB90

State-of-the-art open image model from ex-Stable Diffusion researchers.

Image Generation
ImageOpen Weights

Ideogram 2.0

Ideogram

AIDB83

Image model specialised in legible in-image typography and logos.

Image Generation
ImageProprietary

DINOv3

Meta

AIDB92

Self-supervised vision foundation model for image features.

Image UnderstandingEmbeddings
ImageOpen

Aurora

xAI

AIDB87

Photoreal autoregressive image generation model.

Image Generation
ImageProprietary

Recraft V3

Recraft

AIDB85

Image model designed for brand & vector-style design assets.

Image Generation
ImageProprietary

Adobe Firefly Image 4

Adobe

AIDB92

Commercially-safe image model trained on licensed data.

Image Generation
ImageProprietary

Leonardo Phoenix

Leonardo.Ai

AIDB82

In-house foundation model with strong prompt adherence.

Image Generation
ImageProprietary

Playground v3

Playground

AIDB83

Image model focused on graphic design and typography.

Image Generation
ImageProprietary

FLUX.1 Kontext

Black Forest Labs

AIDB88

Image editing model with character & style consistency.

Image Generation
ImageOpen Weights

HiDream-I1

HiDream

AIDB87

Open 17B image generation model topping benchmarks.

Image Generation
ImageOpen Weights

Apple Intelligence

Apple

AIDB91

On-device + private cloud generative AI across iPhone, iPad and Mac.

MultimodalAgentsImage Generation
On-deviceProprietary

Samsung Galaxy AI

Samsung

AIDB87

Suite of on-device + cloud AI features for Galaxy phones (translate, edit, summarize).

MultimodalAudio / SpeechImage Generation
HybridProprietary

Samsung Gauss2

Samsung

AIDB82

Samsung's in-house generative model family for Galaxy products.

Text GenerationCodeImage Generation
Text + ImageProprietary

Amazon Nova

AWS

AIDB94

Amazon's foundation model family (text, image, video) on Bedrock.

MultimodalImage GenerationVideo Generation
Text + Image + VideoProprietary

Adobe Firefly

Adobe

AIDB89

Commercially-safe generative-AI models for image, vector and video.

Image GenerationVideo Generation
Image + VideoProprietary

Canva Magic Studio

Canva

AIDB87

Suite of AI design tools for image, video, copy and presentations.

Image GenerationText Generation
SaaSProprietary

Azure OpenAI Service

Microsoft

AIDB93

Enterprise access to GPT, o-series and DALL·E models on Azure.

Text GenerationImage Generation
APIProprietary

Nano Banana Pro

Google DeepMind

AIDB94

Gemini-powered flagship image generation and editing model with best-in-class text.

Image Generation
ImageProprietary

Shopify Magic

Shopify

AIDB87

Generative AI across the Shopify admin — product descriptions, emails, blog posts and image edits.

Text GenerationImage Generation
SaaSProprietary

Cloudflare Workers AI

Cloudflare

AIDB88

Serverless GPU inference platform running open models at the edge.

Text GenerationEmbeddingsImage Generation
PlatformProprietary

Pinterest Performance+

Pinterest

AIDB87

GenAI ads platform that builds creative and optimizes targeting automatically.

Image GenerationAgents
SaaSProprietary

Stable Diffusion 3

Stability AI

AIDB87

Diffusion models create data from noise by inverting the forward paths of data towards noise and have emerged as a powerful generative modeling technique for high-dimensional, perceptual data such as images and videos.

Image Generation
ImageProprietary

Claude 3 Sonnet

Anthropic

AIDB92

Anthropic's multimodal, language, vision model tracked by Epoch, focused on chat.

CodeImage GenerationMultimodal
ImageProprietary

Claude 3 Opus

Anthropic

AIDB92

Anthropic's multimodal, language, vision model tracked by Epoch, focused on chat.

CodeImage GenerationMultimodal
ImageProprietary

GPT-4 Turbo (Apr 2024)

OpenAI

AIDB92

Today, we shared dozens of new additions and improvements, and reduced pricing across many parts of our platform.

Image GenerationMultimodalText Generation
ImageProprietary

Reka Core

Reka AI

AIDB81

We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka.

Audio / SpeechCodeImage Generation
AudioProprietary

VILA1.5-13B

NVIDIA

AIDB90

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

Image GenerationMultimodalText Generation
VideoOpen Weights

Claude 3.5 Sonnet

Anthropic

AIDB94

This addendum to our Claude 3 Model Card describes Claude 3.5 Sonnet, a new model which outperforms our previous most capable model, Claude 3 Opus, while operating faster and at a lower cost.

CodeImage GenerationMultimodal
ImageProprietary

Ernie 4.0 Turbo

Baidu

AIDB82

Baidu's multimodal, language, vision model tracked by Epoch, focused on vision-language generation.

Image GenerationMultimodalText Generation
ImageProprietary

SenseChat 5.5

SenseTime

AIDB81

SenseTime's multimodal, language, vision model tracked by Epoch, focused on vision-language generation.

Image GenerationMultimodalReasoning
ImageProprietary

LLaVA-OV-72B

ByteDance

AIDB83

We present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed by consolidating our insights into data, models, and visual representations in the LLaVA-NeXT blog series.

Image GenerationMultimodalText Generation
VideoOpen Weights

GPT-4o (Aug 2024)

OpenAI

AIDB91

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioProprietary

Grok-2

xAI

AIDB89

Grok-2 is our frontier language model with state-of-the-art reasoning capabilities.

CodeImage GenerationMultimodal
ImageProprietary

Oryx 34B

Tsinghua University

AIDB86

Visual data comes in various forms, ranging from small icons of just a few pixels to long videos spanning hours.

3DImage GenerationMultimodal
3DOpen Weights

PixelDance

ByteDance

AIDB84

PixelDance V1.4 is a video generation model developed by the ByteDance Research team, using the DiT structure.

Image GenerationVideo Generation
VideoProprietary

Movie Gen Video

Meta

AIDB90

We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio.

Image GenerationVideo Generation
VideoProprietary

NVLM-X 72B

NVIDIA

AIDB92

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning
ImageProprietary

NVLM-H 72B

NVIDIA

AIDB89

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning
ImageProprietary

NVLM-D 72B

NVIDIA

AIDB87

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning
ImageOpen Weights

SeedEdit

ByteDance

AIDB84

We introduce SeedEdit, a diffusion model that is able to revise a given image with any text prompts.

Image Generation
ImageProprietary

GPT-4o (Nov 2024)

OpenAI

AIDB92

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioProprietary

Amazon Nova Pro

Amazon

AIDB93

A highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks.

CodeImage GenerationMultimodal
VideoProprietary

NVILA 15B

NVIDIA

AIDB91

Visual language models (VLMs) have made significant advances in accuracy in recent years.

Image GenerationMultimodalText Generation
VideoOpen Weights

Infinity

ByteDance

AIDB85

We present Infinity, a Bitwise Visual AutoRegressive Modeling capable of generating high-resolution, photorealistic images following language instruction.

Image Generation
ImageOpen Weights

Sora Turbo

OpenAI

AIDB92

Our video generation model is rolling out at sora.com⁠.

Image GenerationVideo Generation
VideoProprietary

Gemini 2.0 Pro

Google DeepMind

AIDB94

Today, we’re releasing an experimental version of Gemini 2.0 Pro that responds to that feedback.

Audio / SpeechCodeImage Generation
AudioProprietary

Veo 2

Google DeepMind

AIDB93

Google DeepMind's video, vision model tracked by Epoch, focused on video generation.

Image GenerationVideo Generation
VideoProprietary

Kimi k1.5

Moonshot AI

AIDB84

Language model pretraining with next token prediction has proved effective for scaling compute but is limited to the amount of available training data.

CodeImage GenerationMultimodal
ImageProprietary

GPT-4o (Jan 2025)

OpenAI

AIDB92

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioProprietary

Grok 3

xAI

AIDB93

We are pleased to introduce Grok 3, our most advanced model yet: blending strong reasoning with extensive pretraining knowledge.

CodeImage GenerationMultimodal
ImageProprietary

GPT-4.5

OpenAI

AIDB92

We advance AI capabilities by scaling two complementary paradigms: unsupervised learning and reasoning.

CodeImage GenerationMultimodal
ImageProprietary

Mistral OCR

Mistral AI

AIDB88

Mistral OCR is an Optical Character Recognition API that sets a new standard in document understanding.

Image GenerationMultimodalText Generation
ImageProprietary

ERNIE-4.5-VL-424B-A47B (文心大模型4.5)

Baidu

AIDB85

In this report, we introduce ERNIE 4.5, a new family of large-scale multimodal models comprising 10 distinct variants.

CodeImage GenerationMultimodal
VideoOpen Weights

Gemini 2.5 Pro (Mar 2025)

Google DeepMind

AIDB92

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation
AudioProprietary

GPT-4o (Mar 2025)

OpenAI

AIDB95

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioProprietary

Llama 4 Scout

Meta

AIDB91

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal
ImageOpen Weights

Llama 4 Maverick

Meta

AIDB90

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal
ImageOpen Weights

Llama 4 Behemoth (preview)

Meta

AIDB92

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal
ImageProprietary

Gemini 2.5 Pro (May 2025)

Google DeepMind

AIDB95

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation
AudioProprietary

Seed1.5-VL

ByteDance

AIDB86

We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning.

Image GenerationMultimodalText Generation
VideoProprietary

Claude Sonnet 4

Anthropic

AIDB89

Claude Sonnet 4 can understand nuanced instructions and context, recognize and correct its own mistakes, and create sophisticated analysis and insights from complex data.

AgentsCodeImage Generation
ImageProprietary

Gemini 2.5 Pro (Jun 2025)

Google DeepMind

AIDB95

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation
AudioProprietary

Seed-1.6-Thinking

ByteDance

AIDB83

Seed1.6 is the latest general-purpose model series unveiled by the ByteDance Seed team.

Image GenerationMultimodalText Generation
ImageProprietary

Gemini 2.5 Deep Think

Google DeepMind

AIDB94

To advance Gemini’s capabilities towards solving hard reasoning problems, we developed a novel reasoning approach, called Deep Think, that naturally blends in parallel thinking techniques during response generation.

Audio / SpeechCodeImage Generation
AudioProprietary

Qwen Image

Alibaba

AIDB88

We present Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.

Image Generation
ImageOpen Weights

Claude Opus 4.1

Anthropic

AIDB92

Today we're releasing Claude Opus 4.1, an upgrade to Claude Opus 4 on agentic tasks, real-world coding, and reasoning.

AgentsCodeImage Generation
ImageProprietary

GPT-5 nano

OpenAI

AIDB94

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

GPT-5 mini

OpenAI

AIDB93

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

Gemini 2.5 Flash Image (Nano Banana)

Google

AIDB94

Text-to-Image: Generate high-quality images from simple or complex text descriptions.

Image Generation
ImageProprietary

Qwen3-Omni-30B-A3B

Alibaba

AIDB87

We present Qwen3-Omni, a single multimodal model that, for the first time, maintains state-of-the-art performance across text, image, audio, and video without any degradation relative to single-modal counterparts.

Audio / SpeechImage GenerationMultimodal
AudioOpen Weights

Gemini Robotics-ER 1.5

Google DeepMind

AIDB94

Our most capable vision-language model (VLM) reasons about the physical world, natively calls digital tools and creates detailed, multi-step plans to complete a mission.

Audio / SpeechImage GenerationText Generation
AudioProprietary

GPT-5 Pro

OpenAI

AIDB92

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

Veo 3.1

Google DeepMind

AIDB94

We’re also introducing Veo 3.1, which brings richer audio, more narrative control, and enhanced realism that captures true-to-life textures.

Image GenerationVideo Generation
VideoProprietary

GPT-5.1 Instant

OpenAI

AIDB91

"Today we’re upgrading the GPT‑5 series with the release of: GPT‑5.1 Instant: our most-used model, now warmer, more intelligent, and better at following your instructions.

Image GenerationMultimodalText Generation
ImageProprietary

Gemini 3 Pro Image (Nano Banana Pro)

Google DeepMind

AIDB93

Today, we’re introducing Nano Banana Pro (Gemini 3 Pro Image), our new state-of-the art image generation and editing model.

Image Generation
ImageProprietary

GPT-5.2 Pro

OpenAI

AIDB92

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

HyperCLOVA X SEED 32B Think

NAVER

AIDB84

Developed by Naver, South Korea’s leading AI research lab, this cutting-edge language model supports multimodal inputs and advanced reasoning.

Image GenerationMultimodalText Generation
ImageProprietary

Seedance 2.0

ByteDance

AIDB82

ByteDance's image generation, video, audio model tracked by Epoch, focused on video generation.

Audio / SpeechImage GenerationVideo Generation
AudioProprietary

Qwen3.5 397B-A17B

Alibaba

AIDB87

We are delighted to announce the official release of Qwen3.5, introducing the open-weight of the first model in the Qwen3.5 series, namely Qwen3.5-397B-A17B.

Image GenerationText Generation
ImageOpen Weights

Gemini 3.1 Pro

Google DeepMind

AIDB94

Last week, we released a major update to Gemini 3 Deep Think to solve modern challenges across science, research and engineering.

Image GenerationText Generation
ImageProprietary

GPT-5.4 Pro

OpenAI

AIDB94

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

GPT-5.4

OpenAI

AIDB92

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

GPT Image 2

OpenAI

AIDB91

OpenAI's image generation model tracked by Epoch, focused on image generation.

Image Generation
ImageProprietary

GPT-5.5 Pro

OpenAI

AIDB91

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

GPT-5.5

OpenAI

AIDB92

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageProprietary

Krea 1

Krea AI

AIDB81

Krea's in-house image model tuned for aesthetic control and real-time iteration.

Image Generation
ImageProprietary

Mirage

Decart

AIDB83

Real-time generative world model that re-skins live video streams with text prompts.

Video GenerationImage Generation
VideoProprietary

Lens

Microsoft

AIDB93

Microsoft's open-source 3.8B text-to-image model focused on efficient training, fast high-res generation, and strong prompt adherence.

Image Generation
ImageOpen Weights

Gemini Omni Flash

Google DeepMind

AIDB94

Google DeepMind's closed-source multimodal video creation and editing model that generates or edits video from text, image, video, and audio references.

Video GenerationImage GenerationMultimodal
Text + Image + Video + AudioProprietary

OmniCraft Texture Generator

Deemos Technologies

AIDB84

Hyper3D OmniCraft Texture generates photorealistic, seamless, tileable PBR textures for 3D assets and design pipelines.

3DImage Generation
3D + ImageProprietary

Explore other categories