Search the AI Database — Find Any AI Model

GPT-5

OpenAI

OpenAI's flagship multimodal reasoning model with long-context tool use.

Text GenerationReasoningMultimodal

Text + Image + AudioOpen

GPT-4o

OpenAI

Real-time omni-model handling text, vision and voice in a single network.

MultimodalAudio / SpeechImage Understanding

Text + Image + AudioOpen

o3

OpenAI

Frontier reasoning model tuned for math, science and coding workflows.

ReasoningCode

TextOpen

Claude Sonnet 4.5

Anthropic

Anthropic's best coding and agentic model, strong at long autonomous tasks.

CodeReasoningAgents

Text + ImageOpen

Claude Opus 4

Anthropic

Top-tier reasoning model for research, analysis and complex writing.

ReasoningText Generation

Text + ImageOpen

Gemini 2.5 Pro

Google DeepMind

Long-context multimodal model with native tool use and 1M+ token window.

MultimodalReasoningCode

Text + Image + Video + AudioOpen

Gemini 2.5 Flash

Google DeepMind

Fast, cheap multimodal model optimised for high-volume production use.

Text GenerationMultimodal

Text + ImageOpen

Grok 4

xAI

xAI's flagship reasoning model with real-time X knowledge and tool use.

ReasoningText GenerationAgents

Text + ImageOpen

Llama 4

Meta

Meta's open-weights multimodal MoE family (Scout & Maverick).

MultimodalText GenerationCode

Text + ImageOpen

DeepSeek-V3

DeepSeek

High-performance open MoE LLM rivaling closed frontier models on benchmarks.

Text GenerationReasoningCode

TextOpen

DeepSeek-R1

DeepSeek

Open reasoning model trained with RL, competitive with o1-class systems.

ReasoningCode

TextOpen

Mistral Large 2

Mistral AI

European frontier LLM strong at code, math and multilingual tasks.

Text GenerationCode

TextOpen

Qwen3

Alibaba

Open multilingual model family with hybrid thinking modes.

Text GenerationReasoningCode

Text + ImageOpen

Command R+

Cohere

Enterprise-grade RAG and tool-use model for business workloads.

Text GenerationAgents

TextOpen

Phi-4

Microsoft

Small language model punching above its weight on reasoning benchmarks.

ReasoningText Generation

TextOpen

GPT-image-1

OpenAI

Production-grade image generation API with strong text rendering.

Image Generation

ImageOpen

DALL·E 3

OpenAI

Prompt-faithful image generator integrated across ChatGPT.

Image Generation

ImageOpen

Imagen 4

Google DeepMind

Photoreal image model with sharp typography and detail.

Image Generation

ImageOpen

Midjourney v7

Midjourney

Aesthetic-first image model beloved by designers and concept artists.

Image Generation

ImageOpen

Stable Diffusion 3.5

Stability AI

Open-weights image generator with strong fine-tuning ecosystem.

Image Generation

ImageOpen

FLUX.1

Black Forest Labs

State-of-the-art open image model from ex-Stable Diffusion researchers.

Image Generation

ImageOpen

Ideogram 2.0

Ideogram

Image model specialised in legible in-image typography and logos.

Image Generation

ImageOpen

Sora

OpenAI

Text-to-video model producing minute-long cinematic clips.

Video Generation

VideoOpen

Veo 3

Google DeepMind

High-fidelity video generation with native synchronised audio.

Video GenerationAudio / Speech

Video + AudioOpen

Runway Gen-4

Runway

Pro video generation with consistent characters and worlds.

Video Generation

VideoOpen

Kling 2.0

Kuaishou

Chinese text-to-video model with strong physical realism.

Video Generation

VideoOpen

Pika 2.0

Pika Labs

Creative video generator with scene ingredients and edits.

Video Generation

VideoOpen

Whisper v3

OpenAI

Open multilingual speech recognition and translation model.

Audio / Speech

AudioOpen

ElevenLabs v3

ElevenLabs

Best-in-class expressive TTS and voice cloning across 70+ languages.

Audio / Speech

AudioOpen

Suno v4

Suno

Generates full songs with vocals from a text prompt.

Music

AudioOpen

Udio

Text-to-music model focused on production-quality tracks.

Music

AudioOpen

Claude Code

Anthropic

Agentic coding tool that lives in your terminal and edits real codebases.

CodeAgents

TextOpen

GitHub Copilot

GitHub / OpenAI

In-IDE pair programmer powering code completion and chat.

CodeAgents

TextOpen

Cursor

Anysphere

AI-first code editor with multi-file edits and background agents.

CodeAgents

TextOpen

Devin

Cognition

Autonomous software engineer that plans, codes and ships PRs.

AgentsCode

TextOpen

Codestral

Mistral AI

Code-specialised open model covering 80+ programming languages.

Code

TextOpen

AlphaFold 3

Google DeepMind / Isomorphic

Predicts the structure and interactions of life's molecules.

Reasoning

StructuredOpen

Med-PaLM 2

Google

Medical LLM achieving expert-level performance on USMLE-style questions.

ReasoningText Generation

TextOpen

Evo 2

Arc Institute

Genome-scale foundation model spanning DNA, RNA and proteins.

Reasoning

SequenceOpen

BloombergGPT

Bloomberg

Finance-domain LLM trained on decades of market and news data.

Text GenerationReasoning

TextOpen

Harvey

Generative AI platform purpose-built for elite law firms.

Text GenerationAgents

TextOpen

Khanmigo

Khan Academy

AI tutor that guides students with Socratic questioning.

Text GenerationAgents

TextOpen

Perplexity

Answer engine combining LLMs with cited live web search.

AgentsText Generation

TextOpen

NotebookLM

Google

Source-grounded research assistant with audio overviews.

Text GenerationAudio / Speech

Text + AudioOpen

text-embedding-3-large

OpenAI

High-dimensional embeddings for search, RAG and clustering.

Embeddings

TextOpen

Voyage-3

Voyage AI

Top-ranked retrieval embeddings, optimised for RAG quality.

Embeddings

TextOpen

SAM 2

Meta

Segment Anything for images and video, in real time.

Image Understanding

Image + VideoOpen

RT-2

Google DeepMind

Vision-language-action model that controls robots from web knowledge.

MultimodalAgents

Vision + ActionOpen

Genesis

Genesis Embodied AI

Generative physics platform for robotics simulation and 4D worlds.

3DAgents

SimulationOpen

TripoSR

Stability AI / Tripo

Fast single-image to 3D mesh reconstruction model.

3D

3DOpen

Jasper

Marketing copilot for brand-aware content at enterprise scale.

Text GenerationAgents

TextOpen

Operator

OpenAI

Browser-using agent that performs tasks on the open web.

Agents

WebOpen

GPT-4.1

OpenAI

Improved GPT-4 series model with stronger coding and instruction following.

Text GenerationCodeReasoning

Text + ImageOpen

GPT-4.1 mini

OpenAI

Smaller, faster GPT-4.1 for production workloads.

Text GenerationCode

Text + ImageOpen

GPT-4.1 nano

OpenAI

Cheapest, fastest GPT-4.1 tier for high-volume tasks.

Text Generation

TextOpen

o4-mini

OpenAI

Compact reasoning model balancing cost and quality.

ReasoningCode

Text + ImageOpen

GPT-4o mini

OpenAI

Cost-efficient multimodal small model.

Text GenerationMultimodal

Text + ImageOpen

GPT-OSS 120B

OpenAI

OpenAI's first open-weights reasoning models since GPT-2.

ReasoningText Generation

TextOpen

TTS-1 / GPT-4o Voice

OpenAI

OpenAI text-to-speech voices via the audio API.

Audio / Speech

AudioOpen

Claude Haiku 4.5

Anthropic

Anthropic's fastest and cheapest frontier-class small model.

Text GenerationCode

Text + ImageOpen

Claude 3.7 Sonnet

Anthropic

Hybrid reasoning model with extended thinking mode.

ReasoningCode

Text + ImageOpen

Claude 3.5 Haiku

Anthropic

Fast, low-cost model for everyday tasks.

Text Generation

Text + ImageOpen

Gemini 2.5 Flash-Lite

Google DeepMind

Smallest, cheapest Gemini for high-volume tasks.

Text Generation

TextOpen

Gemini 2.0 Flash

Google DeepMind

Multimodal model with native tool use and live API.

MultimodalAgents

Text + Image + AudioOpen

Gemma 3

Google

Open-weights model family in 1B–27B sizes for on-device & server.

Text Generation

Text + ImageOpen

PaliGemma 2

Google

Open vision-language models for fine-tuning.

Image Understanding

Image + TextOpen

Lyria 2

Google DeepMind

Google's professional music generation model.

Music

AudioOpen

Chirp 3

Google

High-fidelity expressive TTS voices on Google Cloud.

Audio / Speech

AudioOpen

Llama 3.3 70B

Meta

Open-weights instruct model competitive with much larger LLMs.

Text GenerationCode

TextOpen

Llama 3.2 Vision

Meta

Open multimodal model in 11B and 90B sizes.

MultimodalImage Understanding

Text + ImageOpen

DINOv3

Meta

Self-supervised vision foundation model for image features.

Image UnderstandingEmbeddings

ImageOpen

Seamless M4T v2

Meta

Multilingual speech-to-speech and speech-to-text translation.

Audio / Speech

Audio + TextOpen

MusicGen

Meta

Open text-to-music model from AudioCraft.

Music

AudioOpen

Grok 4 Heavy

xAI

Multi-agent variant of Grok 4 for the hardest problems.

ReasoningAgents

Text + ImageOpen

Grok Code Fast 1

xAI

Speed-optimised code model for agentic IDE workflows.

CodeAgents

TextOpen

Aurora

xAI

Photoreal autoregressive image generation model.

Image Generation

ImageOpen

Mistral Medium 3

Mistral AI

Frontier-class performance at a fraction of the cost.

Text GenerationCode

Text + ImageOpen

Mistral Small 3.2

Mistral AI

Fast open-weights small model with strong reasoning.

Text Generation

TextOpen

Pixtral Large

Mistral AI

124B multimodal model with state-of-the-art image understanding.

MultimodalImage Understanding

Text + ImageOpen

Mixtral 8x22B

Mistral AI

Sparse mixture-of-experts open model.

Text GenerationCode

TextOpen

Devstral

Mistral AI

Open agentic coding model built with All Hands AI.

CodeAgents

TextOpen

Qwen3-Coder

Alibaba

Open agentic coding model in the Qwen3 family.

CodeAgents

TextOpen

Qwen2.5-VL

Alibaba

Open vision-language model with strong document understanding.

MultimodalImage Understanding

Text + ImageOpen

Wan 2.2

Alibaba

Open text-to-video and image-to-video model.

Video Generation

VideoOpen

GLM-4.5

Zhipu AI

Open agentic foundation model from Zhipu's GLM family.

ReasoningAgents

TextOpen

Kimi K2

Moonshot AI

Trillion-parameter open MoE model with strong agentic skills.

ReasoningAgentsCode

TextOpen

MiniMax-M1

MiniMax

Open reasoning model with 1M-token context.

ReasoningText Generation

TextOpen

Hailuo 02

MiniMax

Cinematic text-to-video generator.

Video Generation

VideoOpen

Yi-Lightning

01.AI

Fast, low-cost frontier-tier LLM from 01.AI.

Text GenerationReasoning

TextOpen

Hunyuan-Large

Tencent

Open MoE model with 389B params from Tencent.

Text Generation

TextOpen

ERNIE 4.5

Baidu

Baidu's flagship multimodal foundation model.

MultimodalReasoning

Text + ImageOpen

Doubao 1.5 Pro

ByteDance

ByteDance's flagship LLM, widely deployed in China.

Text GenerationMultimodal

Text + ImageOpen

Seedance 1.0

ByteDance

ByteDance Seed video generation model.

Video Generation

VideoOpen

Recraft V3

Recraft

Image model designed for brand & vector-style design assets.

Image Generation

ImageOpen

Adobe Firefly Image 4

Adobe

Commercially-safe image model trained on licensed data.

Image Generation

ImageOpen

Leonardo Phoenix

Leonardo.Ai

In-house foundation model with strong prompt adherence.

Image Generation

ImageOpen

Playground v3

Playground

Image model focused on graphic design and typography.

Image Generation

ImageOpen

FLUX.1 Kontext

Black Forest Labs

Image editing model with character & style consistency.

Image Generation

ImageOpen

HiDream-I1

HiDream

Open 17B image generation model topping benchmarks.

Image Generation

ImageOpen

Luma Ray 2

Luma AI

Large video generative model with realistic motion.

Video Generation

VideoOpen

HeyGen Avatar IV

HeyGen

AI avatar video generator for marketing and training.

Video GenerationAudio / Speech

VideoOpen

Synthesia

Enterprise AI video platform with realistic avatars.

Video Generation

VideoOpen

HunyuanVideo

Tencent

Open 13B text-to-video model.

Video Generation

VideoOpen

Mochi 1

Genmo

Open-source video generation model.

Video Generation

VideoOpen

LTX Video

Lightricks

Real-time open video generation model.

Video Generation

VideoOpen

Cartesia Sonic

Cartesia

Ultra-low-latency state-space TTS model.

Audio / Speech

AudioOpen

PlayHT 3.0

PlayHT

Conversational TTS optimised for AI agents.

Audio / Speech

AudioOpen

Resemble AI

Voice cloning and real-time speech synthesis platform.

Audio / Speech

AudioOpen

Deepgram Nova-3

Deepgram

Production-grade streaming speech-to-text model.

Audio / Speech

AudioOpen

AssemblyAI Universal-2

AssemblyAI

Highly accurate speech recognition with rich audio intelligence.

Audio / Speech

AudioOpen

Moonshine

Useful Sensors

Open ASR model optimised for real-time edge inference.

Audio / Speech

AudioOpen

Stable Audio 2.0

Stability AI

Generates full-length audio tracks from text.

Music

AudioOpen

Riffusion

AI music generation with vocal & instrumental control.

Music

AudioOpen

Lovable

AI fullstack builder that ships production web apps from prompts.

CodeAgents

TextOpen

Bolt.new

StackBlitz

Browser-based AI agent that builds and runs full-stack apps.

CodeAgents

TextOpen

v0

Vercel

Generative UI tool that produces React + Tailwind components.

Code

TextOpen

Replit Agent

Replit

Agent that creates, edits and deploys apps inside Replit.

CodeAgents

TextOpen

Windsurf (Cascade)

Codeium

Agentic IDE with deep multi-file flows.

CodeAgents

TextOpen

Tabnine

Privacy-first AI code assistant for the enterprise.

Code

TextOpen

Amazon Q Developer

AWS

Coding & cloud assistant deeply integrated with AWS.

CodeAgents

TextOpen

Sourcegraph Cody

Sourcegraph

Code AI with deep codebase context across repos.

Code

TextOpen

Aider

Open-source CLI pair-programmer using your favorite LLM.

CodeAgents

TextOpen

Continue

Open-source AI code assistant for VS Code & JetBrains.

Code

TextOpen

ChatGPT Agent

OpenAI

ChatGPT mode that browses, codes and acts on your behalf.

Agents

WebOpen

Manus

Butterfly Effect

General-purpose autonomous agent that executes long workflows.

Agents

WebOpen

Google AI Mode

Google

Conversational AI search experience in Google.

AgentsText Generation

TextOpen

You.com

AI assistant combining web search with multi-model chat.

Agents

TextOpen

Brave Leo

Brave

Privacy-respecting AI assistant built into the Brave browser.

Text Generation

TextOpen

Cohere Embed v4

Cohere

Multimodal multilingual embeddings for enterprise RAG.

Embeddings

Text + ImageOpen

Jina Embeddings v3

Jina AI

Multilingual long-context embedding model.

Embeddings

TextOpen

BGE-M3

BAAI

Open multi-functional, multilingual embedding model.

Embeddings

TextOpen

Nomic Embed v2

Nomic

Open MoE multilingual text embeddings.

Embeddings

TextOpen

ESM3

EvolutionaryScale

Frontier protein language model for biology design.

Reasoning

SequenceOpen

RFdiffusion

Baker Lab

Open diffusion model for de novo protein design.

Reasoning

SequenceOpen

Boltz-1

MIT / Recursion

Open AlphaFold3-class biomolecular structure prediction model.

Reasoning

StructuredOpen

GraphCast

Google DeepMind

Best-in-class medium-range global weather forecasting AI.

Reasoning

StructuredOpen

GenCast

Google DeepMind

Probabilistic AI weather forecasting beating ENS.

Reasoning

StructuredOpen

Tx-LLM

Google

LLM tuned for therapeutic and drug development tasks.

Reasoning

TextOpen

π0 (Pi-Zero)

Physical Intelligence

Generalist vision-language-action model for robots.

MultimodalAgents

Vision + ActionOpen

Helix

Figure

Vision-language-action model for humanoid robot control.

MultimodalAgents

Vision + ActionOpen

GR00T N1

NVIDIA

Open foundation model for humanoid robots.

MultimodalAgents

Vision + ActionOpen

Cosmos

NVIDIA

World foundation models for physical AI simulation.

3DAgents

SimulationOpen

Genie 3

Google DeepMind

Real-time interactive world model from a text prompt.

3DAgents

SimulationOpen

Meshy 5

Meshy

Text & image to 3D model generator for creators.

3D

3DOpen

Rodin Gen-2

Hyper3D

High-fidelity 3D asset generation with PBR textures.

3D

3DOpen

Tripo 3.0

Tripo AI

Production-quality text and image to 3D.

3D

3DOpen

CoCounsel

Thomson Reuters

Generative AI legal assistant for lawyers.

Text GenerationAgents

TextOpen

Hebbia Matrix

Hebbia

Agentic research platform for finance and legal teams.

AgentsText Generation

TextOpen

Glean Assistant

Glean

Enterprise AI assistant grounded in company knowledge.

AgentsText Generation

TextOpen

Microsoft 365 Copilot

Microsoft

AI assistant across Word, Excel, Outlook, PowerPoint and Teams.

Text GenerationAgents

TextOpen

Gemini for Workspace

Google

AI in Gmail, Docs, Sheets, Slides and Meet.

Text Generation

TextOpen

Notion AI

Notion

AI for writing, search and meeting notes inside Notion.

Text GenerationAgents

TextOpen

Copy.ai

GTM AI platform for marketing and sales workflows.

Text GenerationAgents

TextOpen

Writer Palmyra X5

Writer

Enterprise LLM family powering Writer's generative platform.

Text Generation

TextOpen

Pi

Inflection

Personal AI focused on emotionally intelligent conversation.

Text Generation

TextOpen

Character.AI

Platform for creating and chatting with AI characters.

Text Generation

TextOpen

Duolingo Max

Duolingo

AI-powered language tutoring features.

Text GenerationAudio / Speech

Text + AudioOpen

OLMo 2

Allen Institute (AI2)

Fully open language model with training data and code released.

Text Generation

TextOpen

Falcon 3

TII

Open LLM family from the Technology Innovation Institute.

Text Generation

TextOpen

SmolLM3

Hugging Face

Compact open model strong in its size class.

Text Generation

TextOpen

Reka Flash 3

Reka

Multimodal frontier model with open weights.

MultimodalReasoning

Text + Image + Video + AudioOpen

Nemotron 4

NVIDIA

NVIDIA's open LLM family for synthetic data and reasoning.

Text GenerationReasoning

TextOpen

IBM Granite 3

IBM

Open enterprise-ready foundation models.

Text GenerationCode

TextOpen

Snowflake Arctic

Snowflake

Open enterprise LLM optimised for SQL and coding.

CodeText Generation

TextOpen

Databricks DBRX

Databricks

Open MoE LLM tuned for enterprise tasks.

Text Generation

TextOpen

Dell AI Factory (with NVIDIA)

Dell Technologies

End-to-end AI infrastructure stack combining PowerEdge servers, storage, networking and NVIDIA AI Enterprise.

AgentsReasoning

InfrastructureOpen

Dell Pro AI Studio

Dell Technologies

Toolkit for deploying on-device AI models to Dell AI PCs at scale.

Agents

On-deviceOpen

Lenovo AI Now

Lenovo

On-device AI assistant running locally on Lenovo AI PCs for private productivity.

AgentsText Generation

On-deviceOpen

Lenovo Hybrid AI Advantage

Lenovo

Hybrid AI platform spanning ThinkSystem servers, ThinkEdge devices and managed services.

Agents

InfrastructureOpen

AMD Instinct MI350

AMD

Datacenter accelerator (CDNA 4) for training and inference of frontier models.

Reasoning

AcceleratorOpen

AMD Ryzen AI 300

AMD

Laptop CPU with XDNA 2 NPU delivering 50+ TOPS for Copilot+ AI PCs.

Agents

On-deviceOpen

AMD ROCm

AMD

Open software stack for GPU compute and AI on Instinct & Radeon hardware.

Code

StackOpen

NVIDIA DGX Cloud

NVIDIA

Managed AI training service on dedicated NVIDIA Hopper/Blackwell clusters.

Reasoning

CloudOpen

NVIDIA Blackwell B200

NVIDIA

Flagship datacenter GPU for trillion-parameter AI training and inference.

Reasoning

AcceleratorOpen

NVIDIA NIM Microservices

NVIDIA

Containerized inference microservices for deploying optimized AI models anywhere.

Agents

ServiceOpen

NVIDIA Jetson Thor

NVIDIA

Edge robotics platform built on Blackwell for humanoid and physical AI.

Agents

EdgeOpen

NVIDIA DRIVE Thor

NVIDIA

Centralized car computer for AV, cockpit AI and infotainment.

AgentsMultimodal

EdgeOpen

Intel Gaudi 3

Intel

Datacenter AI accelerator targeting price/performance vs H100.

Reasoning

AcceleratorOpen

Intel Core Ultra (Lunar Lake)

Intel

Laptop CPU with integrated NPU powering Copilot+ on-device AI.

Agents

On-deviceOpen

Intel OpenVINO

Intel

Open toolkit to optimize and deploy AI inference across Intel CPUs, GPUs and NPUs.

Code

ToolkitOpen

Qualcomm Snapdragon X Elite

Qualcomm

Arm laptop SoC with 45 TOPS Hexagon NPU for Copilot+ PCs.

Agents

On-deviceOpen

Qualcomm AI Hub

Qualcomm

Library of optimized AI models ready to deploy on Snapdragon devices.

MultimodalCode

On-deviceOpen

Apple Intelligence

Apple

On-device + private cloud generative AI across iPhone, iPad and Mac.

MultimodalAgentsImage Generation

On-deviceOpen

Apple Neural Engine (M4)

Apple

38 TOPS NPU integrated in Apple Silicon for on-device ML workloads.

Multimodal

AcceleratorOpen

Google Tensor G5

Google

Pixel SoC powering on-device Gemini Nano features.

MultimodalAudio / Speech

On-deviceOpen

Google TPU v5p / Trillium

Google

Custom AI accelerators powering Gemini training and Google Cloud AI.

Reasoning

AcceleratorOpen

Gemini Nano (on Pixel)

Google

Smallest Gemini model running fully on-device on Pixel and Android.

Text GenerationMultimodal

On-deviceOpen

Samsung Galaxy AI

Samsung

Suite of on-device + cloud AI features for Galaxy phones (translate, edit, summarize).

MultimodalAudio / SpeechImage Generation

HybridOpen

Samsung Gauss2

Samsung

Samsung's in-house generative model family for Galaxy products.

Text GenerationCodeImage Generation

Text + ImageOpen

HPE Private Cloud AI

Hewlett Packard Enterprise

Turnkey on-prem AI cloud built with NVIDIA, co-engineered for enterprises.

AgentsReasoning

InfrastructureOpen

HP AI Companion

HP

Local AI assistant bundled with HP AI PCs for private document Q&A.

Text GenerationAgents

On-deviceOpen

IBM watsonx

IBM

Enterprise AI & data platform with model studio, governance and runtime.

AgentsText Generation

PlatformOpen

Cisco AI Defense

Cisco

Security platform that protects AI applications from misuse and attacks.

Agents

SecurityOpen

Cisco AI Pods

Cisco

Pre-validated infrastructure stacks for inference at the edge of the enterprise.

Agents

InfrastructureOpen

Pure Storage AIRI//S

Pure Storage

AI-ready storage stack co-engineered with NVIDIA for training pipelines.

Reasoning

InfrastructureOpen

NetApp AIPod

NetApp

Converged AI infrastructure with ONTAP storage and NVIDIA compute.

Reasoning

InfrastructureOpen

Supermicro SuperCluster

Supermicro

Liquid-cooled GPU SuperClusters for trillion-parameter LLM training.

Reasoning

InfrastructureOpen

Cerebras WSE-3 / CS-3

Cerebras

Wafer-scale AI processor delivering record-breaking inference throughput.

Reasoning

AcceleratorOpen

Groq LPU

Groq

Language Processing Unit delivering ultra-low-latency LLM inference.

Reasoning

AcceleratorOpen

SambaNova Suite

SambaNova Systems

Full-stack AI platform with Reconfigurable Dataflow Units (RDUs).

ReasoningAgents

PlatformOpen

Tenstorrent Wormhole

Tenstorrent

Open RISC-V based AI accelerator from Jim Keller's team.

Reasoning

AcceleratorOpen

AWS Trainium2

AWS

Custom AWS chip purpose-built for training large language models.

Reasoning

AcceleratorOpen

AWS Inferentia2

AWS

Cost-optimized AWS chip for high-throughput LLM inference.

Reasoning

AcceleratorOpen

Amazon Bedrock

AWS

Managed service to build agents using foundation models from many vendors.

AgentsText Generation

PlatformOpen

Amazon Nova

AWS

Amazon's foundation model family (text, image, video) on Bedrock.

MultimodalImage GenerationVideo Generation

Text + Image + VideoOpen

Azure AI Foundry

Microsoft

Unified platform to design, customize and operate enterprise AI agents.

Agents

PlatformOpen

Microsoft Copilot+ PC

Microsoft

Windows AI PC category with NPU-powered features like Recall and Live Captions.

AgentsImage Understanding

On-deviceOpen

Tesla FSD v13

Tesla

End-to-end neural network for autonomous driving on Tesla vehicles.

MultimodalAgents

Vision + ActionOpen

Mercedes MB.OS with Google Cloud AI

Mercedes-Benz

In-car operating system with conversational AI assistant powered by Google Cloud.

MultimodalAgents

In-vehicleOpen

BMW Intelligent Personal Assistant

BMW

Voice-first in-car AI assistant integrating Alexa LLM features.

Audio / SpeechAgents

In-vehicleOpen

Mobileye Chauffeur

Mobileye

Eyes-off AV system combining EyeQ chips, surround sensing and REM mapping.

MultimodalAgents

Vision + ActionOpen

Waymo Driver

Waymo

Full-stack autonomous driving system deployed in robotaxis.

MultimodalAgents

Vision + ActionOpen

John Deere See & Spray

John Deere

Computer vision system that targets herbicide only at weeds in real time.

Image UnderstandingAgents

Edge VisionOpen

Boston Dynamics Orbit (with AI)

Boston Dynamics

Software platform managing Spot robots with AI-driven inspection routines.

AgentsImage Understanding

Robotics PlatformOpen

Figure 02

Figure

Humanoid robot powered by the Helix VLA model for general-purpose work.

MultimodalAgents

HumanoidOpen

Tesla Optimus

Tesla

General-purpose humanoid robot using Tesla's autonomy stack.

MultimodalAgents

HumanoidOpen

Unitree G1

Unitree

Affordable humanoid robot platform with onboard AI.

Agents

HumanoidOpen

Rabbit R1

Rabbit

Pocket AI device built around the Large Action Model paradigm.

AgentsAudio / Speech

DeviceOpen

Humane AI Pin

Humane

Wearable AI assistant with laser projection and voice-first UI.

AgentsMultimodal

WearableOpen

Meta Ray-Ban (with Meta AI)

Meta

Smart glasses with multimodal Meta AI for live look-and-ask.

MultimodalAudio / Speech

WearableOpen

CrowdStrike Charlotte AI

CrowdStrike

Generative AI security analyst built into the Falcon platform.

Agents

SaaSOpen

Microsoft Security Copilot

Microsoft

Generative AI assistant for SOC analysts and IT admins.

Agents

SaaSOpen

Palo Alto AI Access Security

Palo Alto Networks

Discovery and protection for employee use of generative AI apps.

Agents

SaaSOpen

Salesforce Agentforce

Salesforce

Platform for building autonomous AI agents on top of CRM data.

Agents

PlatformOpen

ServiceNow Now Assist

ServiceNow

Generative AI built into the ServiceNow workflow platform.

AgentsText Generation

PlatformOpen

Oracle AI Agents

Oracle

AI agents embedded across Oracle Fusion Cloud apps and OCI.

Agents

PlatformOpen

SAP Joule

SAP

Generative AI copilot embedded across SAP business applications.

AgentsText Generation

PlatformOpen

John Deere See & Spray

John Deere

Computer-vision sprayer that targets weeds in real time, cutting herbicide use up to 60%.

Image UnderstandingAgents

Hardware + VisionOpen

Climate FieldView

Bayer / Climate

Digital agronomy platform with AI-driven yield, planting and nitrogen recommendations.

Agents

SaaSOpen

Plantix

PEAT

Mobile crop-disease diagnosis from a single leaf photo, used by 30M+ smallholders.

Image Understanding

MobileOpen

Blue River Technology

John Deere

Robotics + computer vision for precision weeding and crop care.

Image UnderstandingAgents

Hardware + VisionOpen

Taranis

Aerial leaf-level imagery with AI for early pest, disease and nutrient detection.

Image Understanding

Aerial ImageryOpen

CropX

Soil-sensor + AI agronomic platform for irrigation and nitrogen optimization.

Agents

IoT + SaaSOpen

AGCO Fuse Smart Farming

AGCO

Connected farm AI platform for fleets of Massey Ferguson and Fendt machinery.

Agents

PlatformOpen

Carbon Robotics LaserWeeder

Carbon Robotics

AI-guided lasers identify and zap weeds in row crops without chemicals.

Image UnderstandingAgents

HardwareOpen

Indigo Carbon

Indigo Ag

AI-powered soil carbon and regenerative-ag marketplace.

Agents

SaaSOpen

Palantir AIP for Gov

Palantir

AI Platform deploying LLMs against classified and government datasets with audit and policy controls.

AgentsReasoning

PlatformOpen

GovGPT (Microsoft Azure Gov)

Microsoft

Azure OpenAI Service in Azure Government for IL5 / FedRAMP High workloads.

Text GenerationAgents

CloudOpen

AWS GovCloud Bedrock

AWS

Amazon Bedrock foundation models in AWS GovCloud for US public-sector workloads.

Text GenerationMultimodal

CloudOpen

Google Public Sector Gemini

Google

Gemini and Vertex AI tailored for federal, state and local government.

MultimodalAgents

CloudOpen

Oracle Government AI

Oracle

Generative AI in Oracle Cloud for Government across HCM, ERP and citizen services.

Agents

CloudOpen

ServiceNow Citizen Engagement

ServiceNow

AI-powered citizen-services workflows for federal and local agencies.

Agents

SaaSOpen

Veritone Public Sector

Veritone

AI for evidence redaction, transcription and investigations for law enforcement.

Audio / SpeechImage Understanding

SaaSOpen

Anduril Lattice

Anduril

AI command-and-control mesh fusing sensors, drones and effectors across the battlespace.

AgentsMultimodal

PlatformOpen

Shield AI Hivemind

Shield AI

Autonomy stack flying GPS- and comms-denied missions on V-BAT and F-16.

Agents

Autonomy StackOpen

Helsing

European defense AI for sensor fusion, electronic warfare and autonomous strike.

AgentsMultimodal

PlatformOpen

Palantir Maven Smart System

Palantir

AI-driven targeting and ISR fusion deployed across US combatant commands.

AgentsImage Understanding

PlatformOpen

Scale Donovan

Scale AI

LLM-powered decision-making platform for defense and intelligence analysts.

AgentsReasoning

PlatformOpen

Lockheed Martin Astris AI

Lockheed Martin

Defense-grade AI infrastructure subsidiary supporting national security missions.

Agents

PlatformOpen

BAE Systems FAST Labs AI

BAE Systems

Autonomous and AI/ML systems for ISR, EW and mission systems.

AgentsImage Understanding

PlatformOpen

Saab Loke

Saab

AI-enabled command-and-control suite for joint and combined operations.

Agents

PlatformOpen

Rebellion Defense

AI products for ISR, mission planning and computer vision in defense.

Image UnderstandingAgents

PlatformOpen

Vannevar Labs

AI for open-source intelligence and non-traditional collection.

ReasoningAgents

SaaSOpen

Booking.com AI Trip Planner

Booking.com

Conversational trip planner suggesting destinations, hotels and itineraries.

AgentsText Generation

SaaSOpen

Expedia Romie

Expedia

Group-travel AI assistant integrated in chat for planning and booking.

Agents

SaaSOpen

Kayak Ask Kayak

Kayak

ChatGPT-powered travel concierge for searching flights, hotels and cars.

Agents

SaaSOpen

TripAdvisor AI Trips

TripAdvisor

AI-generated personalized itineraries from reviews and traveler data.

Text GenerationAgents

SaaSOpen

Hopper GPT

Hopper

Conversational price-prediction and booking assistant for flights and hotels.

Agents

MobileOpen

Hilton ConfirmedConnectingRooms AI

Hilton

AI-driven family-room matching and stay personalization across Hilton brands.

Agents

SaaSOpen

Marriott Renai

Marriott

Generative-AI experiences for guest service and personalization.

Agents

SaaSOpen

Airbnb AI Search

Airbnb

AI-powered search, photo-tour categorization and host customer-service agent.

AgentsImage Understanding

SaaSOpen

GM Super Cruise

General Motors

Hands-free driver-assistance system using AI perception and HD-map fusion.

AgentsImage Understanding

ADASOpen

Ford BlueCruise

Ford

Hands-free highway driving with AI driver monitoring.

Agents

ADASOpen

Hyundai Pleos

Hyundai Motor

AI-powered software-defined vehicle OS with voice and personalization.

AgentsAudio / Speech

Vehicle OSOpen

Stellantis STLA AutoDrive

Stellantis

Level-3 autonomous-driving stack across Stellantis brands.

Agents

ADASOpen

Cruise (GM)

Cruise

Driverless robotaxi platform built on GM vehicles.

Agents

Autonomy StackOpen

Zoox

Amazon

Purpose-built autonomous robotaxi with end-to-end AI driving stack.

Agents

Autonomy StackOpen

Wayve GAIA-2

Wayve

Generative world model for end-to-end embodied driving.

Video GenerationAgents

World ModelOpen

Siemens Industrial Copilot

Siemens

Generative AI copilot for engineers across PLC, design and operations.

AgentsCode

PlatformOpen

Rockwell FactoryTalk Optix AI

Rockwell Automation

AI-enabled HMI and analytics for industrial automation.

Agents

PlatformOpen

GE Vernova AI

GE Vernova

AI for grid orchestration, wind-turbine optimization and power generation.

Agents

PlatformOpen

Schneider Electric EcoStruxure AI

Schneider Electric

AI for energy management and industrial automation across EcoStruxure.

Agents

PlatformOpen

ABB Ability Genix

ABB

Industrial analytics and AI platform for process and discrete manufacturing.

Agents

PlatformOpen

Honeywell Forge

Honeywell

Industrial AI for buildings, aerospace and process industries.

Agents

PlatformOpen

SLB Lumi

SLB (Schlumberger)

Generative-AI platform for energy operations across exploration and production.

Agents

PlatformOpen

Baker Hughes Leucipa

Baker Hughes

Autonomous field-operations AI for oil & gas production.

Agents

PlatformOpen

Tapestry (X / Alphabet)

Alphabet X

AI-driven virtualization platform for the electric grid.

Agents

PlatformOpen

Octopus Kraken

Octopus Energy

AI-native customer and grid platform powering 60M+ energy accounts.

Agents

SaaSOpen

Ericsson Cognitive Network Solutions

Ericsson

AI/ML for autonomous 5G network operations and energy savings.

Agents

PlatformOpen

Nokia MX Industrial Edge AI

Nokia

Edge AI platform for private 5G and industrial automation.

Agents

EdgeOpen

AT&T Ask AT&T

AT&T

Internal generative-AI assistant built on Azure OpenAI for 80k+ employees.

Agents

SaaSOpen

Verizon Personal Research Assistant

Verizon

Generative-AI agent for customer-service and field operations.

Agents

SaaSOpen

Vodafone TOBi

Vodafone

AI customer-service chatbot serving hundreds of millions of subscribers.

Agents

SaaSOpen

Amazon Rufus

Amazon

Generative-AI shopping assistant inside the Amazon app.

Agents

MobileOpen

Shopify Sidekick

Shopify

AI commerce assistant that runs the merchant's store via natural language.

Agents

SaaSOpen

Walmart Sparky

Walmart

Generative-AI shopping assistant in the Walmart app.

Agents

MobileOpen

Klarna AI Assistant

Klarna

OpenAI-powered customer-service agent handling 2/3 of Klarna chats.

Agents

SaaSOpen

Instacart Ask Instacart

Instacart

ChatGPT-powered grocery search and meal planning.

Agents

MobileOpen

Maersk Captain Peter

Maersk

AI-powered remote container monitoring across reefer fleets.

Agents

IoT + SaaSOpen

FedEx Surround

FedEx

AI logistics intelligence platform built with Microsoft for shipment visibility.

Agents

PlatformOpen

UPS DeliveryDefense

UPS

Machine-learning system scoring delivery-success likelihood for shippers.

Agents

APIOpen

Project44 Movement GPT

Project44

Generative-AI supply-chain assistant on top of real-time visibility data.

Agents

SaaSOpen

Blue Yonder Cognitive Solutions

Blue Yonder

AI/ML supply-chain planning, forecasting and execution.

Agents

SaaSOpen

Zillow Zestimate (Neural)

Zillow

Neural-network home-value estimation across 100M+ US homes.

Agents

APIOpen

Procore Copilot

Procore

Generative-AI assistant for construction project management.

Agents

SaaSOpen

Autodesk AI

Autodesk

Generative design and AI across AutoCAD, Revit, Forma and Fusion.

Agents3D

PlatformOpen

HPE Aruba Networking Central AI

HPE Aruba Networking

AIOps for wired, wireless and SD-WAN with predictive issue resolution.

Agents

SaaSOpen

Aruba Networking AI Assistant

HPE Aruba Networking

Conversational AI inside Aruba Central for network troubleshooting.

Agents

SaaSOpen

Cisco AI Assistant

Cisco

Cross-portfolio AI assistant for security, networking and collaboration.

Agents

SaaSOpen

Cisco Hypershield

Cisco

AI-native distributed security fabric for data centers and clouds.

Agents

PlatformOpen

Juniper Mist AI / Marvis

Juniper Networks

AI-driven networking and Marvis virtual network assistant.

Agents

SaaSOpen

Arista CloudVision AVA

Arista Networks

Autonomous Virtual Assist AI for network operations and security.

Agents

PlatformOpen

Extreme AI Expert

Extreme Networks

GenAI assistant for network operations across the Extreme platform.

Agents

SaaSOpen

F5 AI Gateway

F5

Application-delivery and security AI gateway for LLM apps.

Agents

GatewayOpen

Fortinet FortiAI

Fortinet

GenAI security analyst across the Fortinet Security Fabric.

Agents

SaaSOpen

Zscaler ZDX Copilot

Zscaler

Generative-AI copilot for digital experience and zero-trust operations.

Agents

SaaSOpen

Palo Alto Strata Copilot

Palo Alto Networks

GenAI copilot for network security across the Strata portfolio.

Agents

SaaSOpen

Palo Alto Cortex XSIAM

Palo Alto Networks

AI-driven SOC platform unifying SIEM, EDR and SOAR.

Agents

PlatformOpen

Check Point Infinity AI Copilot

Check Point

Generative-AI assistant for security administration and threat analysis.

Agents

SaaSOpen

SentinelOne Purple AI

SentinelOne

Generative-AI threat-hunting analyst across the Singularity platform.

Agents

SaaSOpen

Darktrace ActiveAI

Darktrace

Self-learning AI platform for autonomous response across email, network and cloud.

Agents

PlatformOpen

Vectra AI Platform

Vectra AI

AI-driven threat detection and response across hybrid cloud.

Agents

PlatformOpen

Veeam Data Intelligence

Veeam

AI-powered data resilience, anomaly detection and recovery analytics.

Agents

PlatformOpen

Commvault Cloud Arlie

Commvault

GenAI assistant for cyber resilience, recovery and data protection.

Agents

SaaSOpen

Rubrik Ruby

Rubrik

Generative-AI assistant for cyber recovery investigations and remediation.

Agents

SaaSOpen

Cohesity Gaia

Cohesity

RAG-based AI search and insights over enterprise backup data.

AgentsEmbeddings

SaaSOpen

Pure Storage AIRI

Pure Storage

AI-ready infrastructure (with NVIDIA DGX) for training and inference at scale.

Agents

Reference ArchitectureOpen

NetApp AIPod

NetApp

Converged AI infrastructure with NVIDIA for enterprise model training.

Agents

Reference ArchitectureOpen

Dell PowerScale for AI

Dell Technologies

Scale-out file storage tuned for large-scale AI training and RAG.

Agents

StorageOpen

VAST Data Platform

VAST Data

Unified data platform for AI with embedded vector database and compute.

EmbeddingsAgents

PlatformOpen

Weka AI

WEKA

High-performance data platform for GPU-accelerated AI pipelines.

Agents

StorageOpen

DDN A³I

DDN

Reference AI storage architecture co-engineered with NVIDIA DGX SuperPOD.

Agents

StorageOpen

Hitachi Vantara iQ

Hitachi Vantara

Industry-tailored generative AI solutions on Hitachi infrastructure.

Agents

PlatformOpen

IBM watsonx.ai

IBM

Enterprise studio for building, training and deploying foundation models.

Text GenerationAgents

PlatformOpen

IBM Granite

IBM

Open enterprise LLM family for code, language and time series.

Text GenerationCode

TextOpen

Snowflake Cortex AI

Snowflake

Managed LLMs and agents running directly inside Snowflake.

Text GenerationAgents

PlatformOpen

Databricks Mosaic AI

Databricks

End-to-end platform for building and serving custom AI agents on the lakehouse.

AgentsEmbeddings

PlatformOpen

DBRX

Databricks

Open MoE LLM by Databricks for enterprise customization.

Text GenerationCode

TextOpen

Pinecone

Managed vector database powering RAG and semantic-search apps.

Embeddings

DatabaseOpen

Weaviate

Open-source vector database with hybrid search and modules.

Embeddings

DatabaseOpen

Elastic AI Assistant

Elastic

GenAI assistant across Elastic Search, Observability and Security.

AgentsEmbeddings

PlatformOpen

Splunk AI Assistant

Splunk (Cisco)

GenAI assistant for SPL, observability and security operations.

Agents

SaaSOpen

New Relic AI

New Relic

Generative-AI observability assistant for engineers.

Agents

SaaSOpen

Datadog Bits AI

Datadog

Generative-AI assistant across Datadog observability and security.

Agents

SaaSOpen

Dynatrace Davis CoPilot

Dynatrace

Hypermodal AI combining causal, predictive and generative AI for observability.

Agents

PlatformOpen

Workday Illuminate

Workday

AI agents for HR, finance and planning across Workday.

Agents

SaaSOpen

Atlassian Rovo

Atlassian

Enterprise search and AI agents across Jira, Confluence and 3rd-party SaaS.

AgentsEmbeddings

SaaSOpen

Notion AI

Notion

Built-in AI for writing, search and Q&A across Notion workspaces.

Text GenerationAgents

SaaSOpen

Zoom AI Companion

Zoom

AI assistant for meeting summaries, chat and email across Zoom.

AgentsAudio / Speech

SaaSOpen

Cisco Webex AI Assistant

Cisco

GenAI assistant for meetings, contact center and collaboration.

AgentsAudio / Speech

SaaSOpen

Intuit Assist

Intuit

GenAI financial assistant across TurboTax, QuickBooks, Credit Karma and Mailchimp.

Agents

SaaSOpen

Adobe Firefly

Adobe

Commercially-safe generative-AI models for image, vector and video.

Image GenerationVideo Generation

Image + VideoOpen

Adobe GenStudio

Adobe

Enterprise generative-AI platform for marketing content production.

Agents

PlatformOpen

Canva Magic Studio

Canva

Suite of AI design tools for image, video, copy and presentations.

Image GenerationText Generation

SaaSOpen

HubSpot Breeze

HubSpot

AI agents and copilots across marketing, sales and service.

Agents

SaaSOpen

Glean

Enterprise-search and work-AI assistant across SaaS data.

EmbeddingsAgents

SaaSOpen

Writer Palmyra

Writer

Enterprise LLM family and generative-AI platform for regulated industries.

Text GenerationAgents

PlatformOpen

Jasper

AI marketing platform for brand-aligned content generation.

Text GenerationAgents

SaaSOpen

Now Assist

ServiceNow

GenAI assistant embedded across ITSM, CSM, HRSD and creator workflows.

AgentsText Generation

SaaSOpen

AI Agents (ServiceNow)

ServiceNow

Autonomous AI agents for IT, HR, customer service and security operations.

Agents

SaaSOpen

Now LLM

ServiceNow

Domain-specific large language models tuned for the Now Platform.

Text GenerationReasoning

TextOpen

Workday AI Agents

Workday

Role-based AI agents for recruiting, payroll, expenses and contracts.

Agents

SaaSOpen

Workday Agent System of Record

Workday

Central system to manage, govern and orchestrate AI agents across the enterprise.

Agents

PlatformOpen

Oracle AI Agents

Oracle

Prebuilt AI agents across Oracle Fusion Cloud HCM, ERP, SCM and CX.

Agents

SaaSOpen

Oracle Generative AI Service

Oracle

Managed LLM service on OCI featuring Cohere and Meta Llama models.

Text GenerationEmbeddings

PlatformOpen

Oracle Digital Assistant

Oracle

Conversational AI platform for building enterprise assistants.

Agents

PlatformOpen

Oracle Code Assist

Oracle

GenAI coding companion optimized for Java, SQL and OCI.

Code

IDEOpen

Oracle Health Clinical AI Agent

Oracle

Voice-enabled clinical documentation agent for clinicians.

AgentsAudio / Speech

SaaSOpen

SAP Joule

SAP

Generative-AI copilot embedded across the SAP application portfolio.

AgentsText Generation

SaaSOpen

SAP Business AI

SAP

Portfolio of AI capabilities and agents across SAP business processes.

Agents

PlatformOpen

SAP AI Core

SAP

Runtime and lifecycle management for AI workloads on SAP BTP.

AgentsEmbeddings

PlatformOpen

Agentforce 3

Salesforce

Platform for building, deploying and governing autonomous AI agents.

Agents

PlatformOpen

Einstein GPT

Salesforce

Generative-AI layer across Sales, Service, Marketing and Commerce Clouds.

Text GenerationAgents

SaaSOpen

Salesforce Data Cloud + Einstein

Salesforce

Unified customer data foundation powering Einstein and Agentforce.

EmbeddingsAgents

PlatformOpen

Slack AI

Salesforce / Slack

AI summaries, search and recap built into Slack channels and DMs.

Text GenerationAgents

SaaSOpen

Tableau Pulse / Einstein

Salesforce / Tableau

Generative analytics and natural-language insights inside Tableau.

AgentsReasoning

SaaSOpen

Microsoft 365 Copilot

Microsoft

Generative-AI assistant across Word, Excel, PowerPoint, Outlook and Teams.

AgentsText Generation

SaaSOpen

Copilot Studio

Microsoft

Low-code platform for building and orchestrating custom AI agents.

Agents

PlatformOpen

Dynamics 365 Copilot

Microsoft

Role-based AI copilots for sales, service, finance, supply chain and HR.

Agents

SaaSOpen

Azure AI Foundry

Microsoft

Unified platform to build, evaluate and deploy AI agents and models on Azure.

AgentsEmbeddings

PlatformOpen

Azure OpenAI Service

Microsoft

Enterprise access to GPT, o-series and DALL·E models on Azure.

Text GenerationImage Generation

APIOpen

Azure AI Search

Microsoft

Vector and hybrid retrieval engine for grounding LLMs on enterprise data.

Embeddings

PlatformOpen

GitHub Copilot

Microsoft / GitHub

AI pair-programmer for code completion, chat, reviews and agent mode.

CodeAgents

IDEOpen

Microsoft Fabric Copilot

Microsoft

GenAI copilots for data engineering, science and Power BI inside Fabric.

AgentsCode

SaaSOpen

Power Platform AI Builder

Microsoft

AI models and prebuilt skills for Power Apps and Power Automate.

Agents

PlatformOpen

Amazon Q Business

AWS

Generative-AI assistant grounded on enterprise data and SaaS connectors.

AgentsEmbeddings

SaaSOpen

Amazon Q Developer

AWS

AI coding and operations assistant across the developer lifecycle on AWS.

CodeAgents

IDEOpen

Amazon Bedrock AgentCore

AWS

Secure runtime for deploying and scaling production AI agents on Bedrock.

Agents

PlatformOpen

Amazon SageMaker AI

AWS

End-to-end platform to build, train and deploy ML and foundation models.

AgentsEmbeddings

PlatformOpen

Amazon Connect AI

AWS

GenAI for contact-center agents, self-service and analytics.

AgentsAudio / Speech

SaaSOpen

AWS HealthScribe

AWS

HIPAA-eligible service that generates clinical notes from patient conversations.

Audio / SpeechText Generation

APIOpen

Vertex AI

Google Cloud

Unified platform for Gemini, Model Garden, agents and ML on GCP.

AgentsEmbeddings

PlatformOpen

Vertex AI Agent Builder

Google Cloud

Build, deploy and manage multi-agent systems grounded on enterprise data.

Agents

PlatformOpen

Gemini for Google Workspace

Google

AI assistance across Gmail, Docs, Sheets, Slides, Meet and Drive.

AgentsText Generation

SaaSOpen

Gemini Code Assist

Google

AI coding assistant with enterprise context across IDEs and Google Cloud.

CodeAgents

IDEOpen

Customer Engagement Suite (CCAI)

Google Cloud

Generative contact-center AI for virtual agents, agent assist and insights.

AgentsAudio / Speech

SaaSOpen

BigQuery ML / Gemini in BigQuery

Google Cloud

In-warehouse ML and GenAI directly on BigQuery data via SQL.

ReasoningEmbeddings

PlatformOpen

IBM watsonx.ai

IBM

Studio to train, tune and deploy foundation models including Granite.

Text GenerationEmbeddings

PlatformOpen

IBM watsonx Orchestrate

IBM

Build and orchestrate AI agents across HR, procurement and sales workflows.

Agents

SaaSOpen

IBM watsonx.data

IBM

Open data lakehouse optimized for AI workloads and RAG.

Embeddings

PlatformOpen

IBM watsonx.governance

IBM

AI governance, risk and compliance for foundation models and agents.

Agents

PlatformOpen

Informatica CLAIRE GPT

Informatica

Generative-AI assistant for data management, integration and governance.

AgentsEmbeddings

SaaSOpen

Informatica IDMC AI Agents

Informatica

AI agents across the Intelligent Data Management Cloud for pipelines and quality.

Agents

PlatformOpen

Adobe Experience Platform AI Assistant

Adobe

GenAI assistant for marketers across Adobe Experience Cloud applications.

Agents

SaaSOpen

Adobe Acrobat AI Assistant

Adobe

Conversational AI to summarize, query and draft from PDF documents.

Text Generation

SaaSOpen

Snowflake Cortex Agents

Snowflake

Build agentic apps grounded on governed Snowflake data with hosted LLMs.

AgentsEmbeddings

PlatformOpen

Snowflake Copilot

Snowflake

Natural-language SQL and analytics assistant inside Snowflake.

CodeReasoning

SaaSOpen

Databricks Mosaic AI Agent Framework

Databricks

Tooling to build, evaluate and govern compound AI agents on the lakehouse.

AgentsEmbeddings

PlatformOpen

Databricks Genie

Databricks

Conversational analytics over governed lakehouse data.

ReasoningAgents

SaaSOpen

Cisco AI Assistant

Cisco

Cross-portfolio AI assistant for security, networking and collaboration.

Agents

SaaSOpen

VMware Private AI

Broadcom / VMware

On-prem GenAI reference architecture co-engineered with NVIDIA and IBM.

AgentsEmbeddings

PlatformOpen

Box AI

Box

AI for content Q&A, summarization and metadata extraction in Box.

Text GenerationAgents

SaaSOpen

Dropbox Dash

Dropbox

Universal search and AI assistant across SaaS apps and content.

EmbeddingsAgents

SaaSOpen

DocuSign IAM with AI

DocuSign

AI-powered Intelligent Agreement Management for contract data and workflows.

Agents

SaaSOpen

Zendesk AI Agents

Zendesk

Autonomous and copilot AI agents for customer service.

Agents

SaaSOpen

Freshworks Freddy AI

Freshworks

Generative-AI assistants and agents across CX, ITSM and CRM.

Agents

SaaSOpen

ZoomInfo Copilot

ZoomInfo

GenAI go-to-market copilot for sellers, grounded on B2B data.

Agents

SaaSOpen

Gong AI

Gong

Revenue AI for call insights, forecasting and deal execution.

AgentsAudio / Speech

SaaSOpen

Pega GenAI

Pegasystems

GenAI Blueprint and agents for case management and CRM workflows.

Agents

PlatformOpen

UiPath Autopilot

UiPath

Agentic automation copilot across the UiPath platform for citizens and developers.

Agents

PlatformOpen

Automation Anywhere AI Agent Studio

Automation Anywhere

Build and govern AI agents that combine LLMs with enterprise automation.

Agents

PlatformOpen

Talend Data Fabric AI

Qlik / Talend

AI-assisted data integration, quality and governance.

Agents

PlatformOpen

Qlik Answers

Qlik

Generative analytics service delivering trusted answers from unstructured data.

ReasoningAgents

SaaSOpen

SAS Viya with GenAI

SAS

Analytics and AI platform with embedded LLM orchestration and copilots.

AgentsReasoning

PlatformOpen

TIBCO / Cloud Software Group AI

Cloud Software Group

AI across Spotfire, integration and data virtualization products.

Agents

PlatformOpen

Teradata AI Unlimited / ClearScape

Teradata

In-database analytics and GenAI orchestration on Teradata VantageCloud.

ReasoningEmbeddings

PlatformOpen

MongoDB Atlas Vector Search

MongoDB

Native vector search in MongoDB Atlas for RAG and semantic apps.

Embeddings

DatabaseOpen

Redis AI / Vector

Redis

Low-latency vector database and semantic cache for GenAI apps.

Embeddings

DatabaseOpen

Twilio CustomerAI

Twilio

GenAI and predictive AI across Twilio messaging, voice and Segment.

AgentsAudio / Speech

PlatformOpen

Asana AI

Asana

AI teammates and copilots for work management and goals.

Agents

SaaSOpen

Monday AI

monday.com

AI assistant and blocks for automating Work OS workflows.

Agents

SaaSOpen

Smartsheet AI

Smartsheet

GenAI formulas, summaries and content generation in Smartsheet.

AgentsText Generation

SaaSOpen

Coupa AI

Coupa

Community-powered AI and agents for spend management.

Agents

SaaSOpen

GitLab Duo

GitLab

AI assistant across the GitLab DevSecOps platform with code, chat and security.

CodeAgents

PlatformOpen

Atlassian Intelligence

Atlassian

AI features and agents across Jira, Confluence, Bitbucket and Loom.

Agents

SaaSOpen

Claude Opus 4.5

Anthropic

Anthropic's most intelligent model — state-of-the-art on coding, agents and computer use.

ReasoningCodeAgents

Text + ImageOpen

GPT-5.1-Codex-Max

OpenAI

Frontier agentic coding model for long-horizon software engineering inside Codex.

CodeAgentsReasoning

TextOpen

GPT-5.1

OpenAI

Updated GPT-5 with warmer tone, adaptive reasoning and stronger instruction following.

Text GenerationReasoningMultimodal

Text + Image + AudioOpen

Gemini 3 Pro

Google DeepMind

Leads LMArena Text, WebDev and Vision — Google's flagship multimodal reasoning model.

ReasoningMultimodalCode

Text + Image + Video + AudioOpen

Gemini 3 Deep Think

Google DeepMind

Extended-thinking variant of Gemini 3 for hardest math, science and research problems.

Reasoning

Text + ImageOpen

Nano Banana Pro

Google DeepMind

Gemini-powered flagship image generation and editing model with best-in-class text.

Image Generation

ImageOpen

SAM 3D

Meta

Segment Anything 3D — reconstructs objects, scenes and human bodies from a single image.

3DImage Understanding

Image → 3DOpen

Olmo 3

Ai2

Fully open model flow with training data, checkpoints and recipes for reproducible AI.

Text GenerationReasoning

TextOpen

Grok 4.1

xAI

Refresh of Grok 4 with stronger reasoning, lower hallucination and faster tool use.

ReasoningAgentsText Generation

Text + ImageOpen

Claude Haiku 4.5

Anthropic

Fast, cheap Claude tier matching prior Sonnet-class quality for high-volume agents.

Text GenerationAgentsCode

Text + ImageOpen

Mistral Medium 3

Mistral AI

Cost-efficient enterprise model with frontier-class performance for business workloads.

Text GenerationReasoningCode

Text + ImageOpen

Qwen3-Max

Alibaba

Alibaba's trillion-parameter flagship multilingual reasoning model.

ReasoningText GenerationCode

Text + ImageOpen

Kimi K2

Moonshot AI

Long-context open agentic model from Moonshot, strong on tool use and coding.

AgentsReasoningCode

TextOpen

GLM-4.6

Zhipu AI

Open bilingual frontier model from Zhipu, competitive on coding and reasoning.

Text GenerationCodeReasoning

TextOpen

VISTA-R1

Eigen AI

Agentic RL vision-language model for tool-integrated visual reasoning.

ReasoningImage UnderstandingAgents

Text + ImageOpen

Shopify Magic

Shopify

Generative AI across the Shopify admin — product descriptions, emails, blog posts and image edits.

Text GenerationImage Generation

SaaSOpen

Shopify Semantic Search

Shopify

Embeddings-based product search powering natural-language storefront discovery.

Embeddings

SaaSOpen

Shop App AI

Shopify

Personal shopping assistant in the Shop app, recommending and tracking orders across merchants.

Agents

SaaSOpen

Zendesk Copilot

Zendesk

Agent-side AI copilot suggesting replies, summaries and next actions in real time.

AgentsText Generation

SaaSOpen

Zendesk Resolution Platform

Zendesk

Agentic CX platform (post-Ultimate.ai) for end-to-end automated customer resolutions.

Agents

PlatformOpen

Zendesk QA (Klaus)

Zendesk

AutoQA AI that scores 100% of support conversations across voice and chat.

ReasoningAudio / Speech

SaaSOpen

Twilio Voice Intelligence

Twilio

Speech-to-text, summaries and language operators that analyze every call in real time.

Audio / SpeechReasoning

PlatformOpen

Twilio AI Assistants

Twilio

Build conversational AI agents over SMS, voice and WhatsApp grounded in Segment data.

AgentsAudio / Speech

PlatformOpen

Segment Linked Audiences

Twilio Segment

AI-powered CDP predictions joining warehouse data to real-time activation.

EmbeddingsAgents

PlatformOpen

Symantec AI for DLP

Broadcom / Symantec

AI-driven data loss prevention classifying sensitive content across cloud, email and endpoints.

Agents

PlatformOpen

Broadcom Rally AI

Broadcom

GenAI for agile planning — story generation, sprint summaries and risk forecasting.

AgentsText Generation

SaaSOpen

VMware Cloud Foundation AI Services

Broadcom / VMware

Private AI services for VCF — model serving, RAG and vector DB on-prem.

EmbeddingsAgents

PlatformOpen

Microsoft Sales Copilot

Microsoft

Role-based Copilot inside Outlook & Teams pulling CRM context from Dynamics 365 and Salesforce.

AgentsText Generation

SaaSOpen

Microsoft Service Copilot

Microsoft

Frontline copilot for contact center agents inside Dynamics 365 Customer Service.

AgentsText Generation

SaaSOpen

Dragon Copilot

Microsoft / Nuance

Ambient AI scribe for clinicians that drafts notes and orders from doctor-patient conversations.

Audio / SpeechText Generation

SaaSOpen

GitHub Copilot Workspace

GitHub / Microsoft

Agentic dev environment that plans, edits and tests entire features from a GitHub issue.

CodeAgents

SaaSOpen

Prisma AIRS

Palo Alto Networks

AI Runtime Security — protects models, agents and data across enterprise AI deployments.

Agents

PlatformOpen

Cortex Cloud

Palo Alto Networks

Unified AI-driven CNAPP + CDR converging Prisma Cloud and Cortex into one platform.

Agents

PlatformOpen

Cloudflare Workers AI

Cloudflare

Serverless GPU inference platform running open models at the edge.

Text GenerationEmbeddingsImage Generation

PlatformOpen

Cloudflare AI Gateway

Cloudflare

Observability, caching and rate-limiting proxy for any LLM provider.

Agents

PlatformOpen

Akamai Cloud Inference

Akamai

Distributed-edge inference platform built on the Akamai Connected Cloud.

Agents

PlatformOpen

Stripe Radar

Stripe

ML-based fraud detection trained on the global Stripe payments network.

Reasoning

SaaSOpen

PayPal Smart Receipts

PayPal

Personalized AI recommendations and cashback on merchant receipts.

Agents

SaaSOpen

Block Square AI

Block

AI assistant for sellers — answers business questions from Square sales data.

Agents

SaaSOpen

Coinbase AgentKit

Coinbase

Toolkit letting AI agents transact on-chain with wallets, USDC and smart contracts.

Agents

PlatformOpen

Robinhood Cortex

Robinhood

AI investing companion delivering market insights to Robinhood Gold customers.

ReasoningAgents

SaaSOpen

Spotify AI DJ

Spotify

Personalized AI DJ that curates and narrates listening sessions in a realistic voice.

Audio / SpeechAgents

SaaSOpen

Reddit Answers

Conversational search that synthesizes answers from authentic Reddit discussions.

Text GenerationAgents

SaaSOpen

Snap My AI

Snap

GPT-powered chatbot inside Snapchat with vision and Snap Map awareness.

AgentsMultimodal

SaaSOpen

Pinterest Performance+

GenAI ads platform that builds creative and optimizes targeting automatically.

Image GenerationAgents

SaaSOpen

Uber AI Assistant

Uber

In-app GenAI assistant guiding riders and drivers through Uber and Uber Eats workflows.

Agents

SaaSOpen

DoorDash SafeChat AI

DoorDash

Real-time AI moderation that detects harassment across Dasher-customer chats in 99 languages.

Text Generation

SaaSOpen

Synopsys.ai

Synopsys

AI suite (DSO.ai, VSO.ai, TSO.ai) optimizing chip design across the EDA flow.

AgentsReasoning

PlatformOpen

Cadence Cerebrus / JedAI

Cadence

Generative AI for digital chip implementation and verification across the Cadence flow.

AgentsReasoning

PlatformOpen

Ansys SimAI

Ansys

Cloud generative-AI app delivering near-instant simulation predictions for engineers.

Reasoning

SaaSOpen

Veeva AI

Veeva Systems

Embedded AI agents and shortcuts across Veeva Vault and Commercial Cloud for life sciences.

Agents

SaaSOpen

Hunyuan T1

Tencent

Tencent's deep-reasoning model, mamba-based and tuned for complex multi-step problems.

Reasoning

TextOpen

Baidu Apollo ADFM

Baidu

Autonomous Driving Foundation Model powering Apollo Go robotaxis across China.

MultimodalAgents

Vision + ActionOpen

ByteDance Coze

ByteDance

No-code bot platform for building, publishing and monetizing AI agents.

Agents

PlatformOpen

Palmyra X 003

Writer

Palmyra X 003, is a top-performing instruct model, built specifically for structured text completion rather than conversational use.

Text Generation

TextOpen

Kimi Explorer

Moonshot AI

Moonshot AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

AlphaGeometry

Google DeepMind

Google DeepMind's mathematics model tracked by Epoch, focused on geometry.

Reasoning

TextOpen

Qwen-VL-Max

Alibaba

Alibaba's multimodal, language, vision model tracked by Epoch, focused on chat.

Image UnderstandingMultimodalText Generation

Text + ImageOpen

Qwen1.5-72B

Alibaba

In recent months, our focus has been on developing a “good” model while optimizing the developer experience.

CodeReasoningText Generation

TextOpen

Aya

Cohere for AI

Recent breakthroughs in large language models (LLMs) have centered around a handful of data-rich languages.

Text Generation

TextOpen

Gemini 1.5 Pro

Google DeepMind

Google DeepMind's language, multimodal model tracked by Epoch, focused on language modeling.

MultimodalText Generation

Text + ImageOpen

Stable Diffusion 3

Stability AI

Diffusion models create data from noise by inverting the forward paths of data towards noise and have emerged as a powerful generative modeling technique for high-dimensional, perceptual data such as images and videos.

Image Generation

ImageOpen

MegaScale (Production)

ByteDance

We present the design, implementation and engineering experience in building and deploying MegaScale, a production system for training large language models (LLMs) at the scale of more than 10,000 GPUs.

Text Generation

TextOpen

Mistral Large

Mistral AI

Mistral AI's language model tracked by Epoch, focused on chat.

Text Generation

TextOpen

Claude 3 Sonnet

Anthropic

Anthropic's multimodal, language, vision model tracked by Epoch, focused on chat.

CodeImage GenerationMultimodal

ImageOpen

Claude 3 Opus

Anthropic

Anthropic's multimodal, language, vision model tracked by Epoch, focused on chat.

CodeImage GenerationMultimodal

ImageOpen

Aramco Metabrain AI

Saudi Aramco

Saudi Aramco's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

Inflection-2.5

Inflection AI

At Inflection, our mission is to create a personal AI for everyone.

Text Generation

TextOpen

ManiGaussian

Tsinghua University

Performing language-conditioned robotic manipulation tasks in unstructured environments is highly demanded for general intelligent robots.

AgentsImage UnderstandingMultimodal

Vision + ActionOpen

MM1-30B

Apple

In this work, we discuss building performant Multimodal Large Language Models (MLLMs).

Image UnderstandingMultimodalText Generation

Text + ImageOpen

ReALM

Apple

Reference resolution is an important problem, one that is essential to understand and successfully handle context of different kinds.

Text Generation

TextOpen

GPT-4 Turbo (Apr 2024)

OpenAI

Today, we shared dozens of new additions and improvements, and reduced pricing across many parts of our platform.

Image GenerationMultimodalText Generation

ImageOpen

Reka Core

Reka AI

We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka.

Audio / SpeechCodeImage Generation

AudioOpen

Llama 3-70B

Meta

Meta's language model tracked by Epoch, focused on chat.

CodeText Generation

TextOpen

VILA1.5-13B

NVIDIA

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

Image GenerationMultimodalText Generation

VideoOpen

Yi-Large

01.AI

01.AI's language model tracked by Epoch, focused on chat.

Text Generation

TextOpen

Octo-Base

University of California (UC) Berkeley

University of California (UC) Berkeley's robotics model tracked by Epoch, focused on robotic manipulation.

AgentsMultimodal

Vision + ActionOpen

GLM-4 (0520)

Zhipu AI

We introduce ChatGLM, an evolving family of large language models that we have been developing over time.

CodeReasoningText Generation

TextOpen

ALLaM adapted 70B

Saudi Data and Artificial Intelligence Authority

We present ALLaM: Arabic Large Language Model, a series of large language models to support the ecosystem of Arabic Language Technologies (ALT).

Text Generation

TextOpen

Qwen2-72B

Alibaba

After months of efforts, we are pleased to announce the evolution from Qwen1.5 to Qwen2.

Text Generation

TextOpen

Llama-3.1-Nemotron-70B-Instruct

NVIDIA

High-quality preference datasets are essential for training reward models that can effectively guide large language models (LLMs) in generating high-quality responses aligned with human preferences.

Text Generation

TextOpen

OpenVLA

Stanford University

Stanford University's robotics, vision, language model tracked by Epoch, focused on robotic manipulation.

AgentsImage UnderstandingMultimodal

Vision + ActionOpen

Nemotron-4 340B

NVIDIA

We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4- 340B-Instruct, and Nemotron-4-340B-Reward.

Text Generation

TextOpen

DeepSeek-Coder-V2 236B

DeepSeek

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.

CodeText Generation

TextOpen

Claude 3.5 Sonnet

Anthropic

This addendum to our Claude 3 Model Card describes Claude 3.5 Sonnet, a new model which outperforms our previous most capable model, Claude 3 Opus, while operating faster and at a lower cost.

CodeImage GenerationMultimodal

ImageOpen

Cambrian-1-34B

New York University (NYU)

We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-centric approach.

Image UnderstandingMultimodalText Generation

Text + ImageOpen

ESM3 (98B)

EvolutionaryScale

More than three billion years of evolution have produced an image of biology encoded into the space of natural proteins.

Text Generation

TextOpen

Ernie 4.0 Turbo

Baidu

Baidu's multimodal, language, vision model tracked by Epoch, focused on vision-language generation.

Image GenerationMultimodalText Generation

ImageOpen

SenseChat 5.5

SenseTime

SenseTime's multimodal, language, vision model tracked by Epoch, focused on vision-language generation.

Image GenerationMultimodalReasoning

ImageOpen

Mathstral

Mistral AI

We're contributing Mathstral to the science community to bolster efforts in advanced mathematical problems requiring complex, multi-step logical reasoning.

ReasoningText Generation

TextOpen

DeepL LLM

DeepL

DeepL's language model tracked by Epoch, focused on translation.

Text Generation

TextOpen

Llama 3.1-405B

Meta

Modern artificial intelligence (AI) systems are powered by foundation models.

CodeReasoningText Generation

TextOpen

AFM-server

Apple

Apple's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

AFM-on-device

Apple

Apple's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

LLaVA-OV-72B

ByteDance

We present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed by consolidating our insights into data, models, and visual representations in the LLaVA-NeXT blog series.

Image GenerationMultimodalText Generation

VideoOpen

GPT-4o (Aug 2024)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal

AudioOpen

Table Tennis Agent

Google DeepMind

Achieving human-level speed and performance on real world tasks is a north star for the robotics research community.

AgentsMultimodal

Vision + ActionOpen

Grok-2

xAI

Grok-2 is our frontier language model with state-of-the-art reasoning capabilities.

CodeImage GenerationMultimodal

ImageOpen

Jamba 1.5-Large

AI21 Labs

We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture.

Text Generation

TextOpen

Hairuo

Inspur

Inspur's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

GLM-4-Plus

Zhipu AI

At the KDD International Conference on Data Mining and Knowledge Discovery, the Zhipu GLM team unveiled the new generation of base large model—GLM-4-Plus.

Text Generation

TextOpen

Hunyuan Turbo

Tencent

Tencent's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

Harrison.rad.1

Harrison.ai

Harrison.ai's vision, medicine, language, multimodal model tracked by Epoch, focused on visual question answering.

Image UnderstandingMultimodalText Generation

Text + ImageOpen

AlphaProteo

Google DeepMind

Computational design of protein-binding proteins is a fundamental capability with broad utility in biomedical research and biotechnology.

Text Generation

TextOpen

DeepSeek-V2.5

DeepSeek

DeepSeek's language model tracked by Epoch, focused on language modeling/generation.

CodeText Generation

TextOpen

o1-preview

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeReasoningText Generation

TextOpen

o1-mini

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeReasoningText Generation

TextOpen

Qwen2.5-32B

Alibaba

In the past three months since Qwen2’s release, numerous developers have built new models on the Qwen2 language models, providing us with valuable feedback.

ReasoningText Generation

TextOpen

Qwen2.5-72B

Alibaba

In the past three months since Qwen2’s release, numerous developers have built new models on the Qwen2 language models, providing us with valuable feedback.

ReasoningText Generation

TextOpen

Qwen2.5 Instruct (72B)

Alibaba

Qwen2.5 is the latest series of Qwen large language models.

CodeReasoningText Generation

TextOpen

Oryx 34B

Tsinghua University

Visual data comes in various forms, ranging from small icons of just a few pixels to long videos spanning hours.

3DImage GenerationMultimodal

3DOpen

Telechat2-115B

China Telecom

China Telecom's language model tracked by Epoch, focused on language modeling/generation.

CodeReasoningText Generation

TextOpen

PixelDance

ByteDance

PixelDance V1.4 is a video generation model developed by the ByteDance Research team, using the DiT structure.

Image GenerationVideo Generation

VideoOpen

Llama 3.2 11B

Meta

Meta's multimodal, vision, language model tracked by Epoch, focused on visual question answering.

Image UnderstandingMultimodalText Generation

Text + ImageOpen

Movie Gen Video

Meta

We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio.

Image GenerationVideo Generation

VideoOpen

GR-2

ByteDance

We present GR-2, a state-of-the-art generalist robot agent for versatile and generalizable robot manipulation.

AgentsMultimodal

Vision + ActionOpen

Palmyra X 004

Writer

Palmyra X4 boasts state-of-the-art reasoning through novel training techniques.

CodeText Generation

TextOpen

RDT-1B

Tsinghua University

Tsinghua University's robotics model tracked by Epoch, focused on robotic manipulation.

AgentsMultimodal

Vision + ActionOpen

CHAI-1

Chai discovery

We introduce Chai-1, a multi-modal foundation model for molecular structure prediction that performs at the state-of-the-art across a variety of tasks relevant to drug discovery.

Text Generation

TextOpen

NVLM-X 72B

NVIDIA

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning

ImageOpen

NVLM-H 72B

NVIDIA

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning

ImageOpen

NVLM-D 72B

NVIDIA

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning

ImageOpen

Doubao-pro

ByteDance

A professional-grade, self-developed LLM supporting up to 128k tokens, enabling fine-tuning across the entire series.

Text Generation

TextOpen

SeedEdit

ByteDance

We introduce SeedEdit, a diffusion model that is able to revise a given image with any text prompts.

Image Generation

ImageOpen

Gemini-Exp-1114

Google DeepMind

Google DeepMind's language model tracked by Epoch, focused on language modeling.

Text Generation

TextOpen

k0-math

Moonshot AI

Artificial general intelligence start-up Kimi, owned by Chinese AI start-up Moonshot AI, on Saturday launched its first reasoning AI model k0-math.

ReasoningText Generation

TextOpen

GPT-4o (Nov 2024)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal

AudioOpen

Fugatto 1

NVIDIA

Fugatto is a versatile audio synthesis and transformation model capable of following free-form text instructions with optional audio inputs.

Audio / SpeechMultimodalText Generation

AudioOpen

Amazon Nova Pro

Amazon

A highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks.

CodeImage GenerationMultimodal

VideoOpen

o1

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeMultimodalReasoning

Text + ImageOpen

NVILA 15B

NVIDIA

Visual language models (VLMs) have made significant advances in accuracy in recent years.

Image GenerationMultimodalText Generation

VideoOpen

Infinity

ByteDance

We present Infinity, a Bitwise Visual AutoRegressive Modeling capable of generating high-resolution, photorealistic images following language instruction.

Image Generation

ImageOpen

Sora Turbo

OpenAI

Our video generation model is rolling out at sora.com⁠.

Image GenerationVideo Generation

VideoOpen

EXAONE 3.5 32B

LG AI Research

This technical report introduces the EXAONE 3.5 instruction-tuned language models, developed and released by LG AI Research.

Text Generation

TextOpen

Gemini 2.0 Pro

Google DeepMind

Today, we’re releasing an experimental version of Gemini 2.0 Pro that responds to that feedback.

Audio / SpeechCodeImage Generation

AudioOpen

Apollo 7B

Meta AI

Despite the rapid integration of video perception capabilities into Large Multimodal Models (LMMs), the underlying mechanisms driving their video understanding remain poorly understood.

MultimodalText GenerationVideo Generation

VideoOpen

Veo 2

Google DeepMind

Google DeepMind's video, vision model tracked by Epoch, focused on video generation.

Image GenerationVideo Generation

VideoOpen

STORM-B/8

University of Southern California

We present STORM, a spatio-temporal reconstruction model designed for reconstructing dynamic outdoor scenes from sparse observations.

3D

3DOpen

Stable Point Aware 3D (SPAR3D)

Stability AI

We study the problem of single-image 3D object reconstruction.

3D

3DOpen

INTELLECT-MATH

Prime Intellect

INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning.

Reasoning

TextOpen

Eagle 2

NVIDIA

Recently, promising progress has been made by open-source vision-language models (VLMs) in bringing their capabilities closer to those of proprietary frontier models.

AgentsImage UnderstandingMultimodal

Vision + ActionOpen

Kimi k1.5

Moonshot AI

Language model pretraining with next token prediction has proved effective for scaling compute but is limited to the amount of available training data.

CodeImage GenerationMultimodal

ImageOpen

Computer-Using Agent (CUA)

OpenAI

Today we introduced a research preview of Operator⁠(opens in a new window), an agent that can go to the web to perform tasks for you.

AgentsImage UnderstandingMultimodal

Text + ImageOpen

GPT-4o (Jan 2025)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal

AudioOpen

o3-mini

OpenAI

We’re releasing OpenAI o3-mini, the newest, most cost-efficient model in our reasoning series, available in both ChatGPT and the API today.

CodeReasoningText Generation

TextOpen

Eurus-2-7B-PRIME

Tsinghua University

Tsinghua University's mathematics model tracked by Epoch, focused on mathematical reasoning.

CodeReasoning

TextOpen

Grok 3

xAI

We are pleased to introduce Grok 3, our most advanced model yet: blending strong reasoning with extensive pretraining knowledge.

CodeImage GenerationMultimodal

ImageOpen

Mercury

Inception Labs

Today, we’re excited to announce that Mercury, our first general chat model, is available to support a wider range of text generation applications.

CodeText Generation

TextOpen

GPT-4.5

OpenAI

We advance AI capabilities by scaling two complementary paradigms: unsupervised learning and reasoning.

CodeImage GenerationMultimodal

ImageOpen

QwQ-32B

Alibaba

QwQ is the reasoning model of the Qwen series.

CodeReasoningText Generation

TextOpen

Mistral OCR

Mistral AI

Mistral OCR is an Optical Character Recognition API that sets a new standard in document understanding.

Image GenerationMultimodalText Generation

ImageOpen

Hunyuan-TurboS

Tencent

As Large Language Models (LLMs) rapidly advance, we introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba Mixture of Experts (MoE) model.

CodeReasoningText Generation

TextOpen

EXAONE Deep 32B

LG AI Research

We present EXAONE Deep series, which exhibits superior capabilities in various reasoning tasks, including math and coding benchmarks.

CodeReasoningText Generation

TextOpen

ERNIE-4.5-VL-424B-A47B (文心大模型4.5)

Baidu

In this report, we introduce ERNIE 4.5, a new family of large-scale multimodal models comprising 10 distinct variants.

CodeImage GenerationMultimodal

VideoOpen

o1-pro

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeMultimodalReasoning

Text + ImageOpen

Diffusion Renderer

NVIDIA

Understanding and modeling lighting effects are fundamental tasks in computer vision and graphics.

Video Generation

VideoOpen

DeepSeek-V3 (Mar 2025)

DeepSeek

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

CodeReasoningText Generation

TextOpen

Gemini 2.5 Pro (Mar 2025)

Google DeepMind

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation

AudioOpen

GPT-4o (Mar 2025)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal

AudioOpen

Llama 4 Scout

Meta

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal

ImageOpen

Llama 4 Maverick

Meta

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal

ImageOpen

Llama 4 Behemoth (preview)

Meta

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal

ImageOpen

Pangu Ultra

Huawei

We present Pangu Ultra, a Large Language Model (LLM) with 135 billion parameters and dense Transformer modules trained on Ascend Neural Processing Units (NPUs).

CodeText Generation

TextOpen

Qwen3-235B-A22B

Alibaba

Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models.

CodeReasoningText Generation

TextOpen

Gemini 2.5 Pro (May 2025)

Google DeepMind

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation

AudioOpen

Seed1.5-VL

ByteDance

We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning.

Image GenerationMultimodalText Generation

VideoOpen

Claude Sonnet 4

Anthropic

Claude Sonnet 4 can understand nuanced instructions and context, recognize and correct its own mistakes, and create sophisticated analysis and insights from complex data.

AgentsCodeImage Generation

ImageOpen

DeepSeek-R1 (May 2025)

DeepSeek

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.

CodeReasoningText Generation

TextOpen

Qwen3 Embedding

Alibaba

In this work, we introduce the Qwen3 Embedding series, a significant advancement over its predecessor, the GTE-Qwen series, in text embedding and reranking capabilities, built upon the Qwen3 foundation models.

Text Generation

TextOpen

Gemini 2.5 Pro (Jun 2025)

Google DeepMind

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation

AudioOpen

Seed-1.6-Thinking

ByteDance

Seed1.6 is the latest general-purpose model series unveiled by the ByteDance Seed team.

Image GenerationMultimodalText Generation

ImageOpen

FGN

Google DeepMind

Google DeepMind's earth science model tracked by Epoch, focused on weather forecasting.

Text Generation

TextOpen

EXAONE Path 2.0

LG AI Research

LG AI Research's vision, medicine model tracked by Epoch, focused on cancer diagnosis.

Image Understanding

Text + ImageOpen

Gemini Embedding

Google DeepMind

In this report, we introduce Gemini Embedding, a state-of-the-art embedding model leveraging the power of Gemini, Google's most capable large language model.

Text Generation

TextOpen

EXAONE 4.0 (32B)

LG AI Research

This technical report introduces EXAONE 4.0, which integrates a Non-reasoning mode and a Reasoning mode to achieve both the excellent usability of EXAONE 3.5 and the advanced reasoning abilities of EXAONE Deep.

CodeReasoningText Generation

TextOpen

Qwen3-Coder-480B-A35B

Alibaba

Today, we're announcing Qwen3-Coder, our most agentic code model to date.

AgentsCodeText Generation

TextOpen

Qwen3-235B-A22B-Thinking (Jul 2025)

Alibaba

Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models.

CodeReasoningText Generation

TextOpen

Qwen3-235B-A22B (Jul 2025)

Alibaba

Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models.

CodeReasoningText Generation

TextOpen

MindLink-72B

Kunlun Inc.

We introduce MindLink, a new family of large language models developed by Kunlun Inc.

CodeReasoningText Generation

TextOpen

Gemini 2.5 Deep Think

Google DeepMind

To advance Gemini’s capabilities towards solving hard reasoning problems, we developed a novel reasoning approach, called Deep Think, that naturally blends in parallel thinking techniques during response generation.

Audio / SpeechCodeImage Generation

AudioOpen

Qwen Image

Alibaba

We present Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.

Image Generation

ImageOpen

Hierarchical Reasoning Model (HPM)

Sapient Intelligence

Reasoning, the process of devising and executing complex goal-oriented action sequences, remains a critical challenge in AI.

Image UnderstandingMultimodalText Generation

Text + ImageOpen

gpt-oss-20b

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

Claude Opus 4.1

Anthropic

Today we're releasing Claude Opus 4.1, an upgrade to Claude Opus 4 on agentic tasks, real-world coding, and reasoning.

AgentsCodeImage Generation

ImageOpen

GPT-5 nano

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

GPT-5 mini

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

Gemini 2.5 Flash Image (Nano Banana)

Google

Text-to-Image: Generate high-quality images from simple or complex text descriptions.

Image Generation

ImageOpen

LongCat-Flash

Meituan Inc

We introduce LongCat-Flash, a 560-billion-parameter Mixture-of-Experts (MoE) language model designed for both computational efficiency and advanced agentic capabilities.

CodeReasoningText Generation

TextOpen

AgentFounder-30B

Alibaba

Large language models (LLMs) have evolved into agentic systems capable of autonomous tool use and multi-step reasoning for complex problem-solving.

AgentsCodeReasoning

TextOpen

Qwen3-Omni-30B-A3B

Alibaba

We present Qwen3-Omni, a single multimodal model that, for the first time, maintains state-of-the-art performance across text, image, audio, and video without any degradation relative to single-modal counterparts.

Audio / SpeechImage GenerationMultimodal

AudioOpen

Gemini Robotics-ER 1.5

Google DeepMind

Our most capable vision-language model (VLM) reasons about the physical world, natively calls digital tools and creates detailed, multi-step plans to complete a mission.

Audio / SpeechImage GenerationText Generation

AudioOpen

Sora 2.0

OpenAI

Our latest video generation model is more physically accurate, realistic, and more controllable than prior systems.

Video Generation

VideoOpen

GPT-5 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

Ling-1T

Ant Group

Ling-1T is the first flagship non-thinking model in the Ling 2.0 series, featuring 1 trillion total parameters with ≈ 50 billion active parameters per token.

CodeReasoningText Generation

TextOpen

Veo 3.1

Google DeepMind

We’re also introducing Veo 3.1, which brings richer audio, more narrative control, and enhanced realism that captures true-to-life textures.

Image GenerationVideo Generation

VideoOpen

MiniMax-M2

MiniMax

Today, we are officially open-sourcing and launching MiniMax M2, a model born for Agents and code.

AgentsCodeText Generation

TextOpen

Tongyi DeepResearch

Alibaba

We present Tongyi DeepResearch, an agentic large language model, which is specifically designed for long-horizon, deep information-seeking research tasks.

AgentsReasoningText Generation

TextOpen

Kimi K2 Thinking

Moonshot AI

Today, we are introducing Kimi K2 Thinking, our best open-source thinking model.

CodeReasoningText Generation

TextOpen

Meta's Generative Ads Model (GEM)

Meta

Meta's recommendation model tracked by Epoch, focused on recommender system.

Embeddings

TextOpen

GPT-5.1-Codex

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

GPT-5.1 Instant

OpenAI

"Today we’re upgrading the GPT‑5 series with the release of: GPT‑5.1 Instant: our most-used model, now warmer, more intelligent, and better at following your instructions.

Image GenerationMultimodalText Generation

ImageOpen

π0.6 (pi-0.6)

Physical Intelligence

We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL).

AgentsImage UnderstandingMultimodal

Vision + ActionOpen

P1-235B-A22B

Shanghai AI Lab

Recent progress in large language models (LLMs) has moved the frontier from puzzle-solving to science-grade reasoning-the kind needed to tackle problems whose answers must stand against nature, not merely fit a rubric.

Text Generation

TextOpen

Grok 4.1 Fast

xAI

Today, we’re excited to launch two powerful new additions to the xAI API: Grok 4.1 Fast, our best tool-calling model with a 2M context window.

CodeReasoningText Generation

TextOpen

Gemini 3 Pro Image (Nano Banana Pro)

Google DeepMind

Today, we’re introducing Nano Banana Pro (Gemini 3 Pro Image), our new state-of-the art image generation and editing model.

Image Generation

ImageOpen

DeepSeekMath-V2

DeepSeek

DeepSeek's language model tracked by Epoch, focused on mathematical reasoning.

ReasoningText Generation

TextOpen

SIMA 2

Google DeepMind

We introduce SIMA 2, a generalist embodied agent that understands and acts in a wide variety of 3D virtual worlds.

3D

3DOpen

GPT-5.2 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

GPT-5.2

OpenAI

OpenAI's model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

Nemotron 3-Nano-30B-A3B

NVIDIA

We present Nemotron 3 Nano 30B-A3B, a Mixture-of-Experts hybrid MambaTransformer language model.

Text Generation

TextOpen

GPT-5.2 Codex

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

GLM-4.7

Zhipu AI

Zhipu AI's language model tracked by Epoch, focused on language modeling/generation.

CodeText Generation

TextOpen

MiniMax-M2.1

MiniMax

MiniMax's language model tracked by Epoch, focused on chat.

AgentsCodeText Generation

TextOpen

HyperCLOVA X SEED 32B Think

NAVER

Developed by Naver, South Korea’s leading AI research lab, this cutting-edge language model supports multimodal inputs and advanced reasoning.

Image GenerationMultimodalText Generation

ImageOpen

VAETKI

NC AI

NC AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

A.X K1

SK Telecom

SK Telecom's language model tracked by Epoch, focused on code generation.

CodeReasoningText Generation

TextOpen

Solar Open 100B

Upstage

Solar Open is Upstage's flagship 102B-parameter large language model, trained entirely from scratch and released under the Solar-Apache License 2.0 (see LICENSE).

Text Generation

TextOpen

K-EXAONE

LG AI Research

K-EXAONE is a large-scale multilingual language model developed by LG AI Research.

CodeReasoningText Generation

TextOpen

Qwen3-Max-Thinking

Alibaba

We present Qwen3-Max-Thinking, our latest flagship reasoning model.

Text Generation

TextOpen

Qwen3-Coder-Next

Alibaba

Today, we're announcing Qwen3-Coder-Next, an open-weight language model designed specifically for coding agents and local development.

CodeText Generation

TextOpen

Kimi K2.5

Moonshot AI

We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence.

Text Generation

TextOpen

GPT-5.3 Codex

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

Seedance 2.0

ByteDance

ByteDance's image generation, video, audio model tracked by Epoch, focused on video generation.

Audio / SpeechImage GenerationVideo Generation

AudioOpen

Qwen3.5 397B-A17B

Alibaba

We are delighted to announce the official release of Qwen3.5, introducing the open-weight of the first model in the Qwen3.5 series, namely Qwen3.5-397B-A17B.

Image GenerationText Generation

ImageOpen

Grok 4.20

xAI

xAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

GLM-5

Zhipu AI

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering.

Text Generation

TextOpen

Gemini 3.1 Pro

Google DeepMind

Last week, we released a major update to Gemini 3 Deep Think to solve modern challenges across science, research and engineering.

Image GenerationText Generation

ImageOpen

Qwen3.5-122B-A10B

Alibaba

Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance.

Text Generation

TextOpen

SWE 1.6

Cognition

We are sharing an early preview of our ongoing SWE-1.6 training run.

AgentsCodeText Generation

TextOpen

Gemini 3.0 Flash-lite

Google DeepMind

Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model.

Text Generation

TextOpen

GPT-5.4 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

GPT-5.4

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

Nemotron 3 Super

NVIDIA

NVIDIA's language model tracked by Epoch, focused on language modeling/generation.

CodeText Generation

TextOpen

Composer 2

Anysphere

Composer 2 is a specialized model designed for agentic software engineering.

CodeText Generation

TextOpen

GLM-5.1

Zhipu AI

Zhipu AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

Gemini Flash 3.1 TTS

Google DeepMind

Google DeepMind's audio model tracked by Epoch, focused on audio generation.

Audio / Speech

AudioOpen

Claude Opus 4.7

Anthropic

Anthropic's language model tracked by Epoch, focused on question answering.

Text Generation

TextOpen

Kimi K2.6

Moonshot AI

Moonshot AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

GPT Image 2

OpenAI

OpenAI's image generation model tracked by Epoch, focused on image generation.

Image Generation

ImageOpen

GPT-5.5 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

GPT-5.5

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation

ImageOpen

DeepSeek-V4-Pro

DeepSeek

DeepSeek's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

DeepSeek-V4-Flash

DeepSeek

DeepSeek's language model tracked by Epoch, focused on language modeling/generation.

Text Generation

TextOpen

Composer 2.5

Anysphere

Anysphere's language model tracked by Epoch, focused on coding.

CodeText Generation

TextOpen

Hume EVI 3

Hume AI

Empathic voice interface that perceives and generates emotional speech in real time.

Audio / SpeechMultimodal

VoiceOpen

Krea 1

Krea AI

Krea's in-house image model tuned for aesthetic control and real-time iteration.

Image Generation

ImageOpen

Marble

World Labs

Fei-Fei Li's World Labs spatial intelligence model that generates explorable 3D worlds from a single image.

3DImage Understanding

Image + 3DOpen

LFM2

Liquid AI

Liquid AI's second-generation efficient foundation models built on liquid neural networks for on-device use.

Text GenerationReasoning

TextOpen

DiscoPOP

Sakana AI

LLM-discovered preference optimization algorithm from Sakana's evolutionary research line.

Reasoning

TextOpen

Magic LTM-2-Mini

Magic.dev

100M-token context model purpose-built for whole-repository software synthesis.

CodeReasoning

TextOpen

Poolside Malibu

Poolside

Code-first foundation model trained with reinforcement learning from code-execution feedback.

CodeReasoning

TextOpen

Mirage

Decart

Real-time generative world model that re-skins live video streams with text prompts.

Video GenerationImage Generation

VideoOpen

Sonar Large

Perplexity

Perplexity's in-house search-grounded LLM powering the Perplexity answer engine.

Text GenerationReasoningAgents

TextOpen

Pi 3.0

Inflection AI

Inflection's empathetic conversational assistant tuned for personal, supportive dialogue.

Text GenerationAudio / Speech

Text + VoiceOpen

Kling 2.5

Kuaishou

Kuaishou's flagship text-to-video model with strong motion coherence and 1080p output.

Video Generation

VideoOpen

Imbue 70B

Imbue

Imbue's research model trained from scratch for robust agentic reasoning and code.

ReasoningCodeAgents

TextOpen

Sarvam 1

Sarvam AI

First Indic-first foundation model optimized for 10 Indian languages and English.

Text Generation

TextOpen

Aya Expanse 32B

Cohere

Massively multilingual open-weights model covering 23 languages from Cohere For AI.

Text Generation

TextOpen

Mercury Coder

Inception Labs

Diffusion-based LLM that generates code in parallel for order-of-magnitude latency gains.

CodeText Generation

TextOpen

Nous Hermes 4

Nous Research

Open-source aligned LLM family known for steerable, uncensored research use.

Text GenerationReasoning

TextOpen

Aleph 2

Runway

Runway's closed-source in-context video editing model that modifies existing videos while preserving untouched regions.

Video GenerationMultimodal

VideoOpen

LongCat Video Avatar 1.5

Meituan

Meituan LongCat's open-source audio-driven avatar video model for single- and multi-character human video generation.

Video GenerationMultimodal

Video + AudioOpen

Hy-MT2

Tencent

Tencent Hunyuan's open-source multilingual translation family for fast, instruction-following translation across 33 languages.

Text Generation

TextOpen

Qwen 3.7 Max

Alibaba

Alibaba Cloud's closed-source trillion-parameter flagship LLM for coding, reasoning, and enterprise agentic workflows.

ReasoningCodeAgents

Text + ImageOpen

Lens

Microsoft

Microsoft's open-source 3.8B text-to-image model focused on efficient training, fast high-res generation, and strong prompt adherence.

Image Generation

ImageOpen

Stable Audio 3 Medium

Stability AI

Stability AI's 2B text-to-audio diffusion model for higher-capacity music, sound-effect generation, and audio editing.

MusicAudio / Speech

AudioOpen

Command A+ W4A4

Cohere

Cohere's open-source W4A4-quantized vision-language reasoning model for agentic, multilingual, tool-use enterprise tasks.

ReasoningMultimodalAgents

Text + ImageOpen

Gemini Omni Flash

Google DeepMind

Google DeepMind's closed-source multimodal video creation and editing model that generates or edits video from text, image, video, and audio references.

Video GenerationImage GenerationMultimodal

Text + Image + Video + AudioOpen

OmniCraft Texture Generator

Deemos Technologies

Hyper3D OmniCraft Texture generates photorealistic, seamless, tileable PBR textures for 3D assets and design pipelines.

3DImage Generation

3D + ImageOpen

Gemini 3.5 Flash

Google DeepMind

Google DeepMind's closed-source natively multimodal reasoning model for fast, high-capability agentic and coding tasks.

ReasoningMultimodalCode

Text + Image + AudioOpen

Qwen3.5 LiveTranslate Flash

Alibaba

Alibaba's vision-enhanced real-time audio/video translation model for live multilingual interpretation across 60 languages.

Audio / SpeechMultimodal

Audio + Video + TextOpen

Nemotron Labs Diffusion 14B

NVIDIA

NVIDIA's open 14B text-generation LM supporting autoregressive, diffusion-style parallel, and self-speculative decoding.

Text GenerationReasoning

TextOpen

Mirelo SFX 1.6

Mirelo AI

Mirelo's text-to-sound-effects model for production-ready Foley, ambience, and SFX generation.

Audio / Speech

AudioOpen

WavFlow

Meta

Meta's audio generation model focused on high-fidelity waveform synthesis and speech-music co-generation.

Audio / SpeechMusic

AudioOpen

Lance

ByteDance

ByteDance's foundation model for fast multimodal content creation across short-form video pipelines.

MultimodalVideo Generation

Text + Image + VideoOpen

Agora 1

Odyssey

Odyssey's interactive world model for real-time AI-generated explorable video environments.

Video GenerationAgents3D

Interactive VideoOpen

HRM Text 1B

Sapient Intelligence

Sapient's 1B Hierarchical Reasoning Model for compact, structured chain-of-thought text generation.

ReasoningText Generation

TextOpen

Dramabox

Resemble AI

Resemble AI's expressive multi-character voice acting model for long-form dramatic dialogue and narration.

Audio / Speech

AudioOpen

Stable Audio 3 Small SFX

Stability AI

Stability AI's compact text-to-sound-effects diffusion model optimized for low-latency on-device SFX generation.

Audio / Speech

AudioOpen

Stable Audio 3 Small Music

Stability AI

Stability AI's compact text-to-music diffusion model tuned for short, license-friendly musical loops and stems.

Music

AudioOpen