Catalog Search · 705 models indexed

Search

Find any AI model in AIDB by name, maker, capability, industry or modality.

Showing 705 of 705 models

GPT-5

OpenAI

OpenAI's flagship multimodal reasoning model with long-context tool use.

Text GenerationReasoningMultimodal
Text + Image + AudioOpen

GPT-4o

OpenAI

Real-time omni-model handling text, vision and voice in a single network.

MultimodalAudio / SpeechImage Understanding
Text + Image + AudioOpen

o3

OpenAI

Frontier reasoning model tuned for math, science and coding workflows.

ReasoningCode
TextOpen

Claude Sonnet 4.5

Anthropic

Anthropic's best coding and agentic model, strong at long autonomous tasks.

CodeReasoningAgents
Text + ImageOpen

Claude Opus 4

Anthropic

Top-tier reasoning model for research, analysis and complex writing.

ReasoningText Generation
Text + ImageOpen

Gemini 2.5 Pro

Google DeepMind

Long-context multimodal model with native tool use and 1M+ token window.

MultimodalReasoningCode
Text + Image + Video + AudioOpen

Gemini 2.5 Flash

Google DeepMind

Fast, cheap multimodal model optimised for high-volume production use.

Text GenerationMultimodal
Text + ImageOpen

Grok 4

xAI

xAI's flagship reasoning model with real-time X knowledge and tool use.

ReasoningText GenerationAgents
Text + ImageOpen

Llama 4

Meta

Meta's open-weights multimodal MoE family (Scout & Maverick).

MultimodalText GenerationCode
Text + ImageOpen

DeepSeek-V3

DeepSeek

High-performance open MoE LLM rivaling closed frontier models on benchmarks.

Text GenerationReasoningCode
TextOpen

DeepSeek-R1

DeepSeek

Open reasoning model trained with RL, competitive with o1-class systems.

ReasoningCode
TextOpen

Mistral Large 2

Mistral AI

European frontier LLM strong at code, math and multilingual tasks.

Text GenerationCode
TextOpen

Qwen3

Alibaba

Open multilingual model family with hybrid thinking modes.

Text GenerationReasoningCode
Text + ImageOpen

Command R+

Cohere

Enterprise-grade RAG and tool-use model for business workloads.

Text GenerationAgents
TextOpen

Phi-4

Microsoft

Small language model punching above its weight on reasoning benchmarks.

ReasoningText Generation
TextOpen

GPT-image-1

OpenAI

Production-grade image generation API with strong text rendering.

Image Generation
ImageOpen

DALL·E 3

OpenAI

Prompt-faithful image generator integrated across ChatGPT.

Image Generation
ImageOpen

Imagen 4

Google DeepMind

Photoreal image model with sharp typography and detail.

Image Generation
ImageOpen

Midjourney v7

Midjourney

Aesthetic-first image model beloved by designers and concept artists.

Image Generation
ImageOpen

Stable Diffusion 3.5

Stability AI

Open-weights image generator with strong fine-tuning ecosystem.

Image Generation
ImageOpen

FLUX.1

Black Forest Labs

State-of-the-art open image model from ex-Stable Diffusion researchers.

Image Generation
ImageOpen

Ideogram 2.0

Ideogram

Image model specialised in legible in-image typography and logos.

Image Generation
ImageOpen

Sora

OpenAI

Text-to-video model producing minute-long cinematic clips.

Video Generation
VideoOpen

Veo 3

Google DeepMind

High-fidelity video generation with native synchronised audio.

Video GenerationAudio / Speech
Video + AudioOpen

Runway Gen-4

Runway

Pro video generation with consistent characters and worlds.

Video Generation
VideoOpen

Kling 2.0

Kuaishou

Chinese text-to-video model with strong physical realism.

Video Generation
VideoOpen

Pika 2.0

Pika Labs

Creative video generator with scene ingredients and edits.

Video Generation
VideoOpen

Whisper v3

OpenAI

Open multilingual speech recognition and translation model.

Audio / Speech
AudioOpen

ElevenLabs v3

ElevenLabs

Best-in-class expressive TTS and voice cloning across 70+ languages.

Audio / Speech
AudioOpen

Suno v4

Suno

Generates full songs with vocals from a text prompt.

Music
AudioOpen

Udio

Udio

Text-to-music model focused on production-quality tracks.

Music
AudioOpen

Claude Code

Anthropic

Agentic coding tool that lives in your terminal and edits real codebases.

CodeAgents
TextOpen

GitHub Copilot

GitHub / OpenAI

In-IDE pair programmer powering code completion and chat.

CodeAgents
TextOpen

Cursor

Anysphere

AI-first code editor with multi-file edits and background agents.

CodeAgents
TextOpen

Devin

Cognition

Autonomous software engineer that plans, codes and ships PRs.

AgentsCode
TextOpen

Codestral

Mistral AI

Code-specialised open model covering 80+ programming languages.

Code
TextOpen

AlphaFold 3

Google DeepMind / Isomorphic

Predicts the structure and interactions of life's molecules.

Reasoning
StructuredOpen

Med-PaLM 2

Google

Medical LLM achieving expert-level performance on USMLE-style questions.

ReasoningText Generation
TextOpen

Evo 2

Arc Institute

Genome-scale foundation model spanning DNA, RNA and proteins.

Reasoning
SequenceOpen

BloombergGPT

Bloomberg

Finance-domain LLM trained on decades of market and news data.

Text GenerationReasoning
TextOpen

Harvey

Harvey

Generative AI platform purpose-built for elite law firms.

Text GenerationAgents
TextOpen

Khanmigo

Khan Academy

AI tutor that guides students with Socratic questioning.

Text GenerationAgents
TextOpen

Perplexity

Perplexity

Answer engine combining LLMs with cited live web search.

AgentsText Generation
TextOpen

NotebookLM

Google

Source-grounded research assistant with audio overviews.

Text GenerationAudio / Speech
Text + AudioOpen

text-embedding-3-large

OpenAI

High-dimensional embeddings for search, RAG and clustering.

Embeddings
TextOpen

Voyage-3

Voyage AI

Top-ranked retrieval embeddings, optimised for RAG quality.

Embeddings
TextOpen

SAM 2

Meta

Segment Anything for images and video, in real time.

Image Understanding
Image + VideoOpen

RT-2

Google DeepMind

Vision-language-action model that controls robots from web knowledge.

MultimodalAgents
Vision + ActionOpen

Genesis

Genesis Embodied AI

Generative physics platform for robotics simulation and 4D worlds.

3DAgents
SimulationOpen

TripoSR

Stability AI / Tripo

Fast single-image to 3D mesh reconstruction model.

3D
3DOpen

Jasper

Jasper

Marketing copilot for brand-aware content at enterprise scale.

Text GenerationAgents
TextOpen

Operator

OpenAI

Browser-using agent that performs tasks on the open web.

Agents
WebOpen

GPT-4.1

OpenAI

Improved GPT-4 series model with stronger coding and instruction following.

Text GenerationCodeReasoning
Text + ImageOpen

GPT-4.1 mini

OpenAI

Smaller, faster GPT-4.1 for production workloads.

Text GenerationCode
Text + ImageOpen

GPT-4.1 nano

OpenAI

Cheapest, fastest GPT-4.1 tier for high-volume tasks.

Text Generation
TextOpen

o4-mini

OpenAI

Compact reasoning model balancing cost and quality.

ReasoningCode
Text + ImageOpen

GPT-4o mini

OpenAI

Cost-efficient multimodal small model.

Text GenerationMultimodal
Text + ImageOpen

GPT-OSS 120B

OpenAI

OpenAI's first open-weights reasoning models since GPT-2.

ReasoningText Generation
TextOpen

TTS-1 / GPT-4o Voice

OpenAI

OpenAI text-to-speech voices via the audio API.

Audio / Speech
AudioOpen

Claude Haiku 4.5

Anthropic

Anthropic's fastest and cheapest frontier-class small model.

Text GenerationCode
Text + ImageOpen

Claude 3.7 Sonnet

Anthropic

Hybrid reasoning model with extended thinking mode.

ReasoningCode
Text + ImageOpen

Claude 3.5 Haiku

Anthropic

Fast, low-cost model for everyday tasks.

Text Generation
Text + ImageOpen

Gemini 2.5 Flash-Lite

Google DeepMind

Smallest, cheapest Gemini for high-volume tasks.

Text Generation
TextOpen

Gemini 2.0 Flash

Google DeepMind

Multimodal model with native tool use and live API.

MultimodalAgents
Text + Image + AudioOpen

Gemma 3

Google

Open-weights model family in 1B–27B sizes for on-device & server.

Text Generation
Text + ImageOpen

PaliGemma 2

Google

Open vision-language models for fine-tuning.

Image Understanding
Image + TextOpen

Lyria 2

Google DeepMind

Google's professional music generation model.

Music
AudioOpen

Chirp 3

Google

High-fidelity expressive TTS voices on Google Cloud.

Audio / Speech
AudioOpen

Llama 3.3 70B

Meta

Open-weights instruct model competitive with much larger LLMs.

Text GenerationCode
TextOpen

Llama 3.2 Vision

Meta

Open multimodal model in 11B and 90B sizes.

MultimodalImage Understanding
Text + ImageOpen

DINOv3

Meta

Self-supervised vision foundation model for image features.

Image UnderstandingEmbeddings
ImageOpen

Seamless M4T v2

Meta

Multilingual speech-to-speech and speech-to-text translation.

Audio / Speech
Audio + TextOpen

MusicGen

Meta

Open text-to-music model from AudioCraft.

Music
AudioOpen

Grok 4 Heavy

xAI

Multi-agent variant of Grok 4 for the hardest problems.

ReasoningAgents
Text + ImageOpen

Grok Code Fast 1

xAI

Speed-optimised code model for agentic IDE workflows.

CodeAgents
TextOpen

Aurora

xAI

Photoreal autoregressive image generation model.

Image Generation
ImageOpen

Mistral Medium 3

Mistral AI

Frontier-class performance at a fraction of the cost.

Text GenerationCode
Text + ImageOpen

Mistral Small 3.2

Mistral AI

Fast open-weights small model with strong reasoning.

Text Generation
TextOpen

Pixtral Large

Mistral AI

124B multimodal model with state-of-the-art image understanding.

MultimodalImage Understanding
Text + ImageOpen

Mixtral 8x22B

Mistral AI

Sparse mixture-of-experts open model.

Text GenerationCode
TextOpen

Devstral

Mistral AI

Open agentic coding model built with All Hands AI.

CodeAgents
TextOpen

Qwen3-Coder

Alibaba

Open agentic coding model in the Qwen3 family.

CodeAgents
TextOpen

Qwen2.5-VL

Alibaba

Open vision-language model with strong document understanding.

MultimodalImage Understanding
Text + ImageOpen

Wan 2.2

Alibaba

Open text-to-video and image-to-video model.

Video Generation
VideoOpen

GLM-4.5

Zhipu AI

Open agentic foundation model from Zhipu's GLM family.

ReasoningAgents
TextOpen

Kimi K2

Moonshot AI

Trillion-parameter open MoE model with strong agentic skills.

ReasoningAgentsCode
TextOpen

MiniMax-M1

MiniMax

Open reasoning model with 1M-token context.

ReasoningText Generation
TextOpen

Hailuo 02

MiniMax

Cinematic text-to-video generator.

Video Generation
VideoOpen

Yi-Lightning

01.AI

Fast, low-cost frontier-tier LLM from 01.AI.

Text GenerationReasoning
TextOpen

Hunyuan-Large

Tencent

Open MoE model with 389B params from Tencent.

Text Generation
TextOpen

ERNIE 4.5

Baidu

Baidu's flagship multimodal foundation model.

MultimodalReasoning
Text + ImageOpen

Doubao 1.5 Pro

ByteDance

ByteDance's flagship LLM, widely deployed in China.

Text GenerationMultimodal
Text + ImageOpen

Seedance 1.0

ByteDance

ByteDance Seed video generation model.

Video Generation
VideoOpen

Recraft V3

Recraft

Image model designed for brand & vector-style design assets.

Image Generation
ImageOpen

Adobe Firefly Image 4

Adobe

Commercially-safe image model trained on licensed data.

Image Generation
ImageOpen

Leonardo Phoenix

Leonardo.Ai

In-house foundation model with strong prompt adherence.

Image Generation
ImageOpen

Playground v3

Playground

Image model focused on graphic design and typography.

Image Generation
ImageOpen

FLUX.1 Kontext

Black Forest Labs

Image editing model with character & style consistency.

Image Generation
ImageOpen

HiDream-I1

HiDream

Open 17B image generation model topping benchmarks.

Image Generation
ImageOpen

Luma Ray 2

Luma AI

Large video generative model with realistic motion.

Video Generation
VideoOpen

HeyGen Avatar IV

HeyGen

AI avatar video generator for marketing and training.

Video GenerationAudio / Speech
VideoOpen

Synthesia

Synthesia

Enterprise AI video platform with realistic avatars.

Video Generation
VideoOpen

HunyuanVideo

Tencent

Open 13B text-to-video model.

Video Generation
VideoOpen

Mochi 1

Genmo

Open-source video generation model.

Video Generation
VideoOpen

LTX Video

Lightricks

Real-time open video generation model.

Video Generation
VideoOpen

Cartesia Sonic

Cartesia

Ultra-low-latency state-space TTS model.

Audio / Speech
AudioOpen

PlayHT 3.0

PlayHT

Conversational TTS optimised for AI agents.

Audio / Speech
AudioOpen

Resemble AI

Resemble AI

Voice cloning and real-time speech synthesis platform.

Audio / Speech
AudioOpen

Deepgram Nova-3

Deepgram

Production-grade streaming speech-to-text model.

Audio / Speech
AudioOpen

AssemblyAI Universal-2

AssemblyAI

Highly accurate speech recognition with rich audio intelligence.

Audio / Speech
AudioOpen

Moonshine

Useful Sensors

Open ASR model optimised for real-time edge inference.

Audio / Speech
AudioOpen

Stable Audio 2.0

Stability AI

Generates full-length audio tracks from text.

Music
AudioOpen

Riffusion

Riffusion

AI music generation with vocal & instrumental control.

Music
AudioOpen

Lovable

Lovable

AI fullstack builder that ships production web apps from prompts.

CodeAgents
TextOpen

Bolt.new

StackBlitz

Browser-based AI agent that builds and runs full-stack apps.

CodeAgents
TextOpen

v0

Vercel

Generative UI tool that produces React + Tailwind components.

Code
TextOpen

Replit Agent

Replit

Agent that creates, edits and deploys apps inside Replit.

CodeAgents
TextOpen

Windsurf (Cascade)

Codeium

Agentic IDE with deep multi-file flows.

CodeAgents
TextOpen

Tabnine

Tabnine

Privacy-first AI code assistant for the enterprise.

Code
TextOpen

Amazon Q Developer

AWS

Coding & cloud assistant deeply integrated with AWS.

CodeAgents
TextOpen

Sourcegraph Cody

Sourcegraph

Code AI with deep codebase context across repos.

Code
TextOpen

Aider

Aider

Open-source CLI pair-programmer using your favorite LLM.

CodeAgents
TextOpen

Continue

Continue

Open-source AI code assistant for VS Code & JetBrains.

Code
TextOpen

ChatGPT Agent

OpenAI

ChatGPT mode that browses, codes and acts on your behalf.

Agents
WebOpen

Manus

Butterfly Effect

General-purpose autonomous agent that executes long workflows.

Agents
WebOpen

Google AI Mode

Google

Conversational AI search experience in Google.

AgentsText Generation
TextOpen

You.com

You.com

AI assistant combining web search with multi-model chat.

Agents
TextOpen

Brave Leo

Brave

Privacy-respecting AI assistant built into the Brave browser.

Text Generation
TextOpen

Cohere Embed v4

Cohere

Multimodal multilingual embeddings for enterprise RAG.

Embeddings
Text + ImageOpen

Jina Embeddings v3

Jina AI

Multilingual long-context embedding model.

Embeddings
TextOpen

BGE-M3

BAAI

Open multi-functional, multilingual embedding model.

Embeddings
TextOpen

Nomic Embed v2

Nomic

Open MoE multilingual text embeddings.

Embeddings
TextOpen

ESM3

EvolutionaryScale

Frontier protein language model for biology design.

Reasoning
SequenceOpen

RFdiffusion

Baker Lab

Open diffusion model for de novo protein design.

Reasoning
SequenceOpen

Boltz-1

MIT / Recursion

Open AlphaFold3-class biomolecular structure prediction model.

Reasoning
StructuredOpen

GraphCast

Google DeepMind

Best-in-class medium-range global weather forecasting AI.

Reasoning
StructuredOpen

GenCast

Google DeepMind

Probabilistic AI weather forecasting beating ENS.

Reasoning
StructuredOpen

Tx-LLM

Google

LLM tuned for therapeutic and drug development tasks.

Reasoning
TextOpen

π0 (Pi-Zero)

Physical Intelligence

Generalist vision-language-action model for robots.

MultimodalAgents
Vision + ActionOpen

Helix

Figure

Vision-language-action model for humanoid robot control.

MultimodalAgents
Vision + ActionOpen

GR00T N1

NVIDIA

Open foundation model for humanoid robots.

MultimodalAgents
Vision + ActionOpen

Cosmos

NVIDIA

World foundation models for physical AI simulation.

3DAgents
SimulationOpen

Genie 3

Google DeepMind

Real-time interactive world model from a text prompt.

3DAgents
SimulationOpen

Meshy 5

Meshy

Text & image to 3D model generator for creators.

3D
3DOpen

Rodin Gen-2

Hyper3D

High-fidelity 3D asset generation with PBR textures.

3D
3DOpen

Tripo 3.0

Tripo AI

Production-quality text and image to 3D.

3D
3DOpen

CoCounsel

Thomson Reuters

Generative AI legal assistant for lawyers.

Text GenerationAgents
TextOpen

Hebbia Matrix

Hebbia

Agentic research platform for finance and legal teams.

AgentsText Generation
TextOpen

Glean Assistant

Glean

Enterprise AI assistant grounded in company knowledge.

AgentsText Generation
TextOpen

Microsoft 365 Copilot

Microsoft

AI assistant across Word, Excel, Outlook, PowerPoint and Teams.

Text GenerationAgents
TextOpen

Gemini for Workspace

Google

AI in Gmail, Docs, Sheets, Slides and Meet.

Text Generation
TextOpen

Notion AI

Notion

AI for writing, search and meeting notes inside Notion.

Text GenerationAgents
TextOpen

Copy.ai

Copy.ai

GTM AI platform for marketing and sales workflows.

Text GenerationAgents
TextOpen

Writer Palmyra X5

Writer

Enterprise LLM family powering Writer's generative platform.

Text Generation
TextOpen

Pi

Inflection

Personal AI focused on emotionally intelligent conversation.

Text Generation
TextOpen

Character.AI

Character.AI

Platform for creating and chatting with AI characters.

Text Generation
TextOpen

Duolingo Max

Duolingo

AI-powered language tutoring features.

Text GenerationAudio / Speech
Text + AudioOpen

OLMo 2

Allen Institute (AI2)

Fully open language model with training data and code released.

Text Generation
TextOpen

Falcon 3

TII

Open LLM family from the Technology Innovation Institute.

Text Generation
TextOpen

SmolLM3

Hugging Face

Compact open model strong in its size class.

Text Generation
TextOpen

Reka Flash 3

Reka

Multimodal frontier model with open weights.

MultimodalReasoning
Text + Image + Video + AudioOpen

Nemotron 4

NVIDIA

NVIDIA's open LLM family for synthetic data and reasoning.

Text GenerationReasoning
TextOpen

IBM Granite 3

IBM

Open enterprise-ready foundation models.

Text GenerationCode
TextOpen

Snowflake Arctic

Snowflake

Open enterprise LLM optimised for SQL and coding.

CodeText Generation
TextOpen

Databricks DBRX

Databricks

Open MoE LLM tuned for enterprise tasks.

Text Generation
TextOpen

Dell AI Factory (with NVIDIA)

Dell Technologies

End-to-end AI infrastructure stack combining PowerEdge servers, storage, networking and NVIDIA AI Enterprise.

AgentsReasoning
InfrastructureOpen

Dell Pro AI Studio

Dell Technologies

Toolkit for deploying on-device AI models to Dell AI PCs at scale.

Agents
On-deviceOpen

Lenovo AI Now

Lenovo

On-device AI assistant running locally on Lenovo AI PCs for private productivity.

AgentsText Generation
On-deviceOpen

Lenovo Hybrid AI Advantage

Lenovo

Hybrid AI platform spanning ThinkSystem servers, ThinkEdge devices and managed services.

Agents
InfrastructureOpen

AMD Instinct MI350

AMD

Datacenter accelerator (CDNA 4) for training and inference of frontier models.

Reasoning
AcceleratorOpen

AMD Ryzen AI 300

AMD

Laptop CPU with XDNA 2 NPU delivering 50+ TOPS for Copilot+ AI PCs.

Agents
On-deviceOpen

AMD ROCm

AMD

Open software stack for GPU compute and AI on Instinct & Radeon hardware.

Code
StackOpen

NVIDIA DGX Cloud

NVIDIA

Managed AI training service on dedicated NVIDIA Hopper/Blackwell clusters.

Reasoning
CloudOpen

NVIDIA Blackwell B200

NVIDIA

Flagship datacenter GPU for trillion-parameter AI training and inference.

Reasoning
AcceleratorOpen

NVIDIA NIM Microservices

NVIDIA

Containerized inference microservices for deploying optimized AI models anywhere.

Agents
ServiceOpen

NVIDIA Jetson Thor

NVIDIA

Edge robotics platform built on Blackwell for humanoid and physical AI.

Agents
EdgeOpen

NVIDIA DRIVE Thor

NVIDIA

Centralized car computer for AV, cockpit AI and infotainment.

AgentsMultimodal
EdgeOpen

Intel Gaudi 3

Intel

Datacenter AI accelerator targeting price/performance vs H100.

Reasoning
AcceleratorOpen

Intel Core Ultra (Lunar Lake)

Intel

Laptop CPU with integrated NPU powering Copilot+ on-device AI.

Agents
On-deviceOpen

Intel OpenVINO

Intel

Open toolkit to optimize and deploy AI inference across Intel CPUs, GPUs and NPUs.

Code
ToolkitOpen

Qualcomm Snapdragon X Elite

Qualcomm

Arm laptop SoC with 45 TOPS Hexagon NPU for Copilot+ PCs.

Agents
On-deviceOpen

Qualcomm AI Hub

Qualcomm

Library of optimized AI models ready to deploy on Snapdragon devices.

MultimodalCode
On-deviceOpen

Apple Intelligence

Apple

On-device + private cloud generative AI across iPhone, iPad and Mac.

MultimodalAgentsImage Generation
On-deviceOpen

Apple Neural Engine (M4)

Apple

38 TOPS NPU integrated in Apple Silicon for on-device ML workloads.

Multimodal
AcceleratorOpen

Google Tensor G5

Google

Pixel SoC powering on-device Gemini Nano features.

MultimodalAudio / Speech
On-deviceOpen

Google TPU v5p / Trillium

Google

Custom AI accelerators powering Gemini training and Google Cloud AI.

Reasoning
AcceleratorOpen

Gemini Nano (on Pixel)

Google

Smallest Gemini model running fully on-device on Pixel and Android.

Text GenerationMultimodal
On-deviceOpen

Samsung Galaxy AI

Samsung

Suite of on-device + cloud AI features for Galaxy phones (translate, edit, summarize).

MultimodalAudio / SpeechImage Generation
HybridOpen

Samsung Gauss2

Samsung

Samsung's in-house generative model family for Galaxy products.

Text GenerationCodeImage Generation
Text + ImageOpen

HPE Private Cloud AI

Hewlett Packard Enterprise

Turnkey on-prem AI cloud built with NVIDIA, co-engineered for enterprises.

AgentsReasoning
InfrastructureOpen

HP AI Companion

HP

Local AI assistant bundled with HP AI PCs for private document Q&A.

Text GenerationAgents
On-deviceOpen

IBM watsonx

IBM

Enterprise AI & data platform with model studio, governance and runtime.

AgentsText Generation
PlatformOpen

Cisco AI Defense

Cisco

Security platform that protects AI applications from misuse and attacks.

Agents
SecurityOpen

Cisco AI Pods

Cisco

Pre-validated infrastructure stacks for inference at the edge of the enterprise.

Agents
InfrastructureOpen

Pure Storage AIRI//S

Pure Storage

AI-ready storage stack co-engineered with NVIDIA for training pipelines.

Reasoning
InfrastructureOpen

NetApp AIPod

NetApp

Converged AI infrastructure with ONTAP storage and NVIDIA compute.

Reasoning
InfrastructureOpen

Supermicro SuperCluster

Supermicro

Liquid-cooled GPU SuperClusters for trillion-parameter LLM training.

Reasoning
InfrastructureOpen

Cerebras WSE-3 / CS-3

Cerebras

Wafer-scale AI processor delivering record-breaking inference throughput.

Reasoning
AcceleratorOpen

Groq LPU

Groq

Language Processing Unit delivering ultra-low-latency LLM inference.

Reasoning
AcceleratorOpen

SambaNova Suite

SambaNova Systems

Full-stack AI platform with Reconfigurable Dataflow Units (RDUs).

ReasoningAgents
PlatformOpen

Tenstorrent Wormhole

Tenstorrent

Open RISC-V based AI accelerator from Jim Keller's team.

Reasoning
AcceleratorOpen

AWS Trainium2

AWS

Custom AWS chip purpose-built for training large language models.

Reasoning
AcceleratorOpen

AWS Inferentia2

AWS

Cost-optimized AWS chip for high-throughput LLM inference.

Reasoning
AcceleratorOpen

Amazon Bedrock

AWS

Managed service to build agents using foundation models from many vendors.

AgentsText Generation
PlatformOpen

Amazon Nova

AWS

Amazon's foundation model family (text, image, video) on Bedrock.

MultimodalImage GenerationVideo Generation
Text + Image + VideoOpen

Azure AI Foundry

Microsoft

Unified platform to design, customize and operate enterprise AI agents.

Agents
PlatformOpen

Microsoft Copilot+ PC

Microsoft

Windows AI PC category with NPU-powered features like Recall and Live Captions.

AgentsImage Understanding
On-deviceOpen

Tesla FSD v13

Tesla

End-to-end neural network for autonomous driving on Tesla vehicles.

MultimodalAgents
Vision + ActionOpen

Mercedes MB.OS with Google Cloud AI

Mercedes-Benz

In-car operating system with conversational AI assistant powered by Google Cloud.

MultimodalAgents
In-vehicleOpen

BMW Intelligent Personal Assistant

BMW

Voice-first in-car AI assistant integrating Alexa LLM features.

Audio / SpeechAgents
In-vehicleOpen

Mobileye Chauffeur

Mobileye

Eyes-off AV system combining EyeQ chips, surround sensing and REM mapping.

MultimodalAgents
Vision + ActionOpen

Waymo Driver

Waymo

Full-stack autonomous driving system deployed in robotaxis.

MultimodalAgents
Vision + ActionOpen

John Deere See & Spray

John Deere

Computer vision system that targets herbicide only at weeds in real time.

Image UnderstandingAgents
Edge VisionOpen

Boston Dynamics Orbit (with AI)

Boston Dynamics

Software platform managing Spot robots with AI-driven inspection routines.

AgentsImage Understanding
Robotics PlatformOpen

Figure 02

Figure

Humanoid robot powered by the Helix VLA model for general-purpose work.

MultimodalAgents
HumanoidOpen

Tesla Optimus

Tesla

General-purpose humanoid robot using Tesla's autonomy stack.

MultimodalAgents
HumanoidOpen

Unitree G1

Unitree

Affordable humanoid robot platform with onboard AI.

Agents
HumanoidOpen

Rabbit R1

Rabbit

Pocket AI device built around the Large Action Model paradigm.

AgentsAudio / Speech
DeviceOpen

Humane AI Pin

Humane

Wearable AI assistant with laser projection and voice-first UI.

AgentsMultimodal
WearableOpen

Meta Ray-Ban (with Meta AI)

Meta

Smart glasses with multimodal Meta AI for live look-and-ask.

MultimodalAudio / Speech
WearableOpen

CrowdStrike Charlotte AI

CrowdStrike

Generative AI security analyst built into the Falcon platform.

Agents
SaaSOpen

Microsoft Security Copilot

Microsoft

Generative AI assistant for SOC analysts and IT admins.

Agents
SaaSOpen

Palo Alto AI Access Security

Palo Alto Networks

Discovery and protection for employee use of generative AI apps.

Agents
SaaSOpen

Salesforce Agentforce

Salesforce

Platform for building autonomous AI agents on top of CRM data.

Agents
PlatformOpen

ServiceNow Now Assist

ServiceNow

Generative AI built into the ServiceNow workflow platform.

AgentsText Generation
PlatformOpen

Oracle AI Agents

Oracle

AI agents embedded across Oracle Fusion Cloud apps and OCI.

Agents
PlatformOpen

SAP Joule

SAP

Generative AI copilot embedded across SAP business applications.

AgentsText Generation
PlatformOpen

John Deere See & Spray

John Deere

Computer-vision sprayer that targets weeds in real time, cutting herbicide use up to 60%.

Image UnderstandingAgents
Hardware + VisionOpen

Climate FieldView

Bayer / Climate

Digital agronomy platform with AI-driven yield, planting and nitrogen recommendations.

Agents
SaaSOpen

Plantix

PEAT

Mobile crop-disease diagnosis from a single leaf photo, used by 30M+ smallholders.

Image Understanding
MobileOpen

Blue River Technology

John Deere

Robotics + computer vision for precision weeding and crop care.

Image UnderstandingAgents
Hardware + VisionOpen

Taranis

Taranis

Aerial leaf-level imagery with AI for early pest, disease and nutrient detection.

Image Understanding
Aerial ImageryOpen

CropX

CropX

Soil-sensor + AI agronomic platform for irrigation and nitrogen optimization.

Agents
IoT + SaaSOpen

AGCO Fuse Smart Farming

AGCO

Connected farm AI platform for fleets of Massey Ferguson and Fendt machinery.

Agents
PlatformOpen

Carbon Robotics LaserWeeder

Carbon Robotics

AI-guided lasers identify and zap weeds in row crops without chemicals.

Image UnderstandingAgents
HardwareOpen

Indigo Carbon

Indigo Ag

AI-powered soil carbon and regenerative-ag marketplace.

Agents
SaaSOpen

Palantir AIP for Gov

Palantir

AI Platform deploying LLMs against classified and government datasets with audit and policy controls.

AgentsReasoning
PlatformOpen

GovGPT (Microsoft Azure Gov)

Microsoft

Azure OpenAI Service in Azure Government for IL5 / FedRAMP High workloads.

Text GenerationAgents
CloudOpen

AWS GovCloud Bedrock

AWS

Amazon Bedrock foundation models in AWS GovCloud for US public-sector workloads.

Text GenerationMultimodal
CloudOpen

Google Public Sector Gemini

Google

Gemini and Vertex AI tailored for federal, state and local government.

MultimodalAgents
CloudOpen

Oracle Government AI

Oracle

Generative AI in Oracle Cloud for Government across HCM, ERP and citizen services.

Agents
CloudOpen

ServiceNow Citizen Engagement

ServiceNow

AI-powered citizen-services workflows for federal and local agencies.

Agents
SaaSOpen

Veritone Public Sector

Veritone

AI for evidence redaction, transcription and investigations for law enforcement.

Audio / SpeechImage Understanding
SaaSOpen

Anduril Lattice

Anduril

AI command-and-control mesh fusing sensors, drones and effectors across the battlespace.

AgentsMultimodal
PlatformOpen

Shield AI Hivemind

Shield AI

Autonomy stack flying GPS- and comms-denied missions on V-BAT and F-16.

Agents
Autonomy StackOpen

Helsing

Helsing

European defense AI for sensor fusion, electronic warfare and autonomous strike.

AgentsMultimodal
PlatformOpen

Palantir Maven Smart System

Palantir

AI-driven targeting and ISR fusion deployed across US combatant commands.

AgentsImage Understanding
PlatformOpen

Scale Donovan

Scale AI

LLM-powered decision-making platform for defense and intelligence analysts.

AgentsReasoning
PlatformOpen

Lockheed Martin Astris AI

Lockheed Martin

Defense-grade AI infrastructure subsidiary supporting national security missions.

Agents
PlatformOpen

BAE Systems FAST Labs AI

BAE Systems

Autonomous and AI/ML systems for ISR, EW and mission systems.

AgentsImage Understanding
PlatformOpen

Saab Loke

Saab

AI-enabled command-and-control suite for joint and combined operations.

Agents
PlatformOpen

Rebellion Defense

Rebellion Defense

AI products for ISR, mission planning and computer vision in defense.

Image UnderstandingAgents
PlatformOpen

Vannevar Labs

Vannevar Labs

AI for open-source intelligence and non-traditional collection.

ReasoningAgents
SaaSOpen

Booking.com AI Trip Planner

Booking.com

Conversational trip planner suggesting destinations, hotels and itineraries.

AgentsText Generation
SaaSOpen

Expedia Romie

Expedia

Group-travel AI assistant integrated in chat for planning and booking.

Agents
SaaSOpen

Kayak Ask Kayak

Kayak

ChatGPT-powered travel concierge for searching flights, hotels and cars.

Agents
SaaSOpen

TripAdvisor AI Trips

TripAdvisor

AI-generated personalized itineraries from reviews and traveler data.

Text GenerationAgents
SaaSOpen

Hopper GPT

Hopper

Conversational price-prediction and booking assistant for flights and hotels.

Agents
MobileOpen

Hilton ConfirmedConnectingRooms AI

Hilton

AI-driven family-room matching and stay personalization across Hilton brands.

Agents
SaaSOpen

Marriott Renai

Marriott

Generative-AI experiences for guest service and personalization.

Agents
SaaSOpen

Airbnb AI Search

Airbnb

AI-powered search, photo-tour categorization and host customer-service agent.

AgentsImage Understanding
SaaSOpen

GM Super Cruise

General Motors

Hands-free driver-assistance system using AI perception and HD-map fusion.

AgentsImage Understanding
ADASOpen

Ford BlueCruise

Ford

Hands-free highway driving with AI driver monitoring.

Agents
ADASOpen

Hyundai Pleos

Hyundai Motor

AI-powered software-defined vehicle OS with voice and personalization.

AgentsAudio / Speech
Vehicle OSOpen

Stellantis STLA AutoDrive

Stellantis

Level-3 autonomous-driving stack across Stellantis brands.

Agents
ADASOpen

Cruise (GM)

Cruise

Driverless robotaxi platform built on GM vehicles.

Agents
Autonomy StackOpen

Zoox

Amazon

Purpose-built autonomous robotaxi with end-to-end AI driving stack.

Agents
Autonomy StackOpen

Wayve GAIA-2

Wayve

Generative world model for end-to-end embodied driving.

Video GenerationAgents
World ModelOpen

Siemens Industrial Copilot

Siemens

Generative AI copilot for engineers across PLC, design and operations.

AgentsCode
PlatformOpen

Rockwell FactoryTalk Optix AI

Rockwell Automation

AI-enabled HMI and analytics for industrial automation.

Agents
PlatformOpen

GE Vernova AI

GE Vernova

AI for grid orchestration, wind-turbine optimization and power generation.

Agents
PlatformOpen

Schneider Electric EcoStruxure AI

Schneider Electric

AI for energy management and industrial automation across EcoStruxure.

Agents
PlatformOpen

ABB Ability Genix

ABB

Industrial analytics and AI platform for process and discrete manufacturing.

Agents
PlatformOpen

Honeywell Forge

Honeywell

Industrial AI for buildings, aerospace and process industries.

Agents
PlatformOpen

SLB Lumi

SLB (Schlumberger)

Generative-AI platform for energy operations across exploration and production.

Agents
PlatformOpen

Baker Hughes Leucipa

Baker Hughes

Autonomous field-operations AI for oil & gas production.

Agents
PlatformOpen

Tapestry (X / Alphabet)

Alphabet X

AI-driven virtualization platform for the electric grid.

Agents
PlatformOpen

Octopus Kraken

Octopus Energy

AI-native customer and grid platform powering 60M+ energy accounts.

Agents
SaaSOpen

Ericsson Cognitive Network Solutions

Ericsson

AI/ML for autonomous 5G network operations and energy savings.

Agents
PlatformOpen

Nokia MX Industrial Edge AI

Nokia

Edge AI platform for private 5G and industrial automation.

Agents
EdgeOpen

AT&T Ask AT&T

AT&T

Internal generative-AI assistant built on Azure OpenAI for 80k+ employees.

Agents
SaaSOpen

Verizon Personal Research Assistant

Verizon

Generative-AI agent for customer-service and field operations.

Agents
SaaSOpen

Vodafone TOBi

Vodafone

AI customer-service chatbot serving hundreds of millions of subscribers.

Agents
SaaSOpen

Amazon Rufus

Amazon

Generative-AI shopping assistant inside the Amazon app.

Agents
MobileOpen

Shopify Sidekick

Shopify

AI commerce assistant that runs the merchant's store via natural language.

Agents
SaaSOpen

Walmart Sparky

Walmart

Generative-AI shopping assistant in the Walmart app.

Agents
MobileOpen

Klarna AI Assistant

Klarna

OpenAI-powered customer-service agent handling 2/3 of Klarna chats.

Agents
SaaSOpen

Instacart Ask Instacart

Instacart

ChatGPT-powered grocery search and meal planning.

Agents
MobileOpen

Maersk Captain Peter

Maersk

AI-powered remote container monitoring across reefer fleets.

Agents
IoT + SaaSOpen

FedEx Surround

FedEx

AI logistics intelligence platform built with Microsoft for shipment visibility.

Agents
PlatformOpen

UPS DeliveryDefense

UPS

Machine-learning system scoring delivery-success likelihood for shippers.

Agents
APIOpen

Project44 Movement GPT

Project44

Generative-AI supply-chain assistant on top of real-time visibility data.

Agents
SaaSOpen

Blue Yonder Cognitive Solutions

Blue Yonder

AI/ML supply-chain planning, forecasting and execution.

Agents
SaaSOpen

Zillow Zestimate (Neural)

Zillow

Neural-network home-value estimation across 100M+ US homes.

Agents
APIOpen

Procore Copilot

Procore

Generative-AI assistant for construction project management.

Agents
SaaSOpen

Autodesk AI

Autodesk

Generative design and AI across AutoCAD, Revit, Forma and Fusion.

Agents3D
PlatformOpen

HPE Aruba Networking Central AI

HPE Aruba Networking

AIOps for wired, wireless and SD-WAN with predictive issue resolution.

Agents
SaaSOpen

Aruba Networking AI Assistant

HPE Aruba Networking

Conversational AI inside Aruba Central for network troubleshooting.

Agents
SaaSOpen

Cisco AI Assistant

Cisco

Cross-portfolio AI assistant for security, networking and collaboration.

Agents
SaaSOpen

Cisco Hypershield

Cisco

AI-native distributed security fabric for data centers and clouds.

Agents
PlatformOpen

Juniper Mist AI / Marvis

Juniper Networks

AI-driven networking and Marvis virtual network assistant.

Agents
SaaSOpen

Arista CloudVision AVA

Arista Networks

Autonomous Virtual Assist AI for network operations and security.

Agents
PlatformOpen

Extreme AI Expert

Extreme Networks

GenAI assistant for network operations across the Extreme platform.

Agents
SaaSOpen

F5 AI Gateway

F5

Application-delivery and security AI gateway for LLM apps.

Agents
GatewayOpen

Fortinet FortiAI

Fortinet

GenAI security analyst across the Fortinet Security Fabric.

Agents
SaaSOpen

Zscaler ZDX Copilot

Zscaler

Generative-AI copilot for digital experience and zero-trust operations.

Agents
SaaSOpen

Palo Alto Strata Copilot

Palo Alto Networks

GenAI copilot for network security across the Strata portfolio.

Agents
SaaSOpen

Palo Alto Cortex XSIAM

Palo Alto Networks

AI-driven SOC platform unifying SIEM, EDR and SOAR.

Agents
PlatformOpen

Check Point Infinity AI Copilot

Check Point

Generative-AI assistant for security administration and threat analysis.

Agents
SaaSOpen

SentinelOne Purple AI

SentinelOne

Generative-AI threat-hunting analyst across the Singularity platform.

Agents
SaaSOpen

Darktrace ActiveAI

Darktrace

Self-learning AI platform for autonomous response across email, network and cloud.

Agents
PlatformOpen

Vectra AI Platform

Vectra AI

AI-driven threat detection and response across hybrid cloud.

Agents
PlatformOpen

Veeam Data Intelligence

Veeam

AI-powered data resilience, anomaly detection and recovery analytics.

Agents
PlatformOpen

Commvault Cloud Arlie

Commvault

GenAI assistant for cyber resilience, recovery and data protection.

Agents
SaaSOpen

Rubrik Ruby

Rubrik

Generative-AI assistant for cyber recovery investigations and remediation.

Agents
SaaSOpen

Cohesity Gaia

Cohesity

RAG-based AI search and insights over enterprise backup data.

AgentsEmbeddings
SaaSOpen

Pure Storage AIRI

Pure Storage

AI-ready infrastructure (with NVIDIA DGX) for training and inference at scale.

Agents
Reference ArchitectureOpen

NetApp AIPod

NetApp

Converged AI infrastructure with NVIDIA for enterprise model training.

Agents
Reference ArchitectureOpen

Dell PowerScale for AI

Dell Technologies

Scale-out file storage tuned for large-scale AI training and RAG.

Agents
StorageOpen

VAST Data Platform

VAST Data

Unified data platform for AI with embedded vector database and compute.

EmbeddingsAgents
PlatformOpen

Weka AI

WEKA

High-performance data platform for GPU-accelerated AI pipelines.

Agents
StorageOpen

DDN A³I

DDN

Reference AI storage architecture co-engineered with NVIDIA DGX SuperPOD.

Agents
StorageOpen

Hitachi Vantara iQ

Hitachi Vantara

Industry-tailored generative AI solutions on Hitachi infrastructure.

Agents
PlatformOpen

IBM watsonx.ai

IBM

Enterprise studio for building, training and deploying foundation models.

Text GenerationAgents
PlatformOpen

IBM Granite

IBM

Open enterprise LLM family for code, language and time series.

Text GenerationCode
TextOpen

Snowflake Cortex AI

Snowflake

Managed LLMs and agents running directly inside Snowflake.

Text GenerationAgents
PlatformOpen

Databricks Mosaic AI

Databricks

End-to-end platform for building and serving custom AI agents on the lakehouse.

AgentsEmbeddings
PlatformOpen

DBRX

Databricks

Open MoE LLM by Databricks for enterprise customization.

Text GenerationCode
TextOpen

Pinecone

Pinecone

Managed vector database powering RAG and semantic-search apps.

Embeddings
DatabaseOpen

Weaviate

Weaviate

Open-source vector database with hybrid search and modules.

Embeddings
DatabaseOpen

Elastic AI Assistant

Elastic

GenAI assistant across Elastic Search, Observability and Security.

AgentsEmbeddings
PlatformOpen

Splunk AI Assistant

Splunk (Cisco)

GenAI assistant for SPL, observability and security operations.

Agents
SaaSOpen

New Relic AI

New Relic

Generative-AI observability assistant for engineers.

Agents
SaaSOpen

Datadog Bits AI

Datadog

Generative-AI assistant across Datadog observability and security.

Agents
SaaSOpen

Dynatrace Davis CoPilot

Dynatrace

Hypermodal AI combining causal, predictive and generative AI for observability.

Agents
PlatformOpen

Workday Illuminate

Workday

AI agents for HR, finance and planning across Workday.

Agents
SaaSOpen

Atlassian Rovo

Atlassian

Enterprise search and AI agents across Jira, Confluence and 3rd-party SaaS.

AgentsEmbeddings
SaaSOpen

Notion AI

Notion

Built-in AI for writing, search and Q&A across Notion workspaces.

Text GenerationAgents
SaaSOpen

Zoom AI Companion

Zoom

AI assistant for meeting summaries, chat and email across Zoom.

AgentsAudio / Speech
SaaSOpen

Cisco Webex AI Assistant

Cisco

GenAI assistant for meetings, contact center and collaboration.

AgentsAudio / Speech
SaaSOpen

Intuit Assist

Intuit

GenAI financial assistant across TurboTax, QuickBooks, Credit Karma and Mailchimp.

Agents
SaaSOpen

Adobe Firefly

Adobe

Commercially-safe generative-AI models for image, vector and video.

Image GenerationVideo Generation
Image + VideoOpen

Adobe GenStudio

Adobe

Enterprise generative-AI platform for marketing content production.

Agents
PlatformOpen

Canva Magic Studio

Canva

Suite of AI design tools for image, video, copy and presentations.

Image GenerationText Generation
SaaSOpen

HubSpot Breeze

HubSpot

AI agents and copilots across marketing, sales and service.

Agents
SaaSOpen

Glean

Glean

Enterprise-search and work-AI assistant across SaaS data.

EmbeddingsAgents
SaaSOpen

Writer Palmyra

Writer

Enterprise LLM family and generative-AI platform for regulated industries.

Text GenerationAgents
PlatformOpen

Jasper

Jasper

AI marketing platform for brand-aligned content generation.

Text GenerationAgents
SaaSOpen

Now Assist

ServiceNow

GenAI assistant embedded across ITSM, CSM, HRSD and creator workflows.

AgentsText Generation
SaaSOpen

AI Agents (ServiceNow)

ServiceNow

Autonomous AI agents for IT, HR, customer service and security operations.

Agents
SaaSOpen

Now LLM

ServiceNow

Domain-specific large language models tuned for the Now Platform.

Text GenerationReasoning
TextOpen

Workday AI Agents

Workday

Role-based AI agents for recruiting, payroll, expenses and contracts.

Agents
SaaSOpen

Workday Agent System of Record

Workday

Central system to manage, govern and orchestrate AI agents across the enterprise.

Agents
PlatformOpen

Oracle AI Agents

Oracle

Prebuilt AI agents across Oracle Fusion Cloud HCM, ERP, SCM and CX.

Agents
SaaSOpen

Oracle Generative AI Service

Oracle

Managed LLM service on OCI featuring Cohere and Meta Llama models.

Text GenerationEmbeddings
PlatformOpen

Oracle Digital Assistant

Oracle

Conversational AI platform for building enterprise assistants.

Agents
PlatformOpen

Oracle Code Assist

Oracle

GenAI coding companion optimized for Java, SQL and OCI.

Code
IDEOpen

Oracle Health Clinical AI Agent

Oracle

Voice-enabled clinical documentation agent for clinicians.

AgentsAudio / Speech
SaaSOpen

SAP Joule

SAP

Generative-AI copilot embedded across the SAP application portfolio.

AgentsText Generation
SaaSOpen

SAP Business AI

SAP

Portfolio of AI capabilities and agents across SAP business processes.

Agents
PlatformOpen

SAP AI Core

SAP

Runtime and lifecycle management for AI workloads on SAP BTP.

AgentsEmbeddings
PlatformOpen

Agentforce 3

Salesforce

Platform for building, deploying and governing autonomous AI agents.

Agents
PlatformOpen

Einstein GPT

Salesforce

Generative-AI layer across Sales, Service, Marketing and Commerce Clouds.

Text GenerationAgents
SaaSOpen

Salesforce Data Cloud + Einstein

Salesforce

Unified customer data foundation powering Einstein and Agentforce.

EmbeddingsAgents
PlatformOpen

Slack AI

Salesforce / Slack

AI summaries, search and recap built into Slack channels and DMs.

Text GenerationAgents
SaaSOpen

Tableau Pulse / Einstein

Salesforce / Tableau

Generative analytics and natural-language insights inside Tableau.

AgentsReasoning
SaaSOpen

Microsoft 365 Copilot

Microsoft

Generative-AI assistant across Word, Excel, PowerPoint, Outlook and Teams.

AgentsText Generation
SaaSOpen

Copilot Studio

Microsoft

Low-code platform for building and orchestrating custom AI agents.

Agents
PlatformOpen

Dynamics 365 Copilot

Microsoft

Role-based AI copilots for sales, service, finance, supply chain and HR.

Agents
SaaSOpen

Azure AI Foundry

Microsoft

Unified platform to build, evaluate and deploy AI agents and models on Azure.

AgentsEmbeddings
PlatformOpen

Azure OpenAI Service

Microsoft

Enterprise access to GPT, o-series and DALL·E models on Azure.

Text GenerationImage Generation
APIOpen

Azure AI Search

Microsoft

Vector and hybrid retrieval engine for grounding LLMs on enterprise data.

Embeddings
PlatformOpen

GitHub Copilot

Microsoft / GitHub

AI pair-programmer for code completion, chat, reviews and agent mode.

CodeAgents
IDEOpen

Microsoft Fabric Copilot

Microsoft

GenAI copilots for data engineering, science and Power BI inside Fabric.

AgentsCode
SaaSOpen

Power Platform AI Builder

Microsoft

AI models and prebuilt skills for Power Apps and Power Automate.

Agents
PlatformOpen

Amazon Q Business

AWS

Generative-AI assistant grounded on enterprise data and SaaS connectors.

AgentsEmbeddings
SaaSOpen

Amazon Q Developer

AWS

AI coding and operations assistant across the developer lifecycle on AWS.

CodeAgents
IDEOpen

Amazon Bedrock AgentCore

AWS

Secure runtime for deploying and scaling production AI agents on Bedrock.

Agents
PlatformOpen

Amazon SageMaker AI

AWS

End-to-end platform to build, train and deploy ML and foundation models.

AgentsEmbeddings
PlatformOpen

Amazon Connect AI

AWS

GenAI for contact-center agents, self-service and analytics.

AgentsAudio / Speech
SaaSOpen

AWS HealthScribe

AWS

HIPAA-eligible service that generates clinical notes from patient conversations.

Audio / SpeechText Generation
APIOpen

Vertex AI

Google Cloud

Unified platform for Gemini, Model Garden, agents and ML on GCP.

AgentsEmbeddings
PlatformOpen

Vertex AI Agent Builder

Google Cloud

Build, deploy and manage multi-agent systems grounded on enterprise data.

Agents
PlatformOpen

Gemini for Google Workspace

Google

AI assistance across Gmail, Docs, Sheets, Slides, Meet and Drive.

AgentsText Generation
SaaSOpen

Gemini Code Assist

Google

AI coding assistant with enterprise context across IDEs and Google Cloud.

CodeAgents
IDEOpen

Customer Engagement Suite (CCAI)

Google Cloud

Generative contact-center AI for virtual agents, agent assist and insights.

AgentsAudio / Speech
SaaSOpen

BigQuery ML / Gemini in BigQuery

Google Cloud

In-warehouse ML and GenAI directly on BigQuery data via SQL.

ReasoningEmbeddings
PlatformOpen

IBM watsonx.ai

IBM

Studio to train, tune and deploy foundation models including Granite.

Text GenerationEmbeddings
PlatformOpen

IBM watsonx Orchestrate

IBM

Build and orchestrate AI agents across HR, procurement and sales workflows.

Agents
SaaSOpen

IBM watsonx.data

IBM

Open data lakehouse optimized for AI workloads and RAG.

Embeddings
PlatformOpen

IBM watsonx.governance

IBM

AI governance, risk and compliance for foundation models and agents.

Agents
PlatformOpen

Informatica CLAIRE GPT

Informatica

Generative-AI assistant for data management, integration and governance.

AgentsEmbeddings
SaaSOpen

Informatica IDMC AI Agents

Informatica

AI agents across the Intelligent Data Management Cloud for pipelines and quality.

Agents
PlatformOpen

Adobe Experience Platform AI Assistant

Adobe

GenAI assistant for marketers across Adobe Experience Cloud applications.

Agents
SaaSOpen

Adobe Acrobat AI Assistant

Adobe

Conversational AI to summarize, query and draft from PDF documents.

Text Generation
SaaSOpen

Snowflake Cortex Agents

Snowflake

Build agentic apps grounded on governed Snowflake data with hosted LLMs.

AgentsEmbeddings
PlatformOpen

Snowflake Copilot

Snowflake

Natural-language SQL and analytics assistant inside Snowflake.

CodeReasoning
SaaSOpen

Databricks Mosaic AI Agent Framework

Databricks

Tooling to build, evaluate and govern compound AI agents on the lakehouse.

AgentsEmbeddings
PlatformOpen

Databricks Genie

Databricks

Conversational analytics over governed lakehouse data.

ReasoningAgents
SaaSOpen

Cisco AI Assistant

Cisco

Cross-portfolio AI assistant for security, networking and collaboration.

Agents
SaaSOpen

VMware Private AI

Broadcom / VMware

On-prem GenAI reference architecture co-engineered with NVIDIA and IBM.

AgentsEmbeddings
PlatformOpen

Box AI

Box

AI for content Q&A, summarization and metadata extraction in Box.

Text GenerationAgents
SaaSOpen

Dropbox Dash

Dropbox

Universal search and AI assistant across SaaS apps and content.

EmbeddingsAgents
SaaSOpen

DocuSign IAM with AI

DocuSign

AI-powered Intelligent Agreement Management for contract data and workflows.

Agents
SaaSOpen

Zendesk AI Agents

Zendesk

Autonomous and copilot AI agents for customer service.

Agents
SaaSOpen

Freshworks Freddy AI

Freshworks

Generative-AI assistants and agents across CX, ITSM and CRM.

Agents
SaaSOpen

ZoomInfo Copilot

ZoomInfo

GenAI go-to-market copilot for sellers, grounded on B2B data.

Agents
SaaSOpen

Gong AI

Gong

Revenue AI for call insights, forecasting and deal execution.

AgentsAudio / Speech
SaaSOpen

Pega GenAI

Pegasystems

GenAI Blueprint and agents for case management and CRM workflows.

Agents
PlatformOpen

UiPath Autopilot

UiPath

Agentic automation copilot across the UiPath platform for citizens and developers.

Agents
PlatformOpen

Automation Anywhere AI Agent Studio

Automation Anywhere

Build and govern AI agents that combine LLMs with enterprise automation.

Agents
PlatformOpen

Talend Data Fabric AI

Qlik / Talend

AI-assisted data integration, quality and governance.

Agents
PlatformOpen

Qlik Answers

Qlik

Generative analytics service delivering trusted answers from unstructured data.

ReasoningAgents
SaaSOpen

SAS Viya with GenAI

SAS

Analytics and AI platform with embedded LLM orchestration and copilots.

AgentsReasoning
PlatformOpen

TIBCO / Cloud Software Group AI

Cloud Software Group

AI across Spotfire, integration and data virtualization products.

Agents
PlatformOpen

Teradata AI Unlimited / ClearScape

Teradata

In-database analytics and GenAI orchestration on Teradata VantageCloud.

ReasoningEmbeddings
PlatformOpen

MongoDB Atlas Vector Search

MongoDB

Native vector search in MongoDB Atlas for RAG and semantic apps.

Embeddings
DatabaseOpen

Redis AI / Vector

Redis

Low-latency vector database and semantic cache for GenAI apps.

Embeddings
DatabaseOpen

Twilio CustomerAI

Twilio

GenAI and predictive AI across Twilio messaging, voice and Segment.

AgentsAudio / Speech
PlatformOpen

Asana AI

Asana

AI teammates and copilots for work management and goals.

Agents
SaaSOpen

Monday AI

monday.com

AI assistant and blocks for automating Work OS workflows.

Agents
SaaSOpen

Smartsheet AI

Smartsheet

GenAI formulas, summaries and content generation in Smartsheet.

AgentsText Generation
SaaSOpen

Coupa AI

Coupa

Community-powered AI and agents for spend management.

Agents
SaaSOpen

GitLab Duo

GitLab

AI assistant across the GitLab DevSecOps platform with code, chat and security.

CodeAgents
PlatformOpen

Atlassian Intelligence

Atlassian

AI features and agents across Jira, Confluence, Bitbucket and Loom.

Agents
SaaSOpen

Claude Opus 4.5

Anthropic

Anthropic's most intelligent model — state-of-the-art on coding, agents and computer use.

ReasoningCodeAgents
Text + ImageOpen

GPT-5.1-Codex-Max

OpenAI

Frontier agentic coding model for long-horizon software engineering inside Codex.

CodeAgentsReasoning
TextOpen

GPT-5.1

OpenAI

Updated GPT-5 with warmer tone, adaptive reasoning and stronger instruction following.

Text GenerationReasoningMultimodal
Text + Image + AudioOpen

Gemini 3 Pro

Google DeepMind

Leads LMArena Text, WebDev and Vision — Google's flagship multimodal reasoning model.

ReasoningMultimodalCode
Text + Image + Video + AudioOpen

Gemini 3 Deep Think

Google DeepMind

Extended-thinking variant of Gemini 3 for hardest math, science and research problems.

Reasoning
Text + ImageOpen

Nano Banana Pro

Google DeepMind

Gemini-powered flagship image generation and editing model with best-in-class text.

Image Generation
ImageOpen

SAM 3D

Meta

Segment Anything 3D — reconstructs objects, scenes and human bodies from a single image.

3DImage Understanding
Image → 3DOpen

Olmo 3

Ai2

Fully open model flow with training data, checkpoints and recipes for reproducible AI.

Text GenerationReasoning
TextOpen

Grok 4.1

xAI

Refresh of Grok 4 with stronger reasoning, lower hallucination and faster tool use.

ReasoningAgentsText Generation
Text + ImageOpen

Claude Haiku 4.5

Anthropic

Fast, cheap Claude tier matching prior Sonnet-class quality for high-volume agents.

Text GenerationAgentsCode
Text + ImageOpen

Mistral Medium 3

Mistral AI

Cost-efficient enterprise model with frontier-class performance for business workloads.

Text GenerationReasoningCode
Text + ImageOpen

Qwen3-Max

Alibaba

Alibaba's trillion-parameter flagship multilingual reasoning model.

ReasoningText GenerationCode
Text + ImageOpen

Kimi K2

Moonshot AI

Long-context open agentic model from Moonshot, strong on tool use and coding.

AgentsReasoningCode
TextOpen

GLM-4.6

Zhipu AI

Open bilingual frontier model from Zhipu, competitive on coding and reasoning.

Text GenerationCodeReasoning
TextOpen

VISTA-R1

Eigen AI

Agentic RL vision-language model for tool-integrated visual reasoning.

ReasoningImage UnderstandingAgents
Text + ImageOpen

Shopify Magic

Shopify

Generative AI across the Shopify admin — product descriptions, emails, blog posts and image edits.

Text GenerationImage Generation
SaaSOpen

Shopify Semantic Search

Shopify

Embeddings-based product search powering natural-language storefront discovery.

Embeddings
SaaSOpen

Shop App AI

Shopify

Personal shopping assistant in the Shop app, recommending and tracking orders across merchants.

Agents
SaaSOpen

Zendesk Copilot

Zendesk

Agent-side AI copilot suggesting replies, summaries and next actions in real time.

AgentsText Generation
SaaSOpen

Zendesk Resolution Platform

Zendesk

Agentic CX platform (post-Ultimate.ai) for end-to-end automated customer resolutions.

Agents
PlatformOpen

Zendesk QA (Klaus)

Zendesk

AutoQA AI that scores 100% of support conversations across voice and chat.

ReasoningAudio / Speech
SaaSOpen

Twilio Voice Intelligence

Twilio

Speech-to-text, summaries and language operators that analyze every call in real time.

Audio / SpeechReasoning
PlatformOpen

Twilio AI Assistants

Twilio

Build conversational AI agents over SMS, voice and WhatsApp grounded in Segment data.

AgentsAudio / Speech
PlatformOpen

Segment Linked Audiences

Twilio Segment

AI-powered CDP predictions joining warehouse data to real-time activation.

EmbeddingsAgents
PlatformOpen

Symantec AI for DLP

Broadcom / Symantec

AI-driven data loss prevention classifying sensitive content across cloud, email and endpoints.

Agents
PlatformOpen

Broadcom Rally AI

Broadcom

GenAI for agile planning — story generation, sprint summaries and risk forecasting.

AgentsText Generation
SaaSOpen

VMware Cloud Foundation AI Services

Broadcom / VMware

Private AI services for VCF — model serving, RAG and vector DB on-prem.

EmbeddingsAgents
PlatformOpen

Microsoft Sales Copilot

Microsoft

Role-based Copilot inside Outlook & Teams pulling CRM context from Dynamics 365 and Salesforce.

AgentsText Generation
SaaSOpen

Microsoft Service Copilot

Microsoft

Frontline copilot for contact center agents inside Dynamics 365 Customer Service.

AgentsText Generation
SaaSOpen

Dragon Copilot

Microsoft / Nuance

Ambient AI scribe for clinicians that drafts notes and orders from doctor-patient conversations.

Audio / SpeechText Generation
SaaSOpen

GitHub Copilot Workspace

GitHub / Microsoft

Agentic dev environment that plans, edits and tests entire features from a GitHub issue.

CodeAgents
SaaSOpen

Prisma AIRS

Palo Alto Networks

AI Runtime Security — protects models, agents and data across enterprise AI deployments.

Agents
PlatformOpen

Cortex Cloud

Palo Alto Networks

Unified AI-driven CNAPP + CDR converging Prisma Cloud and Cortex into one platform.

Agents
PlatformOpen

Cloudflare Workers AI

Cloudflare

Serverless GPU inference platform running open models at the edge.

Text GenerationEmbeddingsImage Generation
PlatformOpen

Cloudflare AI Gateway

Cloudflare

Observability, caching and rate-limiting proxy for any LLM provider.

Agents
PlatformOpen

Akamai Cloud Inference

Akamai

Distributed-edge inference platform built on the Akamai Connected Cloud.

Agents
PlatformOpen

Stripe Radar

Stripe

ML-based fraud detection trained on the global Stripe payments network.

Reasoning
SaaSOpen

PayPal Smart Receipts

PayPal

Personalized AI recommendations and cashback on merchant receipts.

Agents
SaaSOpen

Block Square AI

Block

AI assistant for sellers — answers business questions from Square sales data.

Agents
SaaSOpen

Coinbase AgentKit

Coinbase

Toolkit letting AI agents transact on-chain with wallets, USDC and smart contracts.

Agents
PlatformOpen

Robinhood Cortex

Robinhood

AI investing companion delivering market insights to Robinhood Gold customers.

ReasoningAgents
SaaSOpen

Spotify AI DJ

Spotify

Personalized AI DJ that curates and narrates listening sessions in a realistic voice.

Audio / SpeechAgents
SaaSOpen

Reddit Answers

Reddit

Conversational search that synthesizes answers from authentic Reddit discussions.

Text GenerationAgents
SaaSOpen

Snap My AI

Snap

GPT-powered chatbot inside Snapchat with vision and Snap Map awareness.

AgentsMultimodal
SaaSOpen

Pinterest Performance+

Pinterest

GenAI ads platform that builds creative and optimizes targeting automatically.

Image GenerationAgents
SaaSOpen

Uber AI Assistant

Uber

In-app GenAI assistant guiding riders and drivers through Uber and Uber Eats workflows.

Agents
SaaSOpen

DoorDash SafeChat AI

DoorDash

Real-time AI moderation that detects harassment across Dasher-customer chats in 99 languages.

Text Generation
SaaSOpen

Synopsys.ai

Synopsys

AI suite (DSO.ai, VSO.ai, TSO.ai) optimizing chip design across the EDA flow.

AgentsReasoning
PlatformOpen

Cadence Cerebrus / JedAI

Cadence

Generative AI for digital chip implementation and verification across the Cadence flow.

AgentsReasoning
PlatformOpen

Ansys SimAI

Ansys

Cloud generative-AI app delivering near-instant simulation predictions for engineers.

Reasoning
SaaSOpen

Veeva AI

Veeva Systems

Embedded AI agents and shortcuts across Veeva Vault and Commercial Cloud for life sciences.

Agents
SaaSOpen

Hunyuan T1

Tencent

Tencent's deep-reasoning model, mamba-based and tuned for complex multi-step problems.

Reasoning
TextOpen

Baidu Apollo ADFM

Baidu

Autonomous Driving Foundation Model powering Apollo Go robotaxis across China.

MultimodalAgents
Vision + ActionOpen

ByteDance Coze

ByteDance

No-code bot platform for building, publishing and monetizing AI agents.

Agents
PlatformOpen

Palmyra X 003

Writer

Palmyra X 003, is a top-performing instruct model, built specifically for structured text completion rather than conversational use.

Text Generation
TextOpen

Kimi Explorer

Moonshot AI

Moonshot AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

AlphaGeometry

Google DeepMind

Google DeepMind's mathematics model tracked by Epoch, focused on geometry.

Reasoning
TextOpen

Qwen-VL-Max

Alibaba

Alibaba's multimodal, language, vision model tracked by Epoch, focused on chat.

Image UnderstandingMultimodalText Generation
Text + ImageOpen

Qwen1.5-72B

Alibaba

In recent months, our focus has been on developing a “good” model while optimizing the developer experience.

CodeReasoningText Generation
TextOpen

Aya

Cohere for AI

Recent breakthroughs in large language models (LLMs) have centered around a handful of data-rich languages.

Text Generation
TextOpen

Gemini 1.5 Pro

Google DeepMind

Google DeepMind's language, multimodal model tracked by Epoch, focused on language modeling.

MultimodalText Generation
Text + ImageOpen

Stable Diffusion 3

Stability AI

Diffusion models create data from noise by inverting the forward paths of data towards noise and have emerged as a powerful generative modeling technique for high-dimensional, perceptual data such as images and videos.

Image Generation
ImageOpen

MegaScale (Production)

ByteDance

We present the design, implementation and engineering experience in building and deploying MegaScale, a production system for training large language models (LLMs) at the scale of more than 10,000 GPUs.

Text Generation
TextOpen

Mistral Large

Mistral AI

Mistral AI's language model tracked by Epoch, focused on chat.

Text Generation
TextOpen

Claude 3 Sonnet

Anthropic

Anthropic's multimodal, language, vision model tracked by Epoch, focused on chat.

CodeImage GenerationMultimodal
ImageOpen

Claude 3 Opus

Anthropic

Anthropic's multimodal, language, vision model tracked by Epoch, focused on chat.

CodeImage GenerationMultimodal
ImageOpen

Aramco Metabrain AI

Saudi Aramco

Saudi Aramco's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

Inflection-2.5

Inflection AI

At Inflection, our mission is to create a personal AI for everyone.

Text Generation
TextOpen

ManiGaussian

Tsinghua University

Performing language-conditioned robotic manipulation tasks in unstructured environments is highly demanded for general intelligent robots.

AgentsImage UnderstandingMultimodal
Vision + ActionOpen

MM1-30B

Apple

In this work, we discuss building performant Multimodal Large Language Models (MLLMs).

Image UnderstandingMultimodalText Generation
Text + ImageOpen

ReALM

Apple

Reference resolution is an important problem, one that is essential to understand and successfully handle context of different kinds.

Text Generation
TextOpen

GPT-4 Turbo (Apr 2024)

OpenAI

Today, we shared dozens of new additions and improvements, and reduced pricing across many parts of our platform.

Image GenerationMultimodalText Generation
ImageOpen

Reka Core

Reka AI

We introduce Reka Core, Flash, and Edge, a series of powerful multimodal language models trained from scratch by Reka.

Audio / SpeechCodeImage Generation
AudioOpen

Llama 3-70B

Meta

Meta's language model tracked by Epoch, focused on chat.

CodeText Generation
TextOpen

VILA1.5-13B

NVIDIA

Visual language models (VLMs) rapidly progressed with the recent success of large language models.

Image GenerationMultimodalText Generation
VideoOpen

Yi-Large

01.AI

01.AI's language model tracked by Epoch, focused on chat.

Text Generation
TextOpen

Octo-Base

University of California (UC) Berkeley

University of California (UC) Berkeley's robotics model tracked by Epoch, focused on robotic manipulation.

AgentsMultimodal
Vision + ActionOpen

GLM-4 (0520)

Zhipu AI

We introduce ChatGLM, an evolving family of large language models that we have been developing over time.

CodeReasoningText Generation
TextOpen

ALLaM adapted 70B

Saudi Data and Artificial Intelligence Authority

We present ALLaM: Arabic Large Language Model, a series of large language models to support the ecosystem of Arabic Language Technologies (ALT).

Text Generation
TextOpen

Qwen2-72B

Alibaba

After months of efforts, we are pleased to announce the evolution from Qwen1.5 to Qwen2.

Text Generation
TextOpen

Llama-3.1-Nemotron-70B-Instruct

NVIDIA

High-quality preference datasets are essential for training reward models that can effectively guide large language models (LLMs) in generating high-quality responses aligned with human preferences.

Text Generation
TextOpen

OpenVLA

Stanford University

Stanford University's robotics, vision, language model tracked by Epoch, focused on robotic manipulation.

AgentsImage UnderstandingMultimodal
Vision + ActionOpen

Nemotron-4 340B

NVIDIA

We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4- 340B-Instruct, and Nemotron-4-340B-Reward.

Text Generation
TextOpen

DeepSeek-Coder-V2 236B

DeepSeek

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.

CodeText Generation
TextOpen

Claude 3.5 Sonnet

Anthropic

This addendum to our Claude 3 Model Card describes Claude 3.5 Sonnet, a new model which outperforms our previous most capable model, Claude 3 Opus, while operating faster and at a lower cost.

CodeImage GenerationMultimodal
ImageOpen

Cambrian-1-34B

New York University (NYU)

We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-centric approach.

Image UnderstandingMultimodalText Generation
Text + ImageOpen

ESM3 (98B)

EvolutionaryScale

More than three billion years of evolution have produced an image of biology encoded into the space of natural proteins.

Text Generation
TextOpen

Ernie 4.0 Turbo

Baidu

Baidu's multimodal, language, vision model tracked by Epoch, focused on vision-language generation.

Image GenerationMultimodalText Generation
ImageOpen

SenseChat 5.5

SenseTime

SenseTime's multimodal, language, vision model tracked by Epoch, focused on vision-language generation.

Image GenerationMultimodalReasoning
ImageOpen

Mathstral

Mistral AI

We're contributing Mathstral to the science community to bolster efforts in advanced mathematical problems requiring complex, multi-step logical reasoning.

ReasoningText Generation
TextOpen

DeepL LLM

DeepL

DeepL's language model tracked by Epoch, focused on translation.

Text Generation
TextOpen

Llama 3.1-405B

Meta

Modern artificial intelligence (AI) systems are powered by foundation models.

CodeReasoningText Generation
TextOpen

AFM-server

Apple

Apple's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

AFM-on-device

Apple

Apple's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

LLaVA-OV-72B

ByteDance

We present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed by consolidating our insights into data, models, and visual representations in the LLaVA-NeXT blog series.

Image GenerationMultimodalText Generation
VideoOpen

GPT-4o (Aug 2024)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioOpen

Table Tennis Agent

Google DeepMind

Achieving human-level speed and performance on real world tasks is a north star for the robotics research community.

AgentsMultimodal
Vision + ActionOpen

Grok-2

xAI

Grok-2 is our frontier language model with state-of-the-art reasoning capabilities.

CodeImage GenerationMultimodal
ImageOpen

Jamba 1.5-Large

AI21 Labs

We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture.

Text Generation
TextOpen

Hairuo

Inspur

Inspur's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

GLM-4-Plus

Zhipu AI

At the KDD International Conference on Data Mining and Knowledge Discovery, the Zhipu GLM team unveiled the new generation of base large model—GLM-4-Plus.

Text Generation
TextOpen

Hunyuan Turbo

Tencent

Tencent's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

Harrison.rad.1

Harrison.ai

Harrison.ai's vision, medicine, language, multimodal model tracked by Epoch, focused on visual question answering.

Image UnderstandingMultimodalText Generation
Text + ImageOpen

AlphaProteo

Google DeepMind

Computational design of protein-binding proteins is a fundamental capability with broad utility in biomedical research and biotechnology.

Text Generation
TextOpen

DeepSeek-V2.5

DeepSeek

DeepSeek's language model tracked by Epoch, focused on language modeling/generation.

CodeText Generation
TextOpen

o1-preview

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeReasoningText Generation
TextOpen

o1-mini

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeReasoningText Generation
TextOpen

Qwen2.5-32B

Alibaba

In the past three months since Qwen2’s release, numerous developers have built new models on the Qwen2 language models, providing us with valuable feedback.

ReasoningText Generation
TextOpen

Qwen2.5-72B

Alibaba

In the past three months since Qwen2’s release, numerous developers have built new models on the Qwen2 language models, providing us with valuable feedback.

ReasoningText Generation
TextOpen

Qwen2.5 Instruct (72B)

Alibaba

Qwen2.5 is the latest series of Qwen large language models.

CodeReasoningText Generation
TextOpen

Oryx 34B

Tsinghua University

Visual data comes in various forms, ranging from small icons of just a few pixels to long videos spanning hours.

3DImage GenerationMultimodal
3DOpen

Telechat2-115B

China Telecom

China Telecom's language model tracked by Epoch, focused on language modeling/generation.

CodeReasoningText Generation
TextOpen

PixelDance

ByteDance

PixelDance V1.4 is a video generation model developed by the ByteDance Research team, using the DiT structure.

Image GenerationVideo Generation
VideoOpen

Llama 3.2 11B

Meta

Meta's multimodal, vision, language model tracked by Epoch, focused on visual question answering.

Image UnderstandingMultimodalText Generation
Text + ImageOpen

Movie Gen Video

Meta

We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio.

Image GenerationVideo Generation
VideoOpen

GR-2

ByteDance

We present GR-2, a state-of-the-art generalist robot agent for versatile and generalizable robot manipulation.

AgentsMultimodal
Vision + ActionOpen

Palmyra X 004

Writer

Palmyra X4 boasts state-of-the-art reasoning through novel training techniques.

CodeText Generation
TextOpen

RDT-1B

Tsinghua University

Tsinghua University's robotics model tracked by Epoch, focused on robotic manipulation.

AgentsMultimodal
Vision + ActionOpen

CHAI-1

Chai discovery

We introduce Chai-1, a multi-modal foundation model for molecular structure prediction that performs at the state-of-the-art across a variety of tasks relevant to drug discovery.

Text Generation
TextOpen

NVLM-X 72B

NVIDIA

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning
ImageOpen

NVLM-H 72B

NVIDIA

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning
ImageOpen

NVLM-D 72B

NVIDIA

NVIDIA's vision, language model tracked by Epoch, focused on language modeling/generation.

CodeImage GenerationReasoning
ImageOpen

Doubao-pro

ByteDance

A professional-grade, self-developed LLM supporting up to 128k tokens, enabling fine-tuning across the entire series.

Text Generation
TextOpen

SeedEdit

ByteDance

We introduce SeedEdit, a diffusion model that is able to revise a given image with any text prompts.

Image Generation
ImageOpen

Gemini-Exp-1114

Google DeepMind

Google DeepMind's language model tracked by Epoch, focused on language modeling.

Text Generation
TextOpen

k0-math

Moonshot AI

Artificial general intelligence start-up Kimi, owned by Chinese AI start-up Moonshot AI, on Saturday launched its first reasoning AI model k0-math.

ReasoningText Generation
TextOpen

GPT-4o (Nov 2024)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioOpen

Fugatto 1

NVIDIA

Fugatto is a versatile audio synthesis and transformation model capable of following free-form text instructions with optional audio inputs.

Audio / SpeechMultimodalText Generation
AudioOpen

Amazon Nova Pro

Amazon

A highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks.

CodeImage GenerationMultimodal
VideoOpen

o1

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeMultimodalReasoning
Text + ImageOpen

NVILA 15B

NVIDIA

Visual language models (VLMs) have made significant advances in accuracy in recent years.

Image GenerationMultimodalText Generation
VideoOpen

Infinity

ByteDance

We present Infinity, a Bitwise Visual AutoRegressive Modeling capable of generating high-resolution, photorealistic images following language instruction.

Image Generation
ImageOpen

Sora Turbo

OpenAI

Our video generation model is rolling out at sora.com⁠.

Image GenerationVideo Generation
VideoOpen

EXAONE 3.5 32B

LG AI Research

This technical report introduces the EXAONE 3.5 instruction-tuned language models, developed and released by LG AI Research.

Text Generation
TextOpen

Gemini 2.0 Pro

Google DeepMind

Today, we’re releasing an experimental version of Gemini 2.0 Pro that responds to that feedback.

Audio / SpeechCodeImage Generation
AudioOpen

Apollo 7B

Meta AI

Despite the rapid integration of video perception capabilities into Large Multimodal Models (LMMs), the underlying mechanisms driving their video understanding remain poorly understood.

MultimodalText GenerationVideo Generation
VideoOpen

Veo 2

Google DeepMind

Google DeepMind's video, vision model tracked by Epoch, focused on video generation.

Image GenerationVideo Generation
VideoOpen

STORM-B/8

University of Southern California

We present STORM, a spatio-temporal reconstruction model designed for reconstructing dynamic outdoor scenes from sparse observations.

3D
3DOpen

Stable Point Aware 3D (SPAR3D)

Stability AI

We study the problem of single-image 3D object reconstruction.

3D
3DOpen

INTELLECT-MATH

Prime Intellect

INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning.

Reasoning
TextOpen

Eagle 2

NVIDIA

Recently, promising progress has been made by open-source vision-language models (VLMs) in bringing their capabilities closer to those of proprietary frontier models.

AgentsImage UnderstandingMultimodal
Vision + ActionOpen

Kimi k1.5

Moonshot AI

Language model pretraining with next token prediction has proved effective for scaling compute but is limited to the amount of available training data.

CodeImage GenerationMultimodal
ImageOpen

Computer-Using Agent (CUA)

OpenAI

Today we introduced a research preview of Operator⁠(opens in a new window), an agent that can go to the web to perform tasks for you.

AgentsImage UnderstandingMultimodal
Text + ImageOpen

GPT-4o (Jan 2025)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioOpen

o3-mini

OpenAI

We’re releasing OpenAI o3-mini, the newest, most cost-efficient model in our reasoning series, available in both ChatGPT and the API today.

CodeReasoningText Generation
TextOpen

Eurus-2-7B-PRIME

Tsinghua University

Tsinghua University's mathematics model tracked by Epoch, focused on mathematical reasoning.

CodeReasoning
TextOpen

Grok 3

xAI

We are pleased to introduce Grok 3, our most advanced model yet: blending strong reasoning with extensive pretraining knowledge.

CodeImage GenerationMultimodal
ImageOpen

Mercury

Inception Labs

Today, we’re excited to announce that Mercury, our first general chat model, is available to support a wider range of text generation applications.

CodeText Generation
TextOpen

GPT-4.5

OpenAI

We advance AI capabilities by scaling two complementary paradigms: unsupervised learning and reasoning.

CodeImage GenerationMultimodal
ImageOpen

QwQ-32B

Alibaba

QwQ is the reasoning model of the Qwen series.

CodeReasoningText Generation
TextOpen

Mistral OCR

Mistral AI

Mistral OCR is an Optical Character Recognition API that sets a new standard in document understanding.

Image GenerationMultimodalText Generation
ImageOpen

Hunyuan-TurboS

Tencent

As Large Language Models (LLMs) rapidly advance, we introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba Mixture of Experts (MoE) model.

CodeReasoningText Generation
TextOpen

EXAONE Deep 32B

LG AI Research

We present EXAONE Deep series, which exhibits superior capabilities in various reasoning tasks, including math and coding benchmarks.

CodeReasoningText Generation
TextOpen

ERNIE-4.5-VL-424B-A47B (文心大模型4.5)

Baidu

In this report, we introduce ERNIE 4.5, a new family of large-scale multimodal models comprising 10 distinct variants.

CodeImage GenerationMultimodal
VideoOpen

o1-pro

OpenAI

We've developed a new series of AI models designed to spend more time thinking before they respond.

CodeMultimodalReasoning
Text + ImageOpen

Diffusion Renderer

NVIDIA

Understanding and modeling lighting effects are fundamental tasks in computer vision and graphics.

Video Generation
VideoOpen

DeepSeek-V3 (Mar 2025)

DeepSeek

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

CodeReasoningText Generation
TextOpen

Gemini 2.5 Pro (Mar 2025)

Google DeepMind

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation
AudioOpen

GPT-4o (Mar 2025)

OpenAI

We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

Audio / SpeechImage GenerationMultimodal
AudioOpen

Llama 4 Scout

Meta

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal
ImageOpen

Llama 4 Maverick

Meta

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal
ImageOpen

Llama 4 Behemoth (preview)

Meta

We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

CodeImage GenerationMultimodal
ImageOpen

Pangu Ultra

Huawei

We present Pangu Ultra, a Large Language Model (LLM) with 135 billion parameters and dense Transformer modules trained on Ascend Neural Processing Units (NPUs).

CodeText Generation
TextOpen

Qwen3-235B-A22B

Alibaba

Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models.

CodeReasoningText Generation
TextOpen

Gemini 2.5 Pro (May 2025)

Google DeepMind

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation
AudioOpen

Seed1.5-VL

ByteDance

We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning.

Image GenerationMultimodalText Generation
VideoOpen

Claude Sonnet 4

Anthropic

Claude Sonnet 4 can understand nuanced instructions and context, recognize and correct its own mistakes, and create sophisticated analysis and insights from complex data.

AgentsCodeImage Generation
ImageOpen

DeepSeek-R1 (May 2025)

DeepSeek

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.

CodeReasoningText Generation
TextOpen

Qwen3 Embedding

Alibaba

In this work, we introduce the Qwen3 Embedding series, a significant advancement over its predecessor, the GTE-Qwen series, in text embedding and reranking capabilities, built upon the Qwen3 foundation models.

Text Generation
TextOpen

Gemini 2.5 Pro (Jun 2025)

Google DeepMind

Gemini 2.5 Pro Experimental is our most advanced model for complex tasks.

Audio / SpeechCodeImage Generation
AudioOpen

Seed-1.6-Thinking

ByteDance

Seed1.6 is the latest general-purpose model series unveiled by the ByteDance Seed team.

Image GenerationMultimodalText Generation
ImageOpen

FGN

Google DeepMind

Google DeepMind's earth science model tracked by Epoch, focused on weather forecasting.

Text Generation
TextOpen

EXAONE Path 2.0

LG AI Research

LG AI Research's vision, medicine model tracked by Epoch, focused on cancer diagnosis.

Image Understanding
Text + ImageOpen

Gemini Embedding

Google DeepMind

In this report, we introduce Gemini Embedding, a state-of-the-art embedding model leveraging the power of Gemini, Google's most capable large language model.

Text Generation
TextOpen

EXAONE 4.0 (32B)

LG AI Research

This technical report introduces EXAONE 4.0, which integrates a Non-reasoning mode and a Reasoning mode to achieve both the excellent usability of EXAONE 3.5 and the advanced reasoning abilities of EXAONE Deep.

CodeReasoningText Generation
TextOpen

Qwen3-Coder-480B-A35B

Alibaba

Today, we're announcing Qwen3-Coder, our most agentic code model to date.

AgentsCodeText Generation
TextOpen

Qwen3-235B-A22B-Thinking (Jul 2025)

Alibaba

Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models.

CodeReasoningText Generation
TextOpen

Qwen3-235B-A22B (Jul 2025)

Alibaba

Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models.

CodeReasoningText Generation
TextOpen

MindLink-72B

Kunlun Inc.

We introduce MindLink, a new family of large language models developed by Kunlun Inc.

CodeReasoningText Generation
TextOpen

Gemini 2.5 Deep Think

Google DeepMind

To advance Gemini’s capabilities towards solving hard reasoning problems, we developed a novel reasoning approach, called Deep Think, that naturally blends in parallel thinking techniques during response generation.

Audio / SpeechCodeImage Generation
AudioOpen

Qwen Image

Alibaba

We present Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.

Image Generation
ImageOpen

Hierarchical Reasoning Model (HPM)

Sapient Intelligence

Reasoning, the process of devising and executing complex goal-oriented action sequences, remains a critical challenge in AI.

Image UnderstandingMultimodalText Generation
Text + ImageOpen

gpt-oss-20b

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

Claude Opus 4.1

Anthropic

Today we're releasing Claude Opus 4.1, an upgrade to Claude Opus 4 on agentic tasks, real-world coding, and reasoning.

AgentsCodeImage Generation
ImageOpen

GPT-5 nano

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

GPT-5 mini

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

Gemini 2.5 Flash Image (Nano Banana)

Google

Text-to-Image: Generate high-quality images from simple or complex text descriptions.

Image Generation
ImageOpen

LongCat-Flash

Meituan Inc

We introduce LongCat-Flash, a 560-billion-parameter Mixture-of-Experts (MoE) language model designed for both computational efficiency and advanced agentic capabilities.

CodeReasoningText Generation
TextOpen

AgentFounder-30B

Alibaba

Large language models (LLMs) have evolved into agentic systems capable of autonomous tool use and multi-step reasoning for complex problem-solving.

AgentsCodeReasoning
TextOpen

Qwen3-Omni-30B-A3B

Alibaba

We present Qwen3-Omni, a single multimodal model that, for the first time, maintains state-of-the-art performance across text, image, audio, and video without any degradation relative to single-modal counterparts.

Audio / SpeechImage GenerationMultimodal
AudioOpen

Gemini Robotics-ER 1.5

Google DeepMind

Our most capable vision-language model (VLM) reasons about the physical world, natively calls digital tools and creates detailed, multi-step plans to complete a mission.

Audio / SpeechImage GenerationText Generation
AudioOpen

Sora 2.0

OpenAI

Our latest video generation model is more physically accurate, realistic, and more controllable than prior systems.

Video Generation
VideoOpen

GPT-5 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

Ling-1T

Ant Group

Ling-1T is the first flagship non-thinking model in the Ling 2.0 series, featuring 1 trillion total parameters with ≈ 50 billion active parameters per token.

CodeReasoningText Generation
TextOpen

Veo 3.1

Google DeepMind

We’re also introducing Veo 3.1, which brings richer audio, more narrative control, and enhanced realism that captures true-to-life textures.

Image GenerationVideo Generation
VideoOpen

MiniMax-M2

MiniMax

Today, we are officially open-sourcing and launching MiniMax M2, a model born for Agents and code.

AgentsCodeText Generation
TextOpen

Tongyi DeepResearch

Alibaba

We present Tongyi DeepResearch, an agentic large language model, which is specifically designed for long-horizon, deep information-seeking research tasks.

AgentsReasoningText Generation
TextOpen

Kimi K2 Thinking

Moonshot AI

Today, we are introducing Kimi K2 Thinking, our best open-source thinking model.

CodeReasoningText Generation
TextOpen

Meta's Generative Ads Model (GEM)

Meta

Meta's recommendation model tracked by Epoch, focused on recommender system.

Embeddings
TextOpen

GPT-5.1-Codex

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

GPT-5.1 Instant

OpenAI

"Today we’re upgrading the GPT‑5 series with the release of: GPT‑5.1 Instant: our most-used model, now warmer, more intelligent, and better at following your instructions.

Image GenerationMultimodalText Generation
ImageOpen

π0.6 (pi-0.6)

Physical Intelligence

We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL).

AgentsImage UnderstandingMultimodal
Vision + ActionOpen

P1-235B-A22B

Shanghai AI Lab

Recent progress in large language models (LLMs) has moved the frontier from puzzle-solving to science-grade reasoning-the kind needed to tackle problems whose answers must stand against nature, not merely fit a rubric.

Text Generation
TextOpen

Grok 4.1 Fast

xAI

Today, we’re excited to launch two powerful new additions to the xAI API: Grok 4.1 Fast, our best tool-calling model with a 2M context window.

CodeReasoningText Generation
TextOpen

Gemini 3 Pro Image (Nano Banana Pro)

Google DeepMind

Today, we’re introducing Nano Banana Pro (Gemini 3 Pro Image), our new state-of-the art image generation and editing model.

Image Generation
ImageOpen

DeepSeekMath-V2

DeepSeek

DeepSeek's language model tracked by Epoch, focused on mathematical reasoning.

ReasoningText Generation
TextOpen

SIMA 2

Google DeepMind

We introduce SIMA 2, a generalist embodied agent that understands and acts in a wide variety of 3D virtual worlds.

3D
3DOpen

GPT-5.2 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

GPT-5.2

OpenAI

OpenAI's model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

Nemotron 3-Nano-30B-A3B

NVIDIA

We present Nemotron 3 Nano 30B-A3B, a Mixture-of-Experts hybrid MambaTransformer language model.

Text Generation
TextOpen

GPT-5.2 Codex

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

GLM-4.7

Zhipu AI

Zhipu AI's language model tracked by Epoch, focused on language modeling/generation.

CodeText Generation
TextOpen

MiniMax-M2.1

MiniMax

MiniMax's language model tracked by Epoch, focused on chat.

AgentsCodeText Generation
TextOpen

HyperCLOVA X SEED 32B Think

NAVER

Developed by Naver, South Korea’s leading AI research lab, this cutting-edge language model supports multimodal inputs and advanced reasoning.

Image GenerationMultimodalText Generation
ImageOpen

VAETKI

NC AI

NC AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

A.X K1

SK Telecom

SK Telecom's language model tracked by Epoch, focused on code generation.

CodeReasoningText Generation
TextOpen

Solar Open 100B

Upstage

Solar Open is Upstage's flagship 102B-parameter large language model, trained entirely from scratch and released under the Solar-Apache License 2.0 (see LICENSE).

Text Generation
TextOpen

K-EXAONE

LG AI Research

K-EXAONE is a large-scale multilingual language model developed by LG AI Research.

CodeReasoningText Generation
TextOpen

Qwen3-Max-Thinking

Alibaba

We present Qwen3-Max-Thinking, our latest flagship reasoning model.

Text Generation
TextOpen

Qwen3-Coder-Next

Alibaba

Today, we're announcing Qwen3-Coder-Next, an open-weight language model designed specifically for coding agents and local development.

CodeText Generation
TextOpen

Kimi K2.5

Moonshot AI

We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence.

Text Generation
TextOpen

GPT-5.3 Codex

OpenAI

OpenAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

Seedance 2.0

ByteDance

ByteDance's image generation, video, audio model tracked by Epoch, focused on video generation.

Audio / SpeechImage GenerationVideo Generation
AudioOpen

Qwen3.5 397B-A17B

Alibaba

We are delighted to announce the official release of Qwen3.5, introducing the open-weight of the first model in the Qwen3.5 series, namely Qwen3.5-397B-A17B.

Image GenerationText Generation
ImageOpen

Grok 4.20

xAI

xAI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

GLM-5

Zhipu AI

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering.

Text Generation
TextOpen

Gemini 3.1 Pro

Google DeepMind

Last week, we released a major update to Gemini 3 Deep Think to solve modern challenges across science, research and engineering.

Image GenerationText Generation
ImageOpen

Qwen3.5-122B-A10B

Alibaba

Over recent months, we have intensified our focus on developing foundation models that deliver exceptional utility and performance.

Text Generation
TextOpen

SWE 1.6

Cognition

We are sharing an early preview of our ongoing SWE-1.6 training run.

AgentsCodeText Generation
TextOpen

Gemini 3.0 Flash-lite

Google DeepMind

Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model.

Text Generation
TextOpen

GPT-5.4 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

GPT-5.4

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

Nemotron 3 Super

NVIDIA

NVIDIA's language model tracked by Epoch, focused on language modeling/generation.

CodeText Generation
TextOpen

Composer 2

Anysphere

Composer 2 is a specialized model designed for agentic software engineering.

CodeText Generation
TextOpen

GLM-5.1

Zhipu AI

Zhipu AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

Gemini Flash 3.1 TTS

Google DeepMind

Google DeepMind's audio model tracked by Epoch, focused on audio generation.

Audio / Speech
AudioOpen

Claude Opus 4.7

Anthropic

Anthropic's language model tracked by Epoch, focused on question answering.

Text Generation
TextOpen

Kimi K2.6

Moonshot AI

Moonshot AI's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

GPT Image 2

OpenAI

OpenAI's image generation model tracked by Epoch, focused on image generation.

Image Generation
ImageOpen

GPT-5.5 Pro

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

GPT-5.5

OpenAI

OpenAI's multimodal, language, vision model tracked by Epoch, focused on language modeling/generation.

Image GenerationMultimodalText Generation
ImageOpen

DeepSeek-V4-Pro

DeepSeek

DeepSeek's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

DeepSeek-V4-Flash

DeepSeek

DeepSeek's language model tracked by Epoch, focused on language modeling/generation.

Text Generation
TextOpen

Composer 2.5

Anysphere

Anysphere's language model tracked by Epoch, focused on coding.

CodeText Generation
TextOpen

Hume EVI 3

Hume AI

Empathic voice interface that perceives and generates emotional speech in real time.

Audio / SpeechMultimodal
VoiceOpen

Krea 1

Krea AI

Krea's in-house image model tuned for aesthetic control and real-time iteration.

Image Generation
ImageOpen

Marble

World Labs

Fei-Fei Li's World Labs spatial intelligence model that generates explorable 3D worlds from a single image.

3DImage Understanding
Image + 3DOpen

LFM2

Liquid AI

Liquid AI's second-generation efficient foundation models built on liquid neural networks for on-device use.

Text GenerationReasoning
TextOpen

DiscoPOP

Sakana AI

LLM-discovered preference optimization algorithm from Sakana's evolutionary research line.

Reasoning
TextOpen

Magic LTM-2-Mini

Magic.dev

100M-token context model purpose-built for whole-repository software synthesis.

CodeReasoning
TextOpen

Poolside Malibu

Poolside

Code-first foundation model trained with reinforcement learning from code-execution feedback.

CodeReasoning
TextOpen

Mirage

Decart

Real-time generative world model that re-skins live video streams with text prompts.

Video GenerationImage Generation
VideoOpen

Sonar Large

Perplexity

Perplexity's in-house search-grounded LLM powering the Perplexity answer engine.

Text GenerationReasoningAgents
TextOpen

Pi 3.0

Inflection AI

Inflection's empathetic conversational assistant tuned for personal, supportive dialogue.

Text GenerationAudio / Speech
Text + VoiceOpen

Kling 2.5

Kuaishou

Kuaishou's flagship text-to-video model with strong motion coherence and 1080p output.

Video Generation
VideoOpen

Imbue 70B

Imbue

Imbue's research model trained from scratch for robust agentic reasoning and code.

ReasoningCodeAgents
TextOpen

Sarvam 1

Sarvam AI

First Indic-first foundation model optimized for 10 Indian languages and English.

Text Generation
TextOpen

Aya Expanse 32B

Cohere

Massively multilingual open-weights model covering 23 languages from Cohere For AI.

Text Generation
TextOpen

Mercury Coder

Inception Labs

Diffusion-based LLM that generates code in parallel for order-of-magnitude latency gains.

CodeText Generation
TextOpen

Nous Hermes 4

Nous Research

Open-source aligned LLM family known for steerable, uncensored research use.

Text GenerationReasoning
TextOpen

Aleph 2

Runway

Runway's closed-source in-context video editing model that modifies existing videos while preserving untouched regions.

Video GenerationMultimodal
VideoOpen

LongCat Video Avatar 1.5

Meituan

Meituan LongCat's open-source audio-driven avatar video model for single- and multi-character human video generation.

Video GenerationMultimodal
Video + AudioOpen

Hy-MT2

Tencent

Tencent Hunyuan's open-source multilingual translation family for fast, instruction-following translation across 33 languages.

Text Generation
TextOpen

Qwen 3.7 Max

Alibaba

Alibaba Cloud's closed-source trillion-parameter flagship LLM for coding, reasoning, and enterprise agentic workflows.

ReasoningCodeAgents
Text + ImageOpen

Lens

Microsoft

Microsoft's open-source 3.8B text-to-image model focused on efficient training, fast high-res generation, and strong prompt adherence.

Image Generation
ImageOpen

Stable Audio 3 Medium

Stability AI

Stability AI's 2B text-to-audio diffusion model for higher-capacity music, sound-effect generation, and audio editing.

MusicAudio / Speech
AudioOpen

Command A+ W4A4

Cohere

Cohere's open-source W4A4-quantized vision-language reasoning model for agentic, multilingual, tool-use enterprise tasks.

ReasoningMultimodalAgents
Text + ImageOpen

Gemini Omni Flash

Google DeepMind

Google DeepMind's closed-source multimodal video creation and editing model that generates or edits video from text, image, video, and audio references.

Video GenerationImage GenerationMultimodal
Text + Image + Video + AudioOpen

OmniCraft Texture Generator

Deemos Technologies

Hyper3D OmniCraft Texture generates photorealistic, seamless, tileable PBR textures for 3D assets and design pipelines.

3DImage Generation
3D + ImageOpen

Gemini 3.5 Flash

Google DeepMind

Google DeepMind's closed-source natively multimodal reasoning model for fast, high-capability agentic and coding tasks.

ReasoningMultimodalCode
Text + Image + AudioOpen

Qwen3.5 LiveTranslate Flash

Alibaba

Alibaba's vision-enhanced real-time audio/video translation model for live multilingual interpretation across 60 languages.

Audio / SpeechMultimodal
Audio + Video + TextOpen

Nemotron Labs Diffusion 14B

NVIDIA

NVIDIA's open 14B text-generation LM supporting autoregressive, diffusion-style parallel, and self-speculative decoding.

Text GenerationReasoning
TextOpen

Mirelo SFX 1.6

Mirelo AI

Mirelo's text-to-sound-effects model for production-ready Foley, ambience, and SFX generation.

Audio / Speech
AudioOpen

WavFlow

Meta

Meta's audio generation model focused on high-fidelity waveform synthesis and speech-music co-generation.

Audio / SpeechMusic
AudioOpen

Lance

ByteDance

ByteDance's foundation model for fast multimodal content creation across short-form video pipelines.

MultimodalVideo Generation
Text + Image + VideoOpen

Agora 1

Odyssey

Odyssey's interactive world model for real-time AI-generated explorable video environments.

Video GenerationAgents3D
Interactive VideoOpen

HRM Text 1B

Sapient Intelligence

Sapient's 1B Hierarchical Reasoning Model for compact, structured chain-of-thought text generation.

ReasoningText Generation
TextOpen

Dramabox

Resemble AI

Resemble AI's expressive multi-character voice acting model for long-form dramatic dialogue and narration.

Audio / Speech
AudioOpen

Stable Audio 3 Small SFX

Stability AI

Stability AI's compact text-to-sound-effects diffusion model optimized for low-latency on-device SFX generation.

Audio / Speech
AudioOpen

Stable Audio 3 Small Music

Stability AI

Stability AI's compact text-to-music diffusion model tuned for short, license-friendly musical loops and stems.

Music
AudioOpen