Every AI Model on Promptha: LLMs, Image, Video, Audio & More

Promptha gives you access to 80+ AI models from leading providers—all through a single interface. No separate accounts. No managing multiple API keys. Just pick the best model for your task and go.

This guide covers every model available, organized by category. Whether you need text generation, image creation, video production, or audio synthesis, you'll find the right model here.

LLM Models (Text Generation)
Image Generation Models
Video Generation Models
Audio & Music Models
3D Generation Models
Utility Models
How to Choose the Right Model
Model Tiers Explained

LLM Models (Text Generation)

Large Language Models handle text generation, analysis, coding, and reasoning. Promptha offers models from four major providers.

OpenAI Models

Model	Best For	Context Window	Tier
GPT-4o	General-purpose, vision, multimodal	128K	Premium
GPT-4o Mini	Fast, affordable tasks	128K	Budget
GPT-4 Turbo	Complex reasoning, JSON mode	128K	Premium
GPT-3.5 Turbo	Simple tasks, high volume	16K	Budget

GPT-4o is OpenAI's flagship. It handles text and images, follows instructions precisely, and excels at coding. Use GPT-4o Mini for simpler tasks where speed and cost matter more than maximum capability.

Anthropic Models (Claude)

Model	Best For	Context Window	Tier
Claude Sonnet 4	Latest flagship, reasoning, coding	200K	Premium
Claude 3.5 Sonnet	Balanced performance, vision	200K	Premium
Claude 3.5 Haiku	Fast responses, high volume	200K	Budget
Claude 3 Opus	Deep analysis, research	200K	Premium

Claude Sonnet 4 is Anthropic's newest and most capable model. Claude models excel at nuanced writing, careful analysis, and following complex instructions. The 200K context window means they can process entire books or large codebases at once.

Google Models (Gemini)

Model	Best For	Context Window	Tier
Gemini 3 Flash	Latest Google AI, multimodal	1M	Standard
Gemini 2.5 Pro	Complex analysis, research	1M	Premium
Gemini 2.5 Flash	Production, high volume	1M	Standard
Gemini 2.0 Flash	Fast multimodal tasks	1M	Standard
Gemini 1.5 Pro	Long document analysis	1M	Premium

Gemini 3 Flash is Google's latest. The standout feature is the 1 million token context window—you can analyze entire codebases, book series, or video transcripts in a single prompt. Gemini models also understand images, video, and audio natively.

DeepSeek Models

Model	Best For	Context Window	Tier
DeepSeek Chat	Affordable reasoning	32K	Budget
DeepSeek Coder	Code generation, analysis	16K	Budget

DeepSeek offers excellent value. These models perform well on reasoning and coding tasks at a fraction of the cost of premium models. Great for high-volume applications or when budgets are tight.

Image Generation Models

Promptha connects to the best image generation models through Fal.ai and Replicate.

Flux Family (Black Forest Labs)

Model	Best For	Speed	Tier
Flux 2 Pro	Maximum quality	Slow	Premium
Flux 2	Next-gen quality	Medium	Premium
Flux Pro 1.1 Ultra	Ultra high resolution	Slow	Premium
Flux Kontext	Character consistency	Medium	Premium
Flux Realism	Photorealistic images	Medium	Standard
Flux 1.1 Pro	High quality generation	Medium	Premium
Flux Dev	Development/testing	Fast	Standard
Flux Schnell	Fast iterations	Very Fast	Budget

Flux models are the current leaders in image quality. Flux 2 Pro produces the best results, while Flux Schnell is blazing fast for quick iterations. Use Flux Kontext when you need consistent characters across multiple generations.

Ideogram

Model	Best For	Speed	Tier
Ideogram V3	Text in images, logos	Medium	Standard
Ideogram V3 Turbo	Fast text rendering	Fast	Budget
Ideogram V2	General graphics	Medium	Standard

Ideogram excels at rendering text in images. If you need logos, posters, or graphics with readable text, Ideogram is often the best choice.

Stable Diffusion

Model	Best For	Speed	Tier
SD 3.5 Large	High quality, 8B params	Slow	Premium
Stable Diffusion 3	Improved text rendering	Medium	Standard
SDXL	General purpose	Fast	Budget

Stable Diffusion models are reliable workhorses. SDXL is open-source and versatile—great for experimentation and when you need fine-tuned control.

Google Imagen

Model	Best For	Speed	Tier
Imagen 4	Premium quality	Medium	Premium
Imagen 4 Fast	Quick iterations	Fast	Standard

Imagen 4 from Google offers exceptional prompt understanding. It follows complex instructions well and produces high-quality photorealistic images.

Other Image Models

Model	Provider	Best For	Tier
Recraft V3	Recraft	Vector/design style	Standard
Recraft V3 SVG	Recraft	Scalable graphics	Standard
Nano Banana Pro	Fal.ai	Fast, efficient	Standard
Seedream 4.5	ByteDance	4K images, editing	Premium
Gen-4 Image	Runway	Advanced editing	Premium
LongCat Image	Fal.ai	Multilingual text	Standard
ImagineArt 1.5	ImagineArt	Photorealism	Standard

Video Generation Models

Video AI has advanced rapidly. Promptha offers the latest models from multiple providers.

OpenAI Sora

Model	Best For	Tier
Sora 2	Flagship text-to-video	Premium
Sora 2 I2V	Image-to-video	Premium

Sora 2 is OpenAI's flagship video model. It creates high-quality, realistic videos with complex scenes and motion. Use Sora 2 I2V to animate still images.

Google Veo

Model	Best For	Tier
Veo 3.1	High-fidelity video	Premium
Veo 3.1 Fast	Quick video generation	Standard
Veo 3.1 I2V	Image animation	Premium

Veo 3.1 from Google understands physics and motion exceptionally well. It produces high-fidelity videos with natural movement. Use the Fast variant for quick iterations.

Kling (Kuaishou)

Model	Best For	Tier
Kling Video 2.6	Character consistency	Standard
Kling Video I2V	Photo animation	Standard

Kling excels at maintaining character consistency across video frames. Great for character animations and story-based content.

LTX (Lightricks)

Model	Best For	Tier
LTX-2	Fast, affordable video	Budget
LTX-2 I2V	Quick image animation	Budget

LTX-2 is the budget-friendly option. When you need video quickly without premium cost, LTX delivers good results.

Model	Provider	Best For	Tier
Hailuo Video	MiniMax	Realistic motion	Standard
Hailuo 2.3	MiniMax	Improved quality	Standard
Hunyuan Video	Tencent	Open-source quality	Standard
PixVerse 5.5	PixVerse	Artistic/stylized	Standard
Pika 2.2	Pika Labs	Creative effects	Standard
Wan 2.5 T2V	Wan Video	Text-to-video	Standard
Wan 2.5 I2V	Wan Video	Image animation	Standard

Audio & Music Models

Generate speech, music, and sound effects with specialized audio models.

Text-to-Speech

Model	Provider	Best For	Tier
Maya TTS	Fal.ai	Expressive narration	Standard
Chatterbox TTS	Fal.ai	Fun voices, games	Budget
Speech 02 HD	MiniMax	Professional quality	Premium
Kokoro 82M	Jaaari	Lightweight, fast	Budget

Maya TTS produces expressive, natural-sounding speech. Chatterbox TTS is great for entertainment—memes, games, AI agents. Speech 02 HD offers premium professional quality.

Music Generation

Model	Provider	Best For	Tier
Lyria 2	Google	Original compositions	Premium
MiniMax Music V2	MiniMax	Background music	Standard
Music 1.5	MiniMax	Royalty-free tracks	Standard
Beatoven Music	Beatoven	Instrumental music	Standard

Lyria 2 from Google is the premium choice for AI-composed music. Beatoven Music generates royalty-free instrumentals for videos and podcasts.

Sound Effects

Model	Provider	Best For	Tier
Beatoven SFX	Beatoven	Sound effects	Standard

Beatoven SFX creates sound effects for games, videos, and multimedia projects.

3D Generation Models

Create 3D models from text or images.

Model	Provider	Best For	Tier
Rodin	Hyper3D	Text/image to 3D	Premium
SAM 3 3D Objects	Fal.ai	Object reconstruction	Premium
SAM 3 3D Body	Fal.ai	Human body modeling	Premium

Rodin generates 3D models from text descriptions or images—useful for game assets, product visualization, and 3D printing. SAM 3 variants reconstruct accurate 3D models from photographs.

Utility Models

Specialized models for specific tasks.

Upscaling

Model	Provider	Best For	Tier
Crystal Upscaler	Clarity AI	Image enhancement	Standard

Crystal Upscaler increases image resolution while preserving detail and color fidelity.

Background Removal

Model	Provider	Best For	Tier
Bria Background Remove	Bria	Remove backgrounds	Budget

Bria Background Remove extracts subjects from images with high accuracy—essential for e-commerce and product photography.

Avatar & Lipsync

Model	Provider	Best For	Tier
Creatify Aurora	Creatify	Speaking avatars	Premium
OmniHuman 1.5	ByteDance	Human animation	Premium
Sync Lipsync V2	Fal.ai	Audio-video sync	Standard

Creatify Aurora generates talking avatar videos from text. Sync Lipsync V2 synchronizes lip movements to audio—useful for dubbing and localization.

How to Choose the Right Model

For Text Generation

General tasks with images: GPT-4o or Claude Sonnet 4
Long documents (100K+ tokens): Gemini models (1M context)
Budget-conscious: GPT-4o Mini, Claude 3.5 Haiku, DeepSeek Chat
Coding focus: Claude Sonnet 4, DeepSeek Coder
Deep analysis: Claude 3 Opus, Gemini 2.5 Pro

For Image Generation

Maximum quality: Flux 2 Pro, Flux Pro 1.1 Ultra
Text in images: Ideogram V3
Fast iterations: Flux Schnell, Ideogram V3 Turbo
Photorealistic: Flux Realism, Imagen 4
Vector/design: Recraft V3, Recraft V3 SVG
Consistent characters: Flux Kontext

For Video Generation

Premium quality: Sora 2, Veo 3.1
Character consistency: Kling Video 2.6
Budget-friendly: LTX-2
Image animation: Sora 2 I2V, Veo 3.1 I2V, Pika 2.2
Quick iterations: Veo 3.1 Fast, LTX-2

For Audio

Professional voiceover: Maya TTS, Speech 02 HD
Fun/games: Chatterbox TTS
Background music: Beatoven Music, MiniMax Music V2
Premium compositions: Lyria 2

Model Tiers Explained

Promptha organizes models into three tiers:

Premium Tier

Highest quality output
Best for production work
Higher cost per generation
Examples: Flux 2 Pro, Sora 2, Claude Sonnet 4

Standard Tier

Good quality-to-cost ratio
Suitable for most tasks
Balanced performance
Examples: Ideogram V3, Kling Video, Gemini Flash

Budget Tier

Fastest generation
Lowest cost
Great for iterations and testing
Examples: Flux Schnell, LTX-2, GPT-4o Mini

Use premium models for final output. Use budget models for exploration and iteration. Standard models work well when you need both quality and volume.

Using Models in Promptha

In Fabrics

When you use a Fabric, the creator has already selected the optimal model for that task. You can often switch models in the settings if you prefer a different option.

In Ask

Ask assistants support dynamic model switching. Start with one model, switch to another mid-conversation based on what you need.

In AskGL

AskGL gives you direct model control with the @provider.model syntax:

/image @fal.flux-schnell sunset over mountains
/write @anthropic.claude-sonnet-4 blog post about AI
/video @fal.veo-3.1 cinematic drone shot

What's Next?

Now that you know what models are available:

What is AskGL? - Control models with command syntax
What is a Fabric? - Use pre-configured AI tools
What is Ask? - Conversational AI assistants
Claude vs GPT-4 vs Gemini - LLM comparison
Image Generation Models Compared - Deep dive on image AI

Promptha brings together the best AI models in one platform. Instead of managing multiple accounts and APIs, you access everything through a unified interface. Pick the right model for each task, and let the platform handle the rest.

Every AI Model on Promptha: LLMs, Image, Video, Audio & More

Every AI Model on Promptha: LLMs, Image, Video, Audio & More

Table of Contents

LLM Models (Text Generation)

OpenAI Models

Anthropic Models (Claude)

Google Models (Gemini)

DeepSeek Models

Image Generation Models

Flux Family (Black Forest Labs)

Ideogram

Stable Diffusion

Google Imagen

Other Image Models

Video Generation Models

OpenAI Sora

Google Veo

Kling (Kuaishou)

LTX (Lightricks)

Other Video Models

Audio & Music Models

Text-to-Speech

Music Generation

Sound Effects

3D Generation Models

Utility Models

Upscaling

Background Removal

Avatar & Lipsync

How to Choose the Right Model

For Text Generation

For Image Generation

For Video Generation

For Audio

Model Tiers Explained

Premium Tier

Standard Tier

Budget Tier

Using Models in Promptha

In Fabrics

In Ask

In AskGL

What's Next?

Related Articles

Text-to-Speech Models Compared

Speech-02 HD: MiniMax Voice Model

Sound Effects with AI