Every AI Model on Promptha: LLMs, Image, Video, Audio & More
Promptha gives you access to 80+ AI models from leading providers—all through a single interface. No separate accounts. No managing multiple API keys. Just pick the best model for your task and go.
This guide covers every model available, organized by category. Whether you need text generation, image creation, video production, or audio synthesis, you'll find the right model here.
Table of Contents
- LLM Models (Text Generation)
- Image Generation Models
- Video Generation Models
- Audio & Music Models
- 3D Generation Models
- Utility Models
- How to Choose the Right Model
- Model Tiers Explained
LLM Models (Text Generation)
Large Language Models handle text generation, analysis, coding, and reasoning. Promptha offers models from four major providers.
OpenAI Models
| Model | Best For | Context Window | Tier |
|---|---|---|---|
| GPT-4o | General-purpose, vision, multimodal | 128K | Premium |
| GPT-4o Mini | Fast, affordable tasks | 128K | Budget |
| GPT-4 Turbo | Complex reasoning, JSON mode | 128K | Premium |
| GPT-3.5 Turbo | Simple tasks, high volume | 16K | Budget |
GPT-4o is OpenAI's flagship. It handles text and images, follows instructions precisely, and excels at coding. Use GPT-4o Mini for simpler tasks where speed and cost matter more than maximum capability.
Anthropic Models (Claude)
| Model | Best For | Context Window | Tier |
|---|---|---|---|
| Claude Sonnet 4 | Latest flagship, reasoning, coding | 200K | Premium |
| Claude 3.5 Sonnet | Balanced performance, vision | 200K | Premium |
| Claude 3.5 Haiku | Fast responses, high volume | 200K | Budget |
| Claude 3 Opus | Deep analysis, research | 200K | Premium |
Claude Sonnet 4 is Anthropic's newest and most capable model. Claude models excel at nuanced writing, careful analysis, and following complex instructions. The 200K context window means they can process entire books or large codebases at once.
Google Models (Gemini)
| Model | Best For | Context Window | Tier |
|---|---|---|---|
| Gemini 3 Flash | Latest Google AI, multimodal | 1M | Standard |
| Gemini 2.5 Pro | Complex analysis, research | 1M | Premium |
| Gemini 2.5 Flash | Production, high volume | 1M | Standard |
| Gemini 2.0 Flash | Fast multimodal tasks | 1M | Standard |
| Gemini 1.5 Pro | Long document analysis | 1M | Premium |
Gemini 3 Flash is Google's latest. The standout feature is the 1 million token context window—you can analyze entire codebases, book series, or video transcripts in a single prompt. Gemini models also understand images, video, and audio natively.
DeepSeek Models
| Model | Best For | Context Window | Tier |
|---|---|---|---|
| DeepSeek Chat | Affordable reasoning | 32K | Budget |
| DeepSeek Coder | Code generation, analysis | 16K | Budget |
DeepSeek offers excellent value. These models perform well on reasoning and coding tasks at a fraction of the cost of premium models. Great for high-volume applications or when budgets are tight.
Image Generation Models
Promptha connects to the best image generation models through Fal.ai and Replicate.
Flux Family (Black Forest Labs)
| Model | Best For | Speed | Tier |
|---|---|---|---|
| Flux 2 Pro | Maximum quality | Slow | Premium |
| Flux 2 | Next-gen quality | Medium | Premium |
| Flux Pro 1.1 Ultra | Ultra high resolution | Slow | Premium |
| Flux Kontext | Character consistency | Medium | Premium |
| Flux Realism | Photorealistic images | Medium | Standard |
| Flux 1.1 Pro | High quality generation | Medium | Premium |
| Flux Dev | Development/testing | Fast | Standard |
| Flux Schnell | Fast iterations | Very Fast | Budget |
Flux models are the current leaders in image quality. Flux 2 Pro produces the best results, while Flux Schnell is blazing fast for quick iterations. Use Flux Kontext when you need consistent characters across multiple generations.
Ideogram
| Model | Best For | Speed | Tier |
|---|---|---|---|
| Ideogram V3 | Text in images, logos | Medium | Standard |
| Ideogram V3 Turbo | Fast text rendering | Fast | Budget |
| Ideogram V2 | General graphics | Medium | Standard |
Ideogram excels at rendering text in images. If you need logos, posters, or graphics with readable text, Ideogram is often the best choice.
Stable Diffusion
| Model | Best For | Speed | Tier |
|---|---|---|---|
| SD 3.5 Large | High quality, 8B params | Slow | Premium |
| Stable Diffusion 3 | Improved text rendering | Medium | Standard |
| SDXL | General purpose | Fast | Budget |
Stable Diffusion models are reliable workhorses. SDXL is open-source and versatile—great for experimentation and when you need fine-tuned control.
Google Imagen
| Model | Best For | Speed | Tier |
|---|---|---|---|
| Imagen 4 | Premium quality | Medium | Premium |
| Imagen 4 Fast | Quick iterations | Fast | Standard |
Imagen 4 from Google offers exceptional prompt understanding. It follows complex instructions well and produces high-quality photorealistic images.
Other Image Models
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Recraft V3 | Recraft | Vector/design style | Standard |
| Recraft V3 SVG | Recraft | Scalable graphics | Standard |
| Nano Banana Pro | Fal.ai | Fast, efficient | Standard |
| Seedream 4.5 | ByteDance | 4K images, editing | Premium |
| Gen-4 Image | Runway | Advanced editing | Premium |
| LongCat Image | Fal.ai | Multilingual text | Standard |
| ImagineArt 1.5 | ImagineArt | Photorealism | Standard |
Video Generation Models
Video AI has advanced rapidly. Promptha offers the latest models from multiple providers.
OpenAI Sora
| Model | Best For | Tier |
|---|---|---|
| Sora 2 | Flagship text-to-video | Premium |
| Sora 2 I2V | Image-to-video | Premium |
Sora 2 is OpenAI's flagship video model. It creates high-quality, realistic videos with complex scenes and motion. Use Sora 2 I2V to animate still images.
Google Veo
| Model | Best For | Tier |
|---|---|---|
| Veo 3.1 | High-fidelity video | Premium |
| Veo 3.1 Fast | Quick video generation | Standard |
| Veo 3.1 I2V | Image animation | Premium |
Veo 3.1 from Google understands physics and motion exceptionally well. It produces high-fidelity videos with natural movement. Use the Fast variant for quick iterations.
Kling (Kuaishou)
| Model | Best For | Tier |
|---|---|---|
| Kling Video 2.6 | Character consistency | Standard |
| Kling Video I2V | Photo animation | Standard |
Kling excels at maintaining character consistency across video frames. Great for character animations and story-based content.
LTX (Lightricks)
| Model | Best For | Tier |
|---|---|---|
| LTX-2 | Fast, affordable video | Budget |
| LTX-2 I2V | Quick image animation | Budget |
LTX-2 is the budget-friendly option. When you need video quickly without premium cost, LTX delivers good results.
Other Video Models
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Hailuo Video | MiniMax | Realistic motion | Standard |
| Hailuo 2.3 | MiniMax | Improved quality | Standard |
| Hunyuan Video | Tencent | Open-source quality | Standard |
| PixVerse 5.5 | PixVerse | Artistic/stylized | Standard |
| Pika 2.2 | Pika Labs | Creative effects | Standard |
| Wan 2.5 T2V | Wan Video | Text-to-video | Standard |
| Wan 2.5 I2V | Wan Video | Image animation | Standard |
Audio & Music Models
Generate speech, music, and sound effects with specialized audio models.
Text-to-Speech
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Maya TTS | Fal.ai | Expressive narration | Standard |
| Chatterbox TTS | Fal.ai | Fun voices, games | Budget |
| Speech 02 HD | MiniMax | Professional quality | Premium |
| Kokoro 82M | Jaaari | Lightweight, fast | Budget |
Maya TTS produces expressive, natural-sounding speech. Chatterbox TTS is great for entertainment—memes, games, AI agents. Speech 02 HD offers premium professional quality.
Music Generation
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Lyria 2 | Original compositions | Premium | |
| MiniMax Music V2 | MiniMax | Background music | Standard |
| Music 1.5 | MiniMax | Royalty-free tracks | Standard |
| Beatoven Music | Beatoven | Instrumental music | Standard |
Lyria 2 from Google is the premium choice for AI-composed music. Beatoven Music generates royalty-free instrumentals for videos and podcasts.
Sound Effects
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Beatoven SFX | Beatoven | Sound effects | Standard |
Beatoven SFX creates sound effects for games, videos, and multimedia projects.
3D Generation Models
Create 3D models from text or images.
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Rodin | Hyper3D | Text/image to 3D | Premium |
| SAM 3 3D Objects | Fal.ai | Object reconstruction | Premium |
| SAM 3 3D Body | Fal.ai | Human body modeling | Premium |
Rodin generates 3D models from text descriptions or images—useful for game assets, product visualization, and 3D printing. SAM 3 variants reconstruct accurate 3D models from photographs.
Utility Models
Specialized models for specific tasks.
Upscaling
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Crystal Upscaler | Clarity AI | Image enhancement | Standard |
Crystal Upscaler increases image resolution while preserving detail and color fidelity.
Background Removal
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Bria Background Remove | Bria | Remove backgrounds | Budget |
Bria Background Remove extracts subjects from images with high accuracy—essential for e-commerce and product photography.
Avatar & Lipsync
| Model | Provider | Best For | Tier |
|---|---|---|---|
| Creatify Aurora | Creatify | Speaking avatars | Premium |
| OmniHuman 1.5 | ByteDance | Human animation | Premium |
| Sync Lipsync V2 | Fal.ai | Audio-video sync | Standard |
Creatify Aurora generates talking avatar videos from text. Sync Lipsync V2 synchronizes lip movements to audio—useful for dubbing and localization.
How to Choose the Right Model
For Text Generation
- General tasks with images: GPT-4o or Claude Sonnet 4
- Long documents (100K+ tokens): Gemini models (1M context)
- Budget-conscious: GPT-4o Mini, Claude 3.5 Haiku, DeepSeek Chat
- Coding focus: Claude Sonnet 4, DeepSeek Coder
- Deep analysis: Claude 3 Opus, Gemini 2.5 Pro
For Image Generation
- Maximum quality: Flux 2 Pro, Flux Pro 1.1 Ultra
- Text in images: Ideogram V3
- Fast iterations: Flux Schnell, Ideogram V3 Turbo
- Photorealistic: Flux Realism, Imagen 4
- Vector/design: Recraft V3, Recraft V3 SVG
- Consistent characters: Flux Kontext
For Video Generation
- Premium quality: Sora 2, Veo 3.1
- Character consistency: Kling Video 2.6
- Budget-friendly: LTX-2
- Image animation: Sora 2 I2V, Veo 3.1 I2V, Pika 2.2
- Quick iterations: Veo 3.1 Fast, LTX-2
For Audio
- Professional voiceover: Maya TTS, Speech 02 HD
- Fun/games: Chatterbox TTS
- Background music: Beatoven Music, MiniMax Music V2
- Premium compositions: Lyria 2
Model Tiers Explained
Promptha organizes models into three tiers:
Premium Tier
- Highest quality output
- Best for production work
- Higher cost per generation
- Examples: Flux 2 Pro, Sora 2, Claude Sonnet 4
Standard Tier
- Good quality-to-cost ratio
- Suitable for most tasks
- Balanced performance
- Examples: Ideogram V3, Kling Video, Gemini Flash
Budget Tier
- Fastest generation
- Lowest cost
- Great for iterations and testing
- Examples: Flux Schnell, LTX-2, GPT-4o Mini
Use premium models for final output. Use budget models for exploration and iteration. Standard models work well when you need both quality and volume.
Using Models in Promptha
In Fabrics
When you use a Fabric, the creator has already selected the optimal model for that task. You can often switch models in the settings if you prefer a different option.
In Ask
Ask assistants support dynamic model switching. Start with one model, switch to another mid-conversation based on what you need.
In AskGL
AskGL gives you direct model control with the @provider.model syntax:
/image @fal.flux-schnell sunset over mountains
/write @anthropic.claude-sonnet-4 blog post about AI
/video @fal.veo-3.1 cinematic drone shot
What's Next?
Now that you know what models are available:
- What is AskGL? - Control models with command syntax
- What is a Fabric? - Use pre-configured AI tools
- What is Ask? - Conversational AI assistants
- Claude vs GPT-4 vs Gemini - LLM comparison
- Image Generation Models Compared - Deep dive on image AI
Promptha brings together the best AI models in one platform. Instead of managing multiple accounts and APIs, you access everything through a unified interface. Pick the right model for each task, and let the platform handle the rest.