Skip to content
Models

Model Catalog

High-performance models optimized for low-latency inference.

All models are optimized for blazing fast response times. Speed is not a premium feature—it's the standard.

Available Models

Infe Pulse

Free Tier

Ultra-fast reasoning optimized for speed and efficiency

infe-pulse
Context Window
131K tokens
Input Cost
0.00011 IU / 1K tokens
Output Cost
0.00034 IU / 1K tokens
ChatStreamingToolsJSON Mode
Best for: Quick responses, simple tasks, high-volume applications

Infe Titan

Starter

Maximum intelligence for complex reasoning and multimodal tasks

infe-titan
Context Window
131K tokens
Input Cost
0.00019 IU / 1K tokens
Output Cost
0.00077 IU / 1K tokens
ChatStreamingToolsJSON ModeVision
Best for: Complex reasoning, multimodal tasks, production apps

Infe Core 20B

Pro

Balanced performance and speed for general tasks

infe-core-20b
Context Window
32K tokens
Input Cost
0.0002 IU / 1K tokens
Output Cost
0.0002 IU / 1K tokens
ChatStreamingToolsJSON Mode
Best for: General-purpose tasks, cost-effective pro tier access

Infe Core 120B

Pro

Massive parameter count for complex reasoning

infe-core-120b
Context Window
32K tokens
Input Cost
0.0012 IU / 1K tokens
Output Cost
0.0012 IU / 1K tokens
ChatStreamingToolsJSON Mode
Best for: Research, complex analysis, enterprise workloads

Coming Soon

Infe EchoSpeech-to-Text

High-accuracy transcription

Infe VoiceText-to-Speech

Natural speech synthesis

Infe EmbedEmbeddings

Semantic search & RAG

Infe CanvasImage Generation

State-of-the-art images

Infe GuardModeration

Content safety filtering

View full roadmap →

Tier Access

PlanModels
Freeinfe-pulse
Starterinfe-pulse, infe-titan
ProAll models (infe-pulse, infe-titan, infe-core-20b, infe-core-120b)