Models

Model Catalog

High-performance models optimized for low-latency inference.

All models are optimized for blazing fast response times. Speed is not a premium feature—it's the standard.

Available Models

Infe Pulse

Free Tier

Ultra-fast reasoning optimized for speed and efficiency

infe-pulse

Context Window

131K tokens

Input Cost

0.00011 IU / 1K tokens

Output Cost

0.00034 IU / 1K tokens

ChatStreamingToolsJSON Mode

Best for: Quick responses, simple tasks, high-volume applications

Infe Titan

Starter

Maximum intelligence for complex reasoning and multimodal tasks

infe-titan

Context Window

131K tokens

Input Cost

0.00019 IU / 1K tokens

Output Cost

0.00077 IU / 1K tokens

ChatStreamingToolsJSON ModeVision

Best for: Complex reasoning, multimodal tasks, production apps

Infe Core 20B

Pro

Balanced performance and speed for general tasks

infe-core-20b

Context Window

32K tokens

Input Cost

0.0002 IU / 1K tokens

Output Cost

0.0002 IU / 1K tokens

ChatStreamingToolsJSON Mode

Best for: General-purpose tasks, cost-effective pro tier access

Infe Core 120B

Pro

Massive parameter count for complex reasoning

infe-core-120b

Context Window

32K tokens

Input Cost

0.0012 IU / 1K tokens

Output Cost

0.0012 IU / 1K tokens

ChatStreamingToolsJSON Mode

Best for: Research, complex analysis, enterprise workloads

Coming Soon

Infe EchoSpeech-to-Text

High-accuracy transcription

Infe VoiceText-to-Speech

Natural speech synthesis

Infe EmbedEmbeddings

Semantic search & RAG

Infe CanvasImage Generation

State-of-the-art images

Infe GuardModeration

Content safety filtering

View full roadmap →

Tier Access

Plan	Models
Free	infe-pulse
Starter	infe-pulse, infe-titan
Pro	All models (infe-pulse, infe-titan, infe-core-20b, infe-core-120b)