Inference · Ollama

Ollama

Run open-weight LLMs locally with one command. OpenAI-compatible API.

FREEMIUMOpen sourceLocalmacOSWindowsLinuxCLIAPI

The de-facto way to pull and run open-weight models (Llama, Qwen, Gemma, DeepSeek, gpt-oss) on your own machine — no API key, no data leaving the device. Ships native macOS/Windows/Linux apps, an OpenAI-compatible server, and official Python/JS libraries. MIT-licensed and free locally; an optional paid Ollama Cloud runs larger models.

Model support

Multi-model

Llama
Qwen
Gemma
DeepSeek
Mistral
gpt-oss

Runs open-weight models locally; OpenAI-compatible API. Optional cloud for larger models.

Where it runs

macOS
Windows
Linux
CLI
API

Ollama

Multi-model

Together AI

fal

Groq

LM Studio