A large language model whose weights are publicly released, allowing anyone to download, run, fine-tune, and deploy the model on their own infrastructure. Examples include Meta's Llama, Mistral, DeepSeek, and Qwen. Open-source LLMs offer full data control (nothing leaves your servers), customizability (fine-tune for your domain), and no per-token API costs—but require infrastructure expertise and GPU investment. They power self-hosted AI agents for organizations with strict data residency, compliance, or cost requirements.
Пример
A financial services firm deploys Llama on its own GPU cluster to power an internal document analysis agent. Customer data never leaves the firm's network, satisfying regulatory requirements that prohibit sending financial data to third-party APIs.
Часто задаваемые вопросы
Are open-source LLMs as good as proprietary models?
The gap has narrowed significantly. Frontier open-source models (Llama 3, DeepSeek V3, Qwen 2.5) approach proprietary model performance on many benchmarks. However, proprietary models still lead on complex reasoning, instruction following, and safety. Open-source models work well for focused, domain-specific agents; proprietary models are better for general-purpose agents handling diverse, complex requests.
How much does it cost to run an open-source LLM?
GPU costs vary widely. A 7B-parameter model runs on a single consumer GPU ($1,000-2,000). A 70B model needs 2-4 A100s ($15,000-60,000 or $2-8/hour cloud). At high volume, self-hosting becomes cheaper than API pricing; at low volume, APIs are more cost-effective. The crossover point is typically 1-10 million tokens per day.