Stop Paying for Proprietary Models.
|
|
Open-weight models beat proprietary models on cost. And they run on your GPU.
|
|
Qwen 3.6 (Apache 2.0) runs on an RTX 3060 with 12GB and scores 73.4% on SWE-bench Verified — versus 80.8% for proprietary frontier models at US$ 75/million output tokens. DeepSeek V4 Flash (MIT) surpasses 79% and runs on 2 RTX 4090s. GPT-OSS-120B (Apache 2.0) runs in the cloud for pennies. The math doesn't work for proprietary models.
|
|
|
Real benchmark — SWE-bench Verified 2026
Code that works. Or it doesn't.
500 real GitHub issues. The model must understand the bug, write the patch, and pass the tests. No shortcuts. Source: swebench.com.
|
|
|
|
|
|
Qwen 3.6-27B (dense) |
|
77.2% |
|
|
Qwen 3.6-35B-A3B ⚡️ RTX 3060 |
|
73.4% |
|
|
|
|
Data: swebench.com. Qwen 3.6-35B-A3B runs on RTX 3060 12GB with quantization (Apache 2.0). DeepSeek V4 Flash (MIT) requires 2 RTX 4090. Proprietary frontier models cost US$ 75/1M output tokens. *GPT-OSS-120B: partial benchmark, an open model (Apache 2.0), 5.1B active of 120B MoE, runs in cloud. June 2026.
|
|
|
Real cost
Frontier performance. Pocket change.
The cost gap between proprietary and open models is abysmal. And with your own GPU, token cost is zero.
|
| Model |
Input/1M |
Output/1M |
SWE-Ver. |
Runs on |
|
| Proprietary Frontier |
US$ 15 |
US$ 75 |
80.8% |
API only |
|
| DeepSeek V4 Flash (API) |
US$ 0.14 |
US$ 0.28 |
79% |
2 RTX 4090 |
|
| Qwen 3.6-27B (API) |
~US$ 0.50 |
~US$ 2 |
77.2% |
RTX 4090 |
|
| Qwen 3.6-35B-A3B |
US$ 0 |
US$ 0 |
73.4% |
RTX 3060 |
|
| DeepSeek V4 Flash |
US$ 0 |
US$ 0 |
79% |
2× RTX 4090 |
|
|
Sources: deepseek.com, qwen.alibaba.com, swebench.com, openai.com/gpt-oss. Official API prices June/2026. Qwen 3.6-35B-A3B on RTX 3060, DeepSeek V4 Flash on 2× RTX 4090, GPT-OSS-120B on 1× H100.
|
|
|
“While you pay US$ 75 per million output tokens for proprietary APIs, your competitor runs Qwen 3.6 on a 12GB RTX 3060 — for free, without sending any data to anyone.”
— The math that doesn't add up. June 2026.
|
|
|
|
Your scenario
Four paths. One is yours.
From the simplest to the most sovereign. ALL models below are open weight (Apache 2.0 or MIT).
|
|
EXIT 01 — API SWAP
DeepSeek V4 Flash or Qwen 3.6 via API
Swap endpoints, zero code changes. OpenAI-compatible API. Cost 50-250x lower than proprietary models, equivalent coding performance. Results in days. DeepSeek V4 Flash: US$ 0.28/1M output. Qwen 3.6: ~US$ 2/1M output.
|
|
|
EXIT 02 — CLOUD GPU
GPT-OSS-120B, Qwen 3.6 or DeepSeek V4 Flash in the cloud
Rent a GPU (H100, A100) on AWS, Azure, RunPod or Spheron. Run vLLM with OpenAI-compatible API. GPT-OSS-120B (5.1B active, 120B MoE) fits on 1× H100. DeepSeek V4 Flash 2× H100. Full data control. Predictable cost. Used in production by General Bots.
|
|
|
EXIT 03 — ON-PREMISE (YOUR GPU)
Qwen 3.6-35B-A3B on RTX 3060 (12GB)
Qwen 3.6-35B-A3B: activates only 3B parameters per token (MoE). With 4-bit quantization, it fits in 4-6GB of VRAM. Runs on your RTX 3060 with 12GB. 73.4% on SWE-bench Verified. DeepSeek V4 Flash (284B MoE, 13B active): 2× RTX 4090. 79% on SWE-bench Verified, 1M context. Inference cost: ZERO. LGPD/GDPR compliance automatic — data never leaves your machine.
|
|
|
EXIT 04 — LAST RESORT
Legacy API — if you really have no alternative
If you don't have a GPU, can't use the cloud, and don't want to switch APIs, legacy API access is still cheaper than frontier. But it's plan Z. Start with any of the three exits above.
|
|
|
|
REAL READING (NO BULLSHIT)
|
|
|
|
|
|
pragmatismo.com.br
Escape from BigTech
TCO comparison: open source saves up to 87.5% over 5 years vs proprietary stacks. The numbers of freedom.
|
|
|
|
|
|
NEXT STEP
Free assessment with Pragmatismo
We analyze your current stack — model, GPU, volume, cost — and map which of the four exits makes sense. No commitment. No sales pitch.
|
| • Real monthly cost mapping by model and volume |
| • Architecture and GPU recommendation (RTX 3060, 4090 or cloud) |
| • Monthly savings estimate and ROI |
| • LGPD/GDPR compliance analysis for LLMs |
|
|
|
|
Or directly: contato@pragmatismo.com.br
|
|
|
|
|