US Startup Arcee AI Challenges China’s Lead in Open-Source AI with New Trinity Models

3

For much of 2025, China has dominated the development of cutting-edge open-weight language models. Labs like Alibaba’s Qwen, DeepSeek, and Baidu have rapidly advanced Mixture-of-Experts (MoE) models, often with permissive licenses and superior performance. Now, a U.S. company, Arcee AI, is launching a direct challenge to this trend with its new “Trinity” family of open-weight models.

The Rise of Open AI in China

Chinese research labs have taken the lead in developing large-scale, open MoE models due to their permissive licensing and benchmark performance. OpenAI released an open source LLM, but its adoption has been slow because of better-performing alternatives. This trend raises questions about the U.S.’s ability to compete in open-source AI, and why the most significant advances are happening abroad.

Arcee AI’s Trinity Models: A U.S.-Built Alternative

Today, Arcee AI announced the release of Trinity Mini and Trinity Nano Preview, the first two models in its new “Trinity” family. These models are fully trained in the United States under an enterprise-friendly Apache 2.0 license. Users can test Mini in a chatbot on chat.arcee.ai and developers can download the code from Hugging Face for modification and fine-tuning.

While smaller than the largest models, these releases represent the first U.S. attempt to build end-to-end open-weight models at scale, trained from scratch on American infrastructure with a U.S.-curated dataset. According to Arcee CTO Lucas Atkins, “I’m experiencing a combination of extreme pride in my team and crippling exhaustion, so I’m struggling to put into words just how excited I am to have these models out.”

A third model, Trinity Large, is already in training: a 420B parameter model with 13B active parameters per token, scheduled to launch in January 2026.

Trinity’s Technical Edge: AFMoE Architecture

Arcee’s Trinity models utilize a new Attention-First Mixture-of-Experts (AFMoE) architecture. This design combines global sparsity, local/global attention, and gated attention techniques to improve stability and efficiency at scale.

AFMoE differs from traditional MoE models by using smoother sigmoid routing rather than simple rankings when deciding which “expert” to consult, allowing for more graceful blending of multiple perspectives. The “attention-first” approach means the model focuses heavily on how it pays attention to different parts of the conversation, improving long-context reasoning.

Trinity Mini is a 26B parameter model with 3B active per token, designed for high-throughput reasoning, function calling, and tool use. Trinity Nano Preview is a 6B parameter model with roughly 800M active non-embedding parameters—a more experimental, chat-focused model with a stronger personality, but lower reasoning robustness.

Performance and Access

Trinity Mini performs competitively with larger models across reasoning tasks, including outperforming gpt-oss on the SimpleQA benchmark, MMLU, and BFCL V3:

  • MMLU (zero-shot): 84.95
  • Math-500: 92.10
  • GPQA-Diamond: 58.55
  • BFCL V3: 59.67

The model achieves 200+ tokens per second throughput with sub-three-second E2E latency, making it viable for interactive applications. Both models are released under the Apache 2.0 license and available via Hugging Face, OpenRouter, and Arcee’s website. API pricing for Trinity Mini via OpenRouter is $0.045 per million input tokens and $0.15 per million output tokens.

Partnerships for Data and Infrastructure

Arcee’s success relies on strategic partnerships. DatologyAI, a data curation startup, ensures high-quality training data by filtering, deduplicating, and enhancing datasets to avoid legal issues and bias. DatologyAI has constructed a 10 trillion token curriculum for Trinity, including general data, high-quality text, and STEM-heavy material.

Prime Intellect provides the infrastructure, with 512 H200 GPUs in a custom bf16 pipeline for training Trinity Mini and Nano. They are also hosting the 2048 B300 GPU cluster for the upcoming Trinity Large.

The Future of U.S. AI: Model Sovereignty

Arcee’s push into full pretraining reflects a broader strategy: owning the entire training loop for compliance and control, especially as AI systems become more autonomous. The company argues that controlling the weights and training pipeline is crucial for building reliable, adaptable AI products.

Trinity Large, a 420B parameter MoE model, is scheduled for launch in January 2026. If successful, it would be one of the only fully open-weight, U.S.-trained frontier-scale models, positioning Arcee as a key player in the open ecosystem.

Arcee’s Trinity launch signals a renewed effort to reclaim ground for transparent, U.S.-controlled model development, showing that smaller companies can still push boundaries in an open fashion.

Попередня статтяThe Case for Growth: Why Economic Expansion Matters Now More Than Ever
Наступна статтяU.S. Government Takes Stake in Chip Startup xLight, Signaling Shift in Tech Funding