Zyphra Releases ZAYA1-8B, a Compact MoE Model Trained on AMD

Zyphra has released ZAYA1-8B, a Mixture-of-Experts model with fewer than 1 billion active parameters that reportedly competes with far larger open and proprietary models on math, coding, and reasoning benchmarks. The model introduces a novel Markovian RSA test-time compute method that generates multiple reasoning traces in parallel and recursively aggregates them without unbounded context growth. Notably, ZAYA1-8B was trained entirely on AMD Instinct MI300x hardware, demonstrating that serious AI training is no longer locked to NVIDIA’s stack. The model is available under Apache-2.0 with weights on Hugging Face and hosted inference via Zyphra Cloud.

Zyphra — ZAYA1-8B