Zyphra Unveils ZAYA1: First Large-Scale Mixture-of-Experts Model Trained on AMD Instinct MI300X GPUs

Zyphra ZAYA1, a Mixture-of-Experts model, sets training milestones using AMD GPUs and exceeds performance of several competitive models.

Quiver AI Summary

AMD announced that Zyphra has developed ZAYA1, the first large-scale Mixture-of-Experts model trained entirely on AMD technology, including Instinct MI300X GPUs, Pensando networking, and ROCm software. ZAYA1-base demonstrates superior performance in various benchmarks, matching or exceeding models like Llama-3-8B and Qwen3-4B, while achieving significantly faster training processes due to the high memory capacity of the MI300X, eliminating the need for complex sharding. This achievement underscores AMD's role in advancing AI capabilities through innovative partnerships, as Zyphra emphasizes its commitment to efficiency and the co-design of model architectures. The collaboration is set to further progress multimodal foundation models, positioning AMD and Zyphra at the forefront of AI development.

Potential Positives

Zyphra ZAYA1 is the first large-scale Mixture-of-Experts model trained entirely on AMD technology, highlighting AMD's capabilities in large-scale AI model training.
ZAYA1 outperforms notable models such as Llama-3-8B and OLMoE, showcasing competitive performance and enhancing Zyphra's market position in AI development.
The AMD Instinct MI300X GPUs enabled over 10x faster model save times, significantly improving training efficiency and offering a clear advantage in AI model development.
The collaboration between Zyphra, AMD, and IBM reflects a strategic partnership that strengthens Zyphra's innovation capabilities in building advanced multimodal foundation models.

Potential Negatives

Dependence on AMD technology could be a risk if AMD's performance or market position declines, potentially impacting Zyphra's continued success and innovation.
The announcement highlights other competitive models, indicating a crowded market where Zyphra must continuously prove its superiority to maintain relevance.
There is no mention of immediate commercial applications or customer adoption for ZAYA1, which may raise questions about its market viability.

FAQ

What is ZAYA1 in the AI landscape?

ZAYA1 is the first large-scale Mixture-of-Experts model trained entirely on AMD technology, demonstrating advanced AI capabilities.

How does ZAYA1 perform compared to other AI models?

ZAYA1 outperforms models like Llama-3-8B and OLMoE, matching or exceeding the performance of Qwen3-4B and Gemma3-12B.

What technology powered the training of ZAYA1?

The training utilized AMD Instinct™ MI300X GPUs and AMD Pensando™ networking, leveraging the ROCm open software stack for efficiency.

What efficiency gains were reported with ZAYA1?

Zyphra achieved 10x faster model save times and improved throughput, enhancing training reliability and efficiency.

How can I learn more about ZAYA1's technical details?

For comprehensive information, read the Zyphra technical report and blogs on Zyphra and AMD's websites.

Disclaimer: This is an AI-generated summary of a press release distributed by GlobeNewswire. The model used to summarize this release may make mistakes. See the full release here.

$AMD Congressional Stock Trading

Members of Congress have traded $AMD stock 7 times in the past 6 months. Of those trades, 7 have been purchases and 0 have been sales.

Here’s a breakdown of recent trading of $AMD stock by members of Congress over the last 6 months:

REPRESENTATIVE DAN NEWHOUSE purchased up to $15,000 on 08/18.
REPRESENTATIVE CLEO FIELDS has traded it 3 times. They made 3 purchases worth up to $150,000 on 08/15, 08/01, 07/30 and 0 sales.
SENATOR JOHN BOOZMAN purchased up to $15,000 on 08/07.
SENATOR ANGUS S. KING JR. purchased up to $15,000 on 07/21.
REPRESENTATIVE LISA C. MCCLAIN purchased up to $15,000 on 06/17.

To track congressional stock trading, check out Quiver Quantitative's congressional trading dashboard.

$AMD Insider Trading Activity

$AMD insiders have traded $AMD stock on the open market 43 times in the past 6 months. Of those trades, 0 have been purchases and 43 have been sales.

Here’s a breakdown of recent trading of $AMD stock by insiders over the last 6 months:

LISA T SU (Chair, President & CEO) has made 0 purchases and 4 sales selling 225,000 shares for an estimated $36,893,351.
MARK D PAPERMASTER (Chief Technology Officer & EVP) has made 0 purchases and 26 sales selling 103,006 shares for an estimated $18,362,738.
FORREST EUGENE NORROD (EVP & GM DESG) has made 0 purchases and 10 sales selling 38,900 shares for an estimated $7,600,589.
PAUL DARREN GRASBY (EVP & CSO) sold 10,000 shares for an estimated $1,732,100
AVA HAHN (SVP, GC & Corporate Secretary) has made 0 purchases and 2 sales selling 3,011 shares for an estimated $670,804.

To track insider transactions, check out Quiver Quantitative's insider trading dashboard.

$AMD Hedge Fund Activity

We have seen 1,509 institutional investors add shares of $AMD stock to their portfolio, and 1,147 decrease their positions in their most recent quarter.

Here are some of the largest recent moves:

UBS AM, A DISTINCT BUSINESS UNIT OF UBS ASSET MANAGEMENT AMERICAS LLC added 14,398,783 shares (+61.8%) to their portfolio in Q3 2025, for an estimated $2,329,579,101
KINGSTONE CAPITAL PARTNERS TEXAS, LLC removed 6,868,568 shares (-100.0%) from their portfolio in Q3 2025, for an estimated $1,111,265,616
PRICE T ROWE ASSOCIATES INC /MD/ removed 6,000,374 shares (-22.8%) from their portfolio in Q3 2025, for an estimated $970,800,509
FMR LLC removed 5,788,751 shares (-39.8%) from their portfolio in Q3 2025, for an estimated $936,562,024
JENNISON ASSOCIATES LLC added 3,874,319 shares (+inf%) to their portfolio in Q3 2025, for an estimated $626,826,071
WELLINGTON MANAGEMENT GROUP LLP added 3,735,807 shares (+335.9%) to their portfolio in Q3 2025, for an estimated $604,416,214
WINSLOW CAPITAL MANAGEMENT, LLC added 3,448,089 shares (+inf%) to their portfolio in Q3 2025, for an estimated $557,866,319

To track hedge funds' stock portfolios, check out Quiver Quantitative's institutional holdings dashboard.

$AMD Analyst Ratings

Wall Street analysts have issued reports on $AMD in the last several months. We have seen 18 firms issue buy ratings on the stock, and 0 firms issue sell ratings.

Here are some recent analyst ratings:

Evercore ISI Group issued a "Outperform" rating on 11/12/2025
Mizuho issued a "Outperform" rating on 11/12/2025
Wells Fargo issued a "Overweight" rating on 11/12/2025
Wedbush issued a "Outperform" rating on 11/10/2025
CICC issued a "Outperform" rating on 11/07/2025
Stifel issued a "Buy" rating on 11/05/2025
Rosenblatt issued a "Buy" rating on 11/05/2025

To track analyst ratings and price targets for $AMD, check out Quiver Quantitative's $AMD forecast page.

$AMD Price Targets

Multiple analysts have issued price targets for $AMD recently. We have seen 29 analysts offer price targets for $AMD in the last 6 months, with a median target of $280.0.

Here are some recent targets:

Matt Bryson from Wedbush set a target price of $290.0 on 11/12/2025
Joseph Moore from Morgan Stanley set a target price of $260.0 on 11/12/2025
Kevin Cassidy from Rosenblatt set a target price of $300.0 on 11/12/2025
Suji Desilva from Roth Capital set a target price of $300.0 on 11/12/2025
Harsh Kumar from Piper Sandler set a target price of $280.0 on 11/12/2025
Aaron Rakers from Wells Fargo set a target price of $345.0 on 11/12/2025
Mark Lipacis from Evercore ISI Group set a target price of $283.0 on 11/12/2025

Full Release

News Highlights:

Zyphra ZAYA1 becomes the first large-scale Mixture-of-Experts model trained entirely on AMD Instinct™ MI300X GPUs, AMD Pensando™ networking and ROCm open software.
ZAYA1-base outperforms Llama-3-8B and OLMoE across multiple benchmarks and rivals the performance of Qwen3-4B and Gemma3-12B.
Memory capacity of AMD Instinct MI300X helped Zyphra simplify its training capabilities, while achieving 10x faster model save times.

SANTA CLARA, Calif., Nov. 24, 2025 (GLOBE NEWSWIRE) -- AMD (NASDAQ: AMD) announced that Zyphra has achieved a major milestone in large-scale AI model training with the development of ZAYA1, the first large-scale Mixture-of-Experts (MoE) foundation model trained using an AMD GPU and networking platform. Using AMD Instinct™ MI300X GPUs and AMD Pensando™ networking and enabled by the AMD ROCm™ open software stack, the achievement is detailed in a Zyphra technical report published today.

Results from Zyphra show that the model delivers competitive or superior performance to leading open models across reasoning, mathematics, and coding benchmarks—demonstrating the scalability and efficiency of AMD Instinct GPUs for production-scale AI workloads.

“AMD leadership in accelerated computing is empowering innovators like Zyphra to push the boundaries of what’s possible in AI,” said Emad Barsoum, corporate vice president of AI and engineering, Artificial Intelligence Group, AMD. “This milestone showcases the power and flexibility of AMD Instinct GPUs and Pensando networking for training complex, large-scale models.”

“Efficiency has always been a core guiding principle at Zyphra. It shapes how we design model architectures, develop algorithms for training and inference, and choose the hardware with the best price-performance to deliver frontier intelligence to our customers,” said Krithik Puthalath, CEO of Zyphra. “ZAYA1 reflects this philosophy and we are thrilled to be the first company to demonstrate large-scale training on an AMD platform. Our results highlight the power of co-designing model architectures with silicon and systems, and we’re excited to deepen our collaboration with AMD and IBM as we build the next generation of advanced multimodal foundation models.”

Efficient Training at Scale, Powered by AMD Instinct GPUs
The AMD Instinct MI300X GPU’s 192 GB of high-bandwidth memory enabled efficient large-scale training, avoiding costly expert or tensor sharding, which reduced complexity and improving throughput across the full model stack. Zyphra also reported more than 10x faster model save times using AMD optimized distributed I/O, further enhancing training reliability and efficiency. With only a fraction of the active parameters, ZAYA1-Base (8.3B total, 760M active) matches or exceeds the performance of models such as Qwen3-4B (Alibaba), Gemma3-12B (Google), Llama-3-8B (Meta), and OLMoE. ¹

Building on prior collaborative work, Zyphra worked closely with AMD and IBM to design and deploy a large-scale training cluster powered by AMD Instinct™ GPUs with AMD Pensando™ networking interconnect. The jointly engineered AMD and IBM system, announced earlier this quarter, combines AMD Instinct™ MI300X GPUs with IBM Cloud’s high-performance fabric and storage architecture, providing the foundation for ZAYA1’s large-scale pretraining.

For further details on the results, read the Zyphra technical report , the Zyphra blog , and the AMD blog , for comprehensive overviews of the ZAYA1 model architecture, training methodology, and the AMD technologies that enabled its development.

Supporting Resources

Follow AMD on LinkedIn
Follow AMD on Twitter
Read more about AMD Instinct GPUs here
Learn more about how AMD is advancing AI innovation at www.amd.com/aiapplications

About AMD
For more than 50 years AMD has driven innovation in high-performance computing, graphics, and visualization technologies. Billions of people, leading Fortune 500 businesses, and cutting-edge scientific research institutions around the world rely on AMD technology daily to improve how they live, work, and play. AMD employees are focused on building leadership high-performance and adaptive products that push the boundaries of what is possible. For more information about how AMD is enabling today and inspiring tomorrow, visit the AMD (NASDAQ: AMD) website , blog , LinkedIn , and X pages.

Contact:
David Szabados
AMD Communications
+1 408-472-2439
[email protected]

Liz Stine
AMD Investor Relations
+1 720-652-3965
[email protected]

_________________________
¹ Testing by Zyphra as of November 14, 2025, measuring the aggregate throughput of training iterations across the full Zyphra cluster measured in quadrillion floating point operations per second (PFLOPs). The workload was training a model comprised of a set of subsequent MLPs in BFLOAT16 across the full cluster of (128) compute nodes, each containing (8) AMD Instinct™ MI300X GPUs and (8) Pensando™ Pollara 400 Interconnects running a proprietary training stack created by Zyphra. Server manufacturers may vary configurations, yielding different results. Performance may vary based on use of the latest drivers and optimizations. This benchmark was collected with AMD ROCm 6.4.