Skip to Content Facebook Feature Image

AWS and Cerebras Collaboration Aims to Set a New Standard for AI Inference Speed and Performance in the Cloud

News

AWS and Cerebras Collaboration Aims to Set a New Standard for AI Inference Speed and Performance in the Cloud
News

News

AWS and Cerebras Collaboration Aims to Set a New Standard for AI Inference Speed and Performance in the Cloud

2026-03-13 23:07 Last Updated At:23:20

SEATTLE & SUNNYVALE, Calif.--(BUSINESS WIRE)--Mar 13, 2026--

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and Cerebras Systems today announced a collaboration that will, in the coming months, deliver the fastest AI inference solutions available for generative AI applications and LLM workloads. The solution, to be deployed on Amazon Bedrock in AWS data centers, combines AWS Trainium-powered servers, Cerebras CS-3 systems, and Elastic Fabric Adapter (EFA) networking. Later this year, AWS will also offer leading open-source LLMs and Amazon Nova using Cerebras hardware.

This press release features multimedia. View the full release here: https://www.businesswire.com/news/home/20260313406341/en/

Amazon is deploying Cerebras Wafer Scale Engines in AWS datacenters​. Ultra fast inference will be available through AWS Bedrock, bringing industry leading performance to the largest hyperscale cloud.​

“Inference is where AI delivers real value to customers, but speed remains a critical bottleneck for demanding workloads like real-time coding assistance and interactive applications,” said David Brown, Vice President, Compute & ML Services, AWS. “What we're building with Cerebras solves that: by splitting the inference workload across Trainium and CS-3, and connecting them with Amazon’s Elastic Fabric Adapter, each system does what it's best at. The result will be inference that's an order of magnitude faster and higher performance than what's available today."

“Partnering with AWS to build a disaggregated inference solution will bring the fastest inference to a global customer base,” said Andrew Feldman, Founder and CEO of Cerebras Systems. “Every enterprise around the world will be able to benefit from blisteringly fast inference within their existing AWS environment.”

How It Works: Inference Disaggregation

The Trainium + CS-3 solution enables “inference disaggregation,” a technique which separates AI inference into two stages: prompt processing, or “prefill,” and output generation, or “decode.” These two stages have profoundly different computational characteristics. Prefill is natively parallel, computationally intensive, and requires moderate memory bandwidth. Decode, on the other hand, is inherently serial, computationally light, and memory bandwidth intensive. Decode typically represents the majority of inference time in these scenarios because each output token must be generated sequentially.

Because each stage has a different computational challenge, they each benefit from different compute architectures and low-latency, high-bandwidth EFA networking between them. By strategically disaggregating the inference problem — with Trainium optimized for prefill and the Cerebras CS-3 optimized for decode — the two different computational challenges can be optimized in a specialized way.

Built on the AWS Nitro System — the foundation of AWS's secure, high-performance cloud infrastructure — the new solution will ensure that Cerebras CS-3 systems and Trainium-powered instances operate with the same security, isolation, and operational consistency customers expect from AWS.

AWS Trainium for Prefill and Cerebras CS-3 for Decode

Trainium is Amazon's purpose-built AI chip, designed to deliver scalable performance and cost efficiency for training and inference across a broad range of generative AI workloads. Two of the world's leading AI labs—Anthropic and OpenAI—are committed to Trainium. Anthropic has named AWS its primary training partner and is using Trainium to train and deploy its models, while OpenAI will consume 2 gigawatts of Trainium capacity through AWS infrastructure to support demand for Stateful Runtime Environment, frontier models, and other advanced workloads. Since its recent release, Trainium3 has seen strong customer adoption, with organizations across industries committing significant capacity.

Cerebras' CS-3 is the world's fastest AI inference system. It delivers thousands of times greater memory bandwidth than the fastest GPU. As reasoning models now represent a majority of inference to compute and generate more tokens per request as they “think” through problems, the need to accelerate this portion of the workflow has grown accordingly. OpenAI, Cognition, Mistral, and others use Cerebras to accelerate their most demanding workloads, especially agentic coding where developer productivity is constrained by inference speed.

In the disaggregated solution, CS-3 will be fully dedicated to decoding acceleration, enabling dramatically higher capacity for fast output tokens. With Trainium handling prefill, the CS-3 handling decode operations, and high-speed EFA networking connecting them, each processor will deliver maximum token capacity for its focused part of the workload.

About Amazon Web Services

Amazon Web Services (AWS) is guided by customer obsession, pace of innovation, commitment to operational excellence, and long-term thinking. By democratizing technology for nearly two decades and making cloud computing and generative AI accessible to organizations of every size and industry, AWS has built one of the fastest-growing enterprise technology businesses in history. Millions of customers trust AWS to accelerate innovation, transform their businesses, and shape the future. With the most comprehensive AI capabilities and global infrastructure footprint, AWS empowers builders to turn big ideas into reality. Learn more at aws.amazon.com and follow @AWSNewsroom.

About Cerebras Systems

Cerebras Systems builds the fastest AI infrastructure in the world. We are a team of pioneering computer architects, computer scientists, AI researchers, and engineers of all types. We have come together to make AI blisteringly fast through innovation and invention because we believe that when AI is fast it will change the world. Our flagship technology, the Wafer Scale Engine 3 (WSE-3) is the world’s largest and fastest AI processor. 56 times larger than the largest GPU, the WSE uses a fraction of the power per unit compute while delivering inference and training more than 20 times faster than the competition. Leading corporations, research institutes and governments on four continents chose Cerebras to run their AI workloads. Cerebras solutions are available on premise and in the cloud, for further information, visit cerebras.ai or follow us on LinkedIn, X and/or Threads.

This press release contains forward-looking statements, including statements regarding the expected benefits of our products and the transaction described herein. These statements are subject to risks and uncertainties that could cause actual results to differ materially. Neither we nor any other person assumes responsibility for the accuracy and completeness of forward-looking statements. The forward-looking statements included in this press release relate only to events and information as of the date hereof. Cerebras undertakes no obligation to update or revise any forward-looking statement as a result of new information, future events or otherwise, except as otherwise required by law.

Amazon is deploying Cerebras Wafer Scale Engines in AWS datacenters. Ultra fast inference will be available through AWS Bedrock, bringing industry leading performance to the largest hyperscale cloud.

Amazon is deploying Cerebras Wafer Scale Engines in AWS datacenters. Ultra fast inference will be available through AWS Bedrock, bringing industry leading performance to the largest hyperscale cloud.

Stock indexes on Wall Street are losing ground in morning trading Friday, as the fallout from the war with Iran keeps pressure on oil prices, destabilizing the global economy.

The S&P 500 was down 0.2% after having been up as much as 0.9% in the early going. The Dow Jones Industrial Average was up 34 points, or 0.1%, as of 11:06 a.m. Eastern time, and the Nasdaq composite was 0.4% lower.

The latest choppy trading follows heavy turbulence in the market earlier in the week, which has the major indexes headed for their third straight losing week.

In the energy market, which has been roiled by the Iran war and its impact on supplies of crude oil and gas, the price of a barrel of Brent crude, the international standard, was above $100 per barrel, though still 0.2% below its $100.46 closing price on Thursday. It’s up more than 37% for the month.

U.S. crude oil was up 0.1% to $95.83 a day after settling at $95.73 per barrel. It’s up around 43% this month.

Oil prices have been volatile since the Iran war began. Iran’s actions have effectively stopped cargo traffic through the narrow Strait of Hormuz, where a fifth of the world’s oil typically sails. That has oil producers cutting production because their crude has nowhere to go.

If the war continues to hamper the production and transportation of oil from the Persian Gulf, it could cause a surge in inflation that could hurt the global economy. Analysts have said that if the Strait of Hormuz remains closed, oil prices could jump to $150 relatively quickly.

While the International Energy Agency said Wednesday its members would make a record 400 million barrels of oil available from their emergency reserves, some economists believe that would do little to reassure markets.

President Donald Trump signaled earlier this week that he would take more action to address the squeeze on oil flows. The move follows the administration’s decision to grant temporary permission for India to buy Russian oil.

A new snapshot of consumer spending Friday shows inflation crept higher in January, even before the Iran war caused oil and gas prices to spike.

The Commerce Department said prices rose 2.8% in January compared with a year earlier. But excluding the volatile food and energy categories — which the Federal Reserve pays closer attention to — core prices rose 3.1%, up from 3% in the prior month and the highest in nearly two years.

Even so, consumers still lifted their spending at a solid 0.4% pace in January, with their incomes rising at the same pace, according to the report.

Consumer spending powers about two-thirds of the economy, which is why economists keep a close watch on trends in incomes and spending.

The University of Michigan's latest gauge of consumer sentiment on Friday showed consumer sentiment declined slightly to its lowest reading of the year as gasoline price hikes since the start of the war in Iran.

Meanwhile, the Labor Department said Friday U.S. job openings jumped to nearly 7 million in January, topping economists’ forecasts.

Wall Street also got an update on how U.S. economic growth fared in the October-December quarter. The economy, hobbled by last fall’s 43-day government shutdown, grew at a sluggish 0.7% annual rate, a downgrade from its initial estimate last month.

“GDP and the job market have been expanding, but the rate of change has been slowing, which leads to concerns about the overall economy -- and that was even before we stared a war in the Middle East, which spiked the price of oil,” Chris Zaccarelli, chief investment officer for Northlight Asset Management, said in an email.

Most of the sectors in the S&P 500 were rising Friday, with financial and health care stocks driving most of the gains. JPMorgan rose 1.1% and Eli Lilly added 1.6%.

Software maker Adobe fell 6% even after it beat Wall Street’s sales and profit forecasts. Investors were likely underwhelmed by the company’s forecast for its recurring subscription revenue.

Ulta Beauty slid 10.5% for the biggest decline among S&P 500 stocks after the beauty and makeup retailer's latest quarterly results fell short of analysts’ profit targets. Ulta’s profit was dinged by a 23% increase in selling, general and administrative expenses, which jumped to $1 billion in the period.

Bitcoin rose 4.6% to just around $72,777, boosting companies that trade or hoard the cryptocurrency. Coinbase Global rose 2.4% and Strategy gained 4.9%.

In the bond market, the yield on the 10-year Treasury fell to 4.25% from 4.26% late Thursday. It was just 3.97% before the war started.

Higher yields help make all kinds of borrowing more expensive, such as mortgages for potential U.S. homebuyers and bond offerings for companies looking to expand. They also push down on prices for all kinds of investments, from stocks to crypto.

In stock markets abroad, indexes rose in Europe after also falling in Asia.

In early European trading, Britain’s FTSE 100 rose 0.2%, Germany’s DAX added 0.2% and France’s CAC 40 gained 0.4%.

Tokyo’s Nikkei 225 index slipped 1.2%. Technology-related stocks saw some of the bigger losses, with SoftBank Group falling 4.5%.

Ryan Falvey works on the floor at the New York Stock Exchange in New York, Tuesday, March 10, 2026. (AP Photo/Seth Wenig)

Ryan Falvey works on the floor at the New York Stock Exchange in New York, Tuesday, March 10, 2026. (AP Photo/Seth Wenig)

A motorist fills up the tank of a vehicle at a Coscto gasoline station Thursday, March 12, 2026, in east Denver. (AP Photo/David Zalubowski)

A motorist fills up the tank of a vehicle at a Coscto gasoline station Thursday, March 12, 2026, in east Denver. (AP Photo/David Zalubowski)

The per-gallon price for premium unleaded fuel is displayed electronically on a pump at a Costco gosoline station Thursday, March 12, 2026, in east Denver. (AP Photo/David Zalubowski)

The per-gallon price for premium unleaded fuel is displayed electronically on a pump at a Costco gosoline station Thursday, March 12, 2026, in east Denver. (AP Photo/David Zalubowski)

A person walks in front of an electronic stock board showing Japan's Nikkei index at a securities firm Friday, March 13, 2026, in Tokyo. (AP Photo/Eugene Hoshiko)

A person walks in front of an electronic stock board showing Japan's Nikkei index at a securities firm Friday, March 13, 2026, in Tokyo. (AP Photo/Eugene Hoshiko)

A person walks in front of an electronic stock board showing Japan's Nikkei index at a securities firm Friday, March 13, 2026, in Tokyo. (AP Photo/Eugene Hoshiko)

A person walks in front of an electronic stock board showing Japan's Nikkei index at a securities firm Friday, March 13, 2026, in Tokyo. (AP Photo/Eugene Hoshiko)

Gregg Maloney works on the floor at the New York Stock Exchange in New York, Tuesday, March 10, 2026. (AP Photo/Seth Wenig)

Gregg Maloney works on the floor at the New York Stock Exchange in New York, Tuesday, March 10, 2026. (AP Photo/Seth Wenig)

Recommended Articles