Amazon Web Services has scored another major win for its custom AWS Trainium accelerators after striking a deal with AI video startup Decart. The partnership will see Decart optimise its flagship Lucy model on AWS Trainium3 to support real-time video generation, and highlight the growing popularity of AI accelerators over Nvidia’s graphics processing units.
Decart is essentially going all-in on AWS, and as part of the deal, the company will also make its models available through the Amazon Bedrock platform. Developers can integrate Decart’s real-time video generation capabilities into almost any cloud application without worrying about underlying infrastructure.
The distribution through Bedrock increases AWS’s plug-and-play capabilities, demonstrating Amazon’s confidence in growing demand for real-time AI video. It also allows Decart to expand reach and grow adoption among the developer community. AWS Trainium provides Lucy with the extra processing grunt needed to generate high-fidelity video without sacrificing quality or latency.
Custom AI accelerators like Trainium provide an alternative to Nvidia’s GPUs for AI workloads. While Nvidia still dominates the AI market, its GPUs processing the vast majority of AI workloads, it’s facing a growing threat from custom processors.
Why all the fuss over AI accelerators?
AWS Trainium isn’t the only option developers have. Google’s Tensor Processing Unit (TPU) product line and Meta’s Training and Inference Accelerator (MTIA) chips are other examples of custom silicon, each having a similar advantage over Nvidia’s GPUs – their ASIC architecture (Application-Specific Integrated Circuit). As the name suggests, ASIC hardware is engineered specifically to handle one kind of application and do so more efficiently than general purpose processors.
While central processing units are generally considered to be the Swiss Army knife of the computing world due to their ability to handle multiple applications, GPUs are more akin to a powerful electric drill. They’re vastly more powerful than CPUs, designed to process massive amounts of repetitive, parallel computations, making them suitable for AI applications and graphics rendering tasks.
If the GPU is a power drill, the ASIC might be considered a scalpel, designed for extremely precise procedures. When building ASICs, chipmakers strip out all functional units irrelevant to the task for greater efficiency – all their operations are dedicated to the task.
This yields massive performance and energy efficiency benefits compared to GPUs, and may explain their growing popularity. A case in point is Anthropic, which has partnered with AWS on Project Rainier, an enormous cluster made up of hundreds of thousands of Trainium2 processors.
Anthropic says that Project Rainier will provide it with hundreds of exaflops of computing power to run its most advanced AI models, including Claude Opus-4.5.
The AI coding startup Poolside is also using AWS Trainium2 to train its models, and has plans to use its infrastructure for inference as well in future. Meanwhile, Anthropic is hedging its bets, also looking to train future Claude models on a cluster of up to one million Google TPUs. Meta Platforms is reportedly collaborating with Broadcom to develop a custom AI processor to train and run its Llama models, and OpenAI has similar plans.
The Trainium advantage
Decart chose AWS Trainium2 due to its performance, which let Decart achieve the low latency required by real-time video models. Lucy has a time-to-first-frame of 40ms, meaning that it begins generating video almost instantly after prompt. By streamlining video processing on Trainium, Lucy can also match the quality of much slower, more established video models like OpenAI’s Sora 2 and Google’s Veo-3, with Decart generating output at up to 30 fps.
Decart believes Lucy will improve. As part of its agreement with AWS, the company has obtained early access to the newly announced Trainium3 processor, capable of outputs of up to 100 fps and lower latency. “Trainium3’s next-generation architecture delivers higher throughput, lower latency, and greater memory efficiency – allowing us to achieve up to 4x faster frame generation at half the cost of GPUs,” said Decart co-founder and CEO Dean Leitersdorf in a statement.
Nvidia might not be too worried about custom AI processors. The AI chip giant is reported to be designing its own ASIC chips to rival cloud competitors’. Moreover, ASICs aren’t going to replace GPUs completely, as each chip has its own strengths. The flexibility of GPUs means they remain the only real option for general-purpose models like GPT-5 and Gemini 3, and are still dominant in AI training. However, many AI applications have stable processing requirements, meaning they’re particularly suited to running on ASICs.
The rise of custom AI processors is expected to have a profound impact on the industry. By pushing chip design towards greater customisation and enhancing the performance of specialised applications, they’re setting the stage for a new wave of AI innovation, with real-time video at the forefront.
Photo courtesy AWS re:invent

