Papers

Scalable Diffusion Models with Transformers

In this work, the researchers explore a new class of diffusion models based on the transformer architecture; train latent diffusion models, replacing the U-Net backbone with a transformer that operates on latent patches; and analyze the scalability of Diffusion Transformers (DiTs).

Sophia

· Jan 3, 2023

Scalable Diffusion Models with Transformers

Project Paper Code HuggingFace Colab

Abstract

We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches. We analyze the scalability of our Diffusion Transformers (DiTs) through the lens of forward pass complexity as measured by Gflops. We find that DiTs with higher Gflops -- through increased transformer depth/width or increased number of input tokens -- consistently have lower FID. In addition to possessing good scalability properties, our largest DiT-XL/2 models outperform all prior diffusion models on the class-conditional ImageNet 512x512 and 256x256 benchmarks, achieving a state-of-the-art FID of 2.27 on the latter.

0:00

Comments

Cursor launches Composer 2: a model more capable, cheaper and faster than its predecessor

Cursor recently released Composer 2, a new in-house coding model that vastly improves its predecessor's performance. While Composer 2's benchmark scores may not be outstanding, Cursor is betting that the model's lower price point and native integration to the coding environment will drive adoption.

Mar 20, 2026

by Ellie Ramirez-Camara

News

Yann LeCun's AMI Labs just raised Europe's largest seed round for its world models

Yann LeCun's AMI Labs raised a $1.03 billion seed round at a $3.5 billion valuation, Europe's largest seed round on record. The startup will use the raised money to continue developing world models that can be applied to robotics, industrial, and healthcare applications.

Mar 19, 2026

by Ellie Ramirez-Camara

News

Encyclopedia Britannica and Merriam-Webster are the latest publishers to sue OpenAI

Encyclopedia Britannica and Merriam-Webster have sued OpenAI for copyright infringement and trademark infringement. The publishers accuse OpenAI of unlawful scraping and reproduction of their content and claim that falsely attributed hallucinations are damaging their reputations as trusted sources.

Mar 17, 2026

by Ellie Ramirez-Camara

News

Nscale announces Europe's largest Series C, Sheryl Sandberg and Nick Clegg join its board

Nscale raised $2 billion in Europe's largest Series C at a $14.6 billion valuation to accelerate AI infrastructure buildout globally. In parallel, Nscale announced the appointment of Sheryl Sandberg, Nick Clegg, and Susan Decker to its board.

Mar 13, 2026

by Ellie Ramirez-Camara

News

Replit launches Agent 4 as part of its mission to make software development widely accessible

Replit raised $400 million at a $9 billion valuation, effectively tripling its valuation since its last funding round. Replit also launched Agent 4, a faster AI coding agent that can be run in multiple parallel instances and that can handle more complex workflows than its predecessors.

Mar 11, 2026

by Ellie Ramirez-Camara

Subscribe

Scalable Diffusion Models with Transformers

Abstract

Comments

Read Next

Cursor launches Composer 2: a model more capable, cheaper and faster than its predecessor

Yann LeCun's AMI Labs just raised Europe's largest seed round for its world models

Encyclopedia Britannica and Merriam-Webster are the latest publishers to sue OpenAI

Nscale announces Europe's largest Series C, Sheryl Sandberg and Nick Clegg join its board

Replit launches Agent 4 as part of its mission to make software development widely accessible