[MAX] Add AutoencoderKL VAE implementation for Flux.2 pipeline by byungchul-sqzb · Pull Request #5889 · modular/modular

byungchul-sqzb · 2026-02-01T22:43:49Z

Overview

This PR adds the VAE (Variational Autoencoder) model stack required for the Flux2 pipeline. It introduces a new AutoencoderKLFlux2 implementation with Flux2-specific configuration, encoder/decoder components, and supporting layers for image-to-latent encoding and latent-to-image decoding.

This PR implements VAE encoder functionality for image conditional logics in the Flux2 pipeline.

What's included

New AutoencoderKLFlux2 architecture (autoencoder_kl_flux2.py):
- Support for BatchNorm statistics for latent patchification
- Image encoding to latents and decoding from latents
Enhanced VAE components (vae.py):
- Extended Encoder and Decoder implementations
- Support for Flux2's latent patchification process
- Image conditional encoding logic
New downsampling layer (layers/downsampling.py):
- Downsample2D module for spatial resolution reduction
- Optional convolution-based downsampling

NOTES

This model implementation does NOT serve as a standalone executable model. It is expected to be executed within the Flux2 pipeline.
The VAE encoder supports image conditional logics for text-to-image generation workflows.

…nditional logics

katelyncaldwell

A few comments to address, but then good to merge on my end!

katelyncaldwell · 2026-02-03T22:15:36Z

max/python/max/pipelines/architectures/autoencoders/autoencoder_kl_flux2.py

@@ -0,0 +1,325 @@
+# ===----------------------------------------------------------------------=== #
+# Copyright (c) 2025, Modular Inc. All rights reserved.


Done! Commit

katelyncaldwell · 2026-02-03T22:24:09Z

max/python/max/pipelines/architectures/autoencoders/autoencoder_kl_flux2.py

+
+        return converted_weights
+
+    def load_model(self) -> Any:


A lot of this function seems directly copied from BaseAutoencoderModel. Would it make sense to refactor and avoid the duplication?

I’ve now refactored it to eliminate the redundancy as suggested:

Weight Handling: Moved weight dtype check and conversion logic into the BaseAutoencoderModel so it’s handled centrally.

Encoder Logic: Integrated quant_conv logic into the common VAE Encoder class, controlled by a boolean flag.

katelyncaldwell · 2026-02-03T22:48:34Z

max/python/max/pipelines/architectures/autoencoders/autoencoder_kl_flux2.py

+        return self.decoder(z, temb)
+
+
+class BatchNormStats:


Should this be a @dataclass?

Done! Commit

katelyncaldwell · 2026-02-03T22:50:15Z

max/python/max/pipelines/architectures/autoencoders/autoencoder_kl_flux2.py

+            autoencoder_class=AutoencoderKLFlux2,
+        )
+
+    def convert_weights_to_target_dtype(


Consider adding more dtype checks to ensure we're only casting between float types (e.g. if we want to support a quantized dtype, this weight adapter should not be a source of subtle dtype conversion bugs)

Done! Commit

katelyncaldwell · 2026-02-03T22:52:26Z

max/python/max/pipelines/architectures/autoencoders/layers/downsampling.py

+
+        Args:
+            hidden_states: Input tensor of shape [N, C, H, W].
+            *args: Additional positional arguments (ignored, kept for compatibility).


I would prefer to avoid this pattern if possible. I am assuming this is for compatibility with diffusers? Can we remove it or will that lead to challenges?

Done! Commit

…hen src/target dtype are float)

katelyncaldwell

looks great! I am going to merge 😄

katelyncaldwell · 2026-02-05T21:00:41Z

!sync

Add autoencoderkl_flux2 for Flux.2, implement vae encoder to image co…

c4693e8

…nditional logics

byungchul-sqzb requested a review from a team as a code owner February 1, 2026 22:43

byungchul-sqzb marked this pull request as draft February 1, 2026 22:44

This was referenced Feb 1, 2026

[MAX] Add E2E Flux.2 Pipeline for T2I, I2I Generation #5892

Open

[Feature Request][RFC] Adding Diffusion support in Max #5847

Open

byungchul-sqzb added 3 commits February 3, 2026 14:30

Merge branch 'modular:main' into add/flux2-pipeline/models-vae

169ab67

chore: apply code formatting

6b8c64c

chore: resolve mypy errors

27389f7

byungchul-sqzb marked this pull request as ready for review February 3, 2026 07:13

Merge branch 'modular:main' into add/flux2-pipeline/models-vae

eda59bb

katelyncaldwell approved these changes Feb 3, 2026

View reviewed changes

byungchul-sqzb added 7 commits February 4, 2026 02:36

chore: Update copyright headers in autoencoder modules

dccdc9a

refactor: Remove unnecessary *args, **kwargs from autoencoders

13aab0e

refactor: use SimpleNamespace instead of BatchNormStats dataclass

ea16117

refactor: centralize weight loading and dtype conversion in base class

b6cfb30

refactor: move quant_conv into encoder

92e89ee

chore: fix some comments

f7b3927

refactor: allow weight casts only between float dtypes

999f8dc

byungchul-sqzb force-pushed the add/flux2-pipeline/models-vae branch from 0950704 to 999f8dc Compare February 5, 2026 04:07

byungchul-sqzb added 3 commits February 5, 2026 04:51

fixup! refactor: allow weight casts only between float dtypes

04e0a5a

refactor: allow weight casts only between float dtypes(only convert w…

2c08dc3

…hen src/target dtype are float)

chore: remove unused config_name and TConfig TypeVar

d9bbbe5

katelyncaldwell approved these changes Feb 5, 2026

View reviewed changes

modular-automation bot assigned katelyncaldwell Feb 5, 2026

modularbot added the imported-internally Signals that a given pull request has been imported internally. label Feb 5, 2026

		@@ -0,0 +1,325 @@
		# ===----------------------------------------------------------------------=== #
		# Copyright (c) 2025, Modular Inc. All rights reserved.

Conversation

byungchul-sqzb commented Feb 1, 2026

Overview

What's included

NOTES

Uh oh!

katelyncaldwell left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

byungchul-sqzb Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

katelyncaldwell left a comment

Choose a reason for hiding this comment

Uh oh!

katelyncaldwell commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

byungchul-sqzb Feb 5, 2026 •

edited

Loading