Skip to content

Comments

feat: Implement ZSTD compression for DA blobs#3104

Draft
alpe wants to merge 7 commits intomainfrom
alex/blob-compression
Draft

feat: Implement ZSTD compression for DA blobs#3104
alpe wants to merge 7 commits intomainfrom
alex/blob-compression

Conversation

@alpe
Copy link
Contributor

@alpe alpe commented Feb 24, 2026

Resolves #2532

Overview

  • Compress via ZSTD
  • Dynamic compression level selection by load.

Ref:
Benchmarks Zstd
ethereum-optimism/optimistic-specs#174

Arbitrum uses Brotli with dynamic compaction levels: Level 11 for low traffic; Level 0-1 for spikes.
https://docs.arbitrum.io/how-arbitrum-works/inside-arbitrum-nitro#:~:text=The%20data%20then%20undergoes%20compression,toward%20speed%20during%20heavy%20backlogs.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added compression support for data availability submissions, automatically selecting compression levels based on batch size and providing detailed compression metrics.
  • Documentation

    • Updated guidelines for composing error messages with explicit naming conventions.
  • Chores

    • Updated klauspost/compress library dependency to v1.18.4 across all modules.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request implements blob compression before submission to reduce bandwidth usage and storage costs. It introduces zstd-based compression infrastructure in a new module, integrates compression into the DA client with dynamic level selection based on batch size, updates dependencies to klauspost/compress v1.18.4, adds a documentation guideline, and includes comprehensive test coverage.

Changes

Cohort / File(s) Summary
Documentation & Conventions
CLAUDE.md
Added error handling guideline prohibiting error messages starting with "failed to".
Dependency Updates
go.mod, apps/evm/go.mod, apps/grpc/go.mod, apps/testapp/go.mod, execution/evm/test/go.mod, test/e2e/go.mod
Bumped klauspost/compress from v1.18.0 to v1.18.4 (indirect to direct in root; indirect in submodules).
Compression Module
pkg/da/compression.go
Implemented zstd-based compression with three levels (Fastest, Default, Best), magic-header prefix detection, and backward-compatible decompression for uncompressed data.
Compression Tests
pkg/da/compression_test.go
Added 12+ unit tests covering round-trip compression/decompression, edge cases (nil/empty inputs), error handling, data validation, and compression effectiveness across all levels.
DA Client Integration
block/internal/da/client.go
Integrated compression into Submit method: dynamic level selection, per-blob compression with logging, maximum size enforcement, and decompression on data retrieval with error propagation. Updated Submit signature to include options parameter.

Sequence Diagram

sequenceDiagram
    participant Client as DA Client
    participant Compress as Compression Module
    participant Submit as Blob Submission
    participant Retrieve as Blob Retrieval
    participant Decompress as Decompression Module

    Client->>Client: Calculate compression level<br/>(batch size dependent)
    Client->>Compress: Compress(blob, level)
    Compress-->>Client: Compressed blob + magic prefix
    Client->>Client: Validate compressed size<br/>≤ max limit
    Client->>Submit: Submit compressed blob
    Submit-->>Client: Submission result

    Retrieve->>Retrieve: Fetch blob data
    Retrieve->>Decompress: Decompress(blob data)
    Decompress->>Decompress: Check magic prefix
    alt Compressed Data
        Decompress->>Decompress: Decompress zstd payload
        Decompress-->>Retrieve: Original data
    else Uncompressed Data
        Decompress-->>Retrieve: Data unchanged
    end
    Retrieve-->>Client: Retrieved data
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Blobs now compress so tight and neat,
With zstd magic, less bytes to treat!
Faster networks, storage's lean,
The smallest bundles ever seen! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Implement ZSTD compression for DA blobs' clearly and specifically summarizes the main change: adding ZSTD compression for DA blobs. It follows semantic commit conventions and is concise.
Description check ✅ Passed The PR description includes an overview explaining the implementation (ZSTD compression with dynamic levels), references the resolved issue (#2532), and provides relevant context with benchmarks and operational references.
Linked Issues check ✅ Passed The PR fully addresses issue #2532 requirements: implements ZSTD compression, adds dynamic level selection based on load, integrates compression into DA client, and maintains backward compatibility through transparent compression/decompression.
Out of Scope Changes check ✅ Passed All changes are directly related to compression implementation: new compression package, integration into DA client, dependency updates for klauspost/compress, documentation update in CLAUDE.md, and comprehensive tests. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch alex/blob-compression

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Contributor

claude bot commented Feb 24, 2026

Claude finished @alpe's task in 2m 5s —— View job


Code Review: feat: Implement ZSTD compression for DA blobs

  • Gather context and read changed files
  • Review compression module (pkg/da/compression.go)
  • Review DA client integration (block/internal/da/client.go)
  • Review tests (pkg/da/compression_test.go)
  • Post final review

Overall this is a well-thought-out implementation with good security hardening (frame header pre-check, memory cap, decompression timeout). The backward compatibility via magic prefix is clean. A few issues worth addressing before merging.


pkg/da/compression.go

Goroutine leak on decompression timeout — when decompressTimeout (500ms) fires, the DecodeAll goroutine continues running to completion in the background. For a malicious 7 MiB compressed payload that is deliberately slow to decode, this could accumulate goroutines. The channel is buffered so they won't leak permanently, but you're still doing the work you wanted to abort. Consider documenting this limitation or using a context-aware streaming decoder.

Missing "Layer 2" labelcompression.go:119 reads // Layer 3: but there is no Layer 1 comment in the function body and no Layer 2 at all. Fix this →

Two allocations in Compresscompression.go:83-90 calls EncodeAll(data, nil) then copies into a new slice with the magic prefix prepended. You can reduce this to a single allocation: Fix this →

dst := make([]byte, len(magic))
copy(dst, magic)
return encoders[level].EncodeAll(data, dst), nil

IsCompressed idiomcompression.go:145-148: manual byte comparison is correct but bytes.HasPrefix(data, magic) is more idiomatic.


block/internal/da/client.go

Error message violates project conventionclient.go:133 introduces "failed to build blob %d: %v" which directly violates the new CLAUDE.md guideline added in this same PR: "Never start an error message with 'failed to'". Should be e.g. "build blob %d: %v". Fix this →

Silent blob dropping in Getclient.go:404-411: when decompression fails in Get, the blob is silently skipped. Unlike Retrieve where skipping is acceptable (caller just gets what's on-chain), Get is called with a specific list of IDs. Silently returning fewer blobs than IDs requested could cause subtle bugs for callers that don't check counts. Consider returning an error here instead.

Magic threshold numbersclient.go:96-99: the values 10 and 3 are undocumented policy thresholds. Named constants with a doc comment explaining the intended batching semantics would make this tunable and self-documenting. Fix this →

blobSize reports uncompressed bytesclient.go:76-78: blobSize is calculated before compression, so ResultSubmit.BlobSize reflects original data size, not the bytes actually submitted to the DA layer. This can mislead cost estimation. At minimum, add a comment clarifying this is uncompressed size.


pkg/da/compression_test.go

TestDecompress_ContextCanceled can be non-deterministiccompression_test.go:190-202: the test pre-cancels the context and decompresses a tiny 34-byte payload. DecodeAll on such a small payload will almost certainly complete before the goroutine's select runs, meaning both ch and ctx.Done() are ready simultaneously. Go's select picks randomly — this test can flake. Use a large enough payload that DecodeAll realistically takes longer, or restructure the test to not rely on timing. Fix this →

sizes slice shared across subtestscompression_test.go:63-83: appending to sizes inside t.Run callbacks would be a data race if t.Parallel() were added. Consider pre-allocating with make([]int, len(levels)) and using index assignment instead of append.


Positive Highlights

  • Excellent security posture: frame header pre-check (zero-allocation early rejection) + WithDecoderMaxMemory + decompression timeout guards against bomb attacks in layers.
  • max(len(raw), 1) at line 116 correctly prevents NaN in the compression ratio log — nice.
  • Backward compatibility through magic-prefix passthrough in Decompress is clean.
  • The init() panic on encoder/decoder creation is appropriate — these are singleton infrastructure that must be available.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 24, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedFeb 25, 2026, 9:50 AM

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alpe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates ZSTD compression for Data Availability (DA) blobs, enhancing the efficiency of data handling by reducing the size of blobs transmitted and stored. The changes involve updating the klauspost/compress library, introducing a dedicated compression utility package, and modifying the DA client to transparently apply compression during submission and decompression during retrieval. This optimization aims to decrease operational costs and improve performance without altering the external interface for blob operations.

Highlights

  • ZSTD Compression for DA Blobs: Implemented ZSTD compression for Data Availability (DA) blobs, significantly reducing their size before submission and after retrieval to optimize bandwidth and storage.
  • Dependency Update: Updated the klauspost/compress Go module dependency across various applications to version v1.18.4 to support the new compression features.
  • New Compression Package: Introduced a new pkg/da/compression package containing utility functions for ZSTD compression, decompression, and a custom magic prefix check for identifying compressed data.
  • DA Client Integration: Modified the DA client (block/internal/da/client.go) to automatically compress blobs before submission and decompress them upon retrieval, including logging compression ratios for monitoring.
  • Comprehensive Testing: Added comprehensive unit tests for the new compression and decompression logic to ensure data integrity, round-trip correctness, and efficiency across various data types and sizes.
  • Documentation Update: Updated development guidelines in CLAUDE.md to include a new rule against starting error messages with 'failed to'.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • CLAUDE.md
    • Added a new guideline for error message formatting.
  • apps/evm/go.mod
    • Updated github.com/klauspost/compress dependency to v1.18.4.
  • apps/evm/go.sum
    • Updated checksums for github.com/klauspost/compress.
  • apps/grpc/go.mod
    • Updated github.com/klauspost/compress dependency to v1.18.4.
  • apps/grpc/go.sum
    • Updated checksums for github.com/klauspost/compress.
  • apps/testapp/go.mod
    • Updated github.com/klauspost/compress dependency to v1.18.4.
  • apps/testapp/go.sum
    • Updated checksums for github.com/klauspost/compress.
  • block/internal/da/client.go
    • Integrated ZSTD compression for outgoing blobs and decompression for incoming blobs.
    • Added error handling for compression and decompression failures.
    • Included logging for compression ratios during blob submission.
  • execution/evm/go.sum
    • Updated checksums for github.com/klauspost/compress.
  • execution/evm/test/go.mod
    • Updated github.com/klauspost/compress dependency to v1.18.4.
  • execution/evm/test/go.sum
    • Updated checksums for github.com/klauspost/compress.
  • go.mod
    • Added github.com/klauspost/compress v1.18.4 as a direct dependency.
    • Removed github.com/klauspost/compress v1.18.0 as an indirect dependency.
  • go.sum
    • Updated checksums for github.com/klauspost/compress.
  • pkg/da/compression.go
    • Added a new file defining ZSTD compression and decompression functions.
    • Implemented a custom 4-byte magic prefix ('ZSTD') for compressed data identification.
    • Provided IsCompressed function to check for the magic prefix.
  • pkg/da/compression_test.go
    • Added unit tests to verify the correctness and efficiency of the ZSTD compression and decompression logic.
    • Included tests for edge cases like empty data, uncompressed passthrough, and corrupt data.
  • test/e2e/go.mod
    • Updated github.com/klauspost/compress dependency to v1.18.4.
  • test/e2e/go.sum
    • Updated checksums for github.com/klauspost/compress.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.96%. Comparing base (67e18bd) to head (c875c0b).

Files with missing lines Patch % Lines
block/internal/da/client.go 41.17% 16 Missing and 4 partials ⚠️
pkg/da/compression.go 89.47% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3104      +/-   ##
==========================================
+ Coverage   60.91%   60.96%   +0.04%     
==========================================
  Files         113      114       +1     
  Lines       11617    11685      +68     
==========================================
+ Hits         7077     7124      +47     
- Misses       3742     3757      +15     
- Partials      798      804       +6     
Flag Coverage Δ
combined 60.96% <66.66%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces ZSTD compression for DA blobs, which is a great feature for reducing bandwidth and storage costs. The implementation is well-contained within a new pkg/da package and includes a comprehensive set of tests. The changes in block/internal/da/client.go correctly integrate the compression and decompression logic. I have a few suggestions to improve the efficiency, readability, and style of the new compression package.

Comment on lines 39 to 46
compressed := encoder.EncodeAll(data, nil)

// Prepend magic prefix
result := make([]byte, len(magic)+len(compressed))
copy(result, magic)
copy(result[len(magic):], compressed)

return result, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of Compress can be made more memory-efficient. It currently involves at least two allocations. You can reduce this by creating a destination buffer with the magic prefix and then having EncodeAll append the compressed data to it.

Suggested change
compressed := encoder.EncodeAll(data, nil)
// Prepend magic prefix
result := make([]byte, len(magic)+len(compressed))
copy(result, magic)
copy(result[len(magic):], compressed)
return result, nil
dst := make([]byte, len(magic))
copy(dst, magic)
return encoder.EncodeAll(data, dst), nil

// Strip magic prefix and decompress
decompressed, err := decoder.DecodeAll(data[len(magic):], nil)
if err != nil {
return nil, fmt.Errorf("compression: zstd decompress failed: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new error handling guideline in CLAUDE.md suggests avoiding "failed to" in error messages. While this message doesn't use that exact phrase, "zstd decompress failed" can be improved to be more direct. The err != nil check already implies failure. Consider rephrasing to state the operation and wrap the underlying error.

Suggested change
return nil, fmt.Errorf("compression: zstd decompress failed: %w", err)
return nil, fmt.Errorf("compression: zstd decompression: %w", err)
References
  1. The project's coding guidelines in CLAUDE.md state to 'Never start an error message with "failed to"'. (link)

Comment on lines +71 to +74
return data[0] == magic[0] &&
data[1] == magic[1] &&
data[2] == magic[2] &&
data[3] == magic[3]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The manual byte-by-byte comparison to check for the magic prefix is correct, but using bytes.HasPrefix is more idiomatic and readable. You'll need to add import "bytes" to the file.

Suggested change
return data[0] == magic[0] &&
data[1] == magic[1] &&
data[2] == magic[2] &&
data[3] == magic[3]
return bytes.HasPrefix(data, magic)

data[i] = b.Data()
decompressed, decompErr := da.Decompress(b.Data())
if decompErr != nil {
return datypes.ResultRetrieve{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backward compat, we should still fallback to the uncompressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 This is implemented in the Decompress method. It checks for a "magic byte" prefix and returns raw data when not found.

alpe added 2 commits February 24, 2026 12:04
…on based on batch size.

(cherry picked from commit 3ff6211ec69d482437de79a617b5b39a88c91835)
@alpe alpe marked this pull request as ready for review February 24, 2026 15:28
@alpe alpe marked this pull request as draft February 24, 2026 15:29
@alpe
Copy link
Contributor Author

alpe commented Feb 24, 2026

E2e tests failed. I am not sure how this relates to the changes, yet.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
block/internal/da/client.go (2)

74-79: BlobSize reports uncompressed size — document this or add a compressed-size field.

blobSize is calculated from the original data (pre-compression), but the bytes actually submitted to the DA layer are the compressed blobs. For operators relying on BlobSize for cost estimation or quota tracking, this mismatch can be misleading. Consider either documenting that BlobSize is the uncompressed size, or adding a separate field (e.g., CompressedBlobSize) so callers can distinguish.

Also applies to: 200-210

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@block/internal/da/client.go` around lines 74 - 79, The current client.Submit
implementation computes blobSize from the original data slices (variable
blobSize) which reflects the uncompressed size while DA submissions use
compressed blobs; update Submit to either record and return both sizes by adding
a new CompressedBlobSize field to the datypes.ResultSubmit (in addition to the
existing BlobSize) and populate it with the length of the compressed payloads
before sending, or explicitly document in the Submit comment and datatypes that
BlobSize is the uncompressed size; change references in Submit and any callers
of datypes.ResultSubmit to use the new CompressedBlobSize when estimating
costs/quota.

91-100: Consider extracting compression-level thresholds as named constants.

The magic numbers 10 and 3 silently encode the backlog-pressure policy. Named constants (or a small config struct) would make the policy self-documenting and easier to tune without reading the switch logic.

Example
+const (
+	// highBacklogThreshold is the batch-size above which we use fastest compression.
+	highBacklogThreshold = 10
+	// moderateBacklogThreshold is the batch-size above which we use default compression.
+	moderateBacklogThreshold = 3
+)
+
 	compLevel := da.LevelBest
 	switch {
-	case len(data) > 10:
+	case len(data) > highBacklogThreshold:
 		compLevel = da.LevelFastest
-	case len(data) > 3:
+	case len(data) > moderateBacklogThreshold:
 		compLevel = da.LevelDefault
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@block/internal/da/client.go` around lines 91 - 100, Extract the magic numeric
thresholds used in the compression selection switch into named constants (e.g.,
MinFastBatchSize = 10, MinDefaultBatchSize = 3) or into a small config struct,
then replace the literal `10` and `3` in the switch that sets `compLevel` (which
currently uses `da.LevelBest`, `da.LevelFastest`, and `da.LevelDefault`) with
those constants/fields so the backlog-pressure policy is self-documenting and
easily tunable without changing control flow.
pkg/da/compression_test.go (1)

61-81: Shared sizes slice across subtests is fragile if parallelized.

sizes is appended inside t.Run callbacks and then read after the loop. This works today because subtests run sequentially, but adding t.Parallel() in the future would introduce a data race. Consider collecting sizes outside the subtest or adding a brief comment warning against parallelizing these subtests.

One alternative: move the cross-level assertion into its own subtest
-	var sizes []int
 	for _, lvl := range levels {
+		lvl := lvl
 		t.Run(lvl.name, func(t *testing.T) {
 			compressed, err := Compress(data, lvl.level)
 			require.NoError(t, err)
 			assert.True(t, IsCompressed(compressed))
 
-			sizes = append(sizes, len(compressed))
-
 			decompressed, err := Decompress(compressed)
 			require.NoError(t, err)
 			assert.Equal(t, data, decompressed)
 
 			t.Logf("level=%s compressed=%d ratio=%.4f", lvl.name, len(compressed), float64(len(compressed))/float64(len(data)))
 		})
 	}
 
-	// Best should produce equal or smaller output than Fastest
-	if len(sizes) == 3 {
-		assert.LessOrEqual(t, sizes[2], sizes[0], "LevelBest should compress at least as well as LevelFastest")
-	}
+	t.Run("best_leq_fastest", func(t *testing.T) {
+		fastest, err := Compress(data, LevelFastest)
+		require.NoError(t, err)
+		best, err := Compress(data, LevelBest)
+		require.NoError(t, err)
+		assert.LessOrEqual(t, len(best), len(fastest),
+			"LevelBest should compress at least as well as LevelFastest")
+	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/da/compression_test.go` around lines 61 - 81, The test currently appends
to a shared sizes slice inside each t.Run which is unsafe if subtests are
parallelized; change to preallocate sizes := make([]int, len(levels)) before the
loop, capture the loop index (e.g., idx := i) inside the t.Run closure, and
assign sizes[idx] = len(compressed) instead of appending; keep the cross-level
assertion after the loop (or move it into its own t.Run) and add a short comment
on not calling t.Parallel() here to avoid races if someone enables parallel
subtests; refer to the t.Run loop, levels, Compress and Decompress calls and the
sizes slice when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@block/internal/da/client.go`:
- Around line 314-323: The decompression-error branch in Retrieve (where
da.Decompress returns decompErr) returns a ResultRetrieve without setting
BaseResult.Timestamp; update that error return to set Timestamp to the existing
block time (use the blockTime variable if available, otherwise time.Now()) so
the ResultRetrieve mirrors other error branches; modify the return constructing
datypes.ResultRetrieve / datypes.BaseResult in that decompErr branch to include
Timestamp: blockTime (or time.Now()) to ensure non-zero timestamps for callers.
- Around line 113-118: The log computes a compression ratio using
float64(len(compressed))/float64(len(raw)) which yields NaN for zero-length raw;
fix by computing the ratio before logging and guarding when len(raw) == 0 (e.g.,
set ratio := 0.0 or omit the field), then call
c.logger.Debug().Int("original_size", len(raw)).Int("compressed_size",
len(compressed)).Float64("ratio", ratio).Int("level",
int(compLevel)).Msg("compressed blob for DA submission"); this ensures no
division by zero produces NaN in the logs while keeping the same c.logger.Debug
call and the raw/compressed/compLevel identifiers.

In `@pkg/da/compression.go`:
- Around line 47-51: The zstd decoder is created without a max-memory cap
(decoder, err = zstd.NewReader(nil)), which risks OOM on untrusted DA blobs;
update the decoder initialization in pkg/da/compression.go to pass
zstd.WithDecoderMaxMemory(...) (e.g., 7*1024*1024 for a 7MB cap matching the DA
max blob size) so the decoder enforces a safe decompression memory limit.

---

Nitpick comments:
In `@block/internal/da/client.go`:
- Around line 74-79: The current client.Submit implementation computes blobSize
from the original data slices (variable blobSize) which reflects the
uncompressed size while DA submissions use compressed blobs; update Submit to
either record and return both sizes by adding a new CompressedBlobSize field to
the datypes.ResultSubmit (in addition to the existing BlobSize) and populate it
with the length of the compressed payloads before sending, or explicitly
document in the Submit comment and datatypes that BlobSize is the uncompressed
size; change references in Submit and any callers of datypes.ResultSubmit to use
the new CompressedBlobSize when estimating costs/quota.
- Around line 91-100: Extract the magic numeric thresholds used in the
compression selection switch into named constants (e.g., MinFastBatchSize = 10,
MinDefaultBatchSize = 3) or into a small config struct, then replace the literal
`10` and `3` in the switch that sets `compLevel` (which currently uses
`da.LevelBest`, `da.LevelFastest`, and `da.LevelDefault`) with those
constants/fields so the backlog-pressure policy is self-documenting and easily
tunable without changing control flow.

In `@pkg/da/compression_test.go`:
- Around line 61-81: The test currently appends to a shared sizes slice inside
each t.Run which is unsafe if subtests are parallelized; change to preallocate
sizes := make([]int, len(levels)) before the loop, capture the loop index (e.g.,
idx := i) inside the t.Run closure, and assign sizes[idx] = len(compressed)
instead of appending; keep the cross-level assertion after the loop (or move it
into its own t.Run) and add a short comment on not calling t.Parallel() here to
avoid races if someone enables parallel subtests; refer to the t.Run loop,
levels, Compress and Decompress calls and the sizes slice when making the
change.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67e18bd and c875c0b.

⛔ Files ignored due to path filters (7)
  • apps/evm/go.sum is excluded by !**/*.sum
  • apps/grpc/go.sum is excluded by !**/*.sum
  • apps/testapp/go.sum is excluded by !**/*.sum
  • execution/evm/go.sum is excluded by !**/*.sum
  • execution/evm/test/go.sum is excluded by !**/*.sum
  • go.sum is excluded by !**/*.sum
  • test/e2e/go.sum is excluded by !**/*.sum
📒 Files selected for processing (10)
  • CLAUDE.md
  • apps/evm/go.mod
  • apps/grpc/go.mod
  • apps/testapp/go.mod
  • block/internal/da/client.go
  • execution/evm/test/go.mod
  • go.mod
  • pkg/da/compression.go
  • pkg/da/compression_test.go
  • test/e2e/go.mod

Comment on lines 47 to 51
var err error
decoder, err = zstd.NewReader(nil)
if err != nil {
panic(fmt.Sprintf("compression: create zstd decoder: %v", err))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n pkg/da/compression.go

Repository: evstack/ev-node

Length of output: 3511


🌐 Web query:

klauspost compress zstd WithDecoderMaxMemory WithDecoderMaxWindowSize options

💡 Result:

In github.com/klauspost/compress/zstd, the current option names are:

  • WithDecoderMaxMemory(n uint64)
    Sets an upper bound on memory-related growth:

    • For non-streaming decode (DecodeAll): the maximum decoded size allowed.
    • For streaming decode (NewReader/Read): the maximum window size allowed.
      Default: 64 GiB. Max: 1<<63 bytes. Can be changed via ResetWithOptions. [1]
  • WithDecoderMaxWindow(size uint64) (this is what you likely mean by “WithDecoderMaxWindowSize”)
    Sets the maximum Zstd window size the decoder will accept (larger windows imply higher memory usage).
    Default: 512 MB. Max: ~3.75 TB (per Zstandard spec). If WithDecoderMaxMemory is set lower, that lower value wins. Can be changed via ResetWithOptions. [1]

Example:

dec, err := zstd.NewReader(r,
    zstd.WithDecoderMaxMemory(512<<20), // 512 MiB cap (DecodeAll output cap / streaming window cap)
    zstd.WithDecoderMaxWindow(256<<20), // additionally cap window to 256 MiB
)

Source: Go package docs for github.com/klauspost/compress/zstd. [1]

References:
[1] https://pkg.go.dev/github.com/klauspost/compress/zstd


🏁 Script executed:

# Search for any blob size limits or DA constraints in the codebase
rg -i "blob.*size|max.*blob|max.*decompress|decodermax" --type go

Repository: evstack/ev-node

Length of output: 16141


🏁 Script executed:

# Look for DA interface and max blob references
fd "types.go" pkg/da --exec cat -n {}

Repository: evstack/ev-node

Length of output: 3573


🏁 Script executed:

# Search for usage of Decompress function
rg "Decompress" --type go -B 2 -A 2

Repository: evstack/ev-node

Length of output: 4883


Add decompression size limit to prevent memory exhaustion from untrusted DA blobs.

The Decompress function processes untrusted data from the DA layer without constraining decompression output. With the default zstd limit of 64 GiB, a maliciously crafted compressed blob could allocate far more memory than the actual maximum blob size (5 MB). Add zstd.WithDecoderMaxMemory() when initializing the decoder at line 48, aligned with your maximum DA blob size to prevent DoS attacks.

decoder, err = zstd.NewReader(nil, zstd.WithDecoderMaxMemory(7*1024*1024)) // 7 MB to match max blob size
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/da/compression.go` around lines 47 - 51, The zstd decoder is created
without a max-memory cap (decoder, err = zstd.NewReader(nil)), which risks OOM
on untrusted DA blobs; update the decoder initialization in
pkg/da/compression.go to pass zstd.WithDecoderMaxMemory(...) (e.g., 7*1024*1024
for a 7MB cap matching the DA max blob size) so the decoder enforces a safe
decompression memory limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Compression to Blobs Before Submission

2 participants