-
Notifications
You must be signed in to change notification settings - Fork 1.4k
better ssl write handling #6763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 Walkthrough🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin ssl-write |
f602c12 to
ff4e0bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/stdlib/src/ssl/compat.rs`:
- Around line 1687-1788: ssl_write currently always calls writer.write_all(data)
which appends plaintext to rustls on every retry, causing duplicate application
data; add a per-socket flag on PySSLSocket (e.g., plaintext_buffered_for_write)
that ssl_write sets the first time it writes the given data (after calling
writer.write_all) and skips subsequent writer.write_all calls when that flag is
true, still proceeding to flush_pending_tls_output/send_all_bytes and handle
WANT_READ/WANT_WRITE; clear the flag once conn.wants_write() is false and after
the final flush_pending_tls_output completes (so future ssl_write calls will
write new plaintext again); update logic in ssl_write to consult this flag
(respecting is_bio mode) and ensure the flag is cleared on successful completion
or on errors that indicate the buffered plaintext was consumed or should be
discarded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@crates/stdlib/src/ssl/compat.rs`:
- Around line 1787-1812: The timeout arm in the match on send_all_bytes
currently clears the buffered write state leading to duplicate plaintext
re-buffering on retry; modify the SslError::Timeout branch to behave like
WantWrite/WantRead by returning Err(SslError::Timeout) without clearing
socket.write_buffered_len (so pending_tls_output remains intact) and preserve
the existing semantics for is_bio vs socket-mode handling; locate the match in
the write path around send_all_bytes and update the Timeout case to not reset
write_buffered_len so retries won’t duplicate sends.
- Around line 1738-1759: Add a hash guard to detect retries with different
content: add a write_buffered_hash: PyMutex<Option<u64>> field to PySSLSocket
and initialize it to None in both PySSLSocket constructors; when you buffer new
plaintext in the block that currently sets *socket.write_buffered_len.lock() =
data.len(), compute a stable u64 hash of data and store it in
socket.write_buffered_hash.lock(); update the retry check to require both the
length and the stored hash to match (i.e., treat it as a bad write retry if
already_buffered != data.len() || buffered_hash != Some(data_hash)); and when
you clear the buffer state (where you set *socket.write_buffered_len.lock() = 0)
also clear socket.write_buffered_hash to None.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/thread.rs (1)
428-454: Panic safety:run_threadlacksscopeguardprotection unlikestart_joinable_thread.If a Rust panic occurs (not a Python exception), the cleanup code at lines 443-453 including
fetch_sub(1)won't execute, leavingthread_countpermanently elevated. Thestart_joinable_threadspawned thread correctly usesscopeguard::defer!for this.Consider wrapping the cleanup in a
scopeguard::defer!for consistency:♻️ Suggested refactor
fn run_thread(func: ArgCallable, args: FuncArgs, vm: &VirtualMachine) { // Increment thread count when thread actually starts executing vm.state.thread_count.fetch_add(1); + scopeguard::defer! { + for lock in SENTINELS.take() { + if lock.mu.is_locked() { + unsafe { lock.mu.unlock() }; + } + } + // Clean up thread-local storage while VM context is still active + cleanup_thread_local_data(); + // Clean up frame tracking + crate::vm::thread::cleanup_current_thread_frames(vm); + vm.state.thread_count.fetch_sub(1); + } + match func.invoke(args, vm) { Ok(_obj) => {} Err(e) if e.fast_isinstance(vm.ctx.exceptions.system_exit) => {} Err(exc) => { vm.run_unraisable( exc, Some("Exception ignored in thread started by".to_owned()), func.into(), ); } } - for lock in SENTINELS.take() { - if lock.mu.is_locked() { - unsafe { lock.mu.unlock() }; - } - } - // Clean up thread-local storage while VM context is still active - cleanup_thread_local_data(); - // Clean up frame tracking - crate::vm::thread::cleanup_current_thread_frames(vm); - vm.state.thread_count.fetch_sub(1); }
| // Check if we already have data buffered from a previous retry | ||
| // (prevents duplicate writes when retrying after WantWrite/WantRead) | ||
| let already_buffered = *socket.write_buffered_len.lock(); | ||
|
|
||
| // Only write plaintext if not already buffered | ||
| if already_buffered == 0 { | ||
| // Write plaintext to rustls (= SSL_write_ex internal buffer write) | ||
| { | ||
| let mut writer = conn.writer(); | ||
| use std::io::Write; | ||
| writer | ||
| .write_all(data) | ||
| .map_err(|e| SslError::Syscall(format!("Write failed: {e}")))?; | ||
| } | ||
| // Mark data as buffered | ||
| *socket.write_buffered_len.lock() = data.len(); | ||
| } else if already_buffered != data.len() { | ||
| // Caller is retrying with different data - this is a protocol error | ||
| // Clear the buffer state and return an SSL error (bad write retry) | ||
| *socket.write_buffered_len.lock() = 0; | ||
| return Err(SslError::Ssl("bad write retry".to_string())); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's check if the file exists and view the specific lines mentioned
head -20 crates/stdlib/src/ssl/compat.rs | head -5
wc -l crates/stdlib/src/ssl/compat.rsRepository: RustPython/RustPython
Length of output: 186
🏁 Script executed:
# Look at the specific lines mentioned (1738-1759)
sed -n '1738,1759p' crates/stdlib/src/ssl/compat.rsRepository: RustPython/RustPython
Length of output: 186
🏁 Script executed:
# Search for PySSLSocket struct definition to see current fields
rg -n "struct PySSLSocket" crates/stdlib/src/ssl/compat.rs -A 30Repository: RustPython/RustPython
Length of output: 186
🏁 Script executed:
# Search for write_buffered_len references to understand current usage
rg -n "write_buffered_len" crates/stdlib/src/ssl/compat.rs -B 2 -A 2Repository: RustPython/RustPython
Length of output: 1733
🏁 Script executed:
# Search for PySSLSocket struct definition
rg -n "struct PySSLSocket" crates/stdlib/src/ssl/compat.rs -B 5 -A 50Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Also search for #[pyclass] annotations for PySSLSocket
rg -n "#\[pyclass\]" crates/stdlib/src/ssl/compat.rs -B 2 -A 50 | head -150Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search for write_buffered_len field definition
rg -n "write_buffered_len" crates/stdlib/src/ssl/compat.rs | head -20Repository: RustPython/RustPython
Length of output: 522
🏁 Script executed:
# Try broader search for PySSLSocket without struct keyword
rg -n "PySSLSocket" crates/stdlib/src/ssl/compat.rs | head -20Repository: RustPython/RustPython
Length of output: 456
🏁 Script executed:
# Look for pyclass macro definition
rg -n "pyclass.*PySSLSocket" crates/stdlib/src/ssl/compat.rs -B 2 -A 100 | head -200Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search more broadly in the file for where PySSLSocket is declared
head -500 crates/stdlib/src/ssl/compat.rs | tail -400Repository: RustPython/RustPython
Length of output: 16044
🏁 Script executed:
# PySSLSocket is imported from parent module, let's find the _ssl module
fd "_ssl" crates/stdlib/src/ssl/ -type fRepository: RustPython/RustPython
Length of output: 235
🏁 Script executed:
# Look for PySSLSocket struct in the _ssl module
rg -n "struct PySSLSocket" crates/stdlib/src/ssl/_ssl.rs -B 5 -A 50 | head -100Repository: RustPython/RustPython
Length of output: 134
🏁 Script executed:
# Search for write_buffered_len field in _ssl.rs
rg -n "write_buffered_len" crates/stdlib/src/ssl/_ssl.rsRepository: RustPython/RustPython
Length of output: 134
🏁 Script executed:
# Check what fields currently exist in PySSLSocket
rg -n "write_buffered" crates/stdlib/src/ssl/_ssl.rs -B 2 -A 2Repository: RustPython/RustPython
Length of output: 134
🏁 Script executed:
# Find all files in the ssl directory
ls -la crates/stdlib/src/ssl/Repository: RustPython/RustPython
Length of output: 413
🏁 Script executed:
# Search for PySSLSocket definition across the entire ssl directory
rg -n "struct PySSLSocket" crates/stdlib/src/ssl/Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search for where write_buffered_len is first created/initialized
rg -n "write_buffered_len.*=" crates/stdlib/src/ssl/ -B 3 -A 1Repository: RustPython/RustPython
Length of output: 1544
🏁 Script executed:
# Search for the parent module structure
ls -la crates/stdlib/src/ssl/mod.rsRepository: RustPython/RustPython
Length of output: 140
🏁 Script executed:
# Look for PySSLSocket in the mod.rs file
rg -n "PySSLSocket" crates/stdlib/src/ssl/mod.rs -B 5 -A 50 | head -200Repository: RustPython/RustPython
Length of output: 133
🏁 Script executed:
# Search for #[pyclass] in all ssl files to find PySSLSocket definition
rg -n "#\[pyclass\]" crates/stdlib/src/ssl/ -A 30 | grep -A 30 "PySSLSocket"Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Try to find where PySSLSocket is actually defined (might be inline in compat.rs)
rg -n "^pub.*struct PySSLSocket" crates/stdlib/src/ssl/Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look for where PySSLSocket is defined in the entire ssl directory
rg -n "struct PySSLSocket" crates/stdlib/src/ssl/ -B 5 -A 50Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search in compat.rs itself for PySSLSocket struct definition or pyclass
grep -n "PySSLSocket" crates/stdlib/src/ssl/compat.rs | head -20Repository: RustPython/RustPython
Length of output: 456
🏁 Script executed:
# Look at the import line and surrounding context in compat.rs
head -50 crates/stdlib/src/ssl/compat.rsRepository: RustPython/RustPython
Length of output: 2184
🏁 Script executed:
# Search for pyclass definition in compat.rs
rg -n "#\[pyclass\]" crates/stdlib/src/ssl/compat.rs -A 20 | head -150Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look at the directory structure of the ssl parent
ls -la crates/stdlib/src/Repository: RustPython/RustPython
Length of output: 3425
🏁 Script executed:
# Search for all Python SSL-related files
find crates/stdlib/src -name "*ssl*" -o -name "*_ssl*"Repository: RustPython/RustPython
Length of output: 291
🏁 Script executed:
# Look for PySSLSocket definition more broadly
rg -n "PySSLSocket" crates/stdlib/src/ --type rust -B 5 | grep -A 10 "struct PySSLSocket\|#\[pyclass\].*PySSLSocket" | head -100Repository: RustPython/RustPython
Length of output: 637
🏁 Script executed:
# Now let's look at the PySSLSocket struct definition in ssl.rs
sed -n '2302,2370p' crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 3244
🏁 Script executed:
# Search for write_buffered_len field specifically
rg -n "write_buffered_len" crates/stdlib/src/ssl.rs -B 2 -A 2Repository: RustPython/RustPython
Length of output: 1012
🏁 Script executed:
# Get more context around the struct definition
sed -n '2300,2380p' crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 3734
🏁 Script executed:
# Now let's see how the function uses this in context - look at the full function
sed -n '1700,1830p' crates/stdlib/src/ssl/compat.rsRepository: RustPython/RustPython
Length of output: 4509
🏁 Script executed:
# Search for test cases related to write_buffered_len or SSL_write
rg -n "test.*write\|write.*test" crates/stdlib/src/ssl/ -i | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check if there are any tests in the ssl.rs file
rg -n "#\[test\]" crates/stdlib/src/ssl.rs | head -10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look for how CPython or OpenSSL handles this scenario
rg -n "bad write retry\|write retry" crates/stdlib/src/ssl/ -B 3 -A 3Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check how the code currently handles data - is there any copy of the data?
sed -n '1712,1830p' crates/stdlib/src/ssl/compat.rs | grep -A 5 -B 5 "data\|write_all"Repository: RustPython/RustPython
Length of output: 3631
🏁 Script executed:
# Look for similar patterns in the codebase where data is stored/compared
rg -n "hash.*=.*new\|DefaultHasher\|Hash.*Hasher" crates/stdlib/src/ssl/ | head -10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check if DefaultHasher is already used anywhere in the ssl module
rg -n "use std::hash" crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's check the broader context of how write operations are handled
rg -n "fn ssl_write" crates/stdlib/src/ssl/compat.rs -A 5Repository: RustPython/RustPython
Length of output: 486
🏁 Script executed:
# Check if there are any benchmarks or tests that would demonstrate the vulnerability
rg -n "SSL_write\|ssl_write" crates/stdlib/src/ssl.rs -B 2 -A 2 | head -50Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's verify the actual vulnerability - check how write data flows through rustls
# Look at the comment about "internal buffer write"
sed -n '1745,1760p' crates/stdlib/src/ssl/compat.rsRepository: RustPython/RustPython
Length of output: 767
🏁 Script executed:
# Check if rustls::Writer::write_all stores data internally
# Look for any documentation or comments about rustls buffering
rg -n "rustls.*buffer\|buffer.*rustls\|writer.*buffer" crates/stdlib/src/ssl/compat.rs | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# See if there's any example or comment about the retry logic
rg -n "WantWrite\|WantRead" crates/stdlib/src/ssl/compat.rs -B 3 -A 3 | head -80Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Verify the vulnerability is real by understanding rustls buffering
# Look for how rustls Writer works in the broader context
rg -n "conn.writer()" crates/stdlib/src/ssl/compat.rs -B 2 -A 5 | head -40Repository: RustPython/RustPython
Length of output: 738
🏁 Script executed:
# Check if there's any comment about the hash approach feasibility
rg -n "hash\|Hash\|Hasher" crates/stdlib/src/ssl.rs | head -20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check what imports are already in ssl.rs for hashing
head -100 crates/stdlib/src/ssl.rs | grep -i "use.*hash\|use std"Repository: RustPython/RustPython
Length of output: 119
🏁 Script executed:
# Verify the suggested fix would pass cargo clippy by checking similar patterns
rg -n "DefaultHasher\|std::hash" crates/stdlib/src/ | head -10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Verify that hash-based approach is standard by checking similar uses
rg -n "use std::hash::" crates/ | head -10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Confirm the suggested fix would work with current types
# Check if PyMutex<Option<u64>> is already used elsewhere
rg -n "PyMutex<Option<" crates/stdlib/src/ssl.rs | head -10Repository: RustPython/RustPython
Length of output: 181
🏁 Script executed:
# Review the full context one more time - look at initialization pattern
sed -n '1900,1910p' crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 582
🏁 Script executed:
# Verify there are no other places where write_buffered_len is manipulated incorrectly
rg -n "write_buffered_len" crates/stdlib/src/ssl.rsRepository: RustPython/RustPython
Length of output: 240
Detect "bad write retry" when content changes but length matches.
The retry guard only checks if already_buffered != data.len(). If the caller retries with different data of the same length, the code silently proceeds with the old buffered plaintext rather than raising "bad write retry," which deviates from OpenSSL behavior and risks sending corrupted application data. Add a hash-based check to validate that retry data matches the originally buffered content.
The fix requires:
- Add
write_buffered_hash: PyMutex<Option<u64>>field toPySSLSocketstruct - Initialize the field to
Nonein both struct constructors - Compute and store the hash when buffering new data
- Compare both length AND hash in the retry condition:
already_buffered != data.len() || buffered_hash != Some(data_hash) - Clear the hash when clearing the buffer state
🤖 Prompt for AI Agents
In `@crates/stdlib/src/ssl/compat.rs` around lines 1738 - 1759, Add a hash guard
to detect retries with different content: add a write_buffered_hash:
PyMutex<Option<u64>> field to PySSLSocket and initialize it to None in both
PySSLSocket constructors; when you buffer new plaintext in the block that
currently sets *socket.write_buffered_len.lock() = data.len(), compute a stable
u64 hash of data and store it in socket.write_buffered_hash.lock(); update the
retry check to require both the length and the stored hash to match (i.e., treat
it as a bad write retry if already_buffered != data.len() || buffered_hash !=
Some(data_hash)); and when you clear the buffer state (where you set
*socket.write_buffered_len.lock() = 0) also clear socket.write_buffered_hash to
None.
.
Summary by CodeRabbit
Bug Fixes
Improvements
New Features
✏️ Tip: You can customize this high-level summary in your review settings.