buffer: improve btoa performance#52427
Conversation
|
@anonrig I recommend trying something like this... We have three cases: one external one-byte string (easy), one non-external one byte string (we use WriteOneByte) and the general case where we go to 16-bit words, we try converting to latin1 and then we see what happens. In this scenario, there might be an error condition, currently in the code I suggest, it returns the empty string. static void Btoa(const FunctionCallbackInfo<Value>& args) {
CHECK_EQ(args.Length(), 1);
Environment* env = Environment::GetCurrent(args);
THROW_AND_RETURN_IF_NOT_STRING(env, args[0], "argument");
Local<String> input = args[0].As<String>();
MaybeStackBuffer<char> buffer;
size_t written;
if (input->IsExternalOneByte()) { // 8-bit case
auto ext = input->GetExternalOneByteStringResource();
size_t expected_length = simdutf::base64_length_from_binary(ext->length());
buffer.AllocateSufficientStorage(expected_length + 1);
buffer.SetLengthAndZeroTerminate(expected_length);
written = simdutf::binary_to_base64(
ext->data(), ext->length(), buffer.out(), simdutf::base64_default);
} if(input->IsOneByte()) {
MaybeStackBuffer<uint8_t> stack_buf(input->Length());
input->WriteOneByte(env->isolate(),
stack_buf.out(),
0,
input->Length(),
String::NO_NULL_TERMINATION);
size_t expected_length = simdutf::base64_length_from_binary(input->Length());
buffer.AllocateSufficientStorage(expected_length + 1);
buffer.SetLengthAndZeroTerminate(expected_length);
written = simdutf::binary_to_base64(
reinterpret_cast<const char*>(*stack_buf), input->Length(), buffer.out());
} else {
String::Value value(env->isolate(), input);
MaybeStackBuffer<char> stack_buf(value.length());
size_t out_len = simdutf::convert_utf16_to_latin1(reinterpret_cast<const char16_t *>(*value), value.length(), stack_buf.out());
if(out_len == 0) { // error
return args.GetReturnValue().SetEmptyString();
}
size_t expected_length = simdutf::base64_length_from_binary(out_len);
buffer.AllocateSufficientStorage(expected_length + 1);
buffer.SetLengthAndZeroTerminate(expected_length);
written = simdutf::binary_to_base64(
*stack_buf, out_len, buffer.out());
}
auto value =
String::NewFromOneByte(env->isolate(),
reinterpret_cast<const uint8_t*>(buffer.out()),
NewStringType::kNormal,
written)
.ToLocalChecked();
return args.GetReturnValue().Set(value);
}This should be considered ChatGTP-quality code: I did not even try to test it... It is only as an idea. |
|
@anonrig Lint. Lint. Lint. |
|
@lemire I've updated the benchmark results as well :-) |
|
@anonrig My proposed code would return the empty string on error, but we need to throw, I pushed a commit that does that. (Very minor change.) |
Commit Queue failed- Loading data for nodejs/node/pull/52427 ✔ Done loading data for nodejs/node/pull/52427 ----------------------------------- PR info ------------------------------------ Title buffer: improve `btoa` performance (#52427) Author Yagiz Nizipli (@anonrig) Branch anonrig:improve-btoa -> nodejs:main Labels buffer, c++, needs-ci Commits 2 - buffer: improve `btoa` performance - fix: return exception Committers 2 - Yagiz Nizipli - Daniel Lemire PR-URL: https://github.com/nodejs/node/pull/52427 Reviewed-By: Daniel Lemire Reviewed-By: James M Snell ------------------------------ Generated metadata ------------------------------ PR-URL: https://github.com/nodejs/node/pull/52427 Reviewed-By: Daniel Lemire Reviewed-By: James M Snell -------------------------------------------------------------------------------- ℹ This PR was created on Mon, 08 Apr 2024 17:24:56 GMT ✔ Approvals: 2 ✔ - Daniel Lemire (@lemire): https://github.com/nodejs/node/pull/52427#pullrequestreview-1992265694 ✔ - James M Snell (@jasnell) (TSC): https://github.com/nodejs/node/pull/52427#pullrequestreview-1993089038 ⚠ GitHub cannot link the author of 'fix: return exception' to their GitHub account. ⚠ Please suggest them to take a look at https://github.com/nodejs/node/blob/99b1ada/doc/guides/contributing/pull-requests.md#step-1-fork ✔ Last GitHub CI successful ℹ Last Full PR CI on 2024-04-11T02:25:23Z: https://ci.nodejs.org/job/node-test-pull-request/58258/ - Querying data for job/node-test-pull-request/58258/ ✔ Last Jenkins CI successful -------------------------------------------------------------------------------- ✔ No git cherry-pick in progress ✔ No git am in progress ✔ No git rebase in progress -------------------------------------------------------------------------------- - Bringing origin/main up to date... From https://github.com/nodejs/node * branch main -> FETCH_HEAD ✔ origin/main is now up-to-date - Downloading patch for 52427 From https://github.com/nodejs/node * branch refs/pull/52427/merge -> FETCH_HEAD ✔ Fetched commits as ee4fa77624f1..6900b0dce7cf -------------------------------------------------------------------------------- [main 50c914c521] buffer: improve `btoa` performance Author: Yagiz Nizipli Date: Mon Apr 8 13:23:56 2024 -0400 3 files changed, 81 insertions(+), 7 deletions(-) create mode 100644 benchmark/buffers/buffer-btoa.js [main de7c8dcb6a] fix: return exception Author: Daniel Lemire Date: Wed Apr 10 09:44:11 2024 -0400 2 files changed, 6 insertions(+), 2 deletions(-) ✔ Patches applied There are 2 commits in the PR. Attempting autorebase. Rebasing (2/4)https://github.com/nodejs/node/actions/runs/8645651485 |
|
Landed in 21211a3 |
PR-URL: #52427 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #52427 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
Improves the performance of the legacy
btoafunction. It's not recommended to be used in production.Disclaimer
"btoa" is a legacy functionality, and not recommended to be used in production applications.
Thanks Matthieu Riegler for the poster