Skip to content

deps: prefer cpuinfo_cur_freq over scaling_cur_freq in uv_cpu_info()#62018

Closed
RajeshKumar11 wants to merge 1 commit intonodejs:mainfrom
RajeshKumar11:fix/61998-cpufreq-amd-perf-clean
Closed

deps: prefer cpuinfo_cur_freq over scaling_cur_freq in uv_cpu_info()#62018
RajeshKumar11 wants to merge 1 commit intonodejs:mainfrom
RajeshKumar11:fix/61998-cpufreq-amd-perf-clean

Conversation

@RajeshKumar11
Copy link
Contributor

Summary

  • In uv_cpu_info() on Linux, reading /sys/devices/system/cpu/cpuN/cpufreq/scaling_cur_freq for each CPU core triggers a slow ACPI/SMM round-trip on some modern AMD CPUs inside containers — up to ~20ms per core, making os.cpus() take 500–600ms on a 32-core system
  • cpuinfo_cur_freq exposes the same current-frequency information but reads from a hardware-cached register, avoiding the ACPI round-trip
  • This patch prefers cpuinfo_cur_freq and falls back to scaling_cur_freq for systems that do not expose it

Root cause

uv_cpu_info() in deps/uv/src/unix/linux.c iterates over every online CPU and opens /sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq sequentially. On AMD EPYC and Ryzen 9000-series CPUs running inside a Linux container, each open() + read() of that file causes the kernel cpufreq driver to issue an ACPI call into System Management Mode (SMM) to query the live clock frequency. SMM is a privileged firmware context that suspends all other CPU execution while it runs; inside a container the latency is amplified and regularly reaches 20 ms per core.

With 32 cores: 32 × 20 ms = ~640 ms spent inside a single os.cpus() call.

The regression was introduced when libuv switched from cpuinfo_cur_freq to scaling_cur_freq (libuv issue libuv/libuv#4098). The libuv project closed the upstream issue as NOT_PLANNED.

Confirmed by comparing node:20.2.0-alpine (19 ms) vs node:20.3.0-alpine (356 ms) on the same AMD host.

Fix

deps/uv/src/unix/linux.c — in the per-CPU frequency loop starting around line 1884:

  • Try cpuinfo_cur_freq first; it is a cached readout of the actual hardware frequency reported by the CPU's own registers (via /sys/bus/cpu/drivers/acpi-cpufreq or equivalent) and does not trigger SMM
  • If that file is absent (e.g. non-x86 platforms, or cpufreq not loaded), fall back to scaling_cur_freq — identical behaviour to today

The change is purely a path-preference inside the existing if (fp == NULL) continue; fallback pattern already used in the function, so it is safe on all platforms that currently work.

Test

os.cpus() correctness (structure, types, speed field) is already covered by test/parallel/test-os.js. No new test is added — a timing assertion for this fix would be flaky in CI and the slow path requires a specific AMD CPU model inside a Linux container.

Related

Fixes: #61998

On Linux, uv_cpu_info() reads the current CPU frequency for each core
from /sys/devices/system/cpu/cpuN/cpufreq/scaling_cur_freq. On some
modern AMD CPUs (EPYC, Ryzen 9000 series) running inside containers,
reading scaling_cur_freq triggers an ACPI/SMM round-trip that can take
~20ms per core, making os.cpus() take 500-600ms on a 32-core system.

The file cpuinfo_cur_freq exposes the same current frequency information
but reads from a hardware-cached register, avoiding the costly ACPI
round-trip. Prefer it when available and fall back to scaling_cur_freq
for systems that do not expose cpuinfo_cur_freq.

Fixes: nodejs#61998
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/security-wg

@nodejs-github-bot nodejs-github-bot added libuv Issues and PRs related to the libuv dependency or the uv binding. needs-ci PRs that need a full CI run. labels Feb 27, 2026
@Renegade334
Copy link
Member

Changes to libuv should be proposed upstream (https://github.com/libuv/libuv).

Copy link
Member

@richardlau richardlau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need to be changed in upstream libuv.

@RajeshKumar11
Copy link
Contributor Author

Thank you for the feedback @Renegade334 and @richardlau.

Understood — the correct path is to land this upstream in libuv first. I'm opening a PR at https://github.com/libuv/libuv now.

For context: the libuv issue (libuv/libuv#4098) was previously closed as NOT_PLANNED, with the maintainers' position being that uv_cpu_info() shouldn't be called frequently enough for this to matter. However, os.cpus().length is a widely-used idiom and calling it once at startup on a 32-core AMD system now costs ~600 ms, which is hard to justify. I'll make that case in the upstream PR.

I'll update this PR with the libuv upstream PR link once it's open.

@RajeshKumar11
Copy link
Contributor Author

Upstream PR is now open: libuv/libuv#5036

I've submitted the same fix to libuv's v1.x branch. Once it lands there and libuv cuts a release, this can be picked up via the normal libuv upgrade path.

@saghul
Copy link
Member

saghul commented Feb 27, 2026

However, os.cpus().length is a widely-used idiom

Out of curiosity, why is that?

If it's to compute the available number of cores, is availableParalelism faster?

@RajeshKumar11
Copy link
Contributor Author

@saghul Good question — os.availableParallelism() is faster and is the right choice when all you need is the core count. It maps to uv_available_parallelism() which reads from sched_getaffinity() / cgroup quota and does not touch cpufreq at all.

The reason I still think fixing uv_cpu_info() matters:

  1. process.report.getReport() calls uv_cpu_info() directly in src/node_report.cc — users cannot substitute availableParallelism() there. Diagnostic reports in production (e.g. on crash or signal) should not take 600 ms just to gather CPU info.

  2. The speed field has legitimate callers — code that tunes thread-pool sizes or task granularity based on CPU frequency cannot use availableParallelism() as a drop-in.

  3. Existing codebasesos.cpus().length is pervasive. os.availableParallelism() was added in v18.14/v19.4; a lot of code predates it.

That said, I agree availableParallelism() should be the documented recommendation for the "how many workers should I spawn?" pattern. I'm happy to add a note in the Node.js docs for os.cpus() pointing users there if that would help.

The upstream fix is at libuv/libuv#5036.

@Renegade334
Copy link
Member

Closing this PR; feel free to continue discussion in the appropriate venue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libuv Issues and PRs related to the libuv dependency or the uv binding. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

os.cpus().length performance on modern AMD CPUs inside containers

5 participants