Conversation
Adds support for OpenTelemetry tracing, as described in the included docs. Custom instrumentation is not yet supported.
|
Review requested:
|
legendecas
left a comment
There was a problem hiding this comment.
AFAICT, it would create confusion that the built-in support does not work with the @opentelemetry/api package, which is intended to be the global provider for otel components, like propagators and exporters.
|
I personally find the opentelemetry apis really cumbersome to work with. Maybe there is something better that can be done. But I also concur that some kind of high level compatibility api would need to exist. |
|
It would be really nice to get ahold of https://www.npmjs.com/package/otel and add this new core module without the prefix. |
| otel.start({ endpoint: 'http://collector.example.com:4318' }); | ||
|
|
||
| // ... application code ... | ||
| otel.stop(); |
There was a problem hiding this comment.
More a style nit... feel free to ignore. I find this start()/stop() pattern a bit cumbersome. I'd be happier something like...
const stop = otel.start();
stop();which also allows for...
{
using session = otel.start();
// automatically stops
}| * `options` {Object} | ||
| * `endpoint` {string} The OTLP collector endpoint URL. | ||
| **Default:** `'http://localhost:4318'`. | ||
| * `filter` {string|string\[]} Optional comma-separated string or array of |
There was a problem hiding this comment.
Nit: can we pick one? I'd prefer just string[].
Agree that an OpenTelemetry SDK is not built-in to the Node.js instrumentation. However, this implementation is not fully compliant with the OpenTelemetry design. The instrumentation should not implement the OpenTelemetry SDK components, like exporters and tracer SDK. The three OpenTelemetry layers, API, SDK, and instrumentations, have their clear scopes and allows open customization without any vendor lock-in. That is, instrumentations will use the API to create spans and metrics, but do not directly depend on the SDK. This allows instrumentations to be vendor-agnostic. And observability vendors could provide their SDK (or just an exporter) to do whatever propriatary works, like exporting. I think a Node.js built-in |
Qard
left a comment
There was a problem hiding this comment.
Some small questions/ideas, but otherwise LGTM!
| ### `OTEL_SERVICE_NAME` | ||
|
|
||
| Standard OpenTelemetry environment variable used to set the service name in | ||
| exported resource attributes. Defaults to `node-<pid>`. |
There was a problem hiding this comment.
We already have to load the package.json during startup, don't we? So perhaps we could make this use the name field from it, if present, and fallback to that?
| let timeOriginNs; | ||
| function getTimeOriginNs() { | ||
| if (timeOriginNs === undefined) { | ||
| // getTimeOriginTimestamp() returns ms since Unix epoch. | ||
| // Multiply by 1e6 to get nanoseconds. | ||
| const originMs = getTimeOriginTimestamp(); | ||
| timeOriginNs = BigInt(MathRound(originMs * 1e6)); | ||
| } | ||
| return timeOriginNs; | ||
| } | ||
|
|
||
| function hrTimeToNanos() { | ||
| const relativeMs = now(); | ||
| const ns = getTimeOriginNs() + BigInt(MathRound(relativeMs * 1e6)); | ||
| return `${ns}`; | ||
| } |
There was a problem hiding this comment.
I'm wondering a bit about the perf of this. Did you try a high/low pair like process.hrtime() returns to be able to do the 64-bit timestamp rather than this string approach? I'm a bit worried about the perf implications of lots of bigint timestamps being generated all the time. 🤔
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #61907 +/- ##
==========================================
+ Coverage 89.73% 89.75% +0.01%
==========================================
Files 675 680 +5
Lines 204674 205883 +1209
Branches 39330 39565 +235
==========================================
+ Hits 183667 184781 +1114
- Misses 13283 13387 +104
+ Partials 7724 7715 -9
🚀 New features to boost your workflow:
|
| urlStr + '/v1/traces'; | ||
| } | ||
|
|
||
| const parsed = new URL(urlStr); |
There was a problem hiding this comment.
Could this condition be simplified to the following?:
const parsed = new URL('/v1/traces', endpoint);Tests with the Node REPL:
> new URL("/v1/traces", "http://localhost:4318/hello?foo=bar").href
'http://localhost:4318/v1/traces'
> new URL("/v1/traces", "http://localhost:4318/").href
'http://localhost:4318/v1/traces'
> new URL("/v1/traces", "http://localhost:4318").href
'http://localhost:4318/v1/traces'
> new URL("/v1/traces", "http://localhost:4318/v1").href
'http://localhost:4318/v1/traces'
> new URL("/v1/traces", "http://localhost:4318/v1/traces").href
'http://localhost:4318/v1/traces'
> new URL("/v1/traces", "http://localhost:4318/v1/traces/").href
'http://localhost:4318/v1/traces'|
I have a lot of thoughts about this so I'll try to keep it succinct and direct. In the interest of full disclosure, I have been one of the maintainers of OTel JS for years and that will likely color some of my opinions. I am not on the OTel Governance Committee and the opinions here should be taken as my own and not as the official stance of the project. I'll start with the things I like. First, I think having built-in support for OTel is great, and focusing on HTTP first makes sense. The lack of interoperability with existing instrumentation means the use of this will be very limited for now, but that's probably fine. Anyone who needs other instrumentation has the option to use the existing otel libraries. Second, i think it is reasonable to only support OTLP. Supporting other export formats is a well intentioned part of OTel, but it was a lot more important in the earlier days when OTLP support was limited. Now all major vendors support OTLP, and the collector serves as a mature routing layer for people who need to use other formats. I would prefer to have OTLP protobuf as an option since it is by far the most used OTLP format, but JSON makes sense as the de facto standard format for JS remote calls and I understand the desire to avoid the protobuf dependency. Third, I do see the value in a reimplementation which is node-specific and avoids the need for various platform-specific workarounds. Because this is shipped with node itself, there is no need to retain workarounds for old node versions or non-node platforms like browsers. Finally, this seems really easy to use. The lack of configuration, customization, and manual instrumentation definitely makes this simpler to understand and use for end users, and because it is built in there is no need to make sure your otel dependency versions align with each other. I do have a couple issues with the way this is implemented. This seems to have completely done away with most of the features required by the OTel specification. For example, it is missing all of the normal extension points (configuration, propagators, samplers, id generators, span processors, exporters, resource, entities). I'm not sure if this was an unintentional side-effect of vibe coding a replacement SDK or an intentional choice to "avoid bloat," but it really undercuts a lot of the value of OTel. I might even go so far as to say it's not quite accurate to call this "OpenTelemetry support" without those things. It might be better to call this "OTLP span export" or something like that. At the very least I'd make sure you reach out to the OTel Governance Committee to obtain permission to use the OpenTelemetry trademark. I'm not a lawyer and not on the GC, but this is an area where I think it pays to be careful. Finally, and this is my biggest concern, I agree with @legendecas that this seems like it would fracture the otel ecosystem on node. I am most concerned about the complete lack of interoperability with existing otel libraries. It seems to me that there is definitely a way that this could be done that would retain interoperability and allow a gradual transition to a built in implementation without such a fracture. I understand that obviously node can't take a dependency on an NPM module, but I'm certain the otel JS SDK would not have any issue taking a dependency on built-in modules where the experience could be improved for users. |
Adds support for OpenTelemetry tracing, as described in the included docs. Custom instrumentation is not yet supported.
This PR is intended to lay the groundwork for future PRs where API support for custom instrumentations is added. The OpenTelemetry SDK is explicitly not used, mainly to avoid bloat.
A programmatic API to enable and disable tracing is added as a new module,
node:otel, but I'm very open to suggestions in terms of where to add it instead, keeping in mind that the intent is to eventually expose an API for custom instrumentation alongside it.Disclaimer: As an experiment, Claude Code was used to build this PR, based on a hand-written prototype I built (but did not publish) about a year ago. Everything was thoroughly reviewed by me prior to PR submission, including some hand-edits.