Skip to content

V4 worker pipeline package#1516

Open
nico-martin wants to merge 8 commits intov4from
v4-worker-pipeline-package
Open

V4 worker pipeline package#1516
nico-martin wants to merge 8 commits intov4from
v4-worker-pipeline-package

Conversation

@nico-martin
Copy link
Collaborator

Add Web Worker Pipeline Support

This PR reminded me that other libraries offer a fairly simple way to interact with web workers.
So this PR adds a webWorkerPipeline that can be used to replace the pipeline and a webWorkerPipelineHandler that handles the web worker requests.

Subpackage

I deliberately implemented this feature as the first implementation of a subpackage. It uses the same package commands as the main library, so something like pnpm dev or pnpm build works for both packages.

We will incorporate deployment to npm at a later date.

Usage

Main Thread:

import { webWorkerPipeline } from '@huggingface/transformers-webworker';

const worker = new Worker('worker.js', { type: 'module' });

const pipe = await webWorkerPipeline(
  worker,
  'background-removal',
  'Xenova/modnet',
  {
    device: 'webgpu',
    progress_callback: (e) => {
      console.log('Loading:', e.file);
    },
  }
);

const result = await pipe(image);

Worker Thread (worker.js):

import { webWorkerPipelineHandler } from '@huggingface/transformers-webworker';

const handler = webWorkerPipelineHandler();
self.onmessage = handler.onmessage;

Options and Limitations

Function Callbacks

Function callbacks like progress_callback are automatically handled via a callback bridge and will execute in the main thread:

const pipe = await webWorkerPipeline(worker, 'text-generation', 'model', {
  progress_callback: (progress) => {
    console.log('Loading:', progress);
  }
});

Note: session_options cannot contain GPU devices, WebNN contexts, or typed arrays as these are not serializable across worker boundaries.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new @huggingface/transformers-webworker subpackage to enable running pipeline(...) in a Web Worker with a main-thread API wrapper and a callback-bridge for function options (e.g. progress_callback).

Changes:

  • Introduces webWorkerPipeline (main thread) and webWorkerPipelineHandler (worker thread) plus message constants.
  • Adds a callback bridge implementation to serialize/deserialize function options across the worker boundary.
  • Adds a full subpackage setup (build/dev scripts, TS config, Jest config, tests, and README) and updates the lockfile.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
pnpm-lock.yaml Adds the new workspace package and its dev dependencies.
packages/transformers-webworker/package.json Defines the new subpackage (exports, scripts, deps).
packages/transformers-webworker/tsconfig.json Type declaration build configuration for the subpackage.
packages/transformers-webworker/src/index.ts Public entrypoint exporting the two main helpers.
packages/transformers-webworker/src/constants.ts Defines message type constants for worker communication.
packages/transformers-webworker/src/webWorkerPipeline.ts Main-thread wrapper that posts requests to a worker and awaits results.
packages/transformers-webworker/src/webWorkerPipelineHandler.ts Worker-side handler that creates/caches pipelines and executes requests.
packages/transformers-webworker/src/utils/callback-bridge/* Implements callback serialization + invocation plumbing.
packages/transformers-webworker/scripts/** Adds esbuild + typegen dev/build tooling for the subpackage.
packages/transformers-webworker/jest.config.mjs Adds Jest + ts-jest configuration for subpackage tests.
packages/transformers-webworker/tests/*.test.ts Adds tests for the worker handler and main-thread wrapper.
packages/transformers-webworker/README.md Documents usage, callback handling, and limitations.
packages/transformers-webworker/.gitignore Ignores build artifacts and coverage output.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +98 to +128
it("should handle callback invocations from worker", async () => {
const callback = jest.fn();
const options = {
progress_callback: callback,
};

setTimeout(() => {
// First, send init response
mockWorker.onmessage?.({
data: { id: "init", type: RESPONSE_MESSAGE_TYPE_RESULT },
} as MessageEvent);

// Then simulate callback invocation
setTimeout(() => {
mockWorker.onmessage?.({
data: {
type: RESPONSE_MESSAGE_TYPE_INVOKE_CALLBACK,
functionId: "cb_progress_callback",
args: [{ status: "progress", progress: 50 }],
},
} as MessageEvent);
}, 10);
}, 0);

await webWorkerPipeline(mockWorker as any, "text-classification", "test-model", options);

// Wait for callback to be invoked
await new Promise((resolve) => setTimeout(resolve, 20));

expect(callback).toHaveBeenCalledWith({ status: "progress", progress: 50 });
});
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test simulates a worker message via mockWorker.onmessage?.(...), but the implementation listens for callback invocations via worker.addEventListener('message', ...) inside CallbackBridgeClient. Since the mock Worker doesn’t implement addEventListener, callback invocation handling won’t be covered accurately; add an addEventListener/removeEventListener mock (and trigger those listeners) or adjust the implementation to use onmessage consistently.

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +10
"types": "./types/src/index.d.ts",
"type": "module",
"exports": {
".": {
"types": "./types/src/index.d.ts",
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The types entry points to ./types/src/index.d.ts, but with outDir: "types" and include: ["src/**/*"] the declaration output will typically be types/index.d.ts (no src/ segment). Align types/exports.types with the actual emitted declaration path to avoid broken typings for consumers.

Suggested change
"types": "./types/src/index.d.ts",
"type": "module",
"exports": {
".": {
"types": "./types/src/index.d.ts",
"types": "./types/index.d.ts",
"type": "module",
"exports": {
".": {
"types": "./types/index.d.ts",

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +22
const key = JSON.stringify({ task, model_id, options });
let pipe = pipelines.get(key);
if (!pipe) {
pipe = await pipeline(task, model_id, callbackBridge.deserialize(options));
pipelines.set(key, pipe);
}
self.postMessage({ id, type: RESPONSE_READY });
const result = data ? await pipe(data, pipeOptions) : null;
self.postMessage({ id, type: RESPONSE_RESULT, result });
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handler doesn’t catch exceptions from pipeline(...) or pipe(...), so a failure will neither post a RESPONSE_RESULT with an error nor reject on the main thread (and may crash the worker). Wrap the body in try/catch and post an error payload back with the same id when failures occur.

Suggested change
const key = JSON.stringify({ task, model_id, options });
let pipe = pipelines.get(key);
if (!pipe) {
pipe = await pipeline(task, model_id, callbackBridge.deserialize(options));
pipelines.set(key, pipe);
}
self.postMessage({ id, type: RESPONSE_READY });
const result = data ? await pipe(data, pipeOptions) : null;
self.postMessage({ id, type: RESPONSE_RESULT, result });
try {
const key = JSON.stringify({ task, model_id, options });
let pipe = pipelines.get(key);
if (!pipe) {
pipe = await pipeline(task, model_id, callbackBridge.deserialize(options));
pipelines.set(key, pipe);
}
self.postMessage({ id, type: RESPONSE_READY });
const result = data ? await pipe(data, pipeOptions) : null;
self.postMessage({ id, type: RESPONSE_RESULT, result });
} catch (err) {
const error =
err instanceof Error
? { name: err.name, message: err.message, stack: err.stack }
: { name: 'Error', message: String(err) };
self.postMessage({ id, type: RESPONSE_RESULT, error });
}

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +18
const { id, data, task, model_id, options, pipeOptions = {} } = event.data;
const key = JSON.stringify({ task, model_id, options });
let pipe = pipelines.get(key);
if (!pipe) {
pipe = await pipeline(task, model_id, callbackBridge.deserialize(options));
pipelines.set(key, pipe);
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipeline caching key includes the fully serialized options (including callback functionIds). Since the client generates a new functionId per serialize call, the cache will miss and reload pipelines unnecessarily. Consider hashing only stable, non-callback option fields (or stripping __fn entries) when building the cache key.

Copilot uses AI. Check for mistakes.
Comment on lines 14 to 56
const messagesResolversMap = new Map<number | 'init', { resolve: Function; reject: Function }>();
let messageIdCounter = 0;

const originalOnMessage = worker.onmessage;
worker.onmessage = (e) => {
const msg = e.data;
if (msg?.type === RESPONSE_RESULT) {
if (msg?.id === 'init') {
resolve((data: PayloadType, pipeOptions: Record<string, any>) => {
return new Promise<any>((resolve, reject) => {
const id = messageIdCounter++;
messagesResolversMap.set(id, { resolve, reject });
worker.postMessage({
id,
type: REQUEST,
data,
task,
model_id,
options: options ? callbackBridge.serialize(options) : {},
pipeOptions,
});
});
});
} else {
const resolver = messagesResolversMap.get(msg.id);
if (resolver) {
if (msg.error) resolver.reject(msg.error);
else resolver.resolve(msg.result);
messagesResolversMap.delete(msg.id);
}
}
}
};

messagesResolversMap.set('init', { resolve, reject });
worker.postMessage({
id: 'init',
type: REQUEST,
data: null,
task: task ?? '',
model_id: model_id ?? '',
options: options ? callbackBridge.serialize(options) : {},
});
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

messagesResolversMap.set('init', ...) is never read (init resolution happens via the outer resolve(...)), which makes the map misleading. Either handle init via the map (including init error rejection) or remove the unused 'init' entry and related typing.

Copilot uses AI. Check for mistakes.
import type { PipelineType } from "@huggingface/transformers";

const REQUEST_MESSAGE_TYPE = "transformersjs_worker_pipeline";
const RESPONSE_MESSAGE_TYPE_INVOKE_CALLBACK = "transformersjs_worker_invokeCallback";
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants don’t match the implementation: callback invocations use RESPONSE_CALLBACK_INVOCATION (currently 'callback_bridge:invoke'), not 'transformersjs_worker_invokeCallback'. As written, the test will never exercise the callback bridge behavior.

Suggested change
const RESPONSE_MESSAGE_TYPE_INVOKE_CALLBACK = "transformersjs_worker_invokeCallback";
const RESPONSE_MESSAGE_TYPE_INVOKE_CALLBACK = "callback_bridge:invoke";

Copilot uses AI. Check for mistakes.
nico-martin and others added 4 commits February 13, 2026 08:39
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…backBridgeClient.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants