Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

  • added reducto and pulse for OCR

Type of Change

  • New feature

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Jan 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
docs Ready Ready Preview, Comment Jan 16, 2026 2:01am

Review with Vercel Agent

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 16, 2026

Greptile Summary

This PR adds two new OCR tools—Reducto and Pulse—for extracting text from documents. Both implementations follow the established patterns in the codebase with proper authentication, file access verification, and comprehensive type definitions.

Key additions:

  • API routes with hybrid auth and presigned URL generation for workspace files
  • Parser tools with input validation, URL format checking, and response transformation
  • Block definitions with file upload/URL options and customizable parameters
  • Complete TypeScript type definitions for API inputs and outputs
  • Documentation files with usage instructions and parameter references
  • SVG icon components for both tools

Implementation highlights:

  • Proper security with verifyFileAccess for workspace files
  • 5-minute presigned URL expiry for temporary file access
  • Google Drive link validation with helpful error messages
  • Reducto uses JSON body with Bearer auth, Pulse uses FormData with header auth
  • Both tools support selective page processing to optimize costs

The code is well-structured, follows existing patterns, and includes appropriate error handling.

Confidence Score: 5/5

  • This PR is safe to merge with no blocking issues
  • The implementation follows established codebase patterns, includes proper authentication and authorization checks, has comprehensive error handling, and adds complete type definitions. The code is clean, well-organized, and tested manually as indicated.
  • No files require special attention

Important Files Changed

Filename Overview
apps/sim/app/api/tools/reducto/parse/route.ts Added API route for Reducto OCR with proper auth, file access verification, and presigned URL generation
apps/sim/app/api/tools/pulse/parse/route.ts Added API route for Pulse OCR with auth, file access checks, and FormData handling for external API
apps/sim/tools/reducto/parser.ts Implemented Reducto parser tool with URL validation, file upload handling, and response transformation
apps/sim/tools/pulse/parser.ts Implemented Pulse parser tool with comprehensive parameter handling and native API response passthrough
apps/sim/blocks/blocks/reducto.ts Added Reducto block definition with file upload/URL options, page selection, and table format config
apps/sim/blocks/blocks/pulse.ts Added Pulse block definition with multi-format document support, chunking options, and page range selection

Sequence Diagram

sequenceDiagram
    participant User
    participant Block as Reducto/Pulse Block
    participant Parser as Parser Tool
    participant APIRoute as API Route
    participant Auth as Authentication
    participant Storage as File Storage
    participant OCR as OCR API

    User->>Block: Configure with file and settings
    Block->>Parser: Invoke tool with parameters
    Parser->>Parser: Validate inputs
    Parser->>APIRoute: POST to parse endpoint
    
    APIRoute->>Auth: Verify user authentication
    Auth-->>APIRoute: Authentication successful
    
    alt Internal workspace file
        APIRoute->>Storage: Extract storage information
        APIRoute->>Auth: Check file permissions
        Auth-->>APIRoute: Permission granted
        APIRoute->>Storage: Generate temporary download URL
        Storage-->>APIRoute: Return temporary URL
    else External file URL
        APIRoute->>APIRoute: Use provided URL
    end
    
    APIRoute->>OCR: Send file for processing
    OCR-->>APIRoute: Return extracted content
    APIRoute->>Parser: Return results
    Parser->>Parser: Transform response
    Parser-->>Block: Deliver output
    Block-->>User: Show extracted text
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

18 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@waleedlatif1
Copy link
Collaborator Author

@greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants