Skip to content

Explore exreg pure-Ruby regex library for NFA expressions #9172

@headius

Description

@headius

The exreg gem created by @kddnewton is an NFA regex engine implemented in pure Ruby. JRuby's current regex engine Joni is based on Oniguruma, but does not have the DFA improvements added to provide linear-time guarantees and avoid regex DOS attacks.

We should explore whether exreg performs well enough to use for expressions it supports, falling back on Joni otherwise. Anything exreg can compile should be linear time, allowing us to implement the Regexp#linear_time? method better than simply returning false, and we'd be able to match CRuby for ReDOS robustness.

Challenges:

  • It's pure-Ruby, so performance will not be as good as Java-based Joni. It may be acceptable performance, though, and the linear time guarantees may be worth any degradation.
  • Being pure-Ruby, it may be tricky to integrate at parse time into our AST and IR.
  • Using it from Java would require some additional adapter code.

Exploration should start with a performance exploration to see how well it works and to implement low-hanging optimizations. If that looks promising we can proceed from there to integrating it into JRuby.

An alternative approach would be to simply ship the exreg gem in JRuby and allow users to opt into it for Regexp. They would provide a flag or env var or simply rewrite code to use exreg when they know they need to guarantee linear-time execution, or for user-facing expressions that may be at risk of ReDOS.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions