Skip to content

Performance: Method to reduce key compare times #2552

@jwinkler2083233

Description

@jwinkler2083233

I have a modification that I'm using, and it dramatically cuts the operations needed to compare keys and boosts performance. Since this library is all about performance, I think you'll find this interesting.

Consider what's at the core of each json string key comparison: memcmp(ptrA, ptrB, len);

The gap that I saw is this: If you know the length, 'len', at compile-time, then you can pass in 'len' as a const int at compile-time. When that happens, the compiler can inline it and emit code that is exactly correct for that length. g++ is especially good at this. If the compiler knows that the comparison length is exactly 4, for example, then the compiler will automatically cast 2 32-bit integers, and do an integer comparison. It's the least amount of effort. If the compiler doesn't know the length, on the other hand, then it has to call the runtime version of memcmp(). That version has loops and branches and checks, and does significantly more work.

To be clear, here's the 'key_equals()' method:


inline bool object::iterator::key_equals(std::string_view o) const noexcept {
  // We use the fact that the key length can be computed quickly
  // without access to the string buffer.
  const uint32_t len = key_length();
  if(o.size() == len) {
    // We avoid construction of a temporary string_view instance.
    return (memcmp(o.data(), key_c_str(), len) == 0);
  }
  return false;
}

Here's the addition I propose. If we use templates (which are already in use in simdjson), then we can get the compile-time length of a literal string. With that, we can start with a common field lookup, like:

'fields["fieldname"]'

When that string literal is in the code, then templates can capture the literal, and pass it and its constant length all the way through to the key comparator, like this:

template<size_t N>
inline bool key_equals_n(std::string_view o) const noexcept
{
    // The desired key length is N
    constexpr uint32_t len = N;

    const uint32_t actual_key_len = key_length();

    if (actual_key_len != len)
    {
        return false;
    }

    // a standard memcmp, which might still be intrinsic
    return (std::memcmp(o.data(), key_c_str(), len) == 0);
}

That triggers the compiler to generate the optimal code for memcmp(), when optimizations are enabled.

Here's how I capture a string literal using templates and SFINAE, so that the 'operator[]' method is bifurcated based on whether a string literal is detected. 'at_key_n<>()' is a template, so that the constant length can continue to be passed to other template functions, eventually reaching 'key_equals_n<>()':


template<size_t N>
simdjson_result<dom::element> operator[](const char(&key)[N]) const noexcept {
    // N - 1 is the length, excluding the null terminator.
    if (error()) { return error(); }
    return first.at_key_n<N - 1>(key);
}

template <typename T = const char*,
    typename std::enable_if_t<std::is_pointer_v<T> && !std::is_array_v<T> > >
inline simdjson_result<dom::element> operator[](T key) const noexcept {
    if (error()) { return error(); }
    return first[key];
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions