Developer Guide for Implementing Spark Built-in SQL Functions in Velox

In velox, two folders prestosql & sparksql are holding most sql functions, respective for presto and spark. Gluten will ask velox to firstly register prestosql functions, then sparksql functions. So if prestosql and sparksql share same signature for a function, the sparksql function will overwrite the corresponding prestosql function. If the required function is lacking in both folders (exceptions are some common functions defined outside, like cast), we need to implement the missing function in sparksql folder. It is possible that a prestosql function has some semantic difference with the corresponding spark function, even though they share the same name and function signature. If so, we also need to do an implementation in sparksql folder, generally based on the original impl. for prestosql.

There are a few spark functions that can behave differently for some special cases, depending on ANSI on or off. Currently, gluten does NOT support ANSI mode. So only ANSI off needs to be considered in implementing spark built-in functions in velox.

Take BitwiseAndFunction as example:

template <typename T>
struct BitwiseAndFunction {
  template <typename TInput>
  // For void return type, it indicates null result will never be obtained for non-null input.
  // For bool return type, it indicates null result can be obtained for non-null input (false for null).
  FOLLY_ALWAYS_INLINE void call(TInput& result, TInput a, TInput b) {
    result = a & b;
  }
};

It is templated, as well as the call function, to allow multiple types. In the above impl., the result will be null for null input. Please use callNullable if you need different behavior for null input, e.g., get a non-null result for null input. Also see callNullFree in velox document. It is used for fast evaluation in the case that any input has null.

The below code will register the implemented function for all kinds of integer types. The specified name bitwise_and will be actually used in calling this function.

registerBinaryIntegral<BitwiseAndFunction>({prefix + "bitwise_and"});

Functions for complex types have similar implementations. See ArrayAverageFunction in velox/functions/prestosql/ArrayFunctions.h.

Reference:

Velox’s official developer guide:


Back to top

Apache Incubator

Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0. Apache Gluten, Gluten, Apache, the Apache feather logo, and the Apache Gluten project logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Apache Gluten is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

Privacy Policy