Velox Backend Troubleshooting

Fatal error after native exception is thrown

We depend on checking exceptions thrown from native code to validate whether a spark plan can be really offloaded to native engine. But if libunwind-dev is installed, native exception will not be caught and will interrupt the program. So far, we observed this fatal error can happen only on Ubuntu 20.04. Please remove libunwind-dev and then re-build the project to address this issue.

sudo apt-get purge --auto-remove libunwind-dev

Jar conflict issue

With the latest version of Gluten, there should not be any jar conflict issue anymore. If you still get hit with such issue, please follow the below instructions.

The potentially conflicting libraries include protobuf (Both Velox and CK backend), flatbuffers (Velox backend), and arrow-* (Velox backend). These libraries are compiled from source and packed into Gluten’s jar. Jvm should search them from Gluten.jar firstly and load them. But for some reason jvm loads the jars from spark_home/jars which causes conflict. You may use below commands to remove the jars from spark_home/jars. We are still investigating the root cause. Welcome to share if you have good solution.

rm -rf $SPARK_HOME/jars/protobuf-*
# velox backend only
rm -rf $SPARK_HOME/jars/flatbuffers-*
rm -rf $SPARK_HOME/jars/arrow-*

Incompatible class error when using native writer

Gluten native writer overwrite some vanilla spark classes. Therefore, when running a program that uses gluten, it is essential to ensure that the gluten jar is loaded prior to the vanilla spark jar. In this section, we will provide some configuration settings in $SPARK_HOME/conf/spark-defaults.conf for Yarn client, Yarn cluster, and Local&Standalone mode to guarantee that the gluten jar is prioritized.

Configurations for Yarn Client mode

// spark will upload the gluten jar to hdfs and then the nodemanager will fetch the gluten jar before start the executor process. Here also can set the spark.jars.
spark.files = {absolute_path}/gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar
// The absolute path on running node
spark.driver.extraClassPath={absolute_path}/gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar
// The relative path under the executor working directory
spark.executor.extraClassPath=./gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar

Configurations for Yarn Cluster mode

spark.driver.userClassPathFirst = true
spark.executor.userClassPathFirst = true

spark.files = {absolute_path}/gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar
// The relative path under the executor working directory
spark.driver.extraClassPath=./gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar
// The relative path under the executor working directory
spark.executor.extraClassPath=./gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar

Configurations for Local & Standalone mode

// The absolute path on running node
spark.driver.extraClassPath={absolute_path}/gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar
// The absolute path on running node
spark.executor.extraClassPath={absolute_path}/gluten-<spark-version>-<gluten-version>-SNAPSHOT-jar-with-dependencies.jar

Invalid pointer error

If the below error is reported at runtime, please re-build gluten with --compile_arrow_java=ON, then redeploy Gluten jar.

*** Error in `/usr/local/jdk1.8.0_381/bin/java': free(): invalid pointer: 0x00007f36cb5cec80 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7d1fd)[0x7f38c29da1fd]
/lib64/libstdc++.so.6(_ZNSt6locale5_Impl16_M_install_facetEPKNS_2idEPKNS_5facetE+0x142)[0x7f36cb3370d2]
/lib64/libstdc++.so.6(_ZNSt6locale5_ImplC1Em+0x1e3)[0x7f36cb337523]
/lib64/libstdc++.so.6(+0x71495)[0x7f36cb338495]
/lib64/libpthread.so.0(pthread_once+0x50)[0x7f38c3147be0]
/lib64/libstdc++.so.6(+0x714e1)[0x7f36cb3384e1]
/lib64/libstdc++.so.6(_ZNSt6localeC2Ev+0x13)[0x7f36cb338523]
/lib64/libstdc++.so.6(_ZNSt8ios_base4InitC2Ev+0xbc)[0x7f36cb33537c]
/tmp/jnilib-645156599284574767.tmp(+0x2a90)[0x7f375d235a90]
/lib64/ld-linux-x86-64.so.2(+0xf4e3)[0x7f38c33664e3]
/lib64/ld-linux-x86-64.so.2(+0x13b04)[0x7f38c336ab04]
/lib64/ld-linux-x86-64.so.2(+0xf2f4)[0x7f38c33662f4]
/lib64/ld-linux-x86-64.so.2(+0x1321b)[0x7f38c336a21b]
/lib64/libdl.so.2(+0x102b)[0x7f38c2d1f02b]
/lib64/ld-linux-x86-64.so.2(+0xf2f4)[0x7f38c33662f4]
/lib64/libdl.so.2(+0x162d)[0x7f38c2d1f62d]
/lib64/libdl.so.2(dlopen+0x31)[0x7f38c2d1f0c1]
/usr/local/jdk1.8.0_381/jre/lib/amd64/server/libjvm.so(+0x9292b1)[0x7f38c22732b1]
/usr/local/jdk1.8.0_381/jre/lib/amd64/server/libjvm.so(JVM_LoadLibrary+0xa1)[0x7f38c205e0c1]
/usr/local/jdk1.8.0_381/jre/lib/amd64/libjava.so(Java_java_lang_ClassLoader_00024NativeLibrary_load+0x1ac)
...

Back to top

Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0. Apache Gluten, Gluten, Apache, the Apache feather logo, and the Apache Gluten project logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Apache Gluten is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

Privacy Policy