问题描述
配置好Hadoop集群(包括hdfs和yarn),配置好spark-on-yarn,提交任务后发现container异常退出,有core dump产生;修改yarn的资源配置,依然crash。
container日志一旦任务结束就被系统自动清除,只能通过当前控制台分析日志;日志中有core dump字样和java error report的具体地址 /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1523510046494_0001/container_1523510046494_0001_01_000002/hs_err_pid4883.log,但去查看却看不到这个文件。
# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc: SuppressErrorAt=/memnode.cpp:2307#
# A fatal error has been detected by the Java Runtime Environment:#
# Internal Error (/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/openjdk/hotspot/src/share/vm/opto/memnode.cpp:2307), pid=4883, tid=0x00007f930d7f6700
# assert(Opcode() == mem->Opcode() || phase->C->get_alias_index(adr_type()) == Compile::AliasIdxRaw) failed: no mismatched stores, except on raw memory
#
# JRE version: OpenJDK Runtime Environment (8.0_161-b14) (build 1.8.0_161-debug-b14)
# Java VM: OpenJDK 64-Bit Server VM (25.161-b14-debug mixed mode linux-amd64 compressed oops)
#Failed to write core dump. Core dumps have been disabled. To enable core dumping, try “ulimit -c unlimited” before starting Java again
#
# An error report file with more information is saved as:
#/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1523510046494_0001/container_1523510046494_0001_01_000002/hs_err_pid4883.log
#
# Compiler replay data is saved as:
# /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1523510046494_0001/container_1523510046494_0001_01_000002/replay_pid4883.log
#
# If you would like to submit a bug report, please visit:
#http://bugreport.java.com/bugreport/crash.jsp#
Current thread is 140269563373312
Dumping core …
解决方式
首先修改‘yarn.nodemanager.delete.debug-delay-sec’配置,保留日志文件
yarn-site.xml yarn.nodemanager.delete.debug-delay-sec 360000
Error report里看到了错误信息, 执行的JVM的源代码
nosuchmethoderror openjdk/hotspot/src/share/vm/prims/jni.cpp
开始怀疑是JDK的版本,安装的是debug版本。java-1.8.0-openjdk有如下版本可以安装,重新安装为java-1.8.0-openjdk-devel.x86_64,问题解决。程序能够稳定执行了,速度也变快了。
java-1.8.0-openjdk.x86_64 : OpenJDK Runtime Environment
java-1.8.0-openjdk-devel.x86_64 : OpenJDK Development Environment
java-1.8.0-openjdk-debug.x86_64 : OpenJDK Runtime Environment with full debug on
java-1.8.0-openjdk-devel-debug.x86_64 : OpenJDK Development Environment with full debug on