Spark-on-yarn遇到的Java crash问题

问题描述

配置好Hadoop集群(包括hdfs和yarn),配置好spark-on-yarn,提交任务后发现container异常退出,有core dump产生;修改yarn的资源配置,依然crash。

container日志一旦任务结束就被系统自动清除,只能通过当前控制台分析日志;日志中有core dump字样和java error report的具体地址 /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1523510046494_0001/container_1523510046494_0001_01_000002/hs_err_pid4883.log,但去查看却看不到这个文件。

# To suppress the following error report, specify this argument

# after -XX: or in .hotspotrc:  SuppressErrorAt=/memnode.cpp:2307#

# A fatal error has been detected by the Java Runtime Environment:#

#  Internal Error (/builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/openjdk/hotspot/src/share/vm/opto/memnode.cpp:2307), pid=4883, tid=0x00007f930d7f6700

#  assert(Opcode() == mem->Opcode() || phase->C->get_alias_index(adr_type()) == Compile::AliasIdxRaw) failed: no mismatched stores, except on raw memory

#

# JRE version: OpenJDK Runtime Environment (8.0_161-b14) (build 1.8.0_161-debug-b14)

# Java VM: OpenJDK 64-Bit Server VM (25.161-b14-debug mixed mode linux-amd64 compressed oops)

#Failed to write core dump. Core dumps have been disabled. To enable core dumping, try “ulimit -c unlimited” before starting Java again

#

# An error report file with more information is saved as:

#/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1523510046494_0001/container_1523510046494_0001_01_000002/hs_err_pid4883.log

#

# Compiler replay data is saved as:

# /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1523510046494_0001/container_1523510046494_0001_01_000002/replay_pid4883.log

#

# If you would like to submit a bug report, please visit:

#http://bugreport.java.com/bugreport/crash.jsp#

Current thread is 140269563373312

Dumping core …

解决方式

首先修改‘yarn.nodemanager.delete.debug-delay-sec’配置,保留日志文件

yarn-site.xml            yarn.nodemanager.delete.debug-delay-sec            360000

Error report里看到了错误信息, 执行的JVM的源代码

nosuchmethoderror openjdk/hotspot/src/share/vm/prims/jni.cpp

开始怀疑是JDK的版本,安装的是debug版本。java-1.8.0-openjdk有如下版本可以安装,重新安装为java-1.8.0-openjdk-devel.x86_64,问题解决。程序能够稳定执行了,速度也变快了。

java-1.8.0-openjdk.x86_64 : OpenJDK Runtime Environment

java-1.8.0-openjdk-devel.x86_64 : OpenJDK Development Environment

java-1.8.0-openjdk-debug.x86_64 : OpenJDK Runtime Environment with full debug on

java-1.8.0-openjdk-devel-debug.x86_64 : OpenJDK Development Environment with full debug on

    原文作者:北水南调
    原文地址: https://www.jianshu.com/p/43a361900ee6
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞