我的Hadoop工作在Amazon ElasticMapreduce AMI 3.7.0上运行正常.但是当我升级到AMI版本3.8.0时,
java.net.URL类的toString方法开始抛出NullPointerException:
java.lang.NullPointerException
at java.net.URL.toExternalForm(URL.java:925)
at java.net.URL.toString(URL.java:911)
at com.snowplowanalytics.iglu.client.repositories.HttpRepositoryRef.lookupSchema(HttpRepositoryRef.scala:602)
at com.snowplowanalytics.iglu.client.Resolver.recurse$1(Resolver.scala:236)
at com.snowplowanalytics.iglu.client.Resolver.lookupSchema(Resolver.scala:247)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$2$$anonfun$apply$6$$anonfun$apply$7.apply(validatableJson.scala:171)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$2$$anonfun$apply$6$$anonfun$apply$7.apply(validatableJson.scala:170)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$2$$anonfun$apply$6.apply(validatableJson.scala:170)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$2$$anonfun$apply$6.apply(validatableJson.scala:169)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$2.apply(validatableJson.scala:169)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$2.apply(validatableJson.scala:166)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$.verifySchemaAndValidate(validatableJson.scala:166)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonNode.verifySchemaAndValidate(validatableJson.scala:244)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$1$$anonfun$apply$8.apply(Shredder.scala:267)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$1$$anonfun$apply$8.apply(Shredder.scala:266)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$1.apply(Shredder.scala:266)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$1.apply(Shredder.scala:264)
at scala.Option.map(Option.scala:145)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractAndValidateJson(Shredder.scala:264)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractContexts$1(Shredder.scala:101)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.shred(Shredder.scala:108)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$loadAndShred$1.apply(ShredJob.scala:83)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$loadAndShred$1.apply(ShredJob.scala:80)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$.loadAndShred(ShredJob.scala:80)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$5.apply(ShredJob.scala:170)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$5.apply(ShredJob.scala:169)
at com.twitter.scalding.MapFunction.operate(Operations.scala:58)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:452)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166)
调用该方法的URL不为null.类的内部toExternalForm方法抛出异常.
为什么会这样?
这是AMI 3.8.0集群上java -version的输出(在主节点和核心节点上):
[hadoop@ip-xxx-xx-xx-xx ~]$java -version
java version "1.7.0_76"
Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
对于AMI 3.7.0(在主节点和核心节点上):
[hadoop@ip-xxx-xx-xx-xx ~]$java -version
java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
可能会归咎于不同的JRE版本吗?
最佳答案 由于我不愿意提出索赔,这似乎是一个JVM错误.在java.net.URL的OpenJDK源代码中,整个toExternalForm()方法是对处理程序的委托,这是一个瞬态字段:
public String toExternalForm() {
return handler.toExternalForm(this);
}
这可能抛出NPE的唯一方法是if handler为null.据我所知,所有构造函数路径和readObject(ObjectInputStream)方法都确保设置处理程序字段,如果不能,则抛出异常(MalformedURLException或IOException).例如:
private synchronized void readObject(java.io.ObjectInputStream s)
throws IOException, ClassNotFoundException
{
s.defaultReadObject(); // read the fields
if ((handler = getURLStreamHandler(protocol)) == null) {
throw new IOException("unknown protocol: " + protocol);
}
...
我注意到有一个公共JRE 7u79版本,如果升级到Java 8不可行,建议尝试该版本.