Accept timed out 异常堆栈
18/07/13 23:14:35 ERROR PythonRDD: Error while sending iterator
java.net.SocketTimeoutException: Accept timed out
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at org.apache.spark.api.python.PythonRDD$$anon$1.run(PythonRDD.scala:397)
18/07/13 23:22:46 INFO ContextCleaner: Cleaned accumulator 1265984
18/07/13 23:22:46 INFO ContextCleaner: Cleaned accumulator 1261984
18/07/13 23:22:46 INFO ContextCleaner: Cleaned accumulator 1291349
18/07/13 23:22:46 INFO ContextCleaner: Cleaned accumulator 1261975
原因
是spark对相同的IP不同的host别名识别为不同的主机,如:spark-streaming节点上配置 127.0.0.1 localhost.localdomain localhost4 localhost4.localdomain4
会导致executor偶现的Accept timed out的问题。
解决
hosts文件中将 127.0.0.1 localhost.localdomain localhost4 localhost4.localdomain4
改为127.0.0.1 localhost
搞定。