1.场景:
Apache Flume: Agent 【http–>memory–>hdfs(CDH4)】 (http发送请求,通过内存,然后写到cdh4的hdfs上)
当前flume agent机器有【CDH4】环境的文件(而没有Apache hadoop环境),
故 JAVA_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
2.错误:
2016-05-21 19:31:27,756 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO – org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:234)] Creating hdfs://alish1-dataservice-01.mypna.cn:8022/testwjp/2016-05-21/19/FlumeData.1463830281582.tmp
2016-05-21 19:31:27,791 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR – org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:459)] process failed
java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
at org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$FsPermissionProto.getSerializedSize(HdfsProtos.java:5407)
at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
3.分析:
java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
可能出现不同版本的protobuf包
4.验证:
[root@xxx-01 ~]# find / -name protobuf*.jar
/usr/lib/hadoop-hdfs/protobuf-java-2.4.0a.jar
/usr/share/cmf/lib/cdh4/protobuf-java-2.4.0a.jar
/data/01/local/apache-flume-1.6.0-bin/lib/protobuf-java-2.5.0.jar
/data/01/local/apache-tomcat-7.0.42/webapps/logshedcollector/WEB-INF/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/flume-ng/lib/protobuf-java-2.4.1.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/oozie/libtools/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hbase/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop-httpfs/webapps/webhdfs/WEB-INF/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/mahout/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop/client-0.20/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop/client-0.20/protobuf-java.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop/client/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop/client/protobuf-java.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop-yarn/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hcatalog/share/webhcat/svr/lib/protobuf-java-2.4.0a.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop-mapreduce/lib/protobuf-java-2.4.0a.jar
果然有两个版本:
/data/01/local/apache-flume-1.6.0-bin/lib/protobuf-java-2.5.0.jar
/opt/cloudera/parcels/CDH-4.7.1-1.cdh4.7.1.p0.47/lib/hadoop/lib/protobuf-java-2.4.0a.jar
5.解决方案
根据 http://caiguangguang.blog.51cto.com/1652935/1592804
返回”This is supposed to be overridden by subclasses”这句话是 高版本protobuf-java-2.5.0.jar才会有的,
那么将高版本jar包移除,重新启动flume,模拟数据http请求,就能写进hdfs了.
mv /data/01/local/apache-flume-1.6.0-bin/lib/protobuf-java-2.5.0.jar /data/01/local/apache-flume-1.6.0-bin/lib/protobuf-java-2.5.0.jar.bak
6.反思,有个担忧
flume1.6.0 –>protobuf-java-2.5.0.jar
cdh4.8.6 –>protobuf-java-2.4.0a.jar
cdh5.4.8 –>protobuf-java-2.5.0.jar
把flume的protobuf-java-2.5.0.jar给移除了,相当于 flume使用了2.4.0包,
担忧1.6.0版本flume的会使用2.5.0版本的函数方法,那么到时就会抛错,故就只能等着看了.
7.建议
假如使用CDH的话,个人建议最好使用配套的版本flume,
http://blog.itpub.net/30089851/viewspace-2092318/ 进cloudera查看对应版本
CDH4.8.6 –》 flume 1.4.0
CDH5.4.8 –》 flume 1.5.0
假如使用Apache Hadoop的话,还是要配套的版本flume.