自定义MapReduce中数据类型

数据类型(都实现了Writable接口)

BooleanWritable     布尔类型
ByteWritable        单字节数值
DoubleWritable      双字节数值
FloatWritable       浮点数
IntWritable         整型数
LongWritable        长整型
Text                UTF-8格式存储的文本
NullWritable        空类型

因为shuffle中排序依据是key,若定义的数据类型为Key,必须实现writable和comparable接口,即WritableComparable接口

Writable

write()把每个对象序列化到输出流             
readFilds()把输入流字节反序列化到输入流       

自定义数据类型实例

1.定义私有变量
2.setter,getter方法
3.无参有参构造器
4.set()方法,帮助构造器初始化数据(Hadoop偏爱)
5.hashCode()方法和equals()方法
6.toString()方法
7.implement Writable并实现write()方法readFilds()方法
8.implement WritableComparable并实现compareTo()方法

package com.cenzhongman.io;

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;

import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableComparable;

public class UserWritable implements WritableComparable<UserWritable> {
    private int ip;
    private String name;

    public UserWritable() {
    }

    public UserWritable(int ip, String name) {
        this.set(ip, name);
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ip;
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }

    @Override
    public String toString() {
        return ip + "\t" + name;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        UserWritable other = (UserWritable) obj;
        if (ip != other.ip)
            return false;
        if (name == null) {
            if (other.name != null)
                return false;
        } else if (!name.equals(other.name))
            return false;
        return true;
    }

    public void set(int ip, String name) {
        this.setIp(ip);
        this.setName(name);
    }

    public int getIp() {
        return ip;
    }

    public void setIp(int ip) {
        this.ip = ip;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    // read和write方法元素的顺序必须一致
    @Override
    public void readFields(DataInput arg0) throws IOException {
        this.ip = arg0.readInt();
        this.name = arg0.readUTF();
    }

    @Override
    public void write(DataOutput arg0) throws IOException {
        arg0.writeInt(ip);
        arg0.writeUTF(name);
    }

    @Override
    public int compareTo(UserWritable o) {
        int comp = Integer.valueOf(this.getIp()).compareTo(o.getIp());

        if (comp != 0) {
            return comp;
        }
        return this.getName().compareTo(o.getName());
    }
}
    原文作者:MapReduce
    原文地址: https://www.cnblogs.com/cenzhongman/p/7133904.html
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞