Spark之wordcount程序原理深度剖析

wordCount源码:

/**
  * <Description> <br>
  *
  * @author Sunny<br>
  * @version 1.0<br>
  * @CreateDate 2018-03-03 10:19 <br>
  * @see com.spark.ruizhe <br>
  */
object WordCount {
  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setAppName("WorldCount").setMaster("local")
    val sparkContext = new SparkContext(sparkConf)
    val lines = sparkContext.textFile("E:\\workspace\\workspace_spark\\workspace_spark_scala\\test.txt")
    val words = lines.flatMap(line => line.split(" "))
    val pairs = words.map(word => (word, 1))
    val wordsCount = pairs.reduceByKey(_ + _)
    wordsCount.foreach(tuple => println(tuple._1 + " appears " + tuple._2 + " times"))
    println("finished!!")
  }
}

深度分析如图:

《Spark之wordcount程序原理深度剖析》 image.png

    原文作者:SunnyMore
    原文地址: https://www.jianshu.com/p/234fecfe98e9
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞