Spark_SQL-创建临时表

2023年4月11日 438次阅读来源: 蠟筆小噺没有烦恼

一般在开发spark程序的时候，都需要创建一些数据作为临时表来使用，在实际生产中使用HiveSQL直接获取数据，因为在开发阶段都是在单机上，无法连接Hive，所以必须使用临时表代替，需要在服务器上部署的时候将SQLContext隐式转换为HIveContext即可

//1、创建临时表对应的结构，包扩字段名称+字段类型+是否允许为非空
val productSchema=StructType(List(StructField(“product_id”,LongType,true),StructField(“product_title”,StringType,true),StructField(“extend_info”,StringType,true))
//2、从指定位置创建RDD，这里所用的文件使用空格进行字段的划分
valproductRDD=sc.textFile(file_path).map(_.split(” “))
//3、将RDD中的没一行数据创建为一个Row，
valproductRowRDD=productRDD.map(u=>Row(u(0).toLong,u(1).trim,u(2).trim))
//4、将RDD转为DataFrame，只有DataFrame数据格式才能注册临时表。传入RowRDD以及表结构
valproductDataFrame=sqlContext.createDataFrame(productRowRDD,productSchema)
//5、注册为临时表，参数为数据表名称
productDataFrame.registerTempTable(“tableName”)

    原文作者：蠟筆小噺没有烦恼
    原文地址: https://www.jianshu.com/p/171423134707
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。