[Spark] 스파크(spark)버전에 따른 SparkConf와 SparkSession 사용하기

2017. 7. 27. 10:51

Scala 버전만 작성하였다.

Java, Python, R에 대해서도 정보가 필요하다면 글 하단의 참조 링크를 참고 바란다.

SparkConf( Spark 1.6 / Spark 2.x)

You will continue to use these classes (via the sparkContext accessor) to perform operations that require the Spark Core API, such as working with accumulators, broadcast variables, or low-level RDDs. However, you will not need to manually create it.

// Spark 1.6
val sparkConf = new SparkConf().setMaster("local[*]")
sparkConf.set("spark.files", "file.txt")
 
// Spark 2.x
val spark = SparkSession.builder.master("local[*]").getOrCreate()
spark.conf.set("spark.files", "file.txt")

SparkContext( Spark 1.6 / Spark 2.x)

The SQLContext is completely superceded by SparkSession. Most Dataset and DataFrame operations are directly available in SparkSession. Operations related to table and database metadata are now encapsulated in a Catalog (via the catalog accessor).

// Spark 1.6
val sparkConf = new SparkConf()
val sc = new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.json("data.json")
val tables = sqlContext.tables()
 
// Spark 2.x
val spark = SparkSession.builder.getOrCreate()
val df = spark.read.json("data.json")
val tables = spark.catalog.listTables()

HiveContext( Spark 1.6 / Spark 2.x)

The HiveContext is completely superceded by SparkSession. You will need enable Hive support when you create your SparkSession and include the necessary Hive library dependencies in your classpath.

// Spark 1.6
val sparkConf = new SparkConf()
val sc = new SparkContext(sparkConf)
val hiveContext = new HiveContext(sc)
val df = hiveContext.sql("SELECT * FROM hiveTable")
 
// Spark 2.x
val spark = SparkSession.builder.enableHiveSupport().getOrCreate()
val df = spark.sql("SELECT * FROM hiveTable")

ref : https://sparkour.urizone.net/recipes/understanding-sparksession/

저작자표시 비영리

'Bigdata > Spark' 카테고리의 다른 글

[Spark] spark직렬화 포맷 (0)	2017.08.11
[Spark] scala.reflect.api.JavaUniverse.runtimeMirror 에러 (0)	2017.07.27
[Spark] 스파크(Spark) No TypeTag available for 에러 (0)	2017.07.27
[Spark] spark collect연산시 주의사항 (0)	2017.07.14
[Spark] Dateformat orc vs parquet 테스트 (3)	2017.05.11

행복한디벨로퍼

* WEB developer

* Data engineer

* Server backend

> NHN 2014.07 ~ 2021.07

> TOSS 2021.08 ~

운동하는개발자

개발자 관련 모든 강연 관심있어요

ex) 동기부여, 개발 경험담 등

📩 kim3zz@naver.com

[Spark] 스파크(spark)버전에 따른 SparkConf와 SparkSession 사용하기

'Bigdata > Spark' 카테고리의 다른 글

+ Recent posts

티스토리툴바