How to import org.apache.spark.sql.SQLContext.implicits in Spark 1.6: error “value toDF is not a member of org.apache.spark.rdd.RDD”

Note: If you’re using Spark 2.2, please read this post

I am doing a mini project for my company using Spark/Scala and have been stuck with the error mentioned in the title for a couple of days. Googling that error suggested to import org.apache.spark.sql.SQLContext.implicits, and that’s what I did:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.sql._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext.implicits
import org.apache.spark.SparkConf
object TestSQLContext {
[…..]
def main(args:Array[String]) {
[…..]
}
}

And that was the start of the problem: my application started to give a new error:

object SQLContext is not a member of package org.apache.spark.sql
[error] Note: class SQLContext exists, but it has no companion object.

The problem is, none of those online posts mention that we need to create an instance of org.apache.spark.sql.SQLContext before being able to use its members and methods. This is the right way to do it:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.sql._
import org.apache.spark.SparkConf
object Hi {

case class DimC(ID:Int, Name:String, City:String, EffectiveFrom:Int, EffectiveTo:Int)

def main(args:Array[String]) {
val conf = new SparkConf().setAppName(“LoadDW”)
val sc = new SparkContext(conf)
val sqlContext= new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
val fDimCustomer = sc.textFile(“DimCustomer.txt”)

var dimCustomer1 = fDimCustomer.map(_.split(‘,’)).map(r=>DimC(r(0).toInt,r(1),r(2),r(3).toInt,r(4).toInt)).toDF

dimCustomer1.registerTempTable(“Cust_1”)

val customers = sqlContext.sql(“select * from Cust_1”)

customers.show()
}
}

Hope this post helps and please do not hesitate to ask your questions in comments section.

Cheers.

11 thoughts on “How to import org.apache.spark.sql.SQLContext.implicits in Spark 1.6: error “value toDF is not a member of org.apache.spark.rdd.RDD””

Are you writing your Scala directly on your Spark cluster?

Not sure what you mean, but you can submit your scala code on any machine that is connected to the cluster, via ssh for example.

Great! Your post helps me! It saved my day! Thank you very much!

Thank you for the detailed explanation. This article is really helpful to understand the problem while importing “org.apache.spark.sql.SQLContext.implicits” in Spark. Kudos!