Spark SQL in json dataset in spark and Spark Web UI

For sqlContext we need to import sqlContext


from pyspark.sql import SQLContext

Create the SQL context and now entering sql domain


sqlContext=SQLContext(sc)

input a json for a banch of people
the people json file will be like this.

{"name":"Michael"}
{"name":"Andy", "age":30}
{"name":"Justin", "age":19}


users=sqlContext.jsonFile("people.json")

Register the table for users.


users.registerTempTable("users")

Select name, age of users table who are over 21. Nothing happens because it is lazy.

over21=sqlContext.sql("SELECT name, age FROM users WHERE age >21")

collect the over21 datas. It shows one person who is Andy and 30 age
over21.collect()

Spark Web UI

http;//localhost:4040

Apache Spark localhost
Apache Spark localhost 4040 jobs list

User interface for like traditional map reduce. Jobs list are here.By drilling in , you will get more and more information. you can see how long each node executes.