For sqlContext we need to import sqlContext
from pyspark.sql import SQLContext
Create the SQL context and now entering sql domain
sqlContext=SQLContext(sc)
input a json for a banch of people
the people json file will be like this.
{"name":"Michael"}
{"name":"Andy", "age":30}
{"name":"Justin", "age":19}
users=sqlContext.jsonFile("people.json")
Register the table for users.
users.registerTempTable("users")
Select name, age of users table who are over 21. Nothing happens because it is lazy.
over21=sqlContext.sql("SELECT name, age FROM users WHERE age >21")
collect the over21 datas. It shows one person who is Andy and 30 age
over21.collect()
Spark Web UI
http;//localhost:4040

User interface for like traditional map reduce. Jobs list are here.By drilling in , you will get more and more information. you can see how long each node executes.