For sqlContext we need to import sqlContext
from pyspark.sql import SQLContext
Create the SQL context and now entering sql domain
input a json for a banch of people
the people json file will be like this.
Register the table for users.
Select name, age of users table who are over 21. Nothing happens because it is lazy.
over21=sqlContext.sql("SELECT name, age FROM users WHERE age >21")
collect the over21 datas. It shows one person who is Andy and 30 age
Spark Web UI
User interface for like traditional map reduce. Jobs list are here.By drilling in , you will get more and more information. you can see how long each node executes.