Spark Notes
  • Data Source V2
  • Spark Catalog
  • Start Spark
  • PySpark
  • Spark Core
Powered by GitBook
On this page

Was this helpful?

PySpark

Steps:

  • bin/pyspark call org.apache.spark.launcher.Main which would build the command to launch python. And also set the shell.py as the startup script of pyspark to start the spark.

  • context.py will call java_gateway.py to launch spark by invoking bin/spark-submit.

  • spark-submit will launch JavaGatewayServer

  • Python side create SparkConf and SparkContext through the gateway

bin/pyspark.sh

export PYTHONSTARTUP="${SPARK_HOME}/python/pyspark/shell.py"
PreviousStart SparkNextSpark Core

Last updated 3 years ago

Was this helpful?