Table of Contents
Description
Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.
Version | modulename |
---|---|
3.2.0 | spark/3.2.0 |
...
Code Block | ||
---|---|---|
| ||
SPARK_MASTER=$(grep "Starting Spark master" ${SPARK_LOG_DIR/master.err} | cut -d " " -f 9) |
Connect to the master using the Spark interactive shell inĀ
Scala
Code Block | ||
---|---|---|
| ||
spark-shell --master ${SPARK_MASTER} |
...