跳轉到

Spark Installation

get spark

wget https://dlcdn.apache.org/spark/spark-3.2.0/spark-3.2.0-bin-hadoop3.2.tgz
tar xzvf spark-3.2.0-bin-hadoop3.2.tgz

JDK

sudo apt install -y openjdk-8-jdk

environment variables

vi ~/.bashrc

貼上以下內容:

~/.bashrc
# Spark
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

start master

cd spark-3.2.0-bin-hadoop3.2
./sbin/start-master.sh

install pyspark

pip install pyspark

references

https://spark.apache.org/docs/latest/ https://spark.apache.org/downloads.html https://spark.apache.org/docs/latest/spark-standalone.html