The problem hear is that PySpark requirs Java 8 for some functions. Spark 2.2.1 was having problems with Java 9 and beyond. The recommended solution was to install Java 8.
you can install java-8 specifically, and set it as your default java and try again.
to install java 8,
sudo apt install openjdk-8-jdk
to change the default java version, follow this. you can use command
update-java-alternatives --list
for listing all java versions available.
set a default one by running the command:
sudo update-alternatives --config java
to select java version you want. provide the accurate number in the provided list.then cheak your java version java -version
and it should be updated. Set the JAVA_HOME variable also.
to set JAVA_HOME, You must find the specific Java version and folder. Fallow this SO discussion for get a full idea of setting the java home variable. since we are going to use java 8, our folder path is /usr/lib/jvm/java-8-openjdk-amd64/
. just go to /usr/lib/jvm
folder and creak what are the avilable folders. use ls -l
to see folders and their softlinks, since these folders can be a shortcut for some java versions. then go to your home directory cd ~
and edit the bashrc file
cd ~gedit .bashrc
then Add bellow lines to the file, save and exit.
## SETTING JAVA HOMEexport JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64export PATH=$PATH:$JAVA_HOME/bin
after that, to make effect of what you did, type source ~/.bashrc
and run in terminal