Setting up Java, Python and Spark dev environment on Ubuntu
Introduction
This is a step-by-step guide to setting up the development environment for Java, python and installing various other software for Ubuntu 22.04.1 LTS.
Objective:
We need to set up and install the following:
1. Visual Studio Code
2. Docker Desktop
3. Java8 and Java17 (OpenJDK)
4. JEnv
5. IntelliJ Idea Community Edition
6. Spark 3.2.0 and Spark 3.3.2 & set spark home in bash profile
7. PyEnv
Pre-requisites
Git and python should come pre-installed with ubuntu. If git is not installed, install it from the terminal by running the below command:
sudo apt install git
Verify the installation by checking the git version:
Python will be pre-installed by default. Check the version of python in the terminal by running the below command:
python3 --version
To set an alias for python3 as python in your bash profile, follow the below steps:
Open a terminal: Press
Ctrl + Alt + T
to open a new terminal window.Navigate to your home directory (optional but recommended): You can skip this step if you are already in your home directory.
cd ~
Open your bash profile file: This file is usually named
.bashrc
or.bash_aliases
.nano .bashrc
If you prefer to use a different text editor, replace
nano
with your preferred editor (e.g.,vim
,gedit
, etc.).Add the alias: Scroll to the end of the file and add the following line:
alias python=python3
Save the changes: Press
Ctrl + X
, then pressY
to confirm saving, and finally pressEnter
to exit the text editor.Apply the changes: To make the changes take effect, you can either close and reopen the terminal or run the following command:
source ~/.bashrc
Now the alias
python
should point topython3
, and you can usepython
to execute Python 3.x scripts in the terminal.
Visual Studio Code Set Up
- Go to ubuntu software and search for visual studio code and install it.
Install the necessary extensions for python and java intellisense.
Visual Studio code is now set up.
Docker Desktop Set Up
Download the DEB package from the below link: https://docs.docker.com/desktop/install/ubuntu/
Install docker engine on ubuntu using the repository by following the instructions on the below page:
https://docs.docker.com/engine/install/ubuntu/After setting up the repository, install docker engine from the same page:
Install docker desktop in terminal:
Open docker desktop from applications menu.
If you face a problem to open docker desktop from applications menu anytime during development, run the below command in terminal to open it
systemctl --user force-reload docker-desktop
Docker desktop is now set up.
Install Java8 and Java17
Update the repositories:
sudo apt-get update
Install OpenJDK 8:
sudo apt-get install openjdk-8-jdk
Install OpenJDK 17:
sudo apt-get install openjdk-17-jdk
Install JEnv
$ git clone https://github.com/jenv/jenv.git ~/.jenv
Bash
$ echo 'export PATH="$HOME/.jenv/bin:$PATH"' >> ~/.bash_profile $ echo 'eval "$(jenv init -)"' >> ~/.bash_profile
Configure and replace path with actual path of java8 and java17 present in your system
$ jenv add /Insert/Path/To/Java8/ $ jenv add /Insert/Path/To/Java17/
List managed JDK's:
Configure global version:
jenv global 17
Install IntelliJ Idea Community Edition
snap find "intellij"
Install the latest community version:
sudo snap install intellij-idea-community --classic
IntelliJ idea community edition is now installed.
Install Spark 3.2.0 and Spark 3.3.2
Create a directory named spark in /opt folder on your computer.
Download the above given Apache spark versions into the spark folder in the /opt directory from the given archive web link - https://archive.apache.org/dist/spark/
Extract the tarball file in the same location.
Optional - Either delete or move out the tarball files from the /opt directory.
Set spark home in bash profile
Open a terminal and navigate to your home directory using the
cd
command:cd ~
Open the .bashrc file using a text editor like nano or vim:
nano .bashrc
Add the following lines to the end of the .bashrc file:
# Set Spark home for version 3.2.0 export SPARK_HOME=/opt/spark/spark-3.2.0 export PATH=$PATH:$SPARK_HOME/bin # Optionally, set Spark home for version 3.3.2 (comment this out if not needed) # export SPARK_HOME=/opt/spark/spark-3.3.2 # export PATH=$PATH:$SPARK_HOME/bin
Save the changes and exit the text editor.
To apply the changes, either close and reopen the terminal, or run the following command to reload the .bashrc file:
source ~/.bashrc
You can now verify that Spark home is set correctly by checking the value of the SPARK_HOME environment variable:
echo $SPARK_HOME
This should display the path to your desired Spark version (e.g., /opt/spark/spark-3.2.0 or /opt/spark/spark-3.3.2).
Install PyEnv
Navigate to the following GitHub repository - https://github.com/zaemiel/ubuntu-pyenv-installer
Install curl from terminal:
sudo apt install curl
To install Pyenv on your Ubuntu-based distro just execute this command in shell:
bash <(curl -sSL https://raw.githubusercontent.com/zaemiel/ubuntu-pyenv-installer/master/ubuntu-pyenv-installer.sh)
Type option 3 and enter.
Type pyenv in terminal to verify installation:
Type pyenv versions in terminal and verify:
Verify the latest installation of python:
PyEnv is now set up in your system.