How to use Docker for Data Science Project
Step 1: How to install docker in windows, mac, & linux
Before you start using docker as your environment for you data science project, you will need to install it first.
Now, How you install docker is different depending on your operating system.
If you are on Mac & Linux OS, then follow these link to install docker
If you are on Windows 10 Pro, you computer needs to meet specific requirements in order to install docker. And, you need to initiate some settings on your computer like Hyper-V. Don’t worry, I got you covered though. Follow these instructions on how to install docker on windows 10 pro.
If you are using Windows 10 Home, the requirements for your computer is different from the requirement for windows 10 Pro. You will need to install Windows Subsystem For Linux (WSL2), and initiate some windows settings like virtual platform feature before you can download and use docker on your windows 10 home computer. Once again, don’t worry. I got you covered. Go here to see my tutorial on how to install docker on windows 10 home.
After installing docker, you are ready to start using docker on our local computer for data science projects.
After installing docker, you will get some initial instructions of how things work from docker documentation the first time you open docker app. Feel free to liberally use those instructions provided by docker to help you get started.
Step 2: Clone an Existing Github Repo
To make this process easy for you and I, I went ahead and created a TEMPLATE Repo. This repo has the docker files to help you get started. You can just replace the APP folder with you own app as you build. Here is the Template Repo Link again –> https://github.com/EvidenceN/docker-tutorial
There are 3 ways for you to get this Template Repo
- You can just click the link that says “Use This Template” to create your own repo using the template. Then, clone the new repo to your local computer
- You can go ahead and fork the repo, then clone this Template repo I created to your local computer. A reminder on how to clone a repo…(FORK the Repo BEFORE Cloning it)
git clone https://github.com/YOUR-GITHUB-USERNAME/YOUR-REPO-NAME.git> cd YOUR-REPO-NAME
- Alternative to cloning the repo, you can also just download the zip file containing the template.
After getting your copy of the template repo, you are ready to start building your first docker image.
Step 3: Build docker image using docker compose
IMPORTANT: Before you run any docker command on your terminal, make sure you open the docker app and wait for the docker icon to stabilize. If you run any docker command below without the docker app running first, then it will throw an error.
Make sure you navigate to the folder where your repo is located before doing the following instructions.
Build your docker compose image using the dockerfile in the template repo using the following command.
docker-compose build
You won’t need to run this command again unless you update the requirements.txt
file or the Dockerfile
Also like a said earlier, after downloading docker, they provide some introductory lessons, you may want to take advantage of that.
After building your docker image, you are ready to run your app for the first time.
Step 4: Run docker Image
So, when you clone the repo, you will see that I already have an APP built in. This is just a starter app. Feel free to delete it and add in your own app.
Running the docker image is how you will get the app up and running on your local host port.
To run docker image…type
docker-compose up
Then go to localhost:8000
, if its not showing up, then just go to localhost
and it might be there. You should see something like this

Step 5: How to add new libraries to docker environment
That was just the starter tutorial to get you up and running with docker. Now, what if you want to add new python packages/libraries to your docker environment, how do you do that?
To add new packages like scipy, scikit-learn, numpy, you go to the requirements.txt
file. In this template repo, you will see a requirements.txt
file.
Just add the libraries you want to add as well as the version you want. You don’t have to add the version number, you can just have the library without specifying a version.
Requirements.txt
file looks something like this….

After adding new libraries to your requirements.txt file, you need to re-build your image for the changes to take place. Just do
docker-compose build
Then
docker-compose up
Step 6: Build Your APP & Deploy to Amazon Web Services (AWS)
After you have your docker environment up and running, you can focus your time on building your data science APP. Like always, I won’t leave you hanging. Follow this instruction to learn —> How to build a data science API using FastAPI
After building your app, or even before you start building your app, you may want to deploy your APP to AWS to host it online. I also got you covered in this area. —> Follow this instructions to learn how to Deploy to AWS Elastic Beanstalk using Docker.
Step 7: Docker commands to accomplish different tasks.
If you want to improve your docker skills, these are a few docker commands that will help you out tremendously.
To clean up your docker containers and images, follow the instructions on this blog post on How to Prune unused Docker objects
General Docker Commands and their meaning. To use these docker commands, simply type docker command
. “Command” is the command you want executed.
Docker Commands:
Docker Commands | Command Meaning |
---|---|
attach | Attach local standard input, output, and error streams to a running container |
build | Build an image from a Dockerfile |
commit | Create a new image from a container’s changes |
cp | Copy files/folders between a container and the local filesystem |
create | Create a new container |
diff | Inspect changes to files or directories on a container’s filesystem |
events | Get real time events from the server |
exec | Run a command in a running container |
export | Export a container’s filesystem as a tar archive |
history | Show the history of an image |
images | List images |
import | Import the contents from a tarball to create a filesystem image |
info | Display system-wide information |
inspect | Return low-level information on Docker objects |
kill | Kill one or more running containers |
load | Load an image from a tar archive or STDIN |
login | Log in to a Docker registry |
logout | Log out from a Docker registry |
logs | Fetch the logs of a container |
pause | Pause all processes within one or more containers |
port | List port mappings or a specific mapping for the container |
ps | List containers |
pull | Pull an image or a repository from a registry |
push | Push an image or a repository to a registry |
rename | Rename a container |
restart | Restart one or more containers |
rm | Remove one or more containers |
rmi | Remove one or more images |
run | Run a command in a new container |
save | Save one or more images to a tar archive (streamed to STDOUT by default) |
search | Search the Docker Hub for images |
start | Start one or more stopped containers |
stats | Display a live stream of container(s) resource usage statistics |
stop | Stop one or more running containers |
tag | Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE |
top | Display the running processes of a container |
unpause | Unpause all processes within one or more containers |
update | Update configuration of one or more containers |
version | Show the Docker version information |
wait | Block until one or more containers stop, then print their exit codes |
To get the above commands, you can just type in docker help
and it will give you the same list as above inside your command line.
In the above tutorial, we used docker compose quite a bit.
The commands below are a few additional docker-compose commands. Again, to use these commands below, simply type docker-compose -command. “Command” is the command you want executed.
Docker-compose Commands:
Docker-compose Command | What the command means |
---|---|
build | Build or rebuild services |
config | Validate and view the Compose file |
create | Create services |
down | Stop and remove containers, networks, images, and volumes |
events | Receive real time events from containers |
exec | Execute a command in a running container |
help | Get help on a command |
images | List images |
kill | Kill containers |
logs | View output from containers |
pause | Pause services |
port | Print the public port for a port binding |
ps | List containers |
pull | Pull service images |
push | Push service images |
restart | Restart services |
rm | Remove stopped containers |
run | Run a one-off command |
scale | Set number of containers for a service |
start | Start services |
stop | Stop services |
top | Display the running processes |
unpause | Unpause services |
up | Create and start containers |
version | Show version information and quit |
To get more docker-compose help or commands, you can just type docker-compose —help
and you will get the exact same list above in your command line.
I hope you liked this tutorial and it helped you.
Leave a Comment