In this article, we will learn how to install Deep Learning Frameworks like TensorFlow and PyTorch on a machine having a NVIDIA graphics card.
If you have a brand new computer with a graphics card and you don’t know what libraries to install to start your deep learning journey, this article will help you.
We will install CUDA, cuDNN, Python 3, TensorFlow, Pytorch, OpenCV, Dlib along with other Python Machine Learning libraries step-by-step. Note, that if you would like to use TensorFlow with Keras support, there is no need to install Keras package separately, since from TensorFlow2.0 Keras comes as tensorflow.keras submodule.
We have tested the instructions on a system with the following configuration:
Motherboard : Gigabyte X99P – SLI
RAM : 32 GB
Graphics Card : Zotac GeForce GTX 1080 Ti with 11 GB RAM
We will be assuming Ubuntu 16.04 installation. i.e nothing has been installed on the system earlier.
Step 1 : Install Prerequisites
Before installing anything, let us first update the information about the packages stored on the computer and upgrade the already installed packages to their latest versions.
sudo apt-get update
sudo apt-get upgrade
Next, we will install some basic packages which we might need during the installation process as well in future. Also, remove the packages which are not needed.
sudo apt-get install -y build-essential cmake gfortran git pkg-config
sudo apt-get install -y python-dev software-properties-common wget vim
sudo apt-get autoremove
Step 2 : Install CUDA
CUDA ( Compute Unified Device Architecture ) is a parallel computing platform and API developed by NVIDIA which utilizes the parallel computing capabilities of the GPUs. In order to use the graphics card, we need to have CUDA drivers installed on our system.
If you do not have a NVIDIA CUDA supported Graphics Card, then you can skip this step. and go to Step 4.
Download the CUDA driver from the official nvidia website. We recommend you download the deb ( local ) version from Installer type as shown in the screenshot below.
After downloading the file, go to the folder where you have downloaded the file and run the following commands from the terminal to install the CUDA drivers.
Please make sure that the filename used in the command below is the same as the downloaded file.
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install -y cuda-8.0
Run the following command to check whether the driver has installed successfully by running NVIDIA’s System Management Interface (nvidia-smi). It is a tool used for monitoring the state of the GPU.
nvidia-smi
You should get an output as shown below.
As a side note, I found that apart from getting better resolution options for display, installing the CUDA driver lowers the power consumption of the graphics card from 71W to 16W for a NVIDIA GTX 1080 Ti GPU attached via PCIe x16.
Step 3 : Install cuDNN
CUDA Deep Neural Network (cuDNN) is a library used for further optimizing neural network computations. It is written using the CUDA API.
Go to official cudnn website and fill out the form for downloading the cuDNN library. After you get to the download link ( sample shown below ), you should download the “cuDNN v6.0 Library for Linux” from the options.
Now, go to the folder where you have downloaded the “.tgz” file and from the command line execute the following.
tar xvf cudnn-8.0-linux-x64-v6.0.tgz
sudo cp -P cuda/lib64/* /usr/local/cuda/lib64/
sudo cp cuda/include/* /usr/local/cuda/include/
Next, update the paths for CUDA library and executables:
echo 'export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"' >> ~/.bashrc
echo 'export CUDA_HOME=/usr/local/cuda' >> ~/.bashrc
echo 'export PATH="/usr/local/cuda/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
This should get everything sorted out with respect to CUDA and cuDNN
Step 4 : Install requirements for DL Frameworks
Install dependencies of Deep Learning Frameworks:
sudo apt-get update
sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler libopencv-dev
NOTE : If you get a warning saying the following:
/usr/lib/nvidia-375/libEGL.so.1 not a symbolic link
Then execute the below commands:
sudo mv /usr/lib/nvidia-375/libEGL.so.1 /usr/lib/nvidia-375/libEGL.so.1.org
sudo mv /usr/lib32/nvidia-375/libEGL.so.1 /usr/lib32/nvidia-375/libEGL.so.1.org
sudo ln -s /usr/lib/nvidia-375/libEGL.so.375.82 /usr/lib/nvidia-375/libEGL.so.1
sudo ln -s /usr/lib32/nvidia-375/libEGL.so.375.82 /usr/lib32/nvidia-375/libEGL.so.1
Next, we install python 3 along with other important packages like boost, lmdb, glog, blas etc.
sudo apt-get install -y --no-install-recommends libboost-all-dev doxygen
sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev libblas-dev
sudo apt-get install -y libatlas-base-dev libopenblas-dev libgphoto2-dev libeigen3-dev libhdf5-dev
sudo apt-get install -y python3-dev python3-pip python3-nose python3-numpy python3-scipy
Step 5 : Enable Virtual Environments
Most of us work on different projects and like to keep the settings for these projects separate too. This can be done using virtual environments in Python. In a virtual environment, you can install any python library without affecting the global installation or other virtual environments. This way, even if you damage the libraries in one virtual environment, your rest of the projects remain safe. It is highly recommended to use virtual environments.
Install the virtual environment wrapper which enables us to create and work on virtual environments in python.
sudo pip3 install virtualenv virtualenvwrapper
echo "# Virtual Environment Wrapper" >> ~/.bashrc
echo "source /usr/local/bin/virtualenvwrapper.sh" >> ~/.bashrc
source ~/.bashrc
Step 6 : Install Deep Learning frameworks
Now, we install Tensorflow, PyTorch, dlib along with other standard Python ML libraries like numpy, scipy, sklearn etc.
We will create virtual environments and install all the deep learning frameworks inside them. We create a separate environment for Python 3:
# create a virtual environment for python 3
mkvirtualenv virtual-py3 -p python3
# Activate the virtual environment
workon virtual-py3
pip install numpy scipy matplotlib scikit-image scikit-learn ipython protobuf jupyter
# If you do not have CUDA installed
pip install tensorflow
# If you have CUDA installed
pip install tensorflow-gpu
pip install torch
pip install dlib
deactivate
Check Installation of Frameworks
workon virtual-py3
python
import numpy
numpy.__version__
import tensorflow
tensorflow.__version__
import torch
torch.__version__
import cv2
cv2.__version__
If you want to install OpenCV 3.3, follow along
Step 7 : Install OpenCV 3.3
First we will install the dependencies:
sudo apt-get remove x264 libx264-dev
sudo apt-get install -y checkinstall yasm
sudo apt-get install -y libjpeg8-dev libjasper-dev libpng12-dev
# If you are using Ubuntu 16.04
sudo apt-get install -y libtiff5-dev
sudo apt-get install -y libavcodec-dev libavformat-dev libswscale-dev libdc1394-22-dev
sudo apt-get install -y libxine2-dev libv4l-dev
sudo apt-get install -y libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev
sudo apt-get install -y libqt4-dev libgtk2.0-dev libtbb-dev
sudo apt-get install -y libfaac-dev libmp3lame-dev libtheora-dev
sudo apt-get install -y libvorbis-dev libxvidcore-dev
sudo apt-get install -y libopencore-amrnb-dev libopencore-amrwb-dev
sudo apt-get install -y x264 v4l-utils
Download OpenCV and OpenCV-contrib
git clone https://github.com/opencv/opencv.git
cd opencv
git checkout 3.3.0
cd ..
git clone https://github.com/opencv/opencv_contrib.git
cd opencv_contrib
git checkout 3.3.0
cd ..
Configure and generate the MakeFile
cd opencv
mkdir build
cd build
#Remove the line WITH_CUDA=ON if you dont have CUDA in your system
cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D INSTALL_C_EXAMPLES=ON \
-D INSTALL_PYTHON_EXAMPLES=ON \
-D WITH_TBB=ON \
-D WITH_V4L=ON \
-D WITH_QT=ON \
-D WITH_OPENGL=ON \
-D WITH_CUDA=ON \
-D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules \
-D BUILD_EXAMPLES=ON ..
Compile and Install
NOTE : The make operation takes quite a long time, almost an hour using 12 cores on an i7 processor. Also, it might get stuck for long at some places, but don’t worry unless it is stuck for more than an hour.
make -j4
sudo make install
sudo sh -c 'echo "/usr/local/lib" >> /etc/ld.so.conf.d/opencv.conf'
sudo ldconfig
Link OpenCV to your virtual environments
# Check the opencv .so was created
find /usr/local/lib/ -type f -name "cv2*.so"
It should give an output similar to the one shown below/usr/local/lib/python3.5/dist-packages/cv2.cpython-35m-x86_64-linux-gnu.so
These are the locations where OpenCV’s Python runtime library file ( cv2.so ) is located. We need to create symlinks to these files from our virtual environment in order to use OpenCV inside them without reinstalling OpenCV.
Note the exact path of the cv2.so file. In my system, it is located in dist-packages. But in most systems, it is located in site-packages directory.
The creation of symlinks is done as follows:
cd ~/.virtualenvs/virtual-py3/lib/python3.5/site-packages
ln -s /usr/local/lib/python3.5/dist-packages/cv2.cpython-35m-x86_64-linux-gnu.so cv2.so
Check OpenCV Installation
workon virtual-py3
python
import cv2
cv2.__version__
As an output you should get a corresponding OpenCV version.
What next?
Check out our next posts on Keras Basics and Feedforward Neural Networks Basics. More posts on Deep Learning to follow. Stay Tuned!
Subscribe & Download Code
If you liked this article and would like to download code (C++ and Python) and example images used in this post, please click here. Alternately, sign up to receive a free Computer Vision Resource Guide. In our newsletter, we share OpenCV tutorials and examples written in C++/Python, and Computer Vision and Machine Learning algorithms and news.References
Here is a list of other resources you may find useful:
Great post! In case this is useful here are two gists that describe a very similar install process: https://gist.github.com/filitchp/5645d5eebfefe374218fa2cbf89189aa#file-opencv-3-1-ubuntu-16-04-cuda-8-md
https://gist.github.com/filitchp/a578fefc23e38db436389dd4932d6239#file-tensorflow-1-0-ubuntu-16-04-cuda-8-0-cudnn-5-1-md
I ran into some driver and motherboard configuration issues that I mention in my guides.
Thanks for the links Paul.
This looks great! Can you please post some linux laptops which would be appropriate for this? I’ve been googling around and it isn’t absolutely clear which laptops are best for CUDA.
The answer is : It depends on your budget!
However, I do not recommend using of laptop for deep learning for the following reasons:
1. Training takes a lot of time and you cannot travel when training is going on. This takes away the mobility feature of the laptop.
2. Laptop GPUs are inferior to their Desktop counterparts
3. The GPU on laptops are not upgradable
4. Laptops don’t handle the heat produced while training very nicely.
You might want to read this
https://www.quora.com/What-laptop-is-best-for-deep-learning-experiments
It might be better to use GCP or AWS if you plan to do deep learning training once in a while or buy a desktop which you can customize to your budget.
Still, if you really want to invest in a Laptop with a GeForce GPU, the cost will depend on the RAM requirements and also the country you are living. I would suggest you to buy a laptop with atleast 6GB GPU RAM and 12 GB system RAM. The other parts will have to be compatible with these 2 requirements.
Here are some links which you can take as reference :
https://www.amazon.in/Omen-HP-249TX-17-3-inch-Graphics/dp/B06XBJ9MRW/ref=pd_sbs_147_10?_encoding=UTF8&psc=1&refRID=7J6BJFQ3N6SEG6WHVPFB
https://www.amazon.in/MSI-GE62VR-7RF-Apache-Pro/dp/B01NAKTV53/ref=sr_1_2?s=computers&ie=UTF8&qid=1505208548&sr=1-2
thanks. does sound like desktop or cloud is the way to go. so lets assume AWS cloud. then AWS P2 instances would seem appropriate. Sounds right?
What would be the development process though? Ideally would develop on laptop and run/deploy on cloud. What do people do? Are there CUDA emulators which would run on laptops and facilitate the development process?
We develop on AWS and deploy on AWS as well. There are no CUDA emulators I know of, but the code you write on any framework will run just fine on a CPU. It will be very slow for training though. Sometimes when the application does not need fast response, training is done on the GPU and inference is done on the CPU ( more cost effective ).
Thanks.
I have installed openCV as shown above but when i tried to import i am getting error
ImportError: No module named ‘cv2’
I can help if you can give some details. If you share exactly which commands u typed and what outputs you got after installing opencv.
After installing i entered the following commands https://uploads.disquscdn.com/images/21372d37d51e948f560804adcc32ef2d7998a105143eb92056485cb7de62f9aa.png
Did you perform these two steps?
cd ~/.virtualenvs/virtual-py3/lib/python3.5/site-packages
ln -s /usr/local/lib/python3.5/dist-packages/cv2.cpython-35m-x86_64-linux-gnu.so cv2.so
Also, what output are you getting for
find /usr/local/lib/ -type f -name “cv2*.so”
I already performed those two steps and the output is https://uploads.disquscdn.com/images/cfc9928c0dea976ef468b2c8bf5a273d9853ff9d4a293cb069ca43a23d777021.png
I hope you had executed this command
ln -s /usr/local/lib/python3.5/dist-packages/cv2.cpython-35m-x86_64-linux-gnu.so cv2.so
with site-packages instead of dist-packages.
Also, can you show the output of this
cd ~/.virtualenvs/virtual-py3/lib/python3.5/site-packages
ls -l cv2.so
this is the output i got https://uploads.disquscdn.com/images/b0fec92daa53edd1b82d0a7e3bb593b74d17d501f5eca47f44f5c37ec1302289.png
There lies the problem!
You can see the thing in RED. It is pointing to the dist-packages directory.
Delete the cv2.so file in the virtual-py3/lib/python3.5/site-packages directory and execute the command
ln -s /usr/local/lib/python3.5/site-packages/cv2.cpython-35m-x86_64-linux-gnu.so cv2.so
This should solve the problem
I deleted that file and executed the command but still getting the error
Here’s the output https://uploads.disquscdn.com/images/69954e4c5afe797e8dee2a5d7473189e4420e873c67154d54bb1ad3e1e747364.png
Do not copy paste the line as it is
ln -s /usr/local/lib/python3.5/site-packages/cv2.cpython-35m-x86_64-linu… cv2.so
it is not taking the full path. as you can see from the text highlighted in RED. cv2.so is pointing to a wrong file.
It should point to the file cv2.cpython-35m-x86_64-linux-gnu.so in the /usr/local/lib/python3.5/site-packages directory.
Thank you Vikas for your help
I just renamed the cv2.cpython-35m-x86_64-linux-gnu.so to cv2.so
as directed in pyimagesearch blog article and create a symlink.
Now it’s working fine
Glad you could make it work! The renaming isn’t the trick, the linking is.
Anyway, Thank You Vikas for spending your time and helping me
I have this error when I import torch :
ImportError: dlopen: cannot load any more object with static TLS
and I installed it using these commands
— pip install http://download.pytorch.org/whl/cu75/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl
— pip install torchvision
That looks like a one-off error. Are you getting the error still? People have reported that this type of error is encountered randomly. Restarting the python kernel or executing the command solves the problem for them.
It actually works when I restart the PC.Thanks a lot
Nice article but need steps to install tensorflow and keras in windows 10. Can you help me for this?
We do not have a Windows installation tutorial for Tensorflow and Keras. However, you can follow these steps.
1. First of all, tensorflow requires python3 on windows. So, if you dont have that, install python 3 first. You can either install it using Anaconda ( it includes libraries like numpy, scipy, sklearn etc) or simply the python library. It’s your choice.
2. After you have installed Python, install Tensorflow and Keras using
pip3 install –upgrade tensorflow
pip3 install keras
It is advisable to work on Linux based systems for deep learning projects as the support is much better for linux systems. You can use a virtualbox with ubuntu on it if you cannot install linux on your system.
I have installed Tensorflow in a virtual envirnment , how can i install OpenCV in that same Env ?
You need not install OpenCV in virtual environment. You need to link the OpenCV installation to the python environment. This is done in the last section named
Link OpenCV to your virtual environments.
Thanks alot man! I have been trying to install keras for three days now but this guide atlast made it possible.
one thing though, in this command
sudo apt-get install -y cuda
here you should change ‘cuda’ with ‘cuda-8.0’, otherwise it just installs cuda 9.0 (latest version) which i think is not flexible with TF.
Best regards.
Thanks Hash! Will update the post.
Thanks for the article! Very well made.
I have a doubt. When I’m installing the packages in the virutal environments, it’s redownloading them. Is it normal? Does that mean, it stores a different copy of the package for every virtual environment?
Yes, thats how it is done. Otherwise what’s the point of having different environments?
Thanks for the article! After installing tensorflow, I opened the terminal and typed ”
python”, afterwards I typed “import tensorflow as tf” and got the following error:
Traceback (most recent call last):
File “”, line 1, in
ImportError: No module named tensorflow
Any idea what’s the problem and how to solve it?
As far as I know, pip is missing some important files from tensorflow that conda has and conda is missing some other important files that pip has. That’s at least for the moment I am writing this post.
If you don’t have conda, you can download the anaconda version that suits your system from here: https://repo.continuum.io/archive/
Mine is Anaconda3-5.1.0-Linux-x86_64.sh for example. Install it and then you can use the conda package manager as well.
Conda + pip is usually a bad practice and should be avoided, but sadly in this case it might be the only workaround. You can try installing tensorflow with either of them (first pip uninstall tensorflow-gpu and then conda install tensorflow-gpu and vice versa), but I think that at the end you will have to use both. That’s at least what I did.
Keras isn’t working because its default backend is tensorflow. If you can’t manage to install tensorflow no matter what you try, then you should change the keras backend from tensorflow to theano and it should work with no problem. Hopefully…
No I didn’t need it and everything runs fine. You can try installing it and check if it works.
Did you install using virtual environments or directly? If you had followed everything without any error, then you have it installed on virtual environment. You have to activate the virtual environment ( workon virtual-py3 )before running python.
Nice article. However, it does not work for me on Ubuntu 16.04 with the latest cuda version 9.1. The installer automatically update my GPU driver to proprietary 387.26, which is not working well with Ubuntu yet.
Yes, I also faced this issue after some days. Tensorflow needs CUDA version 9.0 as of now.
do you have a compatible ami on the aws service somewhere? if not, do you know which aws gpu ami is equivalent – or similar – to the ubuntu 16* system you are describing above? the idea is to install an try all this somewhere on aws.
thanks.