Problem with setting up Tensorflow GPU support

Question

I am trying to install support for Tensorflow GPU using the following guide:

https://www.tensorflow.org/install/gpu

I am on Ubuntu (20.04 LTS)

I've followed the instruction for the latest Ubuntu below (Cuda 11):

# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update

wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb

sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update

wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
sudo apt install ./libnvinfer7_7.1.3-1+cuda11.0_amd64.deb
sudo apt-get update

# Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
    cuda-11-0 \
    libcudnn8=8.0.4.30-1+cuda11.0  \
    libcudnn8-dev=8.0.4.30-1+cuda11.0

# Reboot. Check that GPUs are visible using the command: nvidia-smi

# Install TensorRT. Requires that libcudnn8 is installed above.
sudo apt-get install -y --no-install-recommends libnvinfer7=7.1.3-1+cuda11.0 \
    libnvinfer-dev=7.1.3-1+cuda11.0 \
    libnvinfer-plugin7=7.1.3-1+cuda11.0

After running this and rebooting, I have Cuda 11 and CuDNN 8.

After this I installed tensorflow with a simple pip install tensorflow, as I understood online there's no need to install tensorflow-gpu explicitly in the newer versions of tensorflow.

This is what I'm getting after trying to import tensorflow and check physical devices:

import tensorflow as tf

Result:

2021-06-02 16:04:03.347039: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0

tf.config.list_physical_devies('GPU')

Result:

2021-06-02 16:11:19.035743: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-06-02 16:11:19.067500: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-02 16:11:19.067753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 6GB computeCapability: 6.1
coreClock: 1.759GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2021-06-02 16:11:19.067771: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-02 16:11:19.069485: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-06-02 16:11:19.069529: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-06-02 16:11:19.069625: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069689: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069736: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069796: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2021-06-02 16:11:19.069930: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-06-02 16:11:19.069938: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

It seems like tensorflow is complaining about 4 files (.so libraries):

libcufft.so.10
libcurand.so.10
libcusolver.so.11
libcusparse.so.11

I've tried to look for these in my system using the locate command on Ubuntu, they do not exist anywhere.

I haven't added anything to my .bashrc since I was not sure what the LD_LIBRARY_PATH must be.

Dinesh KS · Accepted Answer · 2021-06-02 13:34:23Z

1

Put this library path in the ~/.bashrc file and source then try

export PATH=/usr/local/cuda-11.0/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:${LD_LIBRARY_PATH}
export CUDA_HOME=/usr/local/cuda

Please change your path according to your setup.

answered Jun 2, 2021 at 13:34

Dinesh KS

192 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

orie Over a year ago

I have the directory in usr/local/cuda-11.0/ and I also have one in /usr/lib/cuda/ which I installed using sudo apt-get nvidia-cuda-toolkit (do I even need that?), after adding this to .bashrc 2 of the files were resolved but there are still 2 left

orie Over a year ago

2021-06-02 16:50:20: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.0/lib64:/usr/local/cuda/lib64: 2021-06-02 16:50:20: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.0/lib64:/usr/local/cuda/lib64:

orie Over a year ago

It seems like there are some missing files in usr/local/cuda-11.0/lib64, specifically, libcusolver.so.11 and libcusparse.so.11, have I not installed something?

Robert Crovella Over a year ago

You have a corrupted install. The instructions you posted in your question were correct. However you should not also have done sudo apt-get nvidia-cuda-toolkit. That command doesn't appear anywhere in the instructions you posted in your question. My suggestion would be to start over with a fresh install of Ubuntu, and follow the instructions you have posted in your question.

orie Over a year ago

It was a bit confusing because they do ask you to install CUPTI and set the environment variable before these commands, and they mentioned CUPTI includes inside the cuda toolkit.

|

achini · Accepted Answer · 2022-06-09 10:30:02Z

1

Try to install the missing libraries and check

apt-get install -y cuda-command-line-tools-11-4 libcublas-11-4 libcufft-11-4 libcurand-11-4 libcusolver-11-4 libcusparse-11-4

Note that the version should be based on your CUDA version.

${CUDA/./-}

If your CUDA version is 11.2 then the library version will be libcufft-11-2

edited Jun 9, 2022 at 10:30

answered Jun 9, 2022 at 10:00

achini

4494 silver badges7 bronze badges

Collectives™ on Stack Overflow

Problem with setting up Tensorflow GPU support

2 Answers 2

8 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related