1

I am trying to run keras on my GPU.

My setup:

  • NVIDIA Geforce RTX 3070
  • Ubuntu 22.04
  • Python: 3.10

I installed the nvdidia driver via sudo ubuntu-drivers install. Under Software&Updates/Additional Drivers it says that it uses nvidia-driver535. So it has a driver.

I then installed cuda toolkit via sudo apt-get install nvidia-cuda-dev nvidia-cuda-toolkit. I also installed cuDNN via sudo apt install nvidia-cudnn and tensorflow pip install tensorflow which also already includes keras.

But when listing physical devices via the actual tensorflow library it does only list the CPU.

print(tf.config.list_physical_devices())
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

The following is printed when importing tensorflow:

2024-06-26 23:15:15.129300: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-26 23:15:15.131933: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-26 23:15:15.170793: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-26 23:15:15.699070: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-06-26 23:15:16.077326: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-06-26 23:15:16.081814: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

Seems like the cuda drivers are not found as well as "TensorRT" being missing.

This is a fresh ubuntu install, I did not install any other python packages yet.

What can I do to make this work?

2
  • 1
    No package versions are mentioned. Is your CUDA, cuDNN, and tensorflow version combination listed here tensorflow.org/install/source#gpu ? Commented Jun 26, 2024 at 23:06
  • Oh I did not know about these combination issues. My versions: cuda 11.5, cuDNN 8.2, tensorflow 2.16.1., Python: 3.10 does indeed not seem to match the gpu support combinations. Commented Jun 27, 2024 at 6:41

1 Answer 1

0

After a few tries there was one version combination out of the above list I could get to work. For anyone interested, here is the exact installation process I did to get tensorflow gpu support running:

Prerequisites:

REQUIRED: Ubuntu 20.04 or Ubuntu 22.04

"make" tools to build colde later:

sudo apt-get install build-essential

Required package versions

https://www.tensorflow.org/install/source#gpu

In my case:

  • tensorflow 2.15.0
  • Python 3.9-3.11
  • Clang 16.0.0
  • Bazel 6.1.0
  • cuDNN 8.9
  • CUDA 12.2

STEP1: NVIDIA drivers

sudo ubuntu-drivers list
sudo ubuntu-drivers install

STEP2: Reboot for graphics drivers to take effect

STEP3: CUDA 12.2

https://developer.nvidia.com/cuda-12-2-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.0-535.54.03-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

STEP4: cuDNN 8.9.5 for CUDA 12.x

https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-895/install-guide/index.html https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-895/install-guide/index.html#installlinux-deb

Download:

https://developer.nvidia.com/rdp/cudnn-archive

  • "Download cuDNN v8.9.5 (October 27th, 2023), for CUDA 12.x"
  • "Local Installer for Ubuntu22.04 x86_64 (Deb)"
Install:
sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.5.30_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.5.30/cudnn-local-FB167084-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get install libcudnn8=8.9.5.30-1+cuda12.2
sudo apt-get install libcudnn8-dev=8.9.5.30-1+cuda12.2
sudo apt-get install libcudnn8-samples=8.9.5.30-1+cuda12.2

The "verify install" section in the docs did not work as described for me, however the GPU support works no so I dont care.

STEP5: Tensorflow

pip install tensorflow==2.15.0

Pip install worked just fine, so no need for compile tools bazel and clang anymore.

STEP6: Validate GPU-support in python

print(tf.config.list_physical_devices(device_type=None))
>>>[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
>>>Num GPUs Available: 1

DONE

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.