Detecting GPU availability in llama-cpp-python

Question

Question

How can I programmatically check if llama-cpp-python is installed with support for a CUDA-capable GPU?

Context

In my program, I am trying to warn the developers when they fail to configure their system in a way that allows the llama-cpp-python LLMs to leverage GPU acceleration. For example, they may have installed the library using pip install llama-cpp-python without setting appropriate environment variables for CUDA acceleration, or the CUDA Toolkit may be missing from their operating system.

What I Have Tried

In earlier versions of the library, I could reliably detect whether a GPU was available, i.e., I got fast responses, high GPU utilization, and detected GPU availability if and only if the library was installed with appropriate environment variables.

Initially, I used to check GPU availability using:

from llama_cpp.llama_cpp import GGML_USE_CUBLAS

def is_gpu_available_v1() -> bool:
    return GGML_USE_CUBLAS

Later, the GGML_USE_CUBLAS was removed. For some time, I used the following alternative:

from llama_cpp.llama_cpp import _load_shared_library

def is_gpu_available_v2() -> bool:
    lib = _load_shared_library('llama')
    return hasattr(lib, 'ggml_init_cublas')

For newer versions of the library, the latter approach consistently returns False, even if the inference for the LLM is being executed on a GPU.

Programmer.zip · Accepted Answer · 2024-05-08 20:11:45Z

2

Examining the source code of the library, I found a new approach for checking GPU availability:

from llama_cpp.llama_cpp import _load_shared_library

def is_gpu_available_v3() -> bool:
    lib = _load_shared_library('llama')
    return bool(lib.llama_supports_gpu_offload())

This works with llama_cpp_python==0.2.64.

answered May 8, 2024 at 20:11

Programmer.zip

8103 gold badges9 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Parv Sharma · Accepted Answer · 2025-03-30 19:44:50Z

1

For llama-cpp-python version 0.3.1 and 0.3.2, please use this source code:

from llama_cpp.llama_cpp import load_shared_library

import pathlib

def is_gpu_available_v3() -> bool:

    lib = load_shared_library('llama',pathlib.Path('/usr/local/lib/python3.10/dist-packages/llama_cpp/lib'))

    return bool(lib.llama_supports_gpu_offload())

print(is_gpu_available_v3())

edited Mar 30 at 19:44

Parv Sharma

32 bronze badges

answered Oct 18, 2024 at 7:03

John

661 silver badge5 bronze badges

Collectives™ on Stack Overflow

Detecting GPU availability in llama-cpp-python

Question

Context

What I Have Tried

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

Question

Context

What I Have Tried

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related