2

Question

How can I programmatically check if llama-cpp-python is installed with support for a CUDA-capable GPU?

Context

In my program, I am trying to warn the developers when they fail to configure their system in a way that allows the llama-cpp-python LLMs to leverage GPU acceleration. For example, they may have installed the library using pip install llama-cpp-python without setting appropriate environment variables for CUDA acceleration, or the CUDA Toolkit may be missing from their operating system.

What I Have Tried

In earlier versions of the library, I could reliably detect whether a GPU was available, i.e., I got fast responses, high GPU utilization, and detected GPU availability if and only if the library was installed with appropriate environment variables.

Initially, I used to check GPU availability using:

from llama_cpp.llama_cpp import GGML_USE_CUBLAS

def is_gpu_available_v1() -> bool:
    return GGML_USE_CUBLAS

Later, the GGML_USE_CUBLAS was removed. For some time, I used the following alternative:

from llama_cpp.llama_cpp import _load_shared_library

def is_gpu_available_v2() -> bool:
    lib = _load_shared_library('llama')
    return hasattr(lib, 'ggml_init_cublas')

For newer versions of the library, the latter approach consistently returns False, even if the inference for the LLM is being executed on a GPU.

0

2 Answers 2

2

Examining the source code of the library, I found a new approach for checking GPU availability:

from llama_cpp.llama_cpp import _load_shared_library

def is_gpu_available_v3() -> bool:
    lib = _load_shared_library('llama')
    return bool(lib.llama_supports_gpu_offload())

This works with llama_cpp_python==0.2.64.

Sign up to request clarification or add additional context in comments.

Comments

1

For llama-cpp-python version 0.3.1 and 0.3.2, please use this source code:

from llama_cpp.llama_cpp import load_shared_library

import pathlib

def is_gpu_available_v3() -> bool:

    lib = load_shared_library('llama',pathlib.Path('/usr/local/lib/python3.10/dist-packages/llama_cpp/lib'))

    return bool(lib.llama_supports_gpu_offload())

print(is_gpu_available_v3())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.