Frequent 'nvidia' Questions

359 votes

8 answers

351k views

Different CUDA versions shown by nvcc and NVIDIA-smi

I am very confused by the different CUDA versions shown by running which nvcc and nvidia-smi. I have both cuda9.2 and cuda10 installed on my ubuntu 16.04. Now I set the PATH to point to cuda9.2. So ...

yuqli

5,419

asked Nov 22, 2018 at 0:44

150 votes

3 answers

162k views

How do I choose grid and block dimensions for CUDA kernels?

This is a question about how to determine the CUDA grid, block and thread sizes. This is an additional question to the one posted here. Following this link, the answer from talonmies contains a code ...

user1292251

1,785

asked Apr 3, 2012 at 1:14

58 votes

8 answers

38k views

Swing rendering appears broken in JDK 1.8, correct in JDK 1.7

I have installed IntelliJ IDEA (13.1.1 #IC-135.480) and JDK 1.8.0 (x64) and I generated some GUI with the GUI Form designer. Then I ran the code and realized that something is not alright. Here is ...

duffy356

3,716

asked Mar 29, 2014 at 22:08

69 votes

6 answers

278k views

Error Message : Cannot find or open the PDB file

I tried running sample programs provided at NVIDIA's official site. Most of the programs ran smoothly except few where I get similar error messages. How can I fix that? Here's a sample of error ...

KNU

2,534

asked Apr 10, 2013 at 22:34

33 votes

2 answers

119k views

What is the correct version of CUDA for my nvidia driver?

I am using ubuntu 14.04. I want to install CUDA. But I don't know which version is good for my laptop. I trace my driver that is: $cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 ...

Jame

3,884

asked Jun 13, 2015 at 15:42

5 votes

1 answer

22k views

How to create NVIDIA OpenCL project

I want to write application in NVIDIA OpenCL in Visual Studio 2017 but don't know how to create project for this purpose. I have GPU from NVIDIA (GeForce 940M) and Intel (HD Graphics 5500) and ...

Zekhire

115

asked Jul 2, 2019 at 18:24

28 votes

2 answers

15k views

How is CUDA memory managed?

When I run my CUDA program which allocates only a small amount of global memory (below 20 M), I got a "out of memory" error. (From other people's posts, I think the problem is related to memory ...

xhe8

439

asked Dec 30, 2011 at 22:42

184 votes

2 answers

86k views

How do CUDA blocks/warps/threads map onto CUDA cores?

I have been using CUDA for a few weeks, but I have some doubts about the allocation of blocks/warps/thread. I am studying the architecture from a didactic point of view (university project), so ...

Daedalus

1,841

asked May 5, 2012 at 9:58

25 votes

3 answers

20k views

How to measure the inner kernel time in NVIDIA CUDA?

I want to measure time inner kernel of GPU, how how to measure it in NVIDIA CUDA? e.g. __global__ void kernelSample() { some code here get start time some code here get stop time some ...

Amin

381

asked May 14, 2012 at 15:06

13 votes

1 answer

31k views

How do I use Nvidia Multi-process Service (MPS) to run multiple non-MPI CUDA applications?

Can I run non-MPI CUDA applications concurrently on NVIDIA Kepler GPUs with MPS? I'd like to do this because my applications cannot fully utilize the GPU, so I want them to co-run together. Is there ...

dalibocai

2,407

asked Jan 10, 2016 at 19:18

11 votes

3 answers

6k views

Why is NVIDIA Pascal GPUs slow on running CUDA Kernels when using cudaMallocManaged

I was testing the new CUDA 8 along with the Pascal Titan X GPU and is expecting speed up for my code but for some reason it ends up being slower. I am on Ubuntu 16.04. Here is the minimum code that ...

user3667089

3,408

asked Sep 30, 2016 at 2:28

187 votes

8 answers

622k views

How do I select which GPU to run a job on?

In a multi-GPU computer, how do I designate which GPU a CUDA job should run on? As an example, when installing CUDA, I opted to install the NVIDIA_CUDA-<#.#>_Samples then ran several instances ...

Steven C. Howell

18.9k

asked Sep 22, 2016 at 21:23

180 votes

2 answers

178k views

Understanding CUDA grid dimensions, block dimensions and threads organization (simple explanation) [closed]

How are threads organized to be executed by a GPU?

cibercitizen1

21.6k

asked Mar 6, 2010 at 11:08

130 votes

5 answers

69k views

What is a bank conflict? (Doing Cuda/OpenCL programming)

I have been reading the programming guide for CUDA and OpenCL, and I cannot figure out what a bank conflict is. They just sort of dive into how to solve the problem without elaborating on the subject ...

smuggledPancakes

10.4k

asked Oct 1, 2010 at 18:04

118 votes

2 answers

94k views

nvidia-smi Volatile GPU-Utilization explanation?

I know that nvidia-smi -l 1 will give the GPU usage every one second (similarly to the following). However, I would appreciate an explanation on what Volatile GPU-Util really means. Is that the number ...

user3813674

2,693

asked Dec 2, 2016 at 17:31

19 votes

4 answers

19k views

128 bit integer on cuda?

I just managed to install my cuda SDK under Linux Ubuntu 10.04. My graphic card is an NVIDIA geForce GT 425M, and I'd like to use it for some heavy computational problem. What I wonder is: is there ...

Matteo Monti

9,110

asked May 28, 2011 at 14:10

85 votes

9 answers

53k views

Horrible redraw performance of the DataGridView on one of my two screens

I've actually solved this, but I'm posting it for posterity. I ran into a very odd issue with the DataGridView on my dual-monitor system. The issue manifests itself as an EXTREMELY slow repaint of ...

Corey Ross

2,015

asked Sep 23, 2008 at 1:01

4 votes

1 answer

5k views

Cuda kernel returning vectors

I have a list of words, my goal is to match each word in a very very long phrase. I'm having no problem in matching each word, my only problem is to return a vector of structures containing ...

bukk530

1,915

asked Feb 14, 2014 at 18:07

639 votes

21 answers

1.5m views

How do I check if PyTorch is using the GPU?

How do I check if PyTorch is using the GPU? The nvidia-smi command can detect GPU activity, but I want to check it directly from inside a Python script.

vvvvv

32.9k

asked Jan 8, 2018 at 14:50

11 votes

3 answers

26k views

What can I do against 'CUDA driver version is insufficient for CUDA runtime version'?

When I go to /usr/local/cuda/samples/1_Utilities/deviceQuery and execute moose@pc09 /usr/local/cuda/samples/1_Utilities/deviceQuery $ sudo make clean rm -f deviceQuery deviceQuery.o rm -rf ../../bin/...

Martin Thoma

138k

asked Nov 12, 2015 at 10:26

45 votes

5 answers

59k views

How does CUDA assign device IDs to GPUs?

When a computer has multiple CUDA-capable GPUs, each GPU is assigned a device ID. By default, CUDA kernels execute on device ID 0. You can use cudaSetDevice(int device) to select a different device. ...

solvingPuzzles

8,929

asked Dec 8, 2012 at 20:42

44 votes

5 answers

35k views

OpenGL without X.org in linux

I'd like to open an OpenGL context without X in Linux. Is there any way at all to do it? I know it's possible for integrated Intel graphics card hardware, though most people have Nvidia cards in ...

Cheery

25.6k

asked Jul 24, 2010 at 19:53

39 votes

3 answers

25k views

How can I make tensorflow run on a GPU with capability 2.x?

I've successfully installed tensorflow (GPU) on Linux Ubuntu 16.04 and made some small changes in order to make it work with the new Ubuntu LTS release. However, I thought (who knows why) that my GPU ...

mickkk

1,192

asked Jul 23, 2016 at 14:17

4 votes

2 answers

9k views

CUDA program causes nvidia driver to crash

My monte carlo pi calculation CUDA program is causing my nvidia driver to crash when I exceed around 500 trials and 256 full blocks. It seems to be happening in the monteCarlo kernel function.Any help ...

zetatr

179

asked May 31, 2011 at 1:35

71 votes

5 answers

103k views

CUDA determining threads per block, blocks per grid

I'm new to the CUDA paradigm. My question is in determining the number of threads per block, and blocks per grid. Does a bit of art and trial play into this? What I've found is that many examples have ...

dnbwise

1,102

asked Dec 8, 2010 at 18:58

65 votes

9 answers

117k views

Error compiling CUDA from Command Prompt

I'm trying to compile a cuda test program on Windows 7 via Command Prompt, I'm this command: nvcc test.cu But all I get is this error: nvcc fatal : Cannot find compiler 'cl.exe' in PATH What may ...

GennSev

1,666

asked Nov 14, 2011 at 17:49

33 votes

3 answers

26k views

Are cuda kernel calls synchronous or asynchronous

I read that one can use kernel launches to synchronize different blocks i.e., If i want all blocks to complete operation 1 before they go on to operation 2, I should place operation 1 in one kernel ...

Programmer

6,783

asked Dec 12, 2011 at 11:31

28 votes

2 answers

16k views

Forcing NVIDIA GPU programmatically in Optimus laptops

I'm programming a DirectX game, and when I run it on an Optimus laptop the Intel GPU is used, resulting in horrible performance. If I force the NVIDIA GPU using the context menu or by renaming my ...

Smohn Jith

325

asked May 10, 2012 at 14:16

7 votes

2 answers

3k views

Force system with nVidia Optimus to use the real GPU for my application?

I want my application to always run using the real gpu on nVidia Optimus laptops. From "Enabling High Performance Graphics Rendering on Optimus Systems", (http://developer.download.nvidia.com/devzone/...

DelphiDabber

305

asked Mar 12, 2013 at 21:52

4 votes

2 answers

4k views

CUFFT error handling

I'm using the following macro for CUFFT error handling: #define cufftSafeCall(err) __cufftSafeCall(err, __FILE__, __LINE__) inline void __cufftSafeCall(cufftResult err, const char *file, const ...

Vitality

21.7k

asked Apr 28, 2013 at 19:53

2 votes

1 answer

3k views

cuda 11 kernel doesn't run

here is a demo.cu aiming to printf from the GPU device: #include "cuda_runtime.h" #include "device_launch_parameters.h" #include <stdio.h> __global__ void hello_cuda() { ...

govordovsky

409

asked Aug 31, 2020 at 16:56

0 votes

1 answer

4k views

GPU is not detected in Tensorflow

I am using Tensorflow on Windows, and I am trying to use my GPU. But Tensorflow seems unable to detect my GPU. I created a Python virtual environment and installed Python (3.8) and TensorFlow. My ...

Bhakti

31

asked Mar 26, 2024 at 17:51

599 votes

19 answers

992k views

Nvidia NVML Driver/library version mismatch [closed]

When I run nvidia-smi, I get the following message: Failed to initialize NVML: Driver/library version mismatch An hour ago I received the same message and uninstalled my CUDA library and I was able ...

etal

15k

asked Mar 25, 2017 at 22:47

138 votes

10 answers

333k views

Is it possible to run CUDA on AMD GPUs?

I'd like to extend my skill set into GPU computing. I am familiar with raytracing and realtime graphics(OpenGL), but the next generation of graphics and high performance computing seems to be in GPU ...

Lee Jacobs

1,775

asked Oct 10, 2012 at 21:02

112 votes

4 answers

91k views

Streaming multiprocessors, Blocks and Threads (CUDA)

What is the relationship between a CUDA core, a streaming multiprocessor and the CUDA model of blocks and threads? What gets mapped to what and what is parallelized and how? and what is more ...

user400055

asked Aug 19, 2010 at 7:21

52 votes

10 answers

215k views

How do I run nvidia-smi on Windows?

nvidia-smi executed in a Command Prompt (CMD) in Windows returns the following error C:\Users>nvidia-smi 'nvidia-smi' is not recognized as an internal or external command, operable program or batch ...

dward4

2,042

asked Jul 18, 2019 at 17:41

51 votes

8 answers

99k views

Tensorflow not running on GPU

I have aldready spent a considerable of time digging around on stack overflow and else looking for the answer, but couldn't find anything Hi all, I am running Tensorflow with Keras on top. I am 90% ...

valegians

950

asked Jun 29, 2017 at 15:17

32 votes

4 answers

39k views

How can I get number of Cores in cuda device?

I am looking for a function that count number of core of my cuda device. I know each microprocessor have specific cores, and my cuda device has 2 microprocessors. I searched a lot to find a property ...

Alsphere

543

asked Sep 11, 2015 at 19:12

15 votes

6 answers

7k views

Forcing hardware accelerated rendering

I have an OpenGL library written in c++ that is used from a C# application using C++/CLI adapters. My problem is that if the application is used on laptops with Nvidia Optimus technology the ...

JohanR

151

asked Jun 24, 2013 at 7:49

13 votes

1 answer

16k views

why do we need cudaDeviceSynchronize(); in kernels with device-printf?

__global__ void helloCUDA(float f) { printf("Hello thread %d, f=%f\n", threadIdx.x, f); } int main() { helloCUDA<<<1, 5>>>(1.2345f); cudaDeviceSynchronize(); return ...

gpuguy

4,707

asked Oct 5, 2013 at 2:52

3 votes

1 answer

3k views

Calculation on GPU leads to driver error "stopped responding"

I have this little nonsense script here which I am executing in MATLAB R2013b: clear all; n = 2000; times = 50; i = 0; tCPU = tic; disp 'CPU::' A = rand(n, n); B = rand(n, n); disp '::Go' for i = ...

Stefan Falk

25.8k

asked Feb 23, 2014 at 17:11

2 votes

1 answer

6k views

Cuda Random Number Generation

I was wondering what was the best way to generate one pseudo random number between 0 and 49k that would be the same for each thread, by using curand or something else. I prefer to generate the ...

Anoracx

458

asked Mar 6, 2013 at 12:35

99 votes

6 answers

297k views

GPU-accelerated video processing with ffmpeg [closed]

I want to use ffmpeg to accelerate video encode and decode with an NVIDIA GPU. From NVIDIA's website: NVIDIA GPUs contain one or more hardware-based decoder and encoder(s) (separate from the CUDA ...

Wang Hai

1,021

asked Jun 13, 2017 at 0:52

59 votes

3 answers

30k views

Running more than one CUDA applications on one GPU

CUDA document does not specific how many CUDA process can share one GPU. For example, if I launch more than one CUDA programs by the same user with only one GPU card installed in the system, what is ...

cache

1,329

asked Jul 27, 2015 at 0:55

19 votes

5 answers

18k views

How to run CUDA without a GPU using a software implementation?

My laptop doesn't have a nVidia graphic cards, and I want to work on CUDA. The website says that CUDA can be used in emulation mode on non-cuda hardware too. But when I tried installing CUDA drivers ...

emkrish

221

asked Nov 18, 2009 at 5:00

15 votes

2 answers

22k views

Using constants with CUDA

Which is the best way of using constants in CUDA? One way is to define constants in constant memory, like: // CUDA global constants __constant__ int M; int main(void) { ... ...

jrsm

1,715

asked Apr 20, 2013 at 11:41

15 votes

4 answers

23k views

Compile cuda code for CPU

I'm study cuda 5.5 but i don't have any Nvidia GPU. In old version of nvcc have a flag --multicore to compile cuda code for CPU. In the new version of nvcc, what's is the option?? I'm working on ...

F.N.B

1,639

asked Feb 21, 2014 at 22:45

15 votes

2 answers

11k views

How to interrupt or cancel a CUDA kernel from host code

I am working with CUDA and I am trying to stop my kernels work (i.e. terminate all running threads) after a certain if block is being hit. How can I do that? I am really stuck in here.

MD Kamal Hossain Shajal

199

asked Jan 25, 2016 at 9:49

8 votes

2 answers

12k views

C# Performance Counter Help, Nvidia GPU

So I've been experimenting with the performance counter class in C# and have had great success probing the CPU counters and almost everything I can find in the windows performance monitor. HOWEVER, I ...

Alex E

269

asked Apr 3, 2016 at 19:13

8 votes

1 answer

19k views

CUDA5 Examples: Has anyone translated some cutil definitions to CUDA5?

Has anyone started to work with the CUDA5 SDK? I have an old project that uses some cutil functions, but they've been abandoned in the new one. The solution was that most functions can be translated ...

Manuel

185

asked Sep 18, 2012 at 9:32

Collectives™ on Stack Overflow