Skip to main content
We’ve updated our Terms of Service. A new AI Addendum clarifies how Stack Overflow utilizes AI interactions.
Filter by
Sorted by
Tagged with
359 votes
8 answers
351k views

Different CUDA versions shown by nvcc and NVIDIA-smi

I am very confused by the different CUDA versions shown by running which nvcc and nvidia-smi. I have both cuda9.2 and cuda10 installed on my ubuntu 16.04. Now I set the PATH to point to cuda9.2. So ...
yuqli's user avatar
  • 5,419
150 votes
3 answers
162k views

How do I choose grid and block dimensions for CUDA kernels?

This is a question about how to determine the CUDA grid, block and thread sizes. This is an additional question to the one posted here. Following this link, the answer from talonmies contains a code ...
user1292251's user avatar
  • 1,785
58 votes
8 answers
38k views

Swing rendering appears broken in JDK 1.8, correct in JDK 1.7

I have installed IntelliJ IDEA (13.1.1 #IC-135.480) and JDK 1.8.0 (x64) and I generated some GUI with the GUI Form designer. Then I ran the code and realized that something is not alright. Here is ...
duffy356's user avatar
  • 3,716
69 votes
6 answers
278k views

Error Message : Cannot find or open the PDB file

I tried running sample programs provided at NVIDIA's official site. Most of the programs ran smoothly except few where I get similar error messages. How can I fix that? Here's a sample of error ...
KNU's user avatar
  • 2,534
33 votes
2 answers
119k views

What is the correct version of CUDA for my nvidia driver?

I am using ubuntu 14.04. I want to install CUDA. But I don't know which version is good for my laptop. I trace my driver that is: $cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 ...
Jame's user avatar
  • 3,884
5 votes
1 answer
22k views

How to create NVIDIA OpenCL project

I want to write application in NVIDIA OpenCL in Visual Studio 2017 but don't know how to create project for this purpose. I have GPU from NVIDIA (GeForce 940M) and Intel (HD Graphics 5500) and ...
Zekhire's user avatar
  • 115
28 votes
2 answers
15k views

How is CUDA memory managed?

When I run my CUDA program which allocates only a small amount of global memory (below 20 M), I got a "out of memory" error. (From other people's posts, I think the problem is related to memory ...
xhe8's user avatar
  • 439
184 votes
2 answers
86k views

How do CUDA blocks/warps/threads map onto CUDA cores?

I have been using CUDA for a few weeks, but I have some doubts about the allocation of blocks/warps/thread. I am studying the architecture from a didactic point of view (university project), so ...
Daedalus's user avatar
  • 1,841
25 votes
3 answers
20k views

How to measure the inner kernel time in NVIDIA CUDA?

I want to measure time inner kernel of GPU, how how to measure it in NVIDIA CUDA? e.g. __global__ void kernelSample() { some code here get start time some code here get stop time some ...
Amin's user avatar
  • 381
13 votes
1 answer
31k views

How do I use Nvidia Multi-process Service (MPS) to run multiple non-MPI CUDA applications?

Can I run non-MPI CUDA applications concurrently on NVIDIA Kepler GPUs with MPS? I'd like to do this because my applications cannot fully utilize the GPU, so I want them to co-run together. Is there ...
dalibocai's user avatar
  • 2,407
11 votes
3 answers
6k views

Why is NVIDIA Pascal GPUs slow on running CUDA Kernels when using cudaMallocManaged

I was testing the new CUDA 8 along with the Pascal Titan X GPU and is expecting speed up for my code but for some reason it ends up being slower. I am on Ubuntu 16.04. Here is the minimum code that ...
user3667089's user avatar
  • 3,408
187 votes
8 answers
622k views

How do I select which GPU to run a job on?

In a multi-GPU computer, how do I designate which GPU a CUDA job should run on? As an example, when installing CUDA, I opted to install the NVIDIA_CUDA-<#.#>_Samples then ran several instances ...
Steven C. Howell's user avatar
180 votes
2 answers
178k views

Understanding CUDA grid dimensions, block dimensions and threads organization (simple explanation) [closed]

How are threads organized to be executed by a GPU?
cibercitizen1's user avatar
130 votes
5 answers
69k views

What is a bank conflict? (Doing Cuda/OpenCL programming)

I have been reading the programming guide for CUDA and OpenCL, and I cannot figure out what a bank conflict is. They just sort of dive into how to solve the problem without elaborating on the subject ...
smuggledPancakes's user avatar
118 votes
2 answers
94k views

nvidia-smi Volatile GPU-Utilization explanation?

I know that nvidia-smi -l 1 will give the GPU usage every one second (similarly to the following). However, I would appreciate an explanation on what Volatile GPU-Util really means. Is that the number ...
user3813674's user avatar
  • 2,693
19 votes
4 answers
19k views

128 bit integer on cuda?

I just managed to install my cuda SDK under Linux Ubuntu 10.04. My graphic card is an NVIDIA geForce GT 425M, and I'd like to use it for some heavy computational problem. What I wonder is: is there ...
Matteo Monti's user avatar
  • 9,110
85 votes
9 answers
53k views

Horrible redraw performance of the DataGridView on one of my two screens

I've actually solved this, but I'm posting it for posterity. I ran into a very odd issue with the DataGridView on my dual-monitor system. The issue manifests itself as an EXTREMELY slow repaint of ...
Corey Ross's user avatar
  • 2,015
4 votes
1 answer
5k views

Cuda kernel returning vectors

I have a list of words, my goal is to match each word in a very very long phrase. I'm having no problem in matching each word, my only problem is to return a vector of structures containing ...
bukk530's user avatar
  • 1,915
639 votes
21 answers
1.5m views

How do I check if PyTorch is using the GPU?

How do I check if PyTorch is using the GPU? The nvidia-smi command can detect GPU activity, but I want to check it directly from inside a Python script.
vvvvv's user avatar
  • 32.9k
11 votes
3 answers
26k views

What can I do against 'CUDA driver version is insufficient for CUDA runtime version'?

When I go to /usr/local/cuda/samples/1_Utilities/deviceQuery and execute moose@pc09 /usr/local/cuda/samples/1_Utilities/deviceQuery $ sudo make clean rm -f deviceQuery deviceQuery.o rm -rf ../../bin/...
Martin Thoma's user avatar
45 votes
5 answers
59k views

How does CUDA assign device IDs to GPUs?

When a computer has multiple CUDA-capable GPUs, each GPU is assigned a device ID. By default, CUDA kernels execute on device ID 0. You can use cudaSetDevice(int device) to select a different device. ...
solvingPuzzles's user avatar
44 votes
5 answers
35k views

OpenGL without X.org in linux

I'd like to open an OpenGL context without X in Linux. Is there any way at all to do it? I know it's possible for integrated Intel graphics card hardware, though most people have Nvidia cards in ...
Cheery's user avatar
  • 25.6k
39 votes
3 answers
25k views

How can I make tensorflow run on a GPU with capability 2.x?

I've successfully installed tensorflow (GPU) on Linux Ubuntu 16.04 and made some small changes in order to make it work with the new Ubuntu LTS release. However, I thought (who knows why) that my GPU ...
mickkk's user avatar
  • 1,192
4 votes
2 answers
9k views

CUDA program causes nvidia driver to crash

My monte carlo pi calculation CUDA program is causing my nvidia driver to crash when I exceed around 500 trials and 256 full blocks. It seems to be happening in the monteCarlo kernel function.Any help ...
zetatr's user avatar
  • 179
71 votes
5 answers
103k views

CUDA determining threads per block, blocks per grid

I'm new to the CUDA paradigm. My question is in determining the number of threads per block, and blocks per grid. Does a bit of art and trial play into this? What I've found is that many examples have ...
dnbwise's user avatar
  • 1,102
65 votes
9 answers
117k views

Error compiling CUDA from Command Prompt

I'm trying to compile a cuda test program on Windows 7 via Command Prompt, I'm this command: nvcc test.cu But all I get is this error: nvcc fatal : Cannot find compiler 'cl.exe' in PATH What may ...
GennSev's user avatar
  • 1,666
33 votes
3 answers
26k views

Are cuda kernel calls synchronous or asynchronous

I read that one can use kernel launches to synchronize different blocks i.e., If i want all blocks to complete operation 1 before they go on to operation 2, I should place operation 1 in one kernel ...
Programmer's user avatar
  • 6,783
28 votes
2 answers
16k views

Forcing NVIDIA GPU programmatically in Optimus laptops

I'm programming a DirectX game, and when I run it on an Optimus laptop the Intel GPU is used, resulting in horrible performance. If I force the NVIDIA GPU using the context menu or by renaming my ...
Smohn Jith's user avatar
7 votes
2 answers
3k views

Force system with nVidia Optimus to use the real GPU for my application?

I want my application to always run using the real gpu on nVidia Optimus laptops. From "Enabling High Performance Graphics Rendering on Optimus Systems", (http://developer.download.nvidia.com/devzone/...
DelphiDabber's user avatar
4 votes
2 answers
4k views

CUFFT error handling

I'm using the following macro for CUFFT error handling: #define cufftSafeCall(err) __cufftSafeCall(err, __FILE__, __LINE__) inline void __cufftSafeCall(cufftResult err, const char *file, const ...
Vitality's user avatar
  • 21.7k
2 votes
1 answer
3k views

cuda 11 kernel doesn't run

here is a demo.cu aiming to printf from the GPU device: #include "cuda_runtime.h" #include "device_launch_parameters.h" #include <stdio.h> __global__ void hello_cuda() { ...
govordovsky's user avatar
0 votes
1 answer
4k views

GPU is not detected in Tensorflow

I am using Tensorflow on Windows, and I am trying to use my GPU. But Tensorflow seems unable to detect my GPU. I created a Python virtual environment and installed Python (3.8) and TensorFlow. My ...
Bhakti's user avatar
  • 31
599 votes
19 answers
992k views

Nvidia NVML Driver/library version mismatch [closed]

When I run nvidia-smi, I get the following message: Failed to initialize NVML: Driver/library version mismatch An hour ago I received the same message and uninstalled my CUDA library and I was able ...
etal's user avatar
  • 15k
138 votes
10 answers
333k views

Is it possible to run CUDA on AMD GPUs?

I'd like to extend my skill set into GPU computing. I am familiar with raytracing and realtime graphics(OpenGL), but the next generation of graphics and high performance computing seems to be in GPU ...
Lee Jacobs's user avatar
  • 1,775
112 votes
4 answers
91k views

Streaming multiprocessors, Blocks and Threads (CUDA)

What is the relationship between a CUDA core, a streaming multiprocessor and the CUDA model of blocks and threads? What gets mapped to what and what is parallelized and how? and what is more ...
user avatar
52 votes
10 answers
215k views

How do I run nvidia-smi on Windows?

nvidia-smi executed in a Command Prompt (CMD) in Windows returns the following error C:\Users>nvidia-smi 'nvidia-smi' is not recognized as an internal or external command, operable program or batch ...
dward4's user avatar
  • 2,042
51 votes
8 answers
99k views

Tensorflow not running on GPU

I have aldready spent a considerable of time digging around on stack overflow and else looking for the answer, but couldn't find anything Hi all, I am running Tensorflow with Keras on top. I am 90% ...
valegians's user avatar
  • 950
32 votes
4 answers
39k views

How can I get number of Cores in cuda device?

I am looking for a function that count number of core of my cuda device. I know each microprocessor have specific cores, and my cuda device has 2 microprocessors. I searched a lot to find a property ...
Alsphere's user avatar
  • 543
15 votes
6 answers
7k views

Forcing hardware accelerated rendering

I have an OpenGL library written in c++ that is used from a C# application using C++/CLI adapters. My problem is that if the application is used on laptops with Nvidia Optimus technology the ...
JohanR's user avatar
  • 151
13 votes
1 answer
16k views

why do we need cudaDeviceSynchronize(); in kernels with device-printf?

__global__ void helloCUDA(float f) { printf("Hello thread %d, f=%f\n", threadIdx.x, f); } int main() { helloCUDA<<<1, 5>>>(1.2345f); cudaDeviceSynchronize(); return ...
gpuguy's user avatar
  • 4,707
3 votes
1 answer
3k views

Calculation on GPU leads to driver error "stopped responding"

I have this little nonsense script here which I am executing in MATLAB R2013b: clear all; n = 2000; times = 50; i = 0; tCPU = tic; disp 'CPU::' A = rand(n, n); B = rand(n, n); disp '::Go' for i = ...
Stefan Falk's user avatar
  • 25.8k
2 votes
1 answer
6k views

Cuda Random Number Generation

I was wondering what was the best way to generate one pseudo random number between 0 and 49k that would be the same for each thread, by using curand or something else. I prefer to generate the ...
Anoracx's user avatar
  • 458
99 votes
6 answers
297k views

GPU-accelerated video processing with ffmpeg [closed]

I want to use ffmpeg to accelerate video encode and decode with an NVIDIA GPU. From NVIDIA's website: NVIDIA GPUs contain one or more hardware-based decoder and encoder(s) (separate from the CUDA ...
Wang Hai's user avatar
  • 1,021
59 votes
3 answers
30k views

Running more than one CUDA applications on one GPU

CUDA document does not specific how many CUDA process can share one GPU. For example, if I launch more than one CUDA programs by the same user with only one GPU card installed in the system, what is ...
cache's user avatar
  • 1,329
19 votes
5 answers
18k views

How to run CUDA without a GPU using a software implementation?

My laptop doesn't have a nVidia graphic cards, and I want to work on CUDA. The website says that CUDA can be used in emulation mode on non-cuda hardware too. But when I tried installing CUDA drivers ...
emkrish's user avatar
  • 221
15 votes
2 answers
22k views

Using constants with CUDA

Which is the best way of using constants in CUDA? One way is to define constants in constant memory, like: // CUDA global constants __constant__ int M; int main(void) { ... ...
jrsm's user avatar
  • 1,715
15 votes
4 answers
23k views

Compile cuda code for CPU

I'm study cuda 5.5 but i don't have any Nvidia GPU. In old version of nvcc have a flag --multicore to compile cuda code for CPU. In the new version of nvcc, what's is the option?? I'm working on ...
F.N.B's user avatar
  • 1,639
15 votes
2 answers
11k views

How to interrupt or cancel a CUDA kernel from host code

I am working with CUDA and I am trying to stop my kernels work (i.e. terminate all running threads) after a certain if block is being hit. How can I do that? I am really stuck in here.
MD Kamal Hossain Shajal's user avatar
8 votes
2 answers
12k views

C# Performance Counter Help, Nvidia GPU

So I've been experimenting with the performance counter class in C# and have had great success probing the CPU counters and almost everything I can find in the windows performance monitor. HOWEVER, I ...
Alex E's user avatar
  • 269
8 votes
1 answer
19k views

CUDA5 Examples: Has anyone translated some cutil definitions to CUDA5?

Has anyone started to work with the CUDA5 SDK? I have an old project that uses some cutil functions, but they've been abandoned in the new one. The solution was that most functions can be translated ...
Manuel's user avatar
  • 185

1
2 3 4 5
10