Relative Content

Tag Archive for cuda

cuda device class data modification fails for large number of threads [duplicate]

This question already has answers here: Illegal Memory Access on cudaDeviceSynchronize (1 answer) CUDA: How to fill a vector of dynamic size on device and return its contents to another device function? (1 answer) Program hit cudaErrorIllegalAdress without cuda-memcheck error when running program with a large dataset (1 answer) Closed 3 days ago. I instantiate […]

Can you help me find the reason why my CUDA coded MLP will not learn?

I wanted to write an MLP in CUDA without any dependancy’s I apologise in advance for my messy code. Please can you examine my CUDA functions to see if there is an obvious mistake which could explain why it will not solve , as it should, the simple XoR problem. We should see a decrease in error but instead it just produces a random error. I tried to make my own CUDA rng but I am using rand() instead. I have

Performance Issue with Custom Kernels in CUDA Compared to cuSOLVER

I’ve implemented a QR factorization algorithm in CUDA tailored to my specific needs. While testing, I’ve noticed that the execution time of my custom kernel increases exponentially as the size of the matrix grows, whereas the execution time of NVIDIA’s cuSOLVER scales much more evenly.

Passing array of struct from device to array

I am trying to copy the variable d_output->list from device to the host using cudaMemcpy but I am obtaining Segmentation fault (core dumped) Could you please let me know why?