PyCUDA

PyCUDA provides Python access to CUDA with more low-level control than other frameworks. It’s useful when you need fine-grained control over GPU memory and kernel execution.

Key Points:

  • PyCUDA requires import pycuda.autoinit to initialize the CUDA context

  • Use gpuarray.to_gpu() to transfer NumPy arrays to GPU

  • PyCUDA GPUArray can be converted to CV-CUDA using cvcuda.as_tensor()

  • Converting back requires manually constructing a GPUArray with the shared GPU pointer

Required Imports:

import numpy as np
import pycuda.autoinit  # noqa: F401
import pycuda.gpuarray as gpuarray
import cvcuda

PyCUDA to CV-CUDA:

numpy_array = np.random.randn(10, 10).astype(np.float32)
pycuda_array = gpuarray.to_gpu(numpy_array)
cvcuda_tensor = cvcuda.as_tensor(pycuda_array)

CV-CUDA to PyCUDA:

new_pycuda_array = gpuarray.GPUArray(
    shape=cvcuda_tensor.shape,
    dtype=cvcuda_tensor.dtype,
    gpudata=cvcuda_tensor.cuda().__cuda_array_interface__["data"][0],
)

Note that converting from CV-CUDA to PyCUDA requires extracting the GPU pointer from the CUDA Array Interface and manually constructing a GPUArray object. The gpudata parameter takes the pointer directly.

Complete Example: See samples/interoperability/pycuda_interop.py