Hello World Tutorial
This tutorial will guide you through creating a simple CV-CUDA application that performs basic image processing operations. This is a Python script that will demonstrate the following:
Load a batch of images into CV-CUDA
Resize the images
Apply a Gaussian blur
Save the results
Visualize the results
Prerequisites
NVIDIA GPU with compute capabilities 5.2 or newer.
Ubuntu 20.04, 22.04 or 24.04
CUDA 12 runtime with compatible NVIDIA driver.
Python 3.10
Python packages from
samples/hello_world/python/requirements.txt
To run this tutorial, install the required Python packages, preferrably in a virtual environment. This tutorial was writen for Python 3.10, but newer versions of Python 3 may work.
Install the required pip packages listed in the file samples/hello_world/python/requirements.txt
:
pip3 install -r requirements.txt
Writing the Hello World App
Find the complete source code for the tutorial in file samples/hello_world/python/hello_world.py
First, let’s import the necessary modules:
1 2import argparse 3import cvcuda 4import cupy as cp 5from nvidia import nvimgcodec 6from matplotlib import pyplot as plt 7Module
argparse
is used to implement the command line argument parsing:Module
cvcuda
imports CV-CUDA API.Module
cupy
is used to get access to numpy interfaces with support for CUDA backend.Module
nvimagecodec
is used to load (decode) images from files and decode (store) images to files.Module
pyplot
is used to display images.
The
main()
function contains the logic for the app.
We start by loading all the images from files and stack them as a batch into a CV-CUDA tensor.
1 2# Create the nvimgcodec decoder to load images. 3decoder = nvimgcodec.Decoder() 4 5print("Loading images...") 6 7cv_tensors: cvcuda.Tensor = None 8img_shape = None 9for input_filename in inputs: 10 11 # Open the input file and decode it into an image. 12 # nvimgcodec supports jpeg, jpeg2000, tiff, bmp, png, pnm, webp image file formats. 13 print(f"Loading image from {input_filename}") 14 with open(input_filename, "rb") as in_file: 15 data = in_file.read() 16 # Decode the loaded image and store in the default CUDA device. 17 # nvimgcodec decodes images into RGB uint8 HWC format. 18 nv_gpu_img: nvimgcodec.Image = decoder.decode(data).cuda() 19 20 # Wrap an existing CUDA buffer in a CVCUDA tensor. 21 # CVCUDA supports (N)HWC image layout only. 22 cv_tensor = cvcuda.as_tensor(nv_gpu_img, "HWC") 23 24 # Add loaded image to batch: 25 26 # Check that image sizes are the same. 27 if img_shape: 28 if img_shape != cv_tensor.shape: 29 raise RuntimeError( 30 f"All images in input must be of the same size: {img_shape} != {cv_tensor.shape}" 31 ) 32 else: 33 img_shape = cv_tensor.shape 34 # Pack the loaded tensor into a batch (NHWC). 35 cv_tensor = cv_tensor.reshape((1, *cv_tensor.shape), "NHWC") 36 cv_tensors = ( 37 cvcuda.stack([cv_tensors, cv_tensor]) 38 if cv_tensors 39 else cvcuda.stack([cv_tensor]) 40 )Here, we use
nvimgcodec.Decoder.decode()
to decode an image loaded from a file specified in the list of input images into RGB uint8 HWC format, loading it into the default CUDA device.We convert the loaded
nvimgcodec.Image
into acvcuda.Tensor
, bringing the data into CV-CUDA.For each input image, we stack it into a batch in a
cvcuda.Tensor
, converting it from HWC to NHWC whereN
is the batch size. In this tutorial, we require the images to be all of the same size (width and height) to fit them into a singlecvcuda.Tensor
with the NHWC layout.Note that we can perform the CV-CUDA operations directly on each
cvcuda.Tensor
as we obtain it without having to batch them. Batching here is used to illustrate how to operate more efficiently on batches of images.
Next we perform the image processing.
1 2# The resulting cv_tensors has the NHWC layout with N = len(inputs). 3assert cv_tensors.shape[0] == len(inputs) 4print(cv_tensors.shape) 5 6# Manipulate the tensor data in CVCUDA. 7 8# Resize the tensors. 9cv_tensors_result = cvcuda.resize( 10 cv_tensors, 11 (cv_tensors.shape[0], 224, 224, cv_tensors.shape[-1]), # N, H, W, C 12 interp=cvcuda.Interp.LINEAR, 13) 14 15# Apply a gaussian blur. 16kernel_size = (3, 3) 17gaussian_sigma = (1, 1) 18cv_tensors_result = cvcuda.gaussian( 19 cv_tensors_result, kernel_size, gaussian_sigma, cvcuda.Border.CONSTANT 20)Once the data is in a
cvcuda.Tensor
, we perform a resize to224 x 224
, followed by a Gaussian blur with a3 x 3
kernel and a sigma of1
.
Then, we retrieve the results from CV-CUDA and store them to the specified output files.
1 2print("Storing images...") 3 4# Create the nvimgcodec encoder to store images. 5encoder = nvimgcodec.Encoder() 6 7# Use cupy to separate the tensor batch. 8# cvcuda.Tensor.cuda() returns the buffer with __cuda_array_interface__. 9cp_array_result = cp.asarray(cv_tensors_result.cuda()) 10# Write each image to storage. 11encoder.write(outputs, [cp_arr for cp_arr in cp_array_result])We start by wrapping the
cvcuda.Tensor
into acupy.array
. Thecvcuda.Tensor
object is opaque for performance purposes. This step grants us the flexibility to access the data contained in each resulting image to store it.Then, we save the images to the specified files using the
nvimgcodec.Encoder.write()
method.
Finally, once we have the resulting images wrapped in a
cupy.array
, we can usepyplot
to display them. We display the first image in the batch as an example.
1 2# Use pyplot to display the first result. 3print("Displaying the first result...") 4plt.imshow(cp_array_result[0].get()) 5plt.show()
Running the Sample
To run the hello world example, make sure the prerequisites are satisfied.
python3 hello_world.py -i /path/to/image1.jpg /path/to/image2.jpg -o output1.jpg output2.jpg
This will:
Load your input images.
Apply image processing (resize and Gaussian blur).
Save results to output files (existing files will be overwriten).
Display the result of the first image.
Command Line Interface
--inputs
,-i
is used to input a list of image files to load into the app. These must all be of the same size (width and height). Only images in these formats are supported: jpeg, jpeg2000, tiff, bmp, png, pnm, or webp.--outputs
,-o
is used to specify the name of the files where the resulting images will be stored. The number of output files must be the same as the number of input files.
Next Steps
Now that you’ve completed the hello world tutorial, you can:
Try modifying the size values or the Gaussian parameters.
Add more image processing operations.
Explore other CV-CUDA operators.
Check out the more advanced samples in the Samples section.