Image Decoding using nvImageCodec

The image batch decoder is responsible for parsing the input expression, reading and decoding image data. The actual decoding is done in batches using the library nvImageCodec. Although used in the semantic segmentation sample, this image decoder is generic enough to be used in other applications. The code associated with this class can be found in the samples/common/python/nvcodec_utils.py file.

Before the data can be read or decoded, we must parse it (i.e figure out what kind of data it is). Depending on the input_path’s value, we either read one image and create a dummy list with the data from the same image to simulate a batch or read a bunch of images from a directory.

 1self.logger = logging.getLogger(__name__)
 2self.batch_size = batch_size
 3self.input_path = input_path
 4self.device_id = device_id
 5self.total_decoded = 0
 6self.batch_idx = 0
 7self.cuda_ctx = cuda_ctx
 8self.cuda_stream = cuda_stream
 9self.cvcuda_perf = cvcuda_perf
10self.decoder = nvimgcodec.Decoder(device_id=device_id)
11
12# docs_tag: begin_parse_imagebatchdecoder_nvimagecodec
13if os.path.isfile(self.input_path):
14    if os.path.splitext(self.input_path)[1] == ".jpg":
15        # Read the input image file.
16        self.file_names = [self.input_path] * self.batch_size
17        # We will use the nvImageCodec based decoder on the GPU in case of images.
18        # This will be allocated once during the first run or whenever a batch
19        # size change happens.
20    else:
21        raise ValueError("Unable to read file %s as image." % self.input_path)
22
23elif os.path.isdir(self.input_path):
24    # It is a directory. Grab file names of all JPG images.
25    self.file_names = glob.glob(os.path.join(self.input_path, "*.jpg"))
26    self.logger.info("Found a total of %d JPEG images." % len(self.file_names))
27
28else:
29    raise ValueError(
30        "Unknown expression given as input_path: %s." % self.input_path
31    )
32

Once we have a list of image file names that we can read, we will split them into batches based on the batch size.

 1self.file_name_batches = [
 2    self.file_names[i : i + self.batch_size]  # noqa: E203
 3    for i in range(0, len(self.file_names), self.batch_size)
 4]
 5# docs_tag: end_batch_imagebatchdecoder_nvimagecodec
 6
 7self.max_image_size = 1024 * 1024 * 3  # Maximum possible image size.
 8
 9self.logger.info(
10    "Using nvImageCodec decoder version: %s" % nvimgcodec.__version__
11)
12

That is all we need to do for the initialization. Now as soon as a call to decoder is issued, we would start reading and decoding the data. This begins with reading the data bytes in batches and returning None if there is no data left to be read.

Once the data has been read, we use nvImageCodec to decode it into a list of image tensors. The nvImageCodec instance is allocated either on its first use or whenever there is a change in the batch size (i.e. last batch). Since what we get at this point is a list of images (i.e a python list of 3D tensors), we would need to convert them to a 4D tensor by stacking them up on the first dimension.

 1
 2tensor_list = []
 3image_list = self.decoder.decode(data_batch, cuda_stream=self.cuda_stream)
 4
 5# Convert the decoded images to nvcv tensors in a list.
 6for i in range(len(image_list)):
 7    tensor_list.append(cvcuda.as_tensor(image_list[i], "HWC"))
 8
 9# Stack the list of tensors to a single NHWC tensor.
10cvcuda_decoded_tensor = cvcuda.stack(tensor_list)
11self.total_decoded += len(tensor_list)

The final step is to pack all of this data into a special CVCUDA samples object called as Batch. The Batch object helps us keep track of the data associated with the batch, the index of the batch and optionally any filename information one wants to attach (i.e. which files the data came from).

1batch = Batch(
2    batch_idx=self.batch_idx,
3    data=cvcuda_decoded_tensor,
4    fileinfo=file_name_batch,
5)
6self.batch_idx += 1
7