.. # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: Apache-2.0 # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. .. _sample_common: Utilities --------- The ``common.py`` module provides utilities for: * **Image I/O** - GPU-accelerated reading and writing * **CUDA Memory** - Host-device memory transfers * **TensorRT** - Model inference wrapper * **Model Export** - PyTorch to ONNX to TensorRT All samples import from this module to avoid code duplication. Module Location --------------- File: ``samples/common.py`` .. code-block:: python from common import ( read_image, write_image, TRT, cuda_memcpy_h2d, cuda_memcpy_d2h, zero_copy_split, parse_image_args, get_cache_dir, engine_from_onnx, export_classifier_onnx, export_retinanet_onnx, export_segmentation_onnx, ) Image I/O Functions ------------------- .. _common_read_image: read_image() ^^^^^^^^^^^^ .. code-block:: python def read_image(path: Path) -> cvcuda.Tensor # path: Path to input image file (JPG, PNG, etc.) # Returns: CV-CUDA tensor in HWC layout, uint8 data type Load an image from disk directly into GPU memory using nvImageCodec for GPU-accelerated decoding. **Example:** .. code-block:: python from common import read_image image = read_image(Path("input.jpg")) print(image.shape) # (H, W, 3) print(image.dtype) # uint8 .. _common_write_image: write_image() ^^^^^^^^^^^^^ .. code-block:: python def write_image(tensor: cvcuda.Tensor, path: Path) -> None # tensor: CV-CUDA tensor in HWC layout # path: Output file path (format determined by extension: .jpg, .png, etc.) Save a CV-CUDA tensor as an image file using nvImageCodec. **Example:** .. code-block:: python from common import write_image write_image(processed_image, Path("output.jpg")) CUDA Memory Operations ----------------------- .. _common_cuda_memcpy_h2d: cuda_memcpy_h2d() ^^^^^^^^^^^^^^^^^ .. code-block:: python def cuda_memcpy_h2d( host_array: np.ndarray, # NumPy array on CPU device_array: int | dict | object # GPU pointer or CV-CUDA tensor ) -> None Copy data from CPU (host) memory to GPU (device) memory. **Example:** .. code-block:: python # Upload normalization parameters mean = np.array([0.485, 0.456, 0.406], dtype=np.float32) mean_tensor = cvcuda.Tensor((3,), np.float32) cuda_memcpy_h2d(mean, mean_tensor.cuda()) .. _common_cuda_memcpy_d2h: cuda_memcpy_d2h() ^^^^^^^^^^^^^^^^^ .. code-block:: python def cuda_memcpy_d2h( device_array: int | dict | object, # GPU pointer or CV-CUDA tensor host_array: np.ndarray # NumPy array on CPU (pre-allocated) ) -> None Copy data from GPU (device) memory to CPU (host) memory. **Example:** .. code-block:: python # Download inference results output = np.zeros((1, 1000), dtype=np.float32) cuda_memcpy_d2h(output_tensor.cuda(), output) # Now process on CPU top_classes = np.argsort(output[0])[::-1][:5] Tensor Utilities ---------------- .. _common_zero_copy_split: zero_copy_split() ^^^^^^^^^^^^^^^^^ .. code-block:: python def zero_copy_split(batch: cvcuda.Tensor) -> list[cvcuda.Tensor] # batch: Batched tensor with shape (N, ...) where N is batch size # Returns: List of N tensors, each representing one item from the batch Split a batched tensor into individual tensors without copying data (creates views into original memory). **Example:** .. code-block:: python # Stack images batch = cvcuda.stack([img1, img2, img3]) # Shape: (3, H, W, C) # Process batch processed = cvcuda.gaussian(batch, (5, 5), (1.0, 1.0)) # Split back to individual images images = zero_copy_split(processed) # List of 3 tensors for img in images: print(img.shape) # (H, W, C) Argument Parsing ---------------- .. _common_parse_image_args: parse_image_args() ^^^^^^^^^^^^^^^^^^ .. code-block:: python def parse_image_args(default_output: str = "output.jpg") -> argparse.Namespace # default_output: Default output filename # Returns: Namespace with input, output, width, height attributes Parse command-line arguments for image processing samples (``--input``, ``--output``, ``--width``, ``--height``). **Example:** .. code-block:: python args = parse_image_args("processed.jpg") input_image = read_image(args.input) # ... process ... write_image(result, args.output) TensorRT Integration -------------------- .. _common_trt: TRT Class ^^^^^^^^^ .. code-block:: python class TRT: def __init__(self, engine_path: Path) # engine_path: Path to serialized TensorRT engine file (.trtmodel) def __call__(self, inputs: list[cvcuda.Tensor]) -> list[cvcuda.Tensor] # inputs: List of CV-CUDA tensors matching engine's expected inputs # Returns: List of CV-CUDA tensors containing inference results Wrapper class for TensorRT engine inference with CV-CUDA tensor support via ``__cuda_array_interface__``. **Example:** .. code-block:: python # Load TensorRT engine from common import get_cache_dir model = TRT(get_cache_dir() / "resnet50.trtmodel") # Run inference input_tensors = [preprocessed_image] output_tensors = model(input_tensors) # Access results logits = output_tensors[0] .. _common_engine_from_onnx: engine_from_onnx() ^^^^^^^^^^^^^^^^^^ .. code-block:: python def engine_from_onnx( onnx_path: Path, # Path to ONNX model file engine_path: Path, # Path where TensorRT engine will be saved use_fp16: bool = True, # Enable FP16 precision max_batch_size: int = 1 # Maximum batch size to support ) -> None Build a TensorRT engine from an ONNX model with optimizations (FP16, layer fusion, etc.). **Example:** .. code-block:: python engine_from_onnx( Path("model.onnx"), Path("model.trtmodel"), use_fp16=True ) Model Export Functions ---------------------- .. _common_export_classifier_onnx: export_classifier_onnx() ^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python def export_classifier_onnx( model: torch.nn.Module, # PyTorch model output_path: Path, # Where to save ONNX file input_shape: tuple[int, int, int], # Model input shape (C, H, W) verbose: bool = False # Print export details ) -> None Export a PyTorch classification model to ONNX format. **Example:** .. code-block:: python import torchvision model = torchvision.models.resnet50(weights='DEFAULT') export_classifier_onnx( model, Path("resnet50.onnx"), (3, 224, 224) ) .. _common_export_retinanet_onnx: export_retinanet_onnx() ^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python def export_retinanet_onnx( model: torch.nn.Module, # PyTorch RetinaNet model output_path: Path, # Output ONNX path input_shape: tuple[int, int, int], # Input shape (C, H, W) score_threshold: float = 0.5, # Confidence threshold for detections iou_threshold: float = 0.5, # IoU threshold for NMS max_detections: int = 100, # Maximum boxes to return verbose: bool = False # Print export details ) -> None Export RetinaNet detection model with TensorRT EfficientNMS plugin to ONNX (includes GPU-accelerated NMS). .. _common_export_segmentation_onnx: export_segmentation_onnx() ^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python def export_segmentation_onnx( model: torch.nn.Module, # PyTorch segmentation model output_path: Path, # Output ONNX path input_shape: tuple[int, int, int], # Input shape (C, H, W) verbose: bool = False # Print export details ) -> None Export segmentation model (FCN, DeepLab, etc.) to ONNX. **Example:** .. code-block:: python import torchvision fcn = torchvision.models.segmentation.fcn_resnet101(weights='DEFAULT') export_segmentation_onnx( fcn, Path("fcn.onnx"), (3, 224, 224) ) Dependencies ------------ The common module requires: * **cvcuda** - CV-CUDA * **numpy** - Array operations * **tensorrt** - TensorRT inference * **torch** - PyTorch for model export * **nvimgcodec** - Image I/O * **cuda-python** - CUDA runtime bindings See Also -------- * :ref:`Hello World Sample ` - Uses image I/O functions * :ref:`Classification Sample ` - Uses TensorRT utilities * :ref:`Applications ` - End-to-end pipelines * :ref:`Operators ` - Individual operators * :ref:`Python API ` - Core API reference