..
  # SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
  # SPDX-License-Identifier: Apache-2.0
  #
  # Licensed under the Apache License, Version 2.0 (the "License");
  # you may not use this file except in compliance with the License.
  # You may obtain a copy of the License at
  #
  # http://www.apache.org/licenses/LICENSE-2.0
  #
  # Unless required by applicable law or agreed to in writing, software
  # distributed under the License is distributed on an "AS IS" BASIS,
  # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  # See the License for the specific language governing permissions and
  # limitations under the License.

.. _v0.10.0-beta:

v0.10.0-beta
============

Release Highlights
------------------

CV-CUDA v0.10.0 includes a critical bug fix (cache growth management) alongside the following changes:

* **New Features**:

  * Added mechanism to limit and manage cache memory consumption (includes new "Best Practices" documentation) [1]_.
  * Performance improvements of color conversion operators (e.g., 2x faster RGB2YUV).
  * Refactored codebase to allow independent build of NVCV library (data structures).

* **Bug Fixes**:

  * Fixed unbounded cache memory consumption issue [1]_.
  * Improved management of Python-created object lifetimes, decoupled from cache management [1]_.
  * Fixed potential crash in Resize operator's linear and nearest neighbor interpolation from non-aligned vectorized writes.
  * Fixed Python CvtColor operator to correctly handle NV12 and NV21 outputs.
  * Fixed Resize and RandomResizedCrop linear interpolation weight for border rows and columns.
  * Fixed missing parameter in C API for fused ResizeCropConvertReformat.
  * Fixed several minor documentation and error output issues.
  * Fixed minor compiler warning while building Resize operator.

Compatibility and Known Limitations
-----------------------------------

* **New limitations**:

  * Cache/resource management introduced in v0.10 add micro-second-level overhead to Python operator calls. Based on the performance analysis of our Python samples, we expect the production- and pipeline-level impact to be negligible. CUDA kernel and C++ call performance is not affected. We aim to investigate and reduce this overhead further in a future release.​
  * Sporadic Pybind11-deallocation crashes have been reported in long-lasting multi-threaded Python pipelines with externally allocated memory (eg wrapped Pytorch buffers). We are evaluating an upgrade of Pybind11 (currently using 2.10) as a potential fix in an upcoming release.

For the full list, see main README on `CV-CUDA GitHub <https://github.com/CVCUDA/CV-CUDA>`_.

License
-------

CV-CUDA is licensed under the `Apache 2.0 <https://github.com/CVCUDA/CV-CUDA/blob/main/LICENSE.md>`_ license.

Resources
---------

1. `CV-CUDA GitHub <https://github.com/CVCUDA/CV-CUDA>`_
2. `CV-CUDA Increasing Throughput and Reducing Costs for AI-Based Computer Vision with CV-CUDA <https://developer.nvidia.com/blog/increasing-throughput-and-reducing-costs-for-computer-vision-with-cv-cuda/>`_
3. `NVIDIA Announces Microsoft, Tencent, Baidu Adopting CV-CUDA for Computer Vision AI <https://blogs.nvidia.com/blog/2023/03/21/cv-cuda-ai-computer-vision/>`_
4. `CV-CUDA helps Tencent Cloud audio and video PaaS platform achieve full-process GPU acceleration for video enhancement AI <https://developer.nvidia.com/zh-cn/blog/cv-cuda-high-performance-image-processing/>`_

Acknowledgements
----------------

CV-CUDA is developed jointly by NVIDIA and the ByteDance Machine Learning team.

.. [1] These fixes and features add micro-second-level overhead to Python operator calls. Based on the performance analysis of our Python samples, we expect the production- and pipeline-level impact to be negligible. CUDA kernel and C++ call performance is not affected. We aim to investigate and reduce this overhead further in a future release.​