Math operators 

Template Parameters:

T – Type of the values to be accessed in the border wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
StrideType – Type of the strdies used in the underlying TensorWrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type

Returns:

Border wrap useful to access tensor data border aware in H and W in CUDA kernels.

template<typename T, NVCVBorderType B, typename StrideType = int64_t, class = Require<HasTypeTraits<T>>> __host__ auto CreateBorderWrapNHWC(const TensorDataStridedCuda &tensor, T borderValue = {})

Factory function to create an NHWC border wrap given a tensor data.

The output BorderWrap wraps an NHWC 4D tensor allowing to access data per batch (N), per row (H), per column (W) and per channel (C) of the input tensor border aware in rows (or height H) and columns (or width W). The input tensor data must have either NHWC or HWC layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8. The active dimensions are H (second) and W (third).

See also

Template Parameters:

T – Type of the values to be accessed in the border wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
StrideType – Type of the strdies used in the underlying TensorWrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type

Returns:

Border wrap useful to access tensor data border aware in H and W in CUDA kernels.

template<int N, typename T, class = Require<HasEnoughComponents<T, N>>> __host__ __device__ auto DropCast (T v)

Metafunction to drop components of a compound value.

The template parameter N defines the number of components to cast the CUDA compound type T passed as function argument v. This is done by dropping the last components after N from v. For instance, an uint3 can have its z component dropped by passing it as function argument to DropCast and the number 2 as template argument (see example below). The type T is not needed as it is inferred from the argument v. It is a requirement of the DropCast function that the type T has at least N components.

uint2 dstIdx = DropCast<2>(blockIdx * blockDim + threadIdx);

Template Parameters:: N – Number of components to return.
Parameters:: v – [in] Value to drop components from.
Returns:: The compound value with N components dropping the last, extra components.

template<NVCVInterpolationType I, int Position = 1, typename IndexType = int64_t> inline constexpr IndexType __host__ __device__ GetIndexForInterpolation (float c)

Function to get an integer index from a float coordinate for interpolation purpose.

Template Parameters:

I – Interpolation type, one of NVCVInterpolationType.
Position – Interpolation position, 1 for the first index and 2 for the second index.
IndexType – Type of the returned value

Parameters:

c – [in] Coordinate in floating-point to convert to index in integer.

Returns:

Index in integer suitable for interpolation computation.

inline void __host__ __device__ GetCubicCoeffs (float delta, float &w0, float &w1, float &w2, float &w3)

template<typename T, NVCVBorderType B, NVCVInterpolationType I, typename StrideType = int64_t, class = Require<HasTypeTraits<T>>> __host__ auto CreateInterpolationWrapNHW(const TensorDataStridedCuda &tensor, T borderValue = {}, float scaleX = {}, float scaleY = {})

Factory function to create an NHW interpolation wrap given a tensor data.

The output InterpolationWrap wraps an NHW 3D tensor allowing to access data per batch (N), per row (H) and per column (W) of the input tensor interpolation-border aware in rows (or height H) and columns (or width W). The input tensor data must have either NHWC or HWC layout, where the channel C is inside the given template type T, e.g. T=uchar4 for RGBA8. The active dimensions are H (second) and W (third).

See also

Template Parameters:

T – Type of the values to be accessed in the interpolation wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
I – Interpolation to be used when accessing H and W, one of NVCVInterpolationType
StrideType – Stride type used when accessing underlying tensor data

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type
scaleX – [in] Scale X value to be used when interpolation elements with area type
scaleY – [in] Scale Y value to be used when interpolation elements with area type

Returns:

Interpolation wrap useful to access tensor data interpolation-border aware in H and W in CUDA kernels.

template<typename T, NVCVBorderType B, NVCVInterpolationType I, typename StrideType = int64_t, class = Require<HasTypeTraits<T>>> __host__ auto CreateInterpolationWrapNHWC(const TensorDataStridedCuda &tensor, T borderValue = {}, float scaleX = {}, float scaleY = {})

Factory function to create an NHWC interpolation wrap given a tensor data.

The output InterpolationWrap wraps an NHWC 4D tensor allowing to access data per batch (N), per row (H), per column (W) and per channel (C) of the input tensor interpolation-border aware in rows (or height H) and columns (or width W). The input tensor data must have either NHWC or HWC layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8. The active dimensions are H (second) and W (third).

See also

Template Parameters:

T – Type of the values to be accessed in the interpolation wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
I – Interpolation to be used when accessing H and W, one of NVCVInterpolationType
StrideType – Stride type used when accessing underlying tensor data

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type
scaleX – [in] Scale X value to be used when interpolation elements with area type
scaleY – [in] Scale Y value to be used when interpolation elements with area type

Returns:

Interpolation wrap useful to access tensor data interpolation-border aware in H and W in CUDA kernels.

template<typename T, typename U, class = nvcv::cuda::Require<nvcv::cuda::detail::IsSameCompound<T, U>>> inline __host__ __device__ auto dot (T a, U b)

template<RoundMode RM, typename T, typename U, class = Require<(!std::is_same_v<T, U>)&&((NumComponents<T> == NumComponents) || (NumComponents<T> == 0 && HasTypeTraits))>> inline __host__ __device__ auto round (U u)

Metafunction to round all elements of the input.

This function rounds all elements of the input and returns the result with the same type as the input. Optionally, the base type of the result may be specified by the template argument type T. For instance, a float4 can have its 4 elements rounded into a float4 result, or to a different result type, such as T=int or T=int4, where the result will be int4 with the rounded results (see example below). Also optionally, the round mode RM can be specified, as one of RoundMode, e.g. RoundMode::DOWN. It is a requirement of round that the input source type has type traits and the optional result type T is either a regular C type or has the same number of components as the input type.

using FloatType = MakeType<float, 4>;
FloatType res = ...;
FloatType float_rounded = round(res);
ConvertBaseTypeTo<int, FloatType> int_rounded = round<int>(res);

Template Parameters:

RM – Optional round mode to be used, cf. RoundMode.
U – Type of the source value (with 1 to 4 elements) passed as (and inferred from) argument u.
T – Optional type that defines the result of the round.

Parameters:

u – [in] Source value to round all elements with its same type or T.

Returns:

The value with all elements rounded.

template<typename T, typename U, class = Require<(!std::is_same_v<T, U>)&&((NumComponents<T> == NumComponents) || (NumComponents<T> == 0 && HasTypeTraits))>> inline __host__ __device__ auto round (U u)

Overload of round function.

It does specifies target round type T and uses default round mode RoundMode::DEFAULT.

template<RoundMode RM, typename U> inline __host__ __device__ auto round (U u)

Overload of round function.

It does not specify target round type T, using input source type U instead.

template<typename U> inline __host__ __device__ auto round (U u)

Overload of round function.

It does not specify target round type T, using input source type U instead, and uses default round mode RoundMode::DEFAULT.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U min (U a, U b)

Metafunction to compute the minimum of two inputs per element.

This function finds the minimum of two inputs per element and returns the result with the same type as the input. For instance, two int4 inputs {1, 2, 3, 4} and {4, 3, 2, 1} yield the minimum {1, 2, 2, 1} as int4 as well (see example below). It is a requirement of min that the input source type has type traits.

using IntType = MakeType<int, 4>;
IntType a = {1, 2, 3, 4}, b = {4, 3, 2, 1};
IntType ab_min = min(a, b); // = {1, 2, 2, 1}

Template Parameters:: U – Type of the two source arguments and the return type.
Parameters:: u – [in] Input value to compute \( min(x_a, x_b) \) where \( x_a \) ( \( x_b \)) is each element of \( a \) ( \( b \)).
Returns:: The return value with one minimum per element.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U max (U a, U b)

Metafunction to compute the maximum of two inputs per element.

This function finds the maximum of two inputs per element and returns the result with the same type as the input. For instance, two int4 inputs {1, 2, 3, 4} and {4, 3, 2, 1} yield the maximum {4, 3, 3, 4} as int4 as well (see example below). It is a requirement of max that the input source type has type traits.

using IntType = MakeType<int, 4>;
IntType a = {1, 2, 3, 4}, b = {4, 3, 2, 1};
IntType ab_max = max(a, b); // = {4, 3, 3, 4}

Template Parameters:: U – Type of the two source arguments and the return type.
Parameters:: u – [in] Input value to compute \( max(x_a, x_b) \) where \( x_a \) ( \( x_b \)) is each element of \( a \) ( \( b \)).
Returns:: The return value with maximums per element.

template<typename U, typename S, class = Require<(NumComponents == NumComponents<S>) || (HasTypeTraits && NumComponents<S> == 0)>> inline __host__ __device__ U pow (U x, S y)

Metafunction to compute the power of all elements of the input.

This function computes the power of all elements of the input x and returns the result with the same type as the input. It is a requirement of pow that the input x has the same number of components of the power y or y is a scalar (and the type of x has type traits).

Template Parameters:

U – Type of the source argument x and the return type.
S – Type of the source argument y power (use a regular C type for scalar).

Parameters:

x – [in] Input value to compute \( x^y \).
y – [in] Input power to compute \( x^y \).

Returns:

The return value with all elements as the result of the power.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U exp (U u)

Metafunction to compute the natural (base e) exponential of all elements of the input.

This function computes the natural (base e) exponential of all elements of the input and returns the result with the same type as the input. It is a requirement of exp that the input source type has type traits.

Template Parameters:: U – Type of the source argument and the return type.
Parameters:: u – [in] Input value to compute \( e^x \) where \( x \) is each element of \( u \).
Returns:: The return value with all elements as the result of the natural (base e) exponential.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U sqrt (U u)

Metafunction to compute the square root of all elements of the input.

This function computes the square root of all elements of the input and returns the result with the same type as the input. It is a requirement of sqrt that the input source type has type traits.

Template Parameters:: U – Type of the source argument and the return type.
Parameters:: u – [in] Input value to compute \( \sqrt{x} \) where \( x \) is each element of \( u \).
Returns:: The return value with all elements as the result of the square root.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U abs (U u)

Metafunction to compute the absolute value of all elements of the input.

This function computes the absolute value of all elements of the input and returns the result with the same type as the input. For instance, an int4 input {-1, 2, -3, 4} yields the absolute {1, 2, 3, 4} as int4 as well (see example below). It is a requirement of abs that the input source type has type traits.

using IntType = MakeType<int, 4>;
IntType a = {-1, 2, -3, 4};
IntType a_abs = abs(a); // = {1, 2, 3, 4}

Template Parameters:: U – Type of the source argument and the return type.
Parameters:: u – [in] Input value to compute \( |x| \) where \( x \) is each element of \( u \).
Returns:: The return value with the absolute of all elements.

template<typename U, typename S, class = Require<(NumComponents == NumComponents<S>) || (HasTypeTraits && NumComponents<S> == 0)>> inline __host__ __device__ U clamp (U u, S lo, S hi)

Metafunction to clamp all elements of the input.

This function clamps all elements of the input u between lo and hi and returns the result with the same type as the input. It is a requirement of clamp that the input u has the same number of components of the range values lo and hi or both are scalars (and the type of u has type traits).

Template Parameters:

U – Type of the source argument u and the return type.
S – Type of the source argument lo and hi (use a regular C type for scalar).

Parameters:

u – [in] Input value to clamp.
lo – [in] Input clamp range low value.
hi – [in] Input clamp range high value.

Returns:

The return value with all elements clamped.

template<typename T, typename U, class = Require<HasTypeTraits<T, U> && !IsCompound<T>>> __host__ __device__ auto RangeCast (U u)

Metafunction to range cast (scale) all elements to a target range.

This function range casts (that is scales) all elements to the range defined by the template argument type T. For instance, a float4 with all elements between 0 and 1 can be casted to an uchar4 with scaling of each element to be in between 0 and 255 (see example below). It is a requirement of RangeCast that both types have type traits and type T must be a regular C type. Several examples of possible target range giving a source range, depending on the limits of regular C types, for the RangeCast function are as follows:

Source type U	Target type T	Source range	Target range
signed char	float	[-128, 127]	[-1, 1]
float	unsigned char	[0, 1]	[0, 255]
short	unsigned int	[-32768, 32767]	[0, 4294967295]
double	int	[-1, 1]	[-2147483648, 2147483647]
unsigned short	double	[0, 65535]	[0, 1]

using DataType = MakeType<uchar, 4>;
using FloatDataType = ConvertBaseTypeTo<float, DataType>;
FloatDataType res = ...; // res component values are in [0, 1]
DataType pix = RangeCast<BaseType<DataType>>(res); // pix are in [0, 255]

Template Parameters:

T – Type that defines the target range to cast.
U – Type of the source value (with 1 to 4 elements) passed as argument.

Parameters:

u – [in] Source value to cast all elements to range of type T.

Returns:

The value with all elements scaled.

template<typename T, typename U, class = Require<(NumComponents<T> == NumComponents) || (NumComponents<T> == 0 && HasTypeTraits)>> __host__ __device__ auto SaturateCast (U u)

Metafunction to saturate cast all elements to a target type.

This function saturate casts (clamping with potential rounding) all elements to the range defined by the template argument type T. For instance, a float4 with any values (can be below 0 and above 255) can be casted to an uchar4 rounding-then-saturating each value to be in between 0 and 255 (see example below). It is a requirement of SaturateCast that both types have the same number of components or T is a regular C type.

using DataType = MakeType<uchar, 4>;
using FloatDataType = ConvertBaseTypeTo<float, DataType>;
FloatDataType res = ...; // res component values are in [0, 1]
DataType pix = SaturateCast<DataType>(res); // pix are in [0, 255]

Template Parameters:

T – Type that defines the target range to cast.
U – Type of the source value (with 1 to 4 elements) passed as argument.

Parameters:

u – [in] Source value to cast all elements to range of base type of T

Returns:

The value with all elements clamped and potentially rounded.

template<typename T, typename U, class = Require<HasTypeTraits<T, U> && !IsCompound<T>>> __host__ __device__ auto StaticCast (U u)

Metafunction to static cast all values of a compound to a target type.

The template parameter T defines the base type (regular C type) to cast all components of the CUDA compound type U passed as function argument u to the type T. The static cast return type has the base type T and the number of components as the compound type U. For instance, an uint3 can be casted to int3 by passing it as function argument of StaticCast and the type int as template argument (see example below). The type U is not needed as it is inferred from the argument \u. It is a requirement of the StaticCast function that the type T is of regular C type and the type U is of CUDA compound type.

int3 idx = StaticCast<int>(blockIdx * blockDim + threadIdx);

Template Parameters:: T – Type to do static cast on each component of u.
Parameters:: u – [in] Compound value to static cast each of its components to target type T.
Returns:: The compound value with all components static casted to type T.

template<typename T, typename StrideType = int64_t, class = Require<HasTypeTraits<T> && IsStrideType<StrideType>>> __host__ auto CreateTensorWrapNHW(const TensorDataStridedCuda &tensor)

Factory function to create an NHW tensor wrap given a tensor data.

The output TensorWrap is an NHW 3D tensor allowing to access data per batch (N), per row (H) and per column (W) of the input tensor. The input tensor data must have either NHWC or HWC layout, where the channel C is inside T, e.g. T=uchar3 for RGB8.

See also

Template Parameters:

T – Type of the values to be accessed in the tensor wrap.
StrideType – Type of the stride used in the tensor wrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.

Returns:

Tensor wrap useful to access tensor data in CUDA kernels.

template<typename T, typename StrideType = int64_t, class = Require<HasTypeTraits<T> && IsStrideType<StrideType>>> __host__ auto CreateTensorWrapNHWC(const TensorDataStridedCuda &tensor)

Factory function to create an NHWC tensor wrap given a tensor data.

The output TensorWrap is an NHWC 4D tensor allowing to access data per batch (N), per row (H), per column (W) and per channel (C) of the input tensor. The input tensor data must have either NHWC or HWC layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8.

See also

Template Parameters:

T – Type of the values to be accessed in the tensor wrap.
StrideType – Type of the stride used in the tensor wrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.

Returns:

Tensor wrap useful to access tensor data in CUDA kernels.

template<typename T, typename StrideType = int64_t, class = Require<HasTypeTraits<T> && IsStrideType<StrideType>>> __host__ auto CreateTensorWrapNCHW(const TensorDataStridedCuda &tensor)

Factory function to create an NCHW tensor wrap given a tensor data.

The output TensorWrap is an NCHW 4D tensor allowing to access data per batch (N), per channel (C), per row (H), and per column (W) of the input tensor. The input tensor data must have either NCHW or CHW layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8.

See also

TensorBatchWrap shortcuts

Template Parameters:: T – Type of the values to be accessed in the tensor wrap.
Parameters:: tensor – [in] Reference to the tensor that will be wrapped.
Returns:: Tensor wrap useful to access tensor data in CUDA kernels.

template<typename T, typename RT = detail::CopyConstness_t<T, std::conditional_t<IsCompound<T>, BaseType<T>, T>>, class = Require<HasTypeTraits<T>>> __host__ __device__ RT & GetElement (T &v, int eidx)

Metafunction to get an element by reference from a given value reference.

The value may be of CUDA compound type with 1 to 4 elements, where the corresponding element index is 0 to 3, and the return is a reference to the element with the base type of the compound type, copying the constness (that is the return reference is constant if the input value is constant). The value may be a regular C type, in which case the element index is ignored and the identity is returned. It is a requirement of the GetElement function that the type T has type traits.

using PixelRGB8Type = MakeType<unsigned char, 3>;
PixelRGB8Type pix = ...;
auto green = GetElement(pix, 1); // yields unsigned char

Template Parameters:

T – Type of the value to get the element from.

Parameters:

v – [in] Value of type T to get an element from.
eidx – [in] Element index in [0, 3] inside the compound value to get the reference from. This element index is ignored in case the value is not of a CUDA compound type.

Returns:

The reference of the value’s element.

template<int EIDX, typename T, typename RT = detail::CopyConstness_t<T, std::conditional_t<IsCompound<T>, BaseType<T>, T>>, class = Require<HasTypeTraits<T>>> __host__ __device__ RT & GetElement (T &v)

template<typename T, class = Require<HasTypeTraits<T>>> __host__ __device__ T SetAll (BaseType< T > x)

Metafunction to set all elements to the same value.

Set all elements to the value x passed as argument. For instance, an int3 can have all its elements set to zero by calling SetAll and passing int3 as template argument and zero as argument (see example below). Another way to set all elements to a value is by using the type of the argument as base type and passing the number of channels of the return type (see example below).

auto idx = SetAll<int3>(0); // sets to zero all elements of an int3 index idx: {0, 0, 0}
unsigned char ch = 127;
auto pix = SetAll<4>(ch); // sets all elements of an uchar3 pixel pix: {127, 127, 127, 127}

Template Parameters:

T – Type to be returned with all elements set to the given value x.
N – Number of components as a second option instead of passing the type T.

Parameters:

x – [in] Value to set all elements to.

Returns:

The object of type T with all elements set to x.

template<int N, typename BT, typename RT = MakeType<BT, N>, class = Require<HasTypeTraits<BT>>> __host__ __device__ RT SetAll (BT x)

template<class T, class = Require<HasTypeTraits<T>>> const __host__ char * GetTypeName ()

Metafunction to get the name of a type.

Unfortunately typeid().name() in C/C++ typeinfo yields different names depending on the platform. This function returns the name of the type resembling the CUDA compound type, that may be useful for debug printing.

std::cout << GetTypeName<DataType>();

Template Parameters:: T – Type to get the name from.
Returns:: String with the name of the type.

Variables

template<typename ...Ts> constexpr bool HasTypeTraits = (detail::HasTypeTraits_t<Ts>::value && ...)

template<class T, class = Require<HasTypeTraits<T>>> constexpr bool IsCompound = TypeTraits<T>::components >= 1

template<typename T, int N, class = Require<HasTypeTraits<T>>> constexpr bool HasEnoughComponents = N <= TypeTraits<T>::components

template<typename T> constexpr bool IsStrideType = std::is_same_v<T, int32_t> || std::is_same_v<T, int64_t>

template<typename T, typename StrideType> constexpr bool IsIndexType = std::is_integral_v<T> && (TypeTraits<T>::max <= TypeTraits<StrideType>::max)

template<class T, class = Require<HasTypeTraits<T>>> constexpr int NumComponents = TypeTraits<T>::components

Metavariable to get the number of components of a type.

using DataType = ...;
int nc = nvcv::cuda::NumComponents<DataType>;

Note

This is zero for regular C types.

Template Parameters:: T – Type to get the number of components from.

template<class T, class = Require<HasTypeTraits<T>>> constexpr int NumElements = TypeTraits<T>::elements

Metavariable to get the number of elements of a type.

using DataType = ...;
for (int e = 0; e < nvcv::cuda::NumElements<DataType>; ++e)
    // ...

Note

This is one for regular C types and one to four for CUDA compound types.

Template Parameters:: T – Type to get the number of elements from.

template<typename T, class = Require<HasTypeTraits<T>>> constexpr BaseType<T> Lowest = std::is_floating_point_v<BaseType<T>> ? -TypeTraits<T>::max : TypeTraits<T>::min

template<typename ValueType> class ArrayWrap: #include <ArrayWrap.hpp>

template<typename T, NVCVBorderType B> class BorderVarShapeWrap : public nvcv::cuda::detail::BorderIWImpl<T, B>

#include <BorderVarShapeWrap.hpp>

Border var-shape wrapper class used to wrap an ImageBatchVarShapeWrap adding border handling to it.

This class wraps an ImageBatchVarShapeWrap to add border handling functionality. It provides the methods ptr and operator[] to do the same semantic access (pointer or reference) in the wrapped ImageBatchVarShapeWrap but border aware on width and height as active dimensions.

using PixelType = ...;
using ImageBatchWrap = ImageBatchVarShapeWrap<PixelType>;
using BorderVarShape = BorderVarShapeWrap<PixelType, NVCV_BORDER_REPLICATE>;
ImageBatchWrap dst(...);
ImageBatchWrap srcImageBatch(...);
BorderVarShape src(srcImageBatch);
dim3 grid{...}, block{...};
int2 fillBorderSize{2, 2};
FillBorder<<<grid, block>>>(dst, src, src.numImages(), fillBorderSize);

template<typename T, NVCVBorderType B>
__global__ void FillBorder(ImageBatchVarShapeWrap<T> dst, BorderVarShapeWrap<T, B> src, int ns, int2 bs)
{
    int3 dstCoord = StaticCast<int>(blockIdx * blockDim + threadIdx);
    if (dstCoord.x >= dst.width(dstCoord.z) || dstCoord.y >= dst.height(dstCoord.z) || dstCoord.z >= ns)
        return;
    int3 srcCoord = {dstCoord.x - bs.x, dstCoord.y - bs.y, dstCoord.z};
    dst[dstCoord] = src[srcCoord];
}

Template Parameters:

T – Type (it can be const) of each element inside the image batch var-shape wrapper.
B – It is a NVCVBorderType indicating the border to be used.

template<typename T> class BorderVarShapeWrap<T, NVCV_BORDER_CONSTANT> : public nvcv::cuda::detail::BorderIWImpl<T, NVCV_BORDER_CONSTANT>

#include <BorderVarShapeWrap.hpp>

Border var-shape wrapper class specialized for NVCV_BORDER_CONSTANT.

Template Parameters:: T – Type (it can be const) of each element inside the image batch var-shape wrapper.

template<typename T, NVCVBorderType B> class BorderVarShapeWrapNHWC : public nvcv::cuda::detail::BorderIWImpl<T, B>: #include <BorderVarShapeWrap.hpp>

template<typename T> class BorderVarShapeWrapNHWC<T, NVCV_BORDER_CONSTANT> : public nvcv::cuda::detail::BorderIWImpl<T, NVCV_BORDER_CONSTANT>

#include <BorderVarShapeWrap.hpp>

Border var-shape wrapper class specialized for NVCV_BORDER_CONSTANT.

Template Parameters:: T – Type (it can be const) of each element inside the image batch var-shape wrapper.

template<class TW, NVCVBorderType B, bool... ActiveDimensions> class BorderWrap : public nvcv::cuda::detail::BorderWrapImpl<TW, B, ActiveDimensions...>

#include <BorderWrap.hpp>

Border wrapper class used to wrap a TensorWrap adding border handling to it.

This class wraps a TensorWrap to add border handling functionality. It provides the methods ptr and operator[] to do the same semantic access, pointer or reference respectively, in the wrapped TensorWrap but border aware. It also provides a compile-time set of boolean flags to inform active border-aware dimensions. Active dimensions participate in border handling, storing the corresponding dimension shape. Inactive dimensions are not checked, the dimension shape is not stored, and thus core dump (or segmentation fault) might happen if accessing outside boundaries of inactive dimensions.

using DataType = ...;
using TensorWrap2D = TensorWrap<-1, -1, DataType>;
using BorderWrap2D = BorderWrap<TensorWrap2D, NVCV_BORDER_REFLECT, true, true>;
TensorWrap2D tensorWrap(...);
int2 tensorShape = ...;
BorderWrap2D borderAwareTensor(tensorWrap, tensorShape.x, tensorShape.y);
// Now use borderAwareTensor instead of tensorWrap to access elements inside or outside the tensor,
// outside elements use reflect border, that is the outside index is reflected back inside the tensor

See also

NVCV_CPP_CUDATOOLS_BORDERWRAPS

Template Parameters:

TW – It is a TensorWrap class with any dimension and type.
B – It is a NVCVBorderType indicating the border to be used.
ActiveDimensions – Flags to inform active (true) or inactive (false) dimensions.

template<class TW, bool... ActiveDimensions> class BorderWrap<TW, NVCV_BORDER_CONSTANT, ActiveDimensions...> : public nvcv::cuda::detail::BorderWrapImpl<TW, NVCV_BORDER_CONSTANT, ActiveDimensions...>

#include <BorderWrap.hpp>

Border wrapper class specialized for NVCV_BORDER_CONSTANT.

Template Parameters:

TW – It is a TensorWrap class with any dimension and type.
ActiveDimensions – Flags to inform active (true) or inactive (false) dimensions.

template<typename T, int N> class FullTensorWrap

#include <FullTensorWrap.hpp>

FullTensorWrap class is a non-owning wrap of a N-D tensor used for easy access of its elements in CUDA device.

FullTensorWrap is a wrapper of a multi-dimensional tensor that holds all information related to it, i.e. N strides and N shapes, where N is its number of dimensions.

Template arguments:

T type of the values inside the tensor
N dimensions

FullTensor wrapper class specialized for non-constant value type.

Template Parameters:

T – Type (it can be const) of each element (or value) inside the tensor wrapper.
N – dimensions.
T – Type (non-const) of each element inside the tensor wrapper.
N – Number of dimensions.

template<typename T, int N> class FullTensorWrap<const T, N>: #include <FullTensorWrap.hpp>

template<typename T> class ImageBatchVarShapeWrap

#include <ImageBatchVarShapeWrap.hpp>

Image batch var-shape wrapper class to wrap ImageBatchVarShapeDataStridedCuda.

ImageBatchVarShapeWrap is a wrapper of an image batch (or a list of images) of variable shapes. The template parameter T is the type of each element inside the wrapper, and it can be compound type to represent a pixel type, e.g. uchar4 for RGBA images.

 cudaStream_t stream;
 cudaStreamCreate(&stream);
 nvcv::ImageBatchVarShape imageBatch(samples);
auto *imageBatchData  = imageBatch.exportData<nvcv::ImageBatchVarShapeDataStridedCuda>(stream)
 nvcv::cuda::ImageBatchVarShapeWrap<uchar4> wrap(*imageBatchData);
 // Now wrap can be used in device code to access elements of the image batch via operator[] or ptr method.

Image batch var-shape wrapper class to wrap ImageBatchVarShapeDataStridedCuda.

This class is specialized for non-constant value type.

Template Parameters:

T – Type (it can be const) of each element inside the image batch var-shape wrapper.
T – Type (non-const) of each element inside the image batch var-shape wrapper.

Subclassed by nvcv::cuda::ImageBatchVarShapeWrapNHWC< T >

template<typename T> class ImageBatchVarShapeWrap<const T>: #include <ImageBatchVarShapeWrap.hpp>

template<typename T> class ImageBatchVarShapeWrapNHWC : private nvcv::cuda::ImageBatchVarShapeWrap<T>

#include <ImageBatchVarShapeWrap.hpp>

Image batch var-shape wrapper NHWC class to wrap ImageBatchVarShapeDataStridedCuda and number of channels.

This class handles number of channels as a separate run-time parameter instead of built-in T. It considers interleaved channels, where they appear in a packed sequence at the last dimension (thus NHWC). It also considers each image in the batch has a single plane.

Note

The class ImageBatchVarShapeWrap can be used with its template parameter T type as a compound type, where its number of elements yield the number of channels.

Template Parameters:: T – Type (it can be const) of each element inside this wrapper.

template<typename T, NVCVBorderType B, NVCVInterpolationType I> class InterpolationVarShapeWrap : public nvcv::cuda::detail::InterpolationVarShapeWrapImpl<T, B, I>

#include <InterpolationVarShapeWrap.hpp>

Interpolation var-shape wrapper class used to wrap a BorderVarShapeWrap adding interpolation handling to it.

This class wraps a BorderVarShapeWrap to add interpolation handling functionality. It provides the operator[] to do the same semantic value access in the wrapped BorderVarShapeWrap but interpolation aware.

See also

NVCV_CPP_CUDATOOLS_INTERPOLATIONVARSHAPEWRAPS

Note

Each interpolation wrap class below is specialized for one interpolation type.

Template Parameters:

T – Type (it can be const) of each element inside the border var-shape wrapper.
I – It is a NVCVInterpolationType defining the interpolation type to be used.

template<typename T, NVCVBorderType B> class InterpolationVarShapeWrap<T, B, NVCV_INTERP_AREA> : public nvcv::cuda::detail::InterpolationVarShapeWrapImpl<T, B, NVCV_INTERP_AREA>

#include <InterpolationVarShapeWrap.hpp>

Interpolation var-shape wrapper class specialized for NVCV_INTERP_AREA.

Template Parameters:: T – Type (it can be const) of each element inside the border var-shape wrapper.

template<typename T, NVCVBorderType B> class InterpolationVarShapeWrap<T, B, NVCV_INTERP_CUBIC> : public nvcv::cuda::detail::InterpolationVarShapeWrapImpl<T, B, NVCV_INTERP_CUBIC>

#include <InterpolationVarShapeWrap.hpp>

Interpolation var-shape wrapper class specialized for NVCV_INTERP_CUBIC.

Template Parameters:: T – Type (it can be const) of each element inside the border var-shape wrapper.

template<typename T, NVCVBorderType B> class InterpolationVarShapeWrap<T, B, NVCV_INTERP_LINEAR> : public nvcv::cuda::detail::InterpolationVarShapeWrapImpl<T, B, NVCV_INTERP_LINEAR>

#include <InterpolationVarShapeWrap.hpp>

Interpolation var-shape wrapper class specialized for NVCV_INTERP_LINEAR.

Template Parameters:: T – Type (it can be const) of each element inside the border var-shape wrapper.

template<typename T, NVCVBorderType B> class InterpolationVarShapeWrap<T, B, NVCV_INTERP_NEAREST> : public nvcv::cuda::detail::InterpolationVarShapeWrapImpl<T, B, NVCV_INTERP_NEAREST>

#include <InterpolationVarShapeWrap.hpp>

Interpolation var-shape wrapper class specialized for NVCV_INTERP_NEAREST.

Template Parameters:: T – Type (it can be const) of each element inside the border var-shape wrapper.

template<class BW, NVCVInterpolationType I> class InterpolationWrap : public nvcv::cuda::detail::InterpolationWrapImpl<BW, I>

#include <InterpolationWrap.hpp>

Interpolation wrapper class used to wrap a BorderWrap adding interpolation handling to it.

This class wraps a BorderWrap to add interpolation handling functionality. It provides the operator[] to do the same semantic value access in the wrapped BorderWrap but interpolation aware.

using DataType = ...;
using TensorWrap2D = TensorWrap<-1, -1, DataType>;
using BorderWrap2D = BorderWrap<TensorWrap2D, NVCV_BORDER_REFLECT, true, true>;
using InterpWrap2D = InterpolationWrap<BorderWrap2D, NVCV_INTERP_CUBIC>;
TensorWrap2D tensorWrap(...);
BorderWrap2D borderWrap(...);
InterpWrap2D interpolationAwareTensor(borderWrap);
// Now use interpolationAwareTensor instead of borderWrap or tensorWrap to access in-between elements with
// on-the-fly interpolation, in this example grid-unaligned pixels use bi-cubic interpolation, grid-aligned
// pixels that fall outside the tensor use reflect border extension and inside is the tensor value itself

See also

NVCV_CPP_CUDATOOLS_INTERPOLATIONWRAPS

Note

Each interpolation wrap class below is specialized for one interpolation type.

Template Parameters:

BW – It is a BorderWrap class with any dimension and type.
I – It is a NVCVInterpolationType defining the interpolation type to be used.

template<class BW> class InterpolationWrap<BW, NVCV_INTERP_AREA> : public nvcv::cuda::detail::InterpolationWrapImpl<BW, NVCV_INTERP_AREA>

#include <InterpolationWrap.hpp>

Interpolation wrapper class specialized for NVCV_INTERP_AREA.

Template Parameters:: BW – It is a BorderWrap class with any dimension and type.

template<class BW> class InterpolationWrap<BW, NVCV_INTERP_CUBIC> : public nvcv::cuda::detail::InterpolationWrapImpl<BW, NVCV_INTERP_CUBIC>

#include <InterpolationWrap.hpp>

Interpolation wrapper class specialized for NVCV_INTERP_CUBIC.

Template Parameters:: BW – It is a BorderWrap class with any dimension and type.

template<class BW> class InterpolationWrap<BW, NVCV_INTERP_LINEAR> : public nvcv::cuda::detail::InterpolationWrapImpl<BW, NVCV_INTERP_LINEAR>

#include <InterpolationWrap.hpp>

Interpolation wrapper class specialized for NVCV_INTERP_LINEAR.

Template Parameters:: BW – It is a BorderWrap class with any dimension and type.

template<class BW> class InterpolationWrap<BW, NVCV_INTERP_NEAREST> : public nvcv::cuda::detail::InterpolationWrapImpl<BW, NVCV_INTERP_NEAREST>

#include <InterpolationWrap.hpp>

Interpolation wrapper class specialized for NVCV_INTERP_NEAREST.

Template Parameters:: BW – It is a BorderWrap class with any dimension and type.

template<typename T, typename StrideT, StrideT... Strides> class TensorBatchWrapT

#include <TensorBatchWrap.hpp>

TensorBatchWrap class is a non-owning wrap of a batch of N-D tensors used for easy access of its elements in CUDA device.

TensorBatchWrap is a wrapper of a batch of multi-dimensional tensors that can have one or more of its N dimension strides, or pitches, defined either at compile-time or at run-time. Each pitch in Strides represents the offset in bytes as a compile-time template parameter that will be applied from the first (slowest changing) dimension to the last (fastest changing) dimension of the tensor, in that order. Each dimension with run-time pitch is specified as -1 in the Strides template parameter.

Template arguments:

T type of the values inside the tensors
Strides sequence of compile- or run-time pitches (-1 indicates run-time)
- Y compile-time pitches
- X run-time pitches
- N dimensions, where N = X + Y

For example, in the code below a wrap is defined for a batch of HWC 3D tensors where each row in H has a run-time row pitch (second -1), a pixel in W has a compile-time constant pitch as the size of the pixel type and a channel in C has also a compile-time constant pitch as the size of the channel type.

using DataType = ...;
using ChannelType = BaseType<DataType>;
using TensorBatchWrap = TensorBatchWrap<ChannelType, -1, sizeof(DataType), sizeof(ChannelType)>;
TensorBatch tensorBatch = ...;
TensorBatchWrap tensorBatchWrap(tensorBatch.data());
// Elements may be accessed via operator[] using an int4 argument.  They can also be accessed via pointer using
// the ptr method with up to 4 integer arguments or by accessing each TensorWrap separately with tensor(...) method.

TensorBatch wrapper class specialized for non-constant value type.

See also

Template Parameters:

T – Type (it can be const) of each element inside the tensor wrapper.
Strides – Each compile-time (use -1 for run-time) pitch in bytes from first to last dimension.
T – Type (non-const) of each element inside the tensor batch wrapper.
Strides – Each compile-time (use -1 for run-time) pitch in bytes from first to last dimension.

template<typename T, typename StrideT, StrideT... Strides> class TensorBatchWrapT<const T, StrideT, Strides...>: #include <TensorBatchWrap.hpp>

template<typename T, typename StrideT, StrideT... Strides> class TensorWrapT

#include <TensorWrap.hpp>

TensorWrap class is a non-owning wrap of a N-D tensor used for easy access of its elements in CUDA device.

TensorWrap is a wrapper of a multi-dimensional tensor that can have one or more of its N dimension strides, or pitches, defined either at compile-time or at run-time. Each pitch in Strides represents the offset in bytes as a compile-time template parameter that will be applied from the first (slowest changing) dimension to the last (fastest changing) dimension of the tensor, in that order. Each dimension with run-time pitch is specified as -1 in the Strides template parameter.

Template arguments:

T type of the values inside the tensor
StrideT type of the stride used in the byte offset calculation
Strides sequence of compile- or run-time pitches (-1 indicates run-time)
- Y compile-time pitches
- X run-time pitches
- N dimensions, where N = X + Y

For example, in the code below a wrap is defined for an NHWC 4D tensor where each sample image in N has a run-time image pitch (first -1 in template argument), and each row in H has a run-time row pitch (second -1), a pixel in W has a compile-time constant pitch as the size of the pixel type and a channel in C has also a compile-time constant pitch as the size of the channel type.

using DataType = ...;
using ChannelType = BaseType<DataType>;
using TensorWrap = TensorWrap<ChannelType, -1, -1, sizeof(DataType), sizeof(ChannelType)>;
std::byte *imageData = ...;
int imgStride = ...;
int rowStride = ...;
TensorWrap tensorWrap(imageData, imgStride, rowStride);
// Elements may be accessed via operator[] using an int4 argument.  They can also be accessed via pointer using
// the ptr method with up to 4 integer arguments.

Tensor wrapper class specialized for non-constant value type.

See also

TensorWrap shortcuts

Template Parameters:

T – Type (it can be const) of each element inside the tensor wrapper.
Strides – Each compile-time (use -1 for run-time) pitch in bytes from first to last dimension.
T – Type (non-const) of each element inside the tensor wrapper.
Strides – Each compile-time (use -1 for run-time) pitch in bytes from first to last dimension.

template<typename T, typename StrideT, StrideT... Strides> class TensorWrapT<const T, StrideT, Strides...>: #include <TensorWrap.hpp>

namespace detail

Typedefs

template<typename T, NVCVBorderType B> using BorderVarShapeWrapImpl = BorderIWImpl<ImageBatchVarShapeWrap<T>, B>

template<typename T, NVCVBorderType B> using BorderVarShapeWrapNHWCImpl = BorderIWImpl<ImageBatchVarShapeWrapNHWC<T>, B>

template<class FROM, class TO> using CopyConstness_t = typename CopyConstness<FROM, TO>::type

template<class T, int C> using MakeType_t = typename detail::MakeType<T, C>::type

template<class BT, class T> using ConvertBaseTypeTo_t = typename ConvertBaseTypeTo<BT, T>::type

Functions

template<typename U> inline __host__ U RoundEvenImpl (U u)

template<typename T, typename U, int RM = FE_TONEAREST> inline __host__ __device__ T RoundImpl (U u)

template<typename U> inline __host__ __device__ U MinImpl (U a, U b)

template<typename U> inline __host__ __device__ U MaxImpl (U a, U b)

template<typename U, typename S> inline __host__ __device__ U PowImpl (U x, S y)

template<typename U> inline __host__ __device__ U ExpImpl (U u)

template<typename U> inline __host__ __device__ U SqrtImpl (U u)

template<typename U> inline __host__ __device__ U AbsImpl (U u)

template<typename U, typename S> inline __host__ __device__ U ClampImpl (U u, S lo, S hi)

template<typename T, typename U> inline __host__ __device__ T RangeCastImpl (U u)

template<typename T, typename U> inline __host__ __device__ T BaseSaturateCastImpl (U u)

template<typename T, typename U> inline __host__ __device__ T SaturateCastImpl (U u)

template<> __host__ __device__ __forceinline__ unsigned char SaturateCastImpl< unsigned char, signed char > (signed char u)

template<> __host__ __device__ __forceinline__ unsigned char SaturateCastImpl< unsigned char, short > (short u)

template<> __host__ __device__ __forceinline__ unsigned char SaturateCastImpl< unsigned char, unsigned short > (unsigned short u)

template<> __host__ __device__ __forceinline__ unsigned char SaturateCastImpl< unsigned char, int > (int u)

template<> __host__ __device__ __forceinline__ unsigned char SaturateCastImpl< unsigned char, unsigned int > (unsigned int u)

template<> __host__ __device__ __forceinline__ unsigned char SaturateCastImpl< unsigned char, float > (float u)

template<> __host__ __device__ __forceinline__ unsigned char SaturateCastImpl< unsigned char, double > (double u)

template<> __host__ __device__ __forceinline__ signed char SaturateCastImpl< signed char, unsigned char > (unsigned char u)

template<> __host__ __device__ __forceinline__ signed char SaturateCastImpl< signed char, short > (short u)

template<> __host__ __device__ __forceinline__ signed char SaturateCastImpl< signed char, unsigned short > (unsigned short u)

template<> __host__ __device__ __forceinline__ signed char SaturateCastImpl< signed char, int > (int u)

template<> __host__ __device__ __forceinline__ signed char SaturateCastImpl< signed char, unsigned int > (unsigned int u)

template<> __host__ __device__ __forceinline__ signed char SaturateCastImpl< signed char, float > (float u)

template<> __host__ __device__ __forceinline__ signed char SaturateCastImpl< signed char, double > (double u)

template<> __host__ __device__ __forceinline__ unsigned short SaturateCastImpl< unsigned short, signed char > (signed char u)

template<> __host__ __device__ __forceinline__ unsigned short SaturateCastImpl< unsigned short, short > (short u)

template<> __host__ __device__ __forceinline__ unsigned short SaturateCastImpl< unsigned short, int > (int u)

template<> __host__ __device__ __forceinline__ unsigned short SaturateCastImpl< unsigned short, unsigned int > (unsigned int u)

template<> __host__ __device__ __forceinline__ unsigned short SaturateCastImpl< unsigned short, float > (float u)

template<> __host__ __device__ __forceinline__ unsigned short SaturateCastImpl< unsigned short, double > (double u)

template<> __host__ __device__ __forceinline__ short SaturateCastImpl< short, unsigned short > (unsigned short u)

template<> __host__ __device__ __forceinline__ short SaturateCastImpl< short, int > (int u)

template<> __host__ __device__ __forceinline__ short SaturateCastImpl< short, unsigned int > (unsigned int u)

template<> __host__ __device__ __forceinline__ short SaturateCastImpl< short, float > (float u)

template<> __host__ __device__ __forceinline__ short SaturateCastImpl< short, double > (double u)

template<> __host__ __device__ __forceinline__ int SaturateCastImpl< int, unsigned int > (unsigned int u)

template<> __host__ __device__ __forceinline__ int SaturateCastImpl< int, float > (float u)

template<> __host__ __device__ __forceinline__ int SaturateCastImpl< int, double > (double u)

template<> __host__ __device__ __forceinline__ unsigned int SaturateCastImpl< unsigned int, signed char > (signed char u)

template<> __host__ __device__ __forceinline__ unsigned int SaturateCastImpl< unsigned int, short > (short u)

template<> __host__ __device__ __forceinline__ unsigned int SaturateCastImpl< unsigned int, int > (int u)

template<> __host__ __device__ __forceinline__ unsigned int SaturateCastImpl< unsigned int, float > (float u)

template<> __host__ __device__ __forceinline__ unsigned int SaturateCastImpl< unsigned int, double > (double u)

template<typename T, typename U, typename RT, RoundMode RM> inline __host__ __device__ RT RoundImpl (U u)

Variables

template<class T, class U, class = Require<HasTypeTraits<T, U>>> constexpr bool IsSameCompound = IsCompound<T> && TypeTraits<T>::components == TypeTraits::components

template<typename T, typename U, class = Require<HasTypeTraits<T, U>>> constexpr bool OneIsCompound = (TypeTraits<T>::components == 0 && TypeTraits::components >= 1) || (TypeTraits<T>::components >= 1 && TypeTraits::components == 0) || IsSameCompound<T, U>

template<typename T, class = Require<HasTypeTraits<T>>> constexpr bool IsIntegral = std::is_integral_v<typename TypeTraits<T>::base_type>

template<typename T, typename U, class = Require<HasTypeTraits<T, U>>> constexpr bool OneIsCompoundAndBothAreIntegral = OneIsCompound<T, U> && IsIntegral<T> && IsIntegral

template<typename T, class = Require<HasTypeTraits<T>>> constexpr bool IsIntegralCompound = IsIntegral<T> && IsCompound<T>

template<class IW, NVCVBorderType B> class BorderIWImpl: #include <BorderVarShapeWrap.hpp>

Subclassed by nvcv::cuda::BorderVarShapeWrap< T, B >, nvcv::cuda::BorderVarShapeWrap< T, NVCV_BORDER_CONSTANT >, nvcv::cuda::BorderVarShapeWrapNHWC< T, B >, nvcv::cuda::BorderVarShapeWrapNHWC< T, NVCV_BORDER_CONSTANT >

template<class TW, NVCVBorderType B, bool... ActiveDimensions> class BorderWrapImpl: #include <BorderWrap.hpp>

template<class BT, class T> struct ConvertBaseTypeTo: #include <Metaprogramming.hpp>

template<class BT, class T> struct ConvertBaseTypeTo<BT, const T>: #include <Metaprogramming.hpp>

template<class BT, class T> struct ConvertBaseTypeTo<BT, volatile const T>: #include <Metaprogramming.hpp>

template<class BT, class T> struct ConvertBaseTypeTo<BT, volatile T>: #include <Metaprogramming.hpp>

template<class FROM, class TO> struct CopyConstness: #include <Metaprogramming.hpp>

template<class FROM, class TO> struct CopyConstness<const FROM, TO>: #include <Metaprogramming.hpp>

template<typename T, typename = void> struct HasTypeTraits_t : public false_type: #include <Metaprogramming.hpp>

template<typename T> base_type > > : public true_type: #include <Metaprogramming.hpp>

template<typename T, NVCVBorderType B, NVCVInterpolationType I> class InterpolationVarShapeWrapImpl: #include <InterpolationVarShapeWrap.hpp>

Subclassed by nvcv::cuda::InterpolationVarShapeWrap< T, B, I >

template<class BW, NVCVInterpolationType I> class InterpolationWrapImpl: #include <InterpolationWrap.hpp>

Subclassed by nvcv::cuda::InterpolationWrap< BW, I >

template<class T, int C> struct MakeType: #include <Metaprogramming.hpp>

template<> struct MakeType<char, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<char, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<char, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<char, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<char, 4>: #include <Metaprogramming.hpp>

template<class T, int C> struct MakeType<const T, C>: #include <Metaprogramming.hpp>

template<class T, int C> struct MakeType<volatile const T, C>: #include <Metaprogramming.hpp>

template<> struct MakeType<double, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<double, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<double, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<double, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<double, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<float, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<float, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<float, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<float, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<float, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<int, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<int, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<int, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<int, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<int, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<long long, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<long long, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<long long, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<long long, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<long long, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<long, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<long, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<long, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<long, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<long, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<short, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<short, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<short, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<short, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<short, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<signed char, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<signed char, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<signed char, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<signed char, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<signed char, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned char, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned char, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned char, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned char, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned char, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned int, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned int, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned int, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned int, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned int, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long long, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long long, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long long, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long long, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long long, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned long, 4>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned short, 0>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned short, 1>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned short, 2>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned short, 3>: #include <Metaprogramming.hpp>

template<> struct MakeType<unsigned short, 4>: #include <Metaprogramming.hpp>

template<class T, int C> struct MakeType<volatile T, C>: #include <Metaprogramming.hpp>

template<class T> struct TypeTraits: #include <Metaprogramming.hpp>

Subclassed by nvcv::cuda::detail::TypeTraits< const T >, nvcv::cuda::detail::TypeTraits< const volatile T >, nvcv::cuda::detail::TypeTraits< volatile T >

template<> struct TypeTraits<char>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<char1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<char2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<char3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<char4>: #include <Metaprogramming.hpp>

template<class T> struct TypeTraits<const T> : public nvcv::cuda::detail::TypeTraits<T>: #include <Metaprogramming.hpp>

template<class T> struct TypeTraits<volatile const T> : public nvcv::cuda::detail::TypeTraits<T>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<dim3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<double>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<double1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<double2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<double3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<double4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<float>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<float1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<float2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<float3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<float4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<int>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<int1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<int2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<int3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<int4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<long>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<long long>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<long1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<long2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<long3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<long4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<longlong1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<longlong2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<longlong3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<longlong4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<short>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<short1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<short2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<short3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<short4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<signed char>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uchar1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uchar2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uchar3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uchar4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uint1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uint2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uint3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<uint4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulong1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulong2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulong3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulong4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulonglong1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulonglong2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulonglong3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ulonglong4>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<unsigned char>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<unsigned int>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<unsigned long>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<unsigned long long>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<unsigned short>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ushort1>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ushort2>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ushort3>: #include <Metaprogramming.hpp>

template<> struct TypeTraits<ushort4>: #include <Metaprogramming.hpp>

template<class T> struct TypeTraits<volatile T> : public nvcv::cuda::detail::TypeTraits<T>: #include <Metaprogramming.hpp>

namespace math

Functions

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator+= (Vector< T, N > &lhs, const Vector< T, N > &rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator+ (const Vector< T, N > &a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator+= (Vector< T, N > &lhs, T rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator+ (const Vector< T, N > &a, T b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator+ (T a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator-= (Vector< T, N > &lhs, const Vector< T, N > &rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator- (const Vector< T, N > &a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator-= (Vector< T, N > &lhs, T rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator- (const Vector< T, N > &a, T b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator- (T a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator*= (Vector< T, N > &lhs, const T &rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator* (const Vector< T, N > &a, const T &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator* (const T &a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator*= (Vector< T, N > &lhs, const Vector< T, N > &rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator* (const Vector< T, N > &a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator/= (Vector< T, N > &lhs, const T &rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator/ (const Vector< T, N > &a, const T &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator/ (T a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > & operator/= (Vector< T, N > &lhs, const Vector< T, N > &rhs)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator/ (const Vector< T, N > &a, const Vector< T, N > &b)

template<class T, int N> std::ostream &operator<<(std::ostream &out, const Vector<T, N> &v)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > operator- (const Vector< T, N > &v)

template<class T, int N> constexpr __host__ __device__ bool operator== (const Vector< T, N > &a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ bool operator== (const T &a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ bool operator== (const Vector< T, N > &a, const T &b)

template<class T, int N> constexpr __host__ __device__ bool operator< (const Vector< T, N > &a, const Vector< T, N > &b)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > operator* (const Matrix< T, M, N > &m, T val)

template<class T, int M, int N> std::ostream &operator<<(std::ostream &out, const Matrix<T, M, N> &m)

template<class T, int M, int N, int P> constexpr __host__ __device__ Matrix< T, M, P > operator* (const Matrix< T, M, N > &a, const Matrix< T, N, P > &b)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > & operator*= (Matrix< T, M, N > &lhs, T rhs)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > operator* (T val, const Matrix< T, M, N > &m)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > & operator+= (Matrix< T, M, N > &lhs, const Matrix< T, M, N > &rhs)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > operator+ (const Matrix< T, M, N > &lhs, const Matrix< T, M, N > &rhs)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > & operator-= (Matrix< T, M, N > &lhs, const Matrix< T, M, N > &rhs)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > operator- (const Matrix< T, M, N > &a, const Matrix< T, M, N > &b)

template<class T, int M, int N> constexpr __host__ __device__ Vector< T, N > operator* (const Vector< T, M > &v, const Matrix< T, M, N > &m)

template<class T, int M, int N> constexpr __host__ __device__ Vector< T, M > operator* (const Matrix< T, M, N > &m, const Vector< T, N > &v)

template<class T, int M, int N, class = cuda::Require<(M == N && N > 1)>> constexpr __host__ __device__ Matrix< T, M, N > operator* (const Matrix< T, M, 1 > &m, const Vector< T, N > &v)

template<class T, int M, int N> constexpr __host__ __device__ Vector< T, N > & operator*= (Vector< T, M > &v, const Matrix< T, M, M > &m)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > & operator*= (Matrix< T, M, N > &lhs, const Matrix< T, N, N > &rhs)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > operator- (const Matrix< T, M, N > &m)

template<class T, int M, int N> constexpr __host__ __device__ bool operator== (const Matrix< T, M, N > &a, const Matrix< T, M, N > &b)

template<class T, int M, int N> constexpr __host__ __device__ bool operator== (const T &a, const Matrix< T, M, N > &b)

template<class T, int M, int N> constexpr __host__ __device__ bool operator== (const Matrix< T, M, N > &a, const T &b)

template<class T, int M, int N> constexpr __host__ __device__ bool operator< (const Matrix< T, M, N > &a, const Matrix< T, M, N > &b)

template<typename T, int N, int M> constexpr Matrix<T, N, M> as_matrix(const T (&values)[N][M])

template<class T, int N> constexpr __host__ __device__ Vector< T, N > zeros ()

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > zeros ()

template<class T, int N> constexpr __host__ __device__ Vector< T, N > ones ()

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > ones ()

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > identity ()

template<class T, int M> constexpr __host__ __device__ Matrix< T, M, M > vander (const Vector< T, M > &v)

template<class T, int R> constexpr __host__ __device__ Matrix< T, R, R > compan (const Vector< T, R > &a)

template<class T, int M> constexpr __host__ __device__ Matrix< T, M, M > diag (const Vector< T, M > &v)

template<class T, int N> constexpr __host__ __device__ T dot (const Vector< T, N > &a, const Vector< T, N > &b)

template<class T, int N> constexpr __host__ __device__ Vector< T, N > reverse (const Vector< T, N > &a)

template<class T, int M> constexpr __host__ __device__ Matrix< T, M, M > & transp_inplace (Matrix< T, M, M > &m)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, N, M > transp (const Matrix< T, M, N > &m)

template<class T, int N> constexpr __host__ __device__ Matrix< T, N, 1 > transp (const Vector< T, N > &v)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > flip_rows (const Matrix< T, M, N > &m)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > flip_cols (const Matrix< T, M, N > &m)

template<class T, int M, int N> constexpr __host__ __device__ Matrix< T, M, N > flip (const Matrix< T, M, N > &m)

template<class T, int M, int N = M> class Matrix

#include <LinAlg.hpp>

Matrix class to represent small matrices.

It uses the Vector class to stores each row, storing elements in row-major order, i.e. it has M row vectors where each vector has N elements.

Template Parameters:

T – Matrix value type.
M – Number of rows.
N – Number of columns. Default is M (a square matrix).

template<class T, int N> class Vector

#include <LinAlg.hpp>

Vector class to represent small vectors.

Template Parameters:

T – Vector value type.
N – Number of elements.

namespace detail

Functions

template<class T> constexpr __host__ __device__ void swap (T &a, T &b)

namespace detail

Typedefs

template<typename Func> using AddContext_t = typename AddContext<Func>::type

template<class IT> using IsRandomAccessIterator = typename std::enable_if<std::is_same<typename std::iterator_traits<IT>::iterator_category, std::random_access_iterator_tag>::value>::type

template<std::size_t N> using MakeIndexSequence = typename MakeIndexSequenceImpl<N>::type

template<bool Cond, typename T = void> using EnableIf_t = typename std::enable_if<Cond, T>::type

template<bool Cond, typename If, typename Else> using Conditional_t = typename std::conditional<Cond, If, Else>::type

template<typename T> using AddPointer_t = typename std::add_pointer<T>::type

template<typename T> using AddLRef_t = typename std::add_lvalue_reference<T>::type

template<typename T> using AddRRef_t = typename std::add_rvalue_reference<T>::type

template<typename T> using RemovePointer_t = typename std::remove_pointer<T>::type

template<typename T> using RemoveRef_t = typename std::remove_reference<T>::type

template<typename T> using RemoveCV_t = typename std::remove_cv<T>::type

template<typename T> using RemoveCVRef_t = RemoveCV_t<RemoveRef_t<T>>

Functions

template<typename T> constexpr T AlignDown(T value, T alignment_pow2)

Aligns the value down to a multiple of alignment_pow2.

The function operates by masking the least significant bits of the value. If the alignment is not a power of two, the behavior is undefined.

Remark

Negative values are aligned down, not towards zero.

Template Parameters:

T – an integral type

Parameters:

value – a value to align
alignment_pow2 – the alignment, must be a positive power of 2

Returns:

constexpr T the value aligned down to a multiple of alignment_pow2

template<typename T> constexpr T AlignUp(T value, T alignment_pow2)

Aligns the value up to a multiple of alignment_pow2.

The function operates by adding alignment-1 to the value and masking the least significant bits. If the alignment is not a power of two, the behavior is undefined.

Remark

Negative values are aligned up, that is, towards zero.

Template Parameters:

T – an integral type

Parameters:

value – a value to align
alignment_pow2 – the alignment, must be a positive power of 2

Returns:

constexpr T the value aligned up to a multiple of alignment_pow2

template<typename T> constexpr bool IsAligned(T value, T alignment_pow2)

Checks if the value is a multiple of alignment.

Template Parameters:

T – an integral type

Parameters:

value – the value whose alignment is checked
alignment_pow2 – the alignment, must be a power of 2

Returns:

true if value is a multiple of alignment_pow2

Returns:

false otherwise

inline bool IsAligned(const void *ptr, uintptr_t alignment_pow2)

Checks if a pointer is aligned to a multiple of alignment_pow2 bytes.

Parameters:

ptr – the pointer whose alignment is checked
alignment_pow2 – the alignment, must be a power of 2

Returns:

true if value is a multiple of alignment_pow2

Returns:

false otherwise

template<class IFACE, class H> void SetObjectAssociation(NVCVStatus (*setUserPointer)(H, void*), IFACE *obj, H handle)

template<class IFACE, class H>
IFACE *CastImpl(NVCVStatus (*getUserPointer)(H, void**), NVCVStatus (*setUserPointer)(H, void*), H handle)

inline void ThrowException(NVCVStatus status)

inline void CheckThrow(NVCVStatus status)

inline NVCVImageHandle GetImageHandleForPushBack(Image img)

inline NVCVImageHandle GetImageHandleForPushBack(std::reference_wrapper<Image> img)

inline NVCVImageHandle GetImageHandleForPushBack(NVCVImageHandle imgHandle)

template<typename R, typename... Args, typename F> std::is_convertible< decltype(std::declval< F >)(std::declval< Args >)...)), R > IsInvocableRF (F *)

template<typename R, typename ...Args> std::false_type IsInvocableRF(...)

template<typename ...Args, typename F, typename = decltype(std::declval<F>()(std::declval<Args>()...))> std::true_type IsInvocableF(F*)

template<typename ...Args> std::false_type IsInvocableF(...)

std::false_type IsStdFunctionF(...)

template<typename T> std::true_type IsStdFunctionF(const std::function<T>*)

std::false_type IsRefWrapperF(...)

template<typename T> std::false_type IsRefWrapperF(const std::reference_wrapper<T>*)

Variables

constexpr InPlaceT InPlace

template<typename Func> struct AddContext: #include <Callback.hpp>

template<typename Ret, typename ...Args> struct AddContext<Ret(Args...)>: #include <Callback.hpp>

template<typename ArrayDataType, typename = typename std::enable_if<std::is_base_of<ArrayData, ArrayDataType>::value>::type> class ArrayDataAccessImpl: #include <ArrayDataAccess.hpp>

template<class T, int ID = 0> class BaseFromMember: #include <BaseFromMember.hpp>

template<class T, int ID> class BaseFromMember<T&, ID>: #include <BaseFromMember.hpp>

template<typename...> struct Conjunction : public true_type: #include <TypeTraits.hpp>

template<typename T, typename ...Ts> struct Conjunction<T, Ts...> : public nvcv::detail::ConjunctionImpl<T::value, Ts...>: #include <TypeTraits.hpp>

template<bool, typename...> struct ConjunctionImpl: #include <TypeTraits.hpp>

template<typename ...Ts> struct ConjunctionImpl<false, Ts...> : public false_type: #include <TypeTraits.hpp>

template<typename ...Ts> struct ConjunctionImpl<true, Ts...> : public nvcv::detail::Conjunction<Ts...>: #include <TypeTraits.hpp>

template<typename...> struct Disjunction : public false_type: #include <TypeTraits.hpp>

template<typename T, typename ...Ts> struct Disjunction<T, Ts...> : public nvcv::detail::DisjunctionImpl<T::value, Ts...>: #include <TypeTraits.hpp>

template<bool, typename...> struct DisjunctionImpl: #include <TypeTraits.hpp>

template<typename ...Ts> struct DisjunctionImpl<false, Ts...> : public nvcv::detail::Disjunction<Ts...>: #include <TypeTraits.hpp>

template<typename ...Ts> struct DisjunctionImpl<true, Ts...> : public true_type: #include <TypeTraits.hpp>

template<class T> struct DynamicCast: #include <CastsImpl.hpp>

template<class T> struct DynamicCast<T*>: #include <CastsImpl.hpp>

template<std::size_t... II> struct IndexSequence: #include <IndexSequence.hpp>

struct InPlaceT: #include <InPlace.hpp>

template<typename TypeExpression> struct invoke_result : public std::result_of<TypeExpression>: #include <ArrayDataAccess.hpp>

template<typename X> struct IsCallback : public false_type: #include <Callback.hpp>

template<typename Cpp, typename C, typename Tr, bool SingleUse> struct IsCallback<Callback<Cpp, C, Tr, SingleUse>> : public true_type: #include <Callback.hpp>

template<typename Callable, typename ...Args> struct IsInvocable : public decltypeIsInvocableFAddPointer_t<Callable>: #include <TypeTraits.hpp>

template<typename R, typename Callable, typename ...Args> struct IsInvocableR : public decltypeIsInvocableRFAddPointer_t<Callable>: #include <TypeTraits.hpp>

template<typename X> struct IsRefWrapper : public decltypeIsRefWrapperFstd::declval<X*>: #include <TypeTraits.hpp>

template<typename X> struct IsStdFunction : public decltypeIsStdFunctionFstd::declval<X*>: #include <TypeTraits.hpp>

template<std::size_t N, std::size_t... II> struct MakeIndexSequenceImpl: #include <IndexSequence.hpp>

template<std::size_t... II> struct MakeIndexSequenceImpl<0, II...>: #include <IndexSequence.hpp>

template<NVCVResourceType KIND> class MemAllocatorWithKind : public nvcv::MemAllocator : #include <Allocator.hpp>

Provides a common implementation for different memory allocator wrappers

struct NoTranslation: #include <Callback.hpp>

template<typename HandleType> struct SharedHandleOps: #include <HandleWrapper.hpp>

template<class T> struct StaticCast: #include <CastsImpl.hpp>

template<class T> struct StaticCast<T*>: #include <CastsImpl.hpp>

template<typename ShapeInfo> class TensorDataAccessStridedImageImpl : public nvcv::detail::TensorDataAccessStridedImpl<ShapeInfo>

#include <TensorDataAccess.hpp>

Provides specialized access methods for strided tensor data representing images.

This class is an extension of TensorDataAccessStridedImpl and offers specific utilities for accessing the data in image tensors using a strided memory layout. It provides methods to retrieve the number of columns, rows, channels, and other image-specific properties. Furthermore, it provides utility methods to compute strides and access specific rows, channels, etc.

Template Parameters:: ShapeInfo – The type that contains shape information for the image tensor.

Subclassed by nvcv::detail::TensorDataAccessStridedImagePlanarImpl< ShapeInfo >

template<typename ShapeInfo> class TensorDataAccessStridedImagePlanarImpl : public nvcv::detail::TensorDataAccessStridedImageImpl<ShapeInfo>

#include <TensorDataAccess.hpp>

Provides specialized access methods for strided tensor data representing planar images.

This class is an extension of TensorDataAccessStridedImageImpl and offers specific utilities for accessing the data in planar image tensors using a strided memory layout. It provides methods to retrieve the number of planes, compute the stride for the plane dimension, and access specific planes of the image tensor.

Template Parameters:: ShapeInfo – The type that contains shape information for the planar image tensor.

template<typename ShapeInfo, typename LayoutInfo = typename ShapeInfo::LayoutInfo> class TensorDataAccessStridedImpl

#include <TensorDataAccess.hpp>

Provides access to strided tensor data, allowing for more efficient memory access patterns.

This class offers utilities for accessing the data in a tensor using a strided memory layout. It provides functions to retrieve the number of samples, data type, layout, and shape of the tensor. It also contains utilities for computing strides and accessing specific samples.

Template Parameters:

ShapeInfo – The type that contains shape information for the tensor.
LayoutInfo – The type that contains layout information for the tensor. By default, it is derived from ShapeInfo.

template<typename LAYOUT_INFO> class TensorShapeInfoImpl

#include <TensorShapeInfo.hpp>

This class provides detailed information about the shape of a tensor.

The class is templated on the layout information type, which allows it to be adapted to various tensor layout schemes. It provides functions to retrieve the shape, layout, and additional metadata about the tensor.

Template Parameters:: LAYOUT_INFO – The type that contains layout information for the tensor.

template<typename HandleType> struct UniqueHandleOps: #include <HandleWrapper.hpp>

template<class IFACE> class WrapHandle : public IFACE : #include <CastsImpl.hpp>

namespace cuda

Typedefs

template<typename T, int64_t... Strides> using TensorBatchWrap = TensorBatchWrapT<T, int64_t, Strides...>

template<typename T, int32_t... Strides> using TensorBatchWrap32 = TensorBatchWrapT<T, int32_t, Strides...>

template<typename T, typename StrideType = int64_t> using TensorBatch1DWrap = TensorBatchWrapT<T, StrideType, sizeof(T)>

template<typename T, typename StrideType = int64_t> using TensorBatch2DWrap = TensorBatchWrapT<T, StrideType, -1, sizeof(T)>

template<typename T, typename StrideType = int64_t> using TensorBatch3DWrap = TensorBatchWrapT<T, StrideType, -1, -1, sizeof(T)>

template<typename T, typename StrideType = int64_t> using TensorBatch4DWrap = TensorBatchWrapT<T, StrideType, -1, -1, -1, sizeof(T)>

template<typename T, typename StrideType = int64_t> using TensorBatch5DWrap = TensorBatchWrapT<T, StrideType, -1, -1, -1, -1, sizeof(T)>

template<typename T, int N, typename StrideType = int64_t> using TensorBatchNDWrap = std::conditional_t<N == 1, TensorBatch1DWrap<T, StrideType>, std::conditional_t<N == 2, TensorBatch2DWrap<T, StrideType>, std::conditional_t<N == 3, TensorBatch3DWrap<T, StrideType>, std::conditional_t<N == 4, TensorBatch4DWrap<T, StrideType>, std::conditional_t<N == 5, TensorBatch5DWrap<T, StrideType>, void>>>>>

template<typename T, int64_t... Strides> using TensorWrap = TensorWrapT<T, int64_t, Strides...>

template<typename T, int32_t... Strides> using TensorWrap32 = TensorWrapT<T, int32_t, Strides...>

template<typename T, typename StrideType = int64_t> using Tensor1DWrap = TensorWrapT<T, StrideType, sizeof(T)>

template<typename T, typename StrideType = int64_t> using Tensor2DWrap = TensorWrapT<T, StrideType, -1, sizeof(T)>

template<typename T, typename StrideType = int64_t> using Tensor3DWrap = TensorWrapT<T, StrideType, -1, -1, sizeof(T)>

template<typename T, typename StrideType = int64_t> using Tensor4DWrap = TensorWrapT<T, StrideType, -1, -1, -1, sizeof(T)>

template<typename T, typename StrideType = int64_t> using Tensor5DWrap = TensorWrapT<T, StrideType, -1, -1, -1, -1, sizeof(T)>

template<typename T, int N, typename StrideType = int64_t> using TensorNDWrap = std::conditional_t<N == 1, Tensor1DWrap<T, StrideType>, std::conditional_t<N == 2, Tensor2DWrap<T, StrideType>, std::conditional_t<N == 3, Tensor3DWrap<T, StrideType>, std::conditional_t<N == 4, Tensor4DWrap<T, StrideType>, std::conditional_t<N == 5, Tensor5DWrap<T, StrideType>, void>>>>>

template<bool B> using Require = std::enable_if_t

template<class T, class = Require<HasTypeTraits<T>>> using BaseType = typename TypeTraits<T>::base_type

Metatype to get the base type of a CUDA compound types.

using DataType = ...;
using ChannelType = nvcv::cuda::BaseType<DataType>;

Note

This is identity for regular C types.

Template Parameters:: T – Type to get the base type from.

template<class T, int C, class = Require<HasTypeTraits<T>>> using MakeType = detail::MakeType_t<T, C>

Metatype to make a type from a base type and number of components.

When number of components is zero, it yields the identity (regular C) type, and when it is between 1 and 4 it yields the CUDA compound type.

using RGB8Type = MakeType<unsigned char, 3>; // yields uchar3

Note

Note that T=char might yield uchar1..4 types when char is equal unsigned char, i.e. CHAR_MIN == 0.

Template Parameters:

T – Base type to make the type from.
C – Number of components to make the type.

template<class BT, class T, class = Require<HasTypeTraits<BT, T>>> using ConvertBaseTypeTo = detail::ConvertBaseTypeTo_t<BT, T>

Metatype to convert the base type of a type.

The base type of target type T is replaced to be BT.

using DataType = ...;
using FloatDataType = ConvertBaseTypeTo<float, DataType>; // yields float1..4

Template Parameters:

BT – Base type to use in the conversion.
T – Target type to convert its base type.

Enums

enum class RoundMode : int

Values:

enumerator NEAREST

enumerator DOWN

enumerator UP

enumerator ZERO

enumerator DEFAULT

Functions

template<typename T, class OP, class = Require<std::is_floating_point_v<T>>> __device__ void AtomicOp (T *address, T val, OP op)

Metafunction to do a generic atomic operation in floating-point types.

Template Parameters:

T – Type of the values used in the atomic operation.
OP – Operation class that defines the operator call to be used as atomics.

Parameters:

address – [inout] First value to be used in the atomic operation.
val – [in] Second value to be used.
op – [in] Operation to be used.

template<typename T> inline __device__ void AtomicMin (T &a, T b)

Metafunction to do a atomic minimum operation that accepts floating-point types.

Template Parameters:

T – Type of the values used in the atomic operation.

Parameters:

a – [inout] First value to be used in the atomic operation.
b – [in] Second value to be used.

template<typename T> inline __device__ void AtomicMax (T &a, T b)

Metafunction to do a atomic maximum operation that accepts floating-point types.

Template Parameters:

T – Type of the values used in the atomic operation.

Parameters:

a – [inout] First value to be used in the atomic operation.
b – [in] Second value to be used.

template<bool Active = true, typename T> inline constexpr bool __host__ __device__ IsOutside (T c, T s)

Function to check if given coordinate is outside range defined by given size.

Template Parameters:

Active – Flag to turn this function active.
T – Type of the values given to this function.

Parameters:

c – [in] Coordinate to check if it is outside the range [0, s).
s – [in] Size that defines the inside range [0, s).

Returns:

True if given coordinate is outside given size.

template<NVCVBorderType B, bool Active = true, typename T> inline constexpr T __host__ __device__ GetIndexWithBorder (T c, T s)

Function to get a border-aware index considering the range defined by given size.

Note

This function does not work for NVCV_BORDER_CONSTANT.

Template Parameters:

B – It is a NVCVBorderType indicating the border to be used.
Active – Flag to turn this function active.
T – Type of the values given to this function.

Parameters:

c – [in] Coordinate (input index) to put back inside valid range [0, s).
s – [in] Size that defines the valid range [0, s).

template<typename T, NVCVBorderType B, typename StrideType = int64_t, class = Require<HasTypeTraits<T>>> __host__ auto CreateBorderWrapNHW(const TensorDataStridedCuda &tensor, T borderValue = {})

Factory function to create an NHW border wrap given a tensor data.

The output BorderWrap wraps an NHW 3D tensor allowing to access data per batch (N), per row (H) and per column (W) of the input tensor border aware in rows (or height H) and columns (or width W). The input tensor data must have either NHWC or HWC layout, where the channel C is inside the given template type T, e.g. T=uchar4 for RGBA8. The active dimensions are H (second) and W (third).

See also

Template Parameters:

T – Type of the values to be accessed in the border wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
StrideType – Type of the strdies used in the underlying TensorWrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type

Returns:

Border wrap useful to access tensor data border aware in H and W in CUDA kernels.

template<typename T, NVCVBorderType B, typename StrideType = int64_t, class = Require<HasTypeTraits<T>>> __host__ auto CreateBorderWrapNHWC(const TensorDataStridedCuda &tensor, T borderValue = {})

Factory function to create an NHWC border wrap given a tensor data.

The output BorderWrap wraps an NHWC 4D tensor allowing to access data per batch (N), per row (H), per column (W) and per channel (C) of the input tensor border aware in rows (or height H) and columns (or width W). The input tensor data must have either NHWC or HWC layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8. The active dimensions are H (second) and W (third).

See also

Template Parameters:

T – Type of the values to be accessed in the border wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
StrideType – Type of the strdies used in the underlying TensorWrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type

Returns:

Border wrap useful to access tensor data border aware in H and W in CUDA kernels.

template<int N, typename T, class = Require<HasEnoughComponents<T, N>>> __host__ __device__ auto DropCast (T v)

Metafunction to drop components of a compound value.

The template parameter N defines the number of components to cast the CUDA compound type T passed as function argument v. This is done by dropping the last components after N from v. For instance, an uint3 can have its z component dropped by passing it as function argument to DropCast and the number 2 as template argument (see example below). The type T is not needed as it is inferred from the argument v. It is a requirement of the DropCast function that the type T has at least N components.

uint2 dstIdx = DropCast<2>(blockIdx * blockDim + threadIdx);

Template Parameters:: N – Number of components to return.
Parameters:: v – [in] Value to drop components from.
Returns:: The compound value with N components dropping the last, extra components.

template<NVCVInterpolationType I, int Position = 1, typename IndexType = int64_t> inline constexpr IndexType __host__ __device__ GetIndexForInterpolation (float c)

Function to get an integer index from a float coordinate for interpolation purpose.

Template Parameters:

I – Interpolation type, one of NVCVInterpolationType.
Position – Interpolation position, 1 for the first index and 2 for the second index.
IndexType – Type of the returned value

Parameters:

c – [in] Coordinate in floating-point to convert to index in integer.

Returns:

Index in integer suitable for interpolation computation.

inline void __host__ __device__ GetCubicCoeffs (float delta, float &w0, float &w1, float &w2, float &w3)

template<typename T, NVCVBorderType B, NVCVInterpolationType I, typename StrideType = int64_t, class = Require<HasTypeTraits<T>>> __host__ auto CreateInterpolationWrapNHW(const TensorDataStridedCuda &tensor, T borderValue = {}, float scaleX = {}, float scaleY = {})

Factory function to create an NHW interpolation wrap given a tensor data.

The output InterpolationWrap wraps an NHW 3D tensor allowing to access data per batch (N), per row (H) and per column (W) of the input tensor interpolation-border aware in rows (or height H) and columns (or width W). The input tensor data must have either NHWC or HWC layout, where the channel C is inside the given template type T, e.g. T=uchar4 for RGBA8. The active dimensions are H (second) and W (third).

See also

Template Parameters:

T – Type of the values to be accessed in the interpolation wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
I – Interpolation to be used when accessing H and W, one of NVCVInterpolationType
StrideType – Stride type used when accessing underlying tensor data

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type
scaleX – [in] Scale X value to be used when interpolation elements with area type
scaleY – [in] Scale Y value to be used when interpolation elements with area type

Returns:

Interpolation wrap useful to access tensor data interpolation-border aware in H and W in CUDA kernels.

template<typename T, NVCVBorderType B, NVCVInterpolationType I, typename StrideType = int64_t, class = Require<HasTypeTraits<T>>> __host__ auto CreateInterpolationWrapNHWC(const TensorDataStridedCuda &tensor, T borderValue = {}, float scaleX = {}, float scaleY = {})

Factory function to create an NHWC interpolation wrap given a tensor data.

The output InterpolationWrap wraps an NHWC 4D tensor allowing to access data per batch (N), per row (H), per column (W) and per channel (C) of the input tensor interpolation-border aware in rows (or height H) and columns (or width W). The input tensor data must have either NHWC or HWC layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8. The active dimensions are H (second) and W (third).

See also

Template Parameters:

T – Type of the values to be accessed in the interpolation wrap.
B – Border extension to be used when accessing H and W, one of NVCVBorderType
I – Interpolation to be used when accessing H and W, one of NVCVInterpolationType
StrideType – Stride type used when accessing underlying tensor data

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.
borderValue – [in] Border value to be used when accessing outside elements in constant border type
scaleX – [in] Scale X value to be used when interpolation elements with area type
scaleY – [in] Scale Y value to be used when interpolation elements with area type

Returns:

Interpolation wrap useful to access tensor data interpolation-border aware in H and W in CUDA kernels.

template<typename T, typename U, class = nvcv::cuda::Require<nvcv::cuda::detail::IsSameCompound<T, U>>> inline __host__ __device__ auto dot (T a, U b)

template<RoundMode RM, typename T, typename U, class = Require<(!std::is_same_v<T, U>)&&((NumComponents<T> == NumComponents) || (NumComponents<T> == 0 && HasTypeTraits))>> inline __host__ __device__ auto round (U u)

Metafunction to round all elements of the input.

This function rounds all elements of the input and returns the result with the same type as the input. Optionally, the base type of the result may be specified by the template argument type T. For instance, a float4 can have its 4 elements rounded into a float4 result, or to a different result type, such as T=int or T=int4, where the result will be int4 with the rounded results (see example below). Also optionally, the round mode RM can be specified, as one of RoundMode, e.g. RoundMode::DOWN. It is a requirement of round that the input source type has type traits and the optional result type T is either a regular C type or has the same number of components as the input type.

using FloatType = MakeType<float, 4>;
FloatType res = ...;
FloatType float_rounded = round(res);
ConvertBaseTypeTo<int, FloatType> int_rounded = round<int>(res);

Template Parameters:

RM – Optional round mode to be used, cf. RoundMode.
U – Type of the source value (with 1 to 4 elements) passed as (and inferred from) argument u.
T – Optional type that defines the result of the round.

Parameters:

u – [in] Source value to round all elements with its same type or T.

Returns:

The value with all elements rounded.

template<typename T, typename U, class = Require<(!std::is_same_v<T, U>)&&((NumComponents<T> == NumComponents) || (NumComponents<T> == 0 && HasTypeTraits))>> inline __host__ __device__ auto round (U u)

Overload of round function.

It does specifies target round type T and uses default round mode RoundMode::DEFAULT.

template<RoundMode RM, typename U> inline __host__ __device__ auto round (U u)

Overload of round function.

It does not specify target round type T, using input source type U instead.

template<typename U> inline __host__ __device__ auto round (U u)

Overload of round function.

It does not specify target round type T, using input source type U instead, and uses default round mode RoundMode::DEFAULT.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U min (U a, U b)

Metafunction to compute the minimum of two inputs per element.

This function finds the minimum of two inputs per element and returns the result with the same type as the input. For instance, two int4 inputs {1, 2, 3, 4} and {4, 3, 2, 1} yield the minimum {1, 2, 2, 1} as int4 as well (see example below). It is a requirement of min that the input source type has type traits.

using IntType = MakeType<int, 4>;
IntType a = {1, 2, 3, 4}, b = {4, 3, 2, 1};
IntType ab_min = min(a, b); // = {1, 2, 2, 1}

Template Parameters:: U – Type of the two source arguments and the return type.
Parameters:: u – [in] Input value to compute \( min(x_a, x_b) \) where \( x_a \) ( \( x_b \)) is each element of \( a \) ( \( b \)).
Returns:: The return value with one minimum per element.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U max (U a, U b)

Metafunction to compute the maximum of two inputs per element.

This function finds the maximum of two inputs per element and returns the result with the same type as the input. For instance, two int4 inputs {1, 2, 3, 4} and {4, 3, 2, 1} yield the maximum {4, 3, 3, 4} as int4 as well (see example below). It is a requirement of max that the input source type has type traits.

using IntType = MakeType<int, 4>;
IntType a = {1, 2, 3, 4}, b = {4, 3, 2, 1};
IntType ab_max = max(a, b); // = {4, 3, 3, 4}

Template Parameters:: U – Type of the two source arguments and the return type.
Parameters:: u – [in] Input value to compute \( max(x_a, x_b) \) where \( x_a \) ( \( x_b \)) is each element of \( a \) ( \( b \)).
Returns:: The return value with maximums per element.

template<typename U, typename S, class = Require<(NumComponents == NumComponents<S>) || (HasTypeTraits && NumComponents<S> == 0)>> inline __host__ __device__ U pow (U x, S y)

Metafunction to compute the power of all elements of the input.

This function computes the power of all elements of the input x and returns the result with the same type as the input. It is a requirement of pow that the input x has the same number of components of the power y or y is a scalar (and the type of x has type traits).

Template Parameters:

U – Type of the source argument x and the return type.
S – Type of the source argument y power (use a regular C type for scalar).

Parameters:

x – [in] Input value to compute \( x^y \).
y – [in] Input power to compute \( x^y \).

Returns:

The return value with all elements as the result of the power.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U exp (U u)

Metafunction to compute the natural (base e) exponential of all elements of the input.

This function computes the natural (base e) exponential of all elements of the input and returns the result with the same type as the input. It is a requirement of exp that the input source type has type traits.

Template Parameters:: U – Type of the source argument and the return type.
Parameters:: u – [in] Input value to compute \( e^x \) where \( x \) is each element of \( u \).
Returns:: The return value with all elements as the result of the natural (base e) exponential.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U sqrt (U u)

Metafunction to compute the square root of all elements of the input.

This function computes the square root of all elements of the input and returns the result with the same type as the input. It is a requirement of sqrt that the input source type has type traits.

Template Parameters:: U – Type of the source argument and the return type.
Parameters:: u – [in] Input value to compute \( \sqrt{x} \) where \( x \) is each element of \( u \).
Returns:: The return value with all elements as the result of the square root.

template<typename U, class = Require<HasTypeTraits>> inline __host__ __device__ U abs (U u)

Metafunction to compute the absolute value of all elements of the input.

This function computes the absolute value of all elements of the input and returns the result with the same type as the input. For instance, an int4 input {-1, 2, -3, 4} yields the absolute {1, 2, 3, 4} as int4 as well (see example below). It is a requirement of abs that the input source type has type traits.

using IntType = MakeType<int, 4>;
IntType a = {-1, 2, -3, 4};
IntType a_abs = abs(a); // = {1, 2, 3, 4}

Template Parameters:: U – Type of the source argument and the return type.
Parameters:: u – [in] Input value to compute \( |x| \) where \( x \) is each element of \( u \).
Returns:: The return value with the absolute of all elements.

template<typename U, typename S, class = Require<(NumComponents == NumComponents<S>) || (HasTypeTraits && NumComponents<S> == 0)>> inline __host__ __device__ U clamp (U u, S lo, S hi)

Metafunction to clamp all elements of the input.

This function clamps all elements of the input u between lo and hi and returns the result with the same type as the input. It is a requirement of clamp that the input u has the same number of components of the range values lo and hi or both are scalars (and the type of u has type traits).

Template Parameters:

U – Type of the source argument u and the return type.
S – Type of the source argument lo and hi (use a regular C type for scalar).

Parameters:

u – [in] Input value to clamp.
lo – [in] Input clamp range low value.
hi – [in] Input clamp range high value.

Returns:

The return value with all elements clamped.

template<typename T, typename U, class = Require<HasTypeTraits<T, U> && !IsCompound<T>>> __host__ __device__ auto RangeCast (U u)

Metafunction to range cast (scale) all elements to a target range.

This function range casts (that is scales) all elements to the range defined by the template argument type T. For instance, a float4 with all elements between 0 and 1 can be casted to an uchar4 with scaling of each element to be in between 0 and 255 (see example below). It is a requirement of RangeCast that both types have type traits and type T must be a regular C type. Several examples of possible target range giving a source range, depending on the limits of regular C types, for the RangeCast function are as follows:

Source type U	Target type T	Source range	Target range
signed char	float	[-128, 127]	[-1, 1]
float	unsigned char	[0, 1]	[0, 255]
short	unsigned int	[-32768, 32767]	[0, 4294967295]
double	int	[-1, 1]	[-2147483648, 2147483647]
unsigned short	double	[0, 65535]	[0, 1]

using DataType = MakeType<uchar, 4>;
using FloatDataType = ConvertBaseTypeTo<float, DataType>;
FloatDataType res = ...; // res component values are in [0, 1]
DataType pix = RangeCast<BaseType<DataType>>(res); // pix are in [0, 255]

Template Parameters:

T – Type that defines the target range to cast.
U – Type of the source value (with 1 to 4 elements) passed as argument.

Parameters:

u – [in] Source value to cast all elements to range of type T.

Returns:

The value with all elements scaled.

template<typename T, typename U, class = Require<(NumComponents<T> == NumComponents) || (NumComponents<T> == 0 && HasTypeTraits)>> __host__ __device__ auto SaturateCast (U u)

Metafunction to saturate cast all elements to a target type.

This function saturate casts (clamping with potential rounding) all elements to the range defined by the template argument type T. For instance, a float4 with any values (can be below 0 and above 255) can be casted to an uchar4 rounding-then-saturating each value to be in between 0 and 255 (see example below). It is a requirement of SaturateCast that both types have the same number of components or T is a regular C type.

using DataType = MakeType<uchar, 4>;
using FloatDataType = ConvertBaseTypeTo<float, DataType>;
FloatDataType res = ...; // res component values are in [0, 1]
DataType pix = SaturateCast<DataType>(res); // pix are in [0, 255]

Template Parameters:

T – Type that defines the target range to cast.
U – Type of the source value (with 1 to 4 elements) passed as argument.

Parameters:

u – [in] Source value to cast all elements to range of base type of T

Returns:

The value with all elements clamped and potentially rounded.

template<typename T, typename U, class = Require<HasTypeTraits<T, U> && !IsCompound<T>>> __host__ __device__ auto StaticCast (U u)

Metafunction to static cast all values of a compound to a target type.

The template parameter T defines the base type (regular C type) to cast all components of the CUDA compound type U passed as function argument u to the type T. The static cast return type has the base type T and the number of components as the compound type U. For instance, an uint3 can be casted to int3 by passing it as function argument of StaticCast and the type int as template argument (see example below). The type U is not needed as it is inferred from the argument \u. It is a requirement of the StaticCast function that the type T is of regular C type and the type U is of CUDA compound type.

int3 idx = StaticCast<int>(blockIdx * blockDim + threadIdx);

Template Parameters:: T – Type to do static cast on each component of u.
Parameters:: u – [in] Compound value to static cast each of its components to target type T.
Returns:: The compound value with all components static casted to type T.

template<typename T, typename StrideType = int64_t, class = Require<HasTypeTraits<T> && IsStrideType<StrideType>>> __host__ auto CreateTensorWrapNHW(const TensorDataStridedCuda &tensor)

Factory function to create an NHW tensor wrap given a tensor data.

The output TensorWrap is an NHW 3D tensor allowing to access data per batch (N), per row (H) and per column (W) of the input tensor. The input tensor data must have either NHWC or HWC layout, where the channel C is inside T, e.g. T=uchar3 for RGB8.

See also

Template Parameters:

T – Type of the values to be accessed in the tensor wrap.
StrideType – Type of the stride used in the tensor wrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.

Returns:

Tensor wrap useful to access tensor data in CUDA kernels.

template<typename T, typename StrideType = int64_t, class = Require<HasTypeTraits<T> && IsStrideType<StrideType>>> __host__ auto CreateTensorWrapNHWC(const TensorDataStridedCuda &tensor)

Factory function to create an NHWC tensor wrap given a tensor data.

The output TensorWrap is an NHWC 4D tensor allowing to access data per batch (N), per row (H), per column (W) and per channel (C) of the input tensor. The input tensor data must have either NHWC or HWC layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8.

See also

Template Parameters:

T – Type of the values to be accessed in the tensor wrap.
StrideType – Type of the stride used in the tensor wrap.

Parameters:

tensor – [in] Reference to the tensor that will be wrapped.

Returns:

Tensor wrap useful to access tensor data in CUDA kernels.

template<typename T, typename StrideType = int64_t, class = Require<HasTypeTraits<T> && IsStrideType<StrideType>>> __host__ auto CreateTensorWrapNCHW(const TensorDataStridedCuda &tensor)

Factory function to create an NCHW tensor wrap given a tensor data.

The output TensorWrap is an NCHW 4D tensor allowing to access data per batch (N), per channel (C), per row (H), and per column (W) of the input tensor. The input tensor data must have either NCHW or CHW layout, where the channel C is of type T, e.g. T=uchar for each channel of either RGB8 or RGBA8.

See also