java.lang.Object
com.ibm.cuda.CudaFunction
The
CudaFunction class represents a kernel entry point found in
a specific CudaModule loaded on a CUDA-capable device.-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intThe binary architecture version for which the function was compiled.static final intThe size in bytes of user-allocated constant memory required by this function.static final intThe size in bytes of local memory used by each thread of this function.static final intThe maximum number of threads per block, beyond which a launch of the function would fail.static final intThe number of registers used by each thread of this function.static final intThe PTX virtual architecture version for which the function was compiled.static final intThe size in bytes of statically-allocated shared memory required by this function. -
Method Summary
Modifier and TypeMethodDescriptionintgetAttribute(int attribute) Returns the value of the specified @{code attribute}.voidsetCacheConfig(CudaDevice.CacheConfig config) Configures the cache for this function.voidConfigures the shared memory of this function.
-
Field Details
-
ATTRIBUTE_BINARY_VERSION
public static final int ATTRIBUTE_BINARY_VERSIONThe binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13. Note that this will return a value of 10 for legacy cubins that do not have a properly-encoded binary architecture version.- See Also:
-
ATTRIBUTE_CONST_SIZE_BYTES
public static final int ATTRIBUTE_CONST_SIZE_BYTESThe size in bytes of user-allocated constant memory required by this function.- See Also:
-
ATTRIBUTE_LOCAL_SIZE_BYTES
public static final int ATTRIBUTE_LOCAL_SIZE_BYTESThe size in bytes of local memory used by each thread of this function.- See Also:
-
ATTRIBUTE_MAX_THREADS_PER_BLOCK
public static final int ATTRIBUTE_MAX_THREADS_PER_BLOCKThe maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.- See Also:
-
ATTRIBUTE_NUM_REGS
public static final int ATTRIBUTE_NUM_REGSThe number of registers used by each thread of this function.- See Also:
-
ATTRIBUTE_PTX_VERSION
public static final int ATTRIBUTE_PTX_VERSIONThe PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13. Note that this may return the undefined value of 0 for cubins compiled prior to CUDA 3.0.- See Also:
-
ATTRIBUTE_SHARED_SIZE_BYTES
public static final int ATTRIBUTE_SHARED_SIZE_BYTESThe size in bytes of statically-allocated shared memory required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.- See Also:
-
-
Method Details
-
getAttribute
Returns the value of the specified @{code attribute}.- Parameters:
attribute- the attribute to be queried (see ATTRIBUTE_XXX)- Returns:
- the attribute value
- Throws:
CudaException- if a CUDA exception occurs
-
setCacheConfig
Configures the cache for this function.- Parameters:
config- the desired cache configuration- Throws:
CudaException- if a CUDA exception occurs
-