Does anyone know if AMD opencl can support 6GB of physical device memory ?
Specifically, on a Tahiti 7970 single GPU card with 6GB of DDR5 memory.
One concern is that opencl uses 32 bit pointers. There is a 48 bit base address in the
128 bit memory descriptors, but is it used?, how does it work with 32 bit pointers?
The other is that the performance of the higher memory is degraded by page swapping
or some other system mechanism.
(Yes, I need more memory !)
Many thanks for any input.
AMD OpenCL compiler and runtime implementations require single linear address space for the buffer allocations. So the kernel's binary (ISA) needs 64bit memory access and 64 bit arithmetic for the address calculation in order
to support >4GB of the buffer allocations. Setting environment variable "set GPU_FORCE_64BIT_PTR=1" can force compiler/runtime to generate an ISA that supports 64 bit address calculation and memory access. Hence that
allows OpenCL runtime to report all 6GB of local memory. The main performance issue comes from 64-bit address calculation and not from memory access.
48 bit address in memory descriptor mainly used for images. Images have per texel access with
coordinates and don't really require allocations in a single linear address space. When OpenCL runtime
reports supported memory size, it doesn't know if the application is going to allocate 6GB of buffers or images.
Hi German. Thank you for the explanation, but it's still not quite clear what exactly the variable does, especially when kernels are compiled offline. Is this variable affecting compilation, runtime or both? What happens if the variable was set at compile time and is not set at run time (or the other way around)? What happens if host application is a 32-bit one? (so sizeof(size_t) on CPU will be 4).