8 Replies Latest reply: Apr 4, 2013 10:47 PM by himanshu.gautam RSS

GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure

liwoog Newbie
Currently Being Moderated

While my code was running well in production using GPU_MAX_ALLOC_PERCENT at up to 100% with the 12.4 drivers, it fails (CL_OUT_OF_RESOURCES) with the 13.1 drivers (I allocate up to 90% of memory from the code). I tried changing 100% to 80% to no avail.

 

Only being able to use 2GB of the 3GB on the card would render it useless for my next project. I need every bit of memory I can use.

 

Is there a workaround?

 

Machine:

4x HD 7970

Catalyst 13.1 driver on CentOS 6.3


Operating System Version (name), Linux version 2.6.32-279.19.1.el6.centos.plus.x86_64 (mockbuild@c6b7.bsys.dev.centos.org) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Wed Dec 19 06:20:23 UTC 2012

 

Operating System Version (number), 2.6.32

Number Of Processors, 32

System Type, Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz

Total Physical Memory, 64392 MB

Available Physical Memory, 62184 MB

Total Virtual Memory, 33554431 MB

Available Virtual Memory, 33519322 MB

Total Page Files, 8191 MB

Available Page Files, 8191 MB

 

Platform ID, 1, 1, 1, 1, 1

Device Type, GPU, GPU, GPU, GPU, CPU

Device Name, Tahiti, Tahiti, Tahiti, Tahiti, Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz

Vendor, Advanced Micro Devices, Inc., Advanced Micro Devices, Inc., Advanced Micro Devices, Inc., Advanced Micro Devices, Inc., GenuineIntel

Command Queue Properties, Queue profiling, Queue profiling, Queue profiling, Queue profiling, Queue profiling

Is Available, Yes, Yes, Yes, Yes, Yes

Is Compiler Available, Yes, Yes, Yes, Yes, Yes

Is Little Endian, Yes, Yes, Yes, Yes, Yes

Error Correction Support, No, No, No, No, No

Execution Capabilities, Kernel Execution, Kernel Execution, Kernel Execution, Kernel Execution, Kernel Execution, Native Kernel Execution

Global Memory Cache Size, 16 KB, 16 KB, 16 KB, 16 KB, 32 KB

Memory Cache Type, Read Write, Read Write, Read Write, Read Write, Read Write

Global Memory Cache Line Size, 64 bytes, 64 bytes, 64 bytes, 64 bytes, 64 bytes

Global Memory Size, 2,048 MB, 2,048 MB, 2,048 MB, 2,048 MB, 64,393 MB

Host Unified Memory, No, No, No, No, Yes

Are Images Supported, Yes, Yes, Yes, Yes, Yes

Max Image 2D Dimensions, (256w, 256h), (256w, 256h), (256w, 256h), (256w, 256h), (1024w, 1024h)

Max Image 3D Dimensions, (256w, 256h, 256d), (256w, 256h, 256d), (256w, 256h, 256d), (256w, 256h, 256d), (1024w, 1024h, 1024d)

Local Memory Size, 32 KB, 32 KB, 32 KB, 32 KB, 32 KB

Local Memory Type, Local, Local, Local, Local, Global

Max Clock Frequency, 1050, 1050, 1050, 1050, 1200

Max Compute Units, 32, 32, 32, 32, 32

Max Constant Arguments, 8, 8, 8, 8, 8

Max Constant Buffer Size, 64 KB, 64 KB, 64 KB, 64 KB, 64 KB

Max Memory Allocation Size, 512 MB, 512 MB, 512 MB, 512 MB, 16,099 MB

Max Parameter Size, 1,024 bytes, 1,024 bytes, 1,024 bytes, 1,024 bytes, 4 KB

Read Image Arguments, 128, 128, 128, 128, 128

Max Samplers, 16, 16, 16, 16, 16

Max Workgroup Size, 256, 256, 256, 256, 1024

Max Work Item Dimensions, 3, 3, 3, 3, 3

Max Work Item Sizes, (256,256,256), (256,256,256), (256,256,256), (256,256,256), (1024,1024,1024)

Max Write Image Arguments, 8, 8, 8, 8, 8

Memory Base Address Alignment, 2048, 2048, 2048, 2048, 1024

Minimal Data Type Alignment Size, 128 bytes, 128 bytes, 128 bytes, 128 bytes, 128 bytes

OpenCL C Version, OpenCL C 1.2 , OpenCL C 1.2 , OpenCL C 1.2 , OpenCL C 1.2 , OpenCL C 1.2

Native Char Vector Width, 4, 4, 4, 4, 16

Native Short Vector Width, 2, 2, 2, 2, 8

Native Int Vector Width, 1, 1, 1, 1, 4

Native Long Vector Width, 1, 1, 1, 1, 2

Native Float Vector Width, 1, 1, 1, 1, 8

Native Double Vector Width, 1, 1, 1, 1, 4

Native Half Vector Width, 1, 1, 1, 1, 4

Preferred Char Vector Width, 4, 4, 4, 4, 16

Preferred Short Vector Width, 2, 2, 2, 2, 8

Preferred Int Vector Width, 1, 1, 1, 1, 4

Preferred Long Vector Width, 1, 1, 1, 1, 2

Preferred Float Vector Width, 1, 1, 1, 1, 8

Preferred Double Vector Width, 1, 1, 1, 1, 4

Preferred Half Vector Width, 1, 1, 1, 1, 4

Profile, FULL_PROFILE, FULL_PROFILE, FULL_PROFILE, FULL_PROFILE, FULL_PROFILE

Profiling Timer Resolution, 1, 1, 1, 1, 1

Vendor ID, OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2)

  • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
    himanshu.gautam Master
    Currently Being Moderated

    the output posted above looks like some modification of clinfo output. Can you share the source, it may help others as clinfo is having a issue when some platforms are OpenCL 1.1 and some are OpenCL 1.2 compliant.

    I will ask the runtime guys and let you know if there is a way to enable the full memory. Can you check once with 12.10 driver(and 13.2 beta)? Do you still get 2GB out of 3GB memory for your tahiti cards. Thanks for reporting it.

    Regards

    Himanshu , Bruhaspati

    --------------------------------

    The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

    • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
      liwoog Newbie
      Currently Being Moderated

      This was a copy/paste from CodeXL system's info.

       

      Here is the clinfo output below

       

      Number of platforms:


      1
        Platform Profile:


      FULL_PROFILE
        Platform Version:


      OpenCL 1.2 AMD-APP (1113.2)
        Platform Name:


      AMD Accelerated Parallel Processing
        Platform Vendor:


      Advanced Micro Devices, Inc.
        Platform Extensions:


      cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

       

       

       

       

        Platform Name:


      AMD Accelerated Parallel Processing
      Number of devices:


      5
        Device Type:



      CL_DEVICE_TYPE_GPU
        Device ID:



      4098
        Board name:



      AMD Radeon HD 7900 Series
        Device Topology:


      PCI[ B#2, D#0, F#0 ]
        Max compute units:


      32
        Max work items dimensions:

      3
          Max work items[0]:


      256
          Max work items[1]:


      256
          Max work items[2]:


      256
        Max work group size:


      256
        Preferred vector width char:

      4
        Preferred vector width short:

      2
        Preferred vector width int:

      1
        Preferred vector width long:

      1
        Preferred vector width float:

      1
        Preferred vector width double:
      1
        Native vector width char:

      4
        Native vector width short:

      2
        Native vector width int:

      1
        Native vector width long:

      1
        Native vector width float:

      1
        Native vector width double:

      1
        Max clock frequency:


      1050Mhz
        Address bits:



      32
        Max memory allocation:

      536870912
        Image support:


      Yes
        Max number of images read arguments:
      128
        Max number of images write arguments:
      8
        Max image 2D width:


      16384
        Max image 2D height:


      16384
        Max image 3D width:


      2048
        Max image 3D height:


      2048
        Max image 3D depth:


      2048
        Max samplers within kernel:

      16
        Max size of kernel argument:

      1024
        Alignment (bits) of base address:
      2048
        Minimum alignment (bytes) for any datatype: 128

        Single precision floating point capability

          Denorms:



      No
          Quiet NaNs:



      Yes
          Round to nearest even:

      Yes
          Round to zero:


      Yes
          Round to +ve and infinity:

      Yes
          IEEE754-2008 fused multiply-add:
      Yes
        Cache type:



      Read/Write
        Cache line size:


      64
        Cache size:



      16384
        Global memory size:


      2147483648
        Constant buffer size:


      65536
        Max number of constant args:

      8
        Local memory type:


      Scratchpad
        Local memory size:


      32768
        Kernel Preferred work group size multiple: 64
        Error correction support:

      0
        Unified memory for Host and Device:
      0
        Profiling timer resolution:

      1
        Device endianess:


      Little
        Available:



      Yes
        Compiler available:


      Yes
        Execution capabilities:



          Execute OpenCL kernels:

      Yes
          Execute native function:

      No
        Queue properties:



          Out-of-Order:


      No
          Profiling :



      Yes
        Platform ID:



      0x00007ffab08f64e0
        Name:




      Tahiti
        Vendor:



      Advanced Micro Devices, Inc.
        Device OpenCL C version:

      OpenCL C 1.2
        Driver version:


      1113.2 (VM)
        Profile:



      FULL_PROFILE
        Version:



      OpenCL 1.2 AMD-APP (1113.2)
        Extensions:



      cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_c1x_atomics

       

       

       

       

        Device Type:



      CL_DEVICE_TYPE_GPU
        Device ID:



      4098
        Board name:



      AMD Radeon HD 7900 Series
        Device Topology:


      PCI[ B#3, D#0, F#0 ]
        Max compute units:


      32
        Max work items dimensions:

      3
          Max work items[0]:


      256
          Max work items[1]:


      256
          Max work items[2]:


      256
        Max work group size:


      256
        Preferred vector width char:

      4
        Preferred vector width short:

      2
        Preferred vector width int:

      1
        Preferred vector width long:

      1
        Preferred vector width float:

      1
        Preferred vector width double:
      1
        Native vector width char:

      4
        Native vector width short:

      2
        Native vector width int:

      1
        Native vector width long:

      1
        Native vector width float:

      1
        Native vector width double:

      1
        Max clock frequency:


      1050Mhz
        Address bits:



      32
        Max memory allocation:

      536870912
        Image support:


      Yes
        Max number of images read arguments:
      128
        Max number of images write arguments:
      8
        Max image 2D width:


      16384
        Max image 2D height:


      16384
        Max image 3D width:


      2048
        Max image 3D height:


      2048
        Max image 3D depth:


      2048
        Max samplers within kernel:

      16
        Max size of kernel argument:

      1024
        Alignment (bits) of base address:
      2048
        Minimum alignment (bytes) for any datatype: 128

        Single precision floating point capability

          Denorms:



      No
          Quiet NaNs:



      Yes
          Round to nearest even:

      Yes
          Round to zero:


      Yes
          Round to +ve and infinity:

      Yes
          IEEE754-2008 fused multiply-add:
      Yes
        Cache type:



      Read/Write
        Cache line size:


      64
        Cache size:



      16384
        Global memory size:


      2147483648
        Constant buffer size:


      65536
        Max number of constant args:

      8
        Local memory type:


      Scratchpad
        Local memory size:


      32768
        Kernel Preferred work group size multiple: 64
        Error correction support:

      0
        Unified memory for Host and Device:
      0
        Profiling timer resolution:

      1
        Device endianess:


      Little
        Available:



      Yes
        Compiler available:


      Yes
        Execution capabilities:



          Execute OpenCL kernels:

      Yes
          Execute native function:

      No
        Queue properties:



          Out-of-Order:


      No
          Profiling :



      Yes
        Platform ID:



      0x00007ffab08f64e0
        Name:




      Tahiti
        Vendor:



      Advanced Micro Devices, Inc.
        Device OpenCL C version:

      OpenCL C 1.2
        Driver version:


      1113.2 (VM)
        Profile:



      FULL_PROFILE
        Version:



      OpenCL 1.2 AMD-APP (1113.2)
        Extensions:



      cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_c1x_atomics

       

       

       

       

        Device Type:



      CL_DEVICE_TYPE_GPU
        Device ID:



      4098
        Board name:



      AMD Radeon HD 7900 Series
        Device Topology:


      PCI[ B#-125, D#0, F#0 ]
        Max compute units:


      32
        Max work items dimensions:

      3
          Max work items[0]:


      256
          Max work items[1]:


      256
          Max work items[2]:


      256
        Max work group size:


      256
        Preferred vector width char:

      4
        Preferred vector width short:

      2
        Preferred vector width int:

      1
        Preferred vector width long:

      1
        Preferred vector width float:

      1
        Preferred vector width double:
      1
        Native vector width char:

      4
        Native vector width short:

      2
        Native vector width int:

      1
        Native vector width long:

      1
        Native vector width float:

      1
        Native vector width double:

      1
        Max clock frequency:


      1050Mhz
        Address bits:



      32
        Max memory allocation:

      536870912
        Image support:


      Yes
        Max number of images read arguments:
      128
        Max number of images write arguments:
      8
        Max image 2D width:


      16384
        Max image 2D height:


      16384
        Max image 3D width:


      2048
        Max image 3D height:


      2048
        Max image 3D depth:


      2048
        Max samplers within kernel:

      16
        Max size of kernel argument:

      1024
        Alignment (bits) of base address:
      2048
        Minimum alignment (bytes) for any datatype: 128

        Single precision floating point capability

          Denorms:



      No
          Quiet NaNs:



      Yes
          Round to nearest even:

      Yes
          Round to zero:


      Yes
          Round to +ve and infinity:

      Yes
          IEEE754-2008 fused multiply-add:
      Yes
        Cache type:



      Read/Write
        Cache line size:


      64
        Cache size:



      16384
        Global memory size:


      2147483648
        Constant buffer size:


      65536
        Max number of constant args:

      8
        Local memory type:


      Scratchpad
        Local memory size:


      32768
        Kernel Preferred work group size multiple: 64
        Error correction support:

      0
        Unified memory for Host and Device:
      0
        Profiling timer resolution:

      1
        Device endianess:


      Little
        Available:



      Yes
        Compiler available:


      Yes
        Execution capabilities:



          Execute OpenCL kernels:

      Yes
          Execute native function:

      No
        Queue properties:



          Out-of-Order:


      No
          Profiling :



      Yes
        Platform ID:



      0x00007ffab08f64e0
        Name:




      Tahiti
        Vendor:



      Advanced Micro Devices, Inc.
        Device OpenCL C version:

      OpenCL C 1.2
        Driver version:


      1113.2 (VM)
        Profile:



      FULL_PROFILE
        Version:



      OpenCL 1.2 AMD-APP (1113.2)
        Extensions:



      cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_c1x_atomics

       

       

       

       

        Device Type:



      CL_DEVICE_TYPE_GPU
        Device ID:



      4098
        Board name:



      AMD Radeon HD 7900 Series
        Device Topology:


      PCI[ B#-124, D#0, F#0 ]
        Max compute units:


      32
        Max work items dimensions:

      3
          Max work items[0]:


      256
          Max work items[1]:


      256
          Max work items[2]:


      256
        Max work group size:


      256
        Preferred vector width char:

      4
        Preferred vector width short:

      2
        Preferred vector width int:

      1
        Preferred vector width long:

      1
        Preferred vector width float:

      1
        Preferred vector width double:
      1
        Native vector width char:

      4
        Native vector width short:

      2
        Native vector width int:

      1
        Native vector width long:

      1
        Native vector width float:

      1
        Native vector width double:

      1
        Max clock frequency:


      1050Mhz
        Address bits:



      32
        Max memory allocation:

      536870912
        Image support:


      Yes
        Max number of images read arguments:
      128
        Max number of images write arguments:
      8
        Max image 2D width:


      16384
        Max image 2D height:


      16384
        Max image 3D width:


      2048
        Max image 3D height:


      2048
        Max image 3D depth:


      2048
        Max samplers within kernel:

      16
        Max size of kernel argument:

      1024
        Alignment (bits) of base address:
      2048
        Minimum alignment (bytes) for any datatype: 128

        Single precision floating point capability

          Denorms:



      No
          Quiet NaNs:



      Yes
          Round to nearest even:

      Yes
          Round to zero:


      Yes
          Round to +ve and infinity:

      Yes
          IEEE754-2008 fused multiply-add:
      Yes
        Cache type:



      Read/Write
        Cache line size:


      64
        Cache size:



      16384
        Global memory size:


      2147483648
        Constant buffer size:


      65536
        Max number of constant args:

      8
        Local memory type:


      Scratchpad
        Local memory size:


      32768
        Kernel Preferred work group size multiple: 64
        Error correction support:

      0
        Unified memory for Host and Device:
      0
        Profiling timer resolution:

      1
        Device endianess:


      Little
        Available:



      Yes
        Compiler available:


      Yes
        Execution capabilities:



          Execute OpenCL kernels:

      Yes
          Execute native function:

      No
        Queue properties:



          Out-of-Order:


      No
          Profiling :



      Yes
        Platform ID:



      0x00007ffab08f64e0
        Name:




      Tahiti
        Vendor:



      Advanced Micro Devices, Inc.
        Device OpenCL C version:

      OpenCL C 1.2
        Driver version:


      1113.2 (VM)
        Profile:



      FULL_PROFILE
        Version:



      OpenCL 1.2 AMD-APP (1113.2)
        Extensions:



      cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_c1x_atomics

       

       

       

       

        Device Type:



      CL_DEVICE_TYPE_CPU
        Device ID:



      4098
        Board name:




        Max compute units:


      32
        Max work items dimensions:

      3
          Max work items[0]:


      1024
          Max work items[1]:


      1024
          Max work items[2]:


      1024
        Max work group size:


      1024
        Preferred vector width char:

      16
        Preferred vector width short:

      8
        Preferred vector width int:

      4
        Preferred vector width long:

      2
        Preferred vector width float:

      8
        Preferred vector width double:
      4
        Native vector width char:

      16
        Native vector width short:

      8
        Native vector width int:

      4
        Native vector width long:

      2
        Native vector width float:

      8
        Native vector width double:

      4
        Max clock frequency:


      2601Mhz
        Address bits:



      64
        Max memory allocation:

      16880146432
        Image support:


      Yes
        Max number of images read arguments:
      128
        Max number of images write arguments:
      8
        Max image 2D width:


      8192
        Max image 2D height:


      8192
        Max image 3D width:


      2048
        Max image 3D height:


      2048
        Max image 3D depth:


      2048
        Max samplers within kernel:

      16
        Max size of kernel argument:

      4096
        Alignment (bits) of base address:
      1024
        Minimum alignment (bytes) for any datatype: 128

        Single precision floating point capability

          Denorms:



      Yes
          Quiet NaNs:



      Yes
          Round to nearest even:

      Yes
          Round to zero:


      Yes
          Round to +ve and infinity:

      Yes
          IEEE754-2008 fused multiply-add:
      Yes
        Cache type:



      Read/Write
        Cache line size:


      64
        Cache size:



      32768
        Global memory size:


      67520585728
        Constant buffer size:


      65536
        Max number of constant args:

      8
        Local memory type:


      Global
        Local memory size:


      32768
        Kernel Preferred work group size multiple: 1
        Error correction support:

      0
        Unified memory for Host and Device:
      1
        Profiling timer resolution:

      1
        Device endianess:


      Little
        Available:



      Yes
        Compiler available:


      Yes
        Execution capabilities:



          Execute OpenCL kernels:

      Yes
          Execute native function:

      Yes
        Queue properties:



          Out-of-Order:


      No
          Profiling :



      Yes
        Platform ID:



      0x00007ffab08f64e0
        Name:




      Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
        Vendor:



      GenuineIntel
        Device OpenCL C version:

      OpenCL C 1.2
        Driver version:


      1113.2 (sse2,avx)
        Profile:



      FULL_PROFILE
        Version:



      OpenCL 1.2 AMD-APP (1113.2)
        Extensions:



      cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt
    • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
      liwoog Newbie
      Currently Being Moderated

      With 12.8 and up, I can run with GPU_MAX_ALLOC_PERCENT set up to 45. Thankfully it also ups the global mem size to 3074424832, which is the important factor.

      • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
        himanshu.gautam Master
        Currently Being Moderated

        hi liwoog,

        Can you explain how you are seeing 3074424832(2.86GB) out of the 3GB by setting GPU_MAX_ALLOC_PERCENT to 45?

        Shouldn't setting it to 45 enable 45% of the GPU memory?

        Regards

        Himanshu , Bruhaspati

        --------------------------------

        The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

        • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
          liwoog Newbie
          Currently Being Moderated

          Simple..

           

                    if (CL_SUCCESS != (err = clGetDeviceInfo(devices[0], CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(cl_long), &maxGlobalMem, NULL)))

           

          returns 3074424832.

           

          The total memory seems to be dependent on the largest allocatable block:

          setenv GPU_MAX_ALLOC_PERCENT 25

          maxMemAllocSize(733741056), globalMemSize(2934964224)


          setenv GPU_MAX_ALLOC_PERCENT 26

          maxMemAllocSize(763090698), globalMemSize(3052362792)

           

          setenv GPU_MAX_ALLOC_PERCENT 27

          maxMemAllocSize(792440340), globalMemSize(3073376256)

          • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
            himanshu.gautam Master
            Currently Being Moderated

            Hi liwoog,

            That is interesting to learn.


             

            The GPU_MAX_ALLOC_PERCENT changes the maximum buffer size.  If you wants more GPU memory, you could try setting GPU_MAX_HEAP_SIZE to a value close to a 100 (say 95).

            Regards

            Himanshu , Bruhaspati

            --------------------------------

            The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

            • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
              liwoog Newbie
              Currently Being Moderated

              GPU_MAX_HEAP_SIZE on its own does not seem to impact the maximum amount of usable memory.

              • Re: GPU_MAX_ALLOC_PERCENT and 13.1 drivers failure
                himanshu.gautam Master
                Currently Being Moderated

                I also tried the flaf GPU_MAX_HEAP_SIZE, and it did not work for me on linux with 13.1 driver. As i understand, these features are only for experimental purpose, and AMD keeps the right to disable these flags in future.

                Anyways I do understand your problem, and i have asked someone for a workaround. I will let you know if i hear from him

                Regards

                Himanshu , Bruhaspati

                --------------------------------

                The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

More Like This

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points