15 Replies Latest reply: Apr 17, 2013 3:53 AM by jpsollie RSS

OpenCL kernel recycling makes Xorg crash

jpsollie Newbie
Currently Being Moderated

System:

CPU: turion ultra ZM-86 / 4GB

GPU: radeon HD 4570 / 512MB

 

Software:

Xorg 1.12.4

Catalyst 13.1 for legacy GPU

GCC 4.6.3

linux kernel 3.2.43

 

I have a very weird problem while performing an 'iteration' over an openCL kernel:

when the program iterates over the kernel for more than 2 times, it makes the Xorg server crash.

I can run the program more than 2 times sequentially when it does not perform iteration

I can run the program in iteration mode when launching it from the console: the system is perfectly stable when Xorg is not running.

 

some minor remark here is that during kernel execution, Xorg also 'locks up': if you put a clock on the background before launching the program in non-iterative mode, it just keeps the time before launch until it finalizes, then the screen is redrawn.

 

Is there any way I can instruct the program to be less 'aggressive" with its resources? I already tried a clfinish at the end of each iteration to make sure I did not forgot any read / writes in the command queue, but that didn't help either.

  • Re: OpenCL kernel recycling makes Xorg crash
    himanshu.gautam Master
    Currently Being Moderated

    1. How long does your kernel run?

    2  imho, . Seems more like a bug in your code.... Please post your code

    Regards

    Himanshu , Bruhaspati

    --------------------------------

    The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

    • Re: OpenCL kernel recycling makes Xorg crash
      jpsollie Newbie
      Currently Being Moderated

      himanshu.gautam wrote:

       

      1. How long does your kernel run?

      2  imho, . Seems more like a bug in your code.... Please post your code

      1. about a minute

      2. I really hoped I wouldn't have to do so :s the code is still in 'under investigation' stage and as such not documented at all

      anyway, the C source file is up here: http://83.101.10.32:/debugCL.c

       

      the 'iteration' is in the function 'continue_work', line 473

      the device setup is in the function 'progress_task_proposal_advertisement', line 556

      • Re: OpenCL kernel recycling makes Xorg crash
        himanshu.gautam Master
        Currently Being Moderated

        If you hog the GPU for 1 minute -- it can potentially kickstart some watchdog timer which can attempt a forceful recovery.

        I know this used to the case with Windows long time back... Not too sure what happens in Linux..

        But it is just natural to expect a 1-minute long kernel on a diplay card to invite some watchdogs/monitors.

        Regards

        Himanshu , Bruhaspati

        --------------------------------

        The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

  • Re: OpenCL kernel recycling makes Xorg crash
    jpsollie Newbie
    Currently Being Moderated

    even more interesting:

     

    the problem may not be related to Xorg itself:

    the 'iteration' is currently running on my pc at tty0, and if I use Xfce4 instead of kde 4.8, the system is completely operational.

     

    do I have to verify resources that may be requested by an openGL engine?

    • Re: OpenCL kernel recycling makes Xorg crash
      himanshu.gautam Master
      Currently Being Moderated

      OpenCL on GPU is tied to X - currently on Linux.

      AMD is working to de-couple the two so that you can run your programs even without the X-server running.

       

      Hogging the GPU for 1 minute will certainly invite trouble.

      I will ask around to see what can be done...

       

      but your best bet would be to break down your kernel into multiple kernel launches each probably running for a second or two.

      Regards

      Himanshu , Bruhaspati

      --------------------------------

      The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

      • Re: OpenCL kernel recycling makes Xorg crash
        jpsollie Newbie
        Currently Being Moderated

        then I believe I found an unexpected feature (or a very serious program malfunction):

        The program runs without X running.

        Even if X has not been loaded (so at startup, before I'm loading X) the program runs

         

        the only dependency is that the ati kernel module must be loaded - which the kernel loads by itself.

         

        BTW: I don't know if I should change the thread subject, as running the program inside a minimalistic environment (no opengl, directfb, ...) works fine - no screen freezes at all! so the problem is not with X - it might be Qt or another rendering engine which messes up the situation

        • Re: OpenCL kernel recycling makes Xorg crash
          himanshu.gautam Master
          Currently Being Moderated

          :-)

          Well, mmm.... I am not sure what is going on....

           

          Are you telling that without X-server beng loaded at startup, you are able to run OpenCL?

          Regards

          Himanshu , Bruhaspati

          --------------------------------

          The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

  • Re: OpenCL kernel recycling makes Xorg crash
    jpsollie Newbie
    Currently Being Moderated

    Is there any way to detect if a device is currently connected to an active screen?  That would be the last resort, except moving away from this aging HD 4570 and optimizing the kernel

  • Re: OpenCL kernel recycling makes Xorg crash
    jpsollie Newbie
    Currently Being Moderated

    I optimized my kernel a bit (replaced % with & where possible, decreased memory usage, etc ...), and gained a 15% performance increase, so that's nice.  but the problem does not seem to solve itself, so I guess I'll just have to find myself a GPU which is more suited for this stuff (the bottleneck is memory bandwidth).  case can be closed

  • Re: OpenCL kernel recycling makes Xorg crash
    jpsollie Newbie
    Currently Being Moderated

    I have a question concerning this topic:

    I have a pc which is capable of running 3 VGA cards, and used this one to experiment with this program.

    Currently, this pc is populated with nvidia 8800/9500 cards (bought way before the actual phenom X6), and some signs of a solution seem to appear.  Of course, the lack of GPU capabilities (8800 cores were actually never designed to be openCL 1.0 compatible) will not fix the problem.

    A card replacement may let me continue my research, but I got one question that annoys me a bit:

    in the openCL benchmarks, AMD cards seem to perform way better with integer operations (the ones I am performing) for the same price than nvidias, so that would be obvious.  However, the workgroup size limit is only 256 compared to 1024 with nvidia cards.  Is this a driver software limit which AMD might change in one of its next releases? cause it would be useful if I could put a few more items in the same workgroup

    • Re: OpenCL kernel recycling makes Xorg crash
      himanshu.gautam Master
      Currently Being Moderated

      jpsollie wrote:

       

      However, the workgroup size limit is only 256 compared to 1024 with nvidia cards.  Is this a driver software limit which AMD might change in one of its next releases? cause it would be useful if I could put a few more items in the same workgroup

      I do not think, that limit will change any time soon. And I do not see a big reason, to support 1024 work-items in a compute unit anyhow. The intent of GPU computing is to use the available GPU resources to their maximum. You can always breakdown your work within 256 threads, as compared to 1024 threads, by assigning 4 times more work to each thread.

      Regards

      Himanshu , Bruhaspati

      --------------------------------

      The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied

      • Re: OpenCL kernel recycling makes Xorg crash
        jpsollie Newbie
        Currently Being Moderated

        true, but that could imply I got to enqueue a few extra kernel executions.  the process is mainly coordinated by get_local_id(i) and get_group_id(i) and a wg size of 1024 would let me use a 2nd workgroup dimension from time to time (32*32). Isn't this an overhead worth thinking about (I absolutely have no clue, so if it's a stupid question, just tell)

More Like This

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points