3 Replies Latest reply: Jan 2, 2013 1:59 PM by rc556677 RSS

Illegal instruction: ACML 5.3.0+Open64 4.5.2

rc556677 Newbie
Currently Being Moderated

This is on Fedora 18 Beta:

 

Fortran examples are failing witth "Illegal instruction" using Open64 4.5.2 and ACML 5.3.0

The target CPU is Phenom II X4 965

All C C++ examples pass. Only the Fortran examples fail

It looks like an illegal instruction in the Fortran support library.

I wonder whether this is an Open64 or ACML problem?

 

The combination of  GCC 4.7.2 and ACML 5.3.0 passes all examples

 

[178502.799741] traps: sgetrf_example.[8443] trap invalid opcode ip:7f87dac66780 sp:7fffd45fe468 error:0 in libffio.so[7f87dac14000+84000]

 

$ /opt/acml5.3.0/util/cpuid.exe

Chip manufacturer: AuthenticAMD

AuthenticAMD family 15 extended family 1 model 4

Model Name: AMD Phenom(tm) II X4 965 Processor

Chip supports SSE

Chip supports SSE2

Chip supports SSE3

Chip does not support AVX

Chip does not support FMA3

Chip does not support FMA4

 

Compiling program acmlinfo.f:

openf95 -c -O1 acmlinfo.f -o acmlinfo.o

Linking program acmlinfo.exe:

openf95  acmlinfo.o  /opt/acml5.3.0/open64_64/lib/libacml.a -lrt -ldl -o acmlinfo.exe

Running program acmlinfo.exe:

(export LD_LIBRARY_PATH='/opt/acml5.3.0/open64_64/lib:'; ./acmlinfo.exe > acmlinfo.res 2>&1)

ACML (AMD Core Math Library) version 5.3.0.67  (Tue Dec 11 04:15:54 CST 2012)

Copyright AMD,NAG 2012

Build system: Linux 3.0.13-0.27-default x86_64 acml-build-lin2

Built using Fortran compiler: openf95 Open64 Compiler Suite: Version 4.5.2

   with flags:  -OPT:vcast_complex=OFF -Wall -fPIC -fno-second-underscore -DUSE_ACMLMALLOCFAST -m64 -DIS_64BIT -march=opteron -msse -msse2 -O2

and C compiler: gcc (GCC) 4.7.1

   with flags: -L/opt/x86_open64-4.5.2/lib/gcc-lib/x86_64-open64-linux/4.5.2 -Wall -W -Wno-unused-parameter -Wstrict-prototypes -Wwrite-strings -D_GNU_SOURCE -D_ISOC99_SOURCE -fPIC -DUSE_ACMLMALLOCFAST -m64 -DIS_64BIT -march=opteron -msse -msse2 -O3

 

 

Compiling program sgetrf_example.f:

openf95 -c -O1 sgetrf_example.f -o sgetrf_example.o

Linking program sgetrf_example.exe:

openf95  sgetrf_example.o  /opt/acml5.3.0/open64_64/lib/libacml.a -lrt -ldl -o sgetrf_example.exe

Running program sgetrf_example.exe:

(export LD_LIBRARY_PATH='/opt/acml5.3.0/open64_64/lib:'; ./sgetrf_example.exe > sgetrf_example.res 2>&1)

/bin/sh: line 1:  8443 Illegal instruction     (core dumped) ./sgetrf_example.exe > sgetrf_example.res 2>&1

make: *** [sgetrf_example.res] Error 132

  • Re: Illegal instruction: ACML 5.3.0+Open64 4.5.2
    Chip Freitag Moderator
    Currently Being Moderated

    I can duplicate this issue.  I ran under gdb, and the illegal instruction is a vzeroupper instruction, which is an AVX opcode, not supported by the part you are using. 

    This instruction is at the start of the main program, which means that the open64 compiler is putting it in by default.  I even added -march=opteron -msse -msse (the flags used for the ACML library build) and that did not change the problem..   I then added -mno-avx to the command line and the problem no longer occurs in MAIN, but instead occurs in the ACML library.

     

    Unfortunately we did not build the ACML library with -mno-avx, so vzeroupper occurs in many places.

     

    We'll have to rebuild the open64 version to resolve this problem, I'm not sure when we will be able post it.

    The information presented in this reply is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.
    • Re: Illegal instruction: ACML 5.3.0+Open64 4.5.2
      Chip Freitag Moderator
      Currently Being Moderated

      My previous post is slightly in error.  The next illegal instruction is in the libfortran.a runtime supplied by the open64 compiler.

      However I was able to solve the problem!

       

      On the open64 compiler page you should find two recent 4.5.2-1 builds.  The first set are for "Piledriver core" devices.

      The second set are for any x86_64 parts. 

      http://developer.amd.com/tools/cpu-development/x86-open64-compiler-suite/

      The file name is: x86_open64-4.5.2-1.rhel5_sles10.x86_64.tar.bz2

       

      I downloaded this second version and installed it, making sure that LD_LIBRARY_PATH points to the runtime libraries from this new version.  I added -mno-avx to the FLAGS definitions in the example GNUmakefile.

      After these changes, the examples all built and ran correctly.

       

      Aparently this alternate build of the open64 compiler has the runtimes built without AVX instructions.

      The good news is there is no need for a new ACML version.

      The information presented in this reply is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.
      • Re: Illegal instruction: ACML 5.3.0+Open64 4.5.2
        rc556677 Newbie
        Currently Being Moderated

        Thank you for investigating this - it has solved my problem.

         

        Following your instructions I installed x86_open64-4.5.2-1.rhel5_sles10.x86_64.rpm

        on Fedora 18/Phenom II X4 965. I checked that the runtimes libfortran.so libacml_mv.so libffio.so

        are from this build. Now Open64/ACML 5.3.0 examples all build and run correctly.

         

        Thanks again.

         

        Richard

More Like This

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points