15 Replies Latest reply: Aug 23, 2012 2:08 AM by santosh.zanjurne RSS

Problems building openmpi 1.6 with AMD open64 compiler

mithion Newbie
Currently Being Moderated

With the new release of AMD open64 4.5.2, I wanted to recompile openmpi to support the new version. However I've been having problems building openmpi. I've tried different things. However, when I run the following configure command:

 

# ./configure --prefix=/usr/local/openmpi CC=opencc CXX=openCC F77=openf90 FC=openf90 CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64

 

Results in the following error:

 

checking Fortran 90 kind of MPI_INTERGER_KIND (selected_int_kind(9))... ./configure: line 53651: 16695 Illegal instruction     ./conftest 1>&5 2>&1

configure: error: Could not determine kind of selected_int_kind(MPI_INTEGER_KIND)

 

I've also tried omitting specifying the F77 variable in the configure line. This allows the configuration to proceed error free and I can subsequently build and install openmpi. However doing this results in an error message saying that openmpi was not built with Fortran 90 support so the command mpif90 does not work. Anybody else seeing these kinds of errors?

 

Philippe

  • Re: Problems building openmpi 1.6 with AMD open64 compiler
    santosh.zanjurne Moderator
    Currently Being Moderated

    Hi Philppe,

    Can you give me the details on the machine-processor/os-name-version/gcc/glibc/binutil version.  Also let me know how you are using the Open64 compiler, i.e. how you are building the sources or which binary package you downloaded.

     

    I tried to reproduce this on RHEL-6.2 and SLES11.sp2, but I could not reproduce this issue.

     

    Regards,

    Santosh

    -------------------------
    The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.
    • Re: Problems building openmpi 1.6 with AMD open64 compiler
      mithion Newbie
      Currently Being Moderated

      We are using this on a cluster and I'm currently trying to build open64 support with openmpi 1.6 (the latest version at this time) on the headnode of the cluster. Our head node is sporting a Intel Core 2 Quad Q8400. However, the compute nodes are running Opterons 6272 hence why we wish to use open64. I've downloaded openmpi 1.6 and extracted the compressed tar file. I then CD into the directory and run the ./compile command I posted above.

       

      I've tried with different variations of options (ie I tried with FC only, FC + F77, FC + CC + CXX, FC + F77 + CXX + CC etc...). However, the openmpi FAQ recommends building openmpi with a consistent compiler suite for best results. If I understand correctly, GCC is used where a specific compiler isn't explicitly specified. I only get the above error (the one about selected_int_kind) when I simultaneously specify the FC and F77 compilers. If I omit F77, the ./configure completes without error but I get another problem down the road.

       

      So to give you some information about our setup:

      We are running Rocks Cluster Linux 6.0 (which is built from CentOS 6.2)

      # rpm -q gcc glibc binutils

      gcc-4.4.6-3.el6.x86_64

      glibc-2.12-1.47.el6_2.9.x86_64

      glibc-2.12-1.47.el6_2.9.i686

      binutils-2.20.51.0.2-5.28.el6.x86_64

       

      I also wanted to add that we've been successfully using open64 4.5.1 on this system for a few months with the same version of openmpi.

    • Re: Problems building openmpi 1.6 with AMD open64 compiler
      mithion Newbie
      Currently Being Moderated

      So I went ahead and switched back my environment to use open64 4.5.1 and I was able to successfully run the ./compile command from the original post. So something changed with 4.5.2.

      • Re: Problems building openmpi 1.6 with AMD open64 compiler
        santosh.zanjurne Moderator
        Currently Being Moderated

        Since you are running the application on Opteron machine, you should set the all flags, C/FC/F77/CXX, FLAGS to "-march=bdver1".  Doing this helps compiler generate optimized code for target architecture and would help you get the best performance.  I shall try to reproduce the issue you reported though.  Let me know if this helps.

         

        Regards,

        Santosh

        -------------------------
        The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.
        • Re: Problems building openmpi 1.6 with AMD open64 compiler
          santosh.zanjurne Moderator
          Currently Being Moderated

          Phillipe,

          Can you send the output of the attached program from the console as well as the files generated by the compiler ?    Test program and the command in the file attached. 

          -------------------------
          The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.
          • Re: Problems building openmpi 1.6 with AMD open64 compiler
            mithion Newbie
            Currently Being Moderated

            So here is the requested information. I've also included in the file output.txt the output of the compilation and test program. It appears in results in an illegal operation.

            • Re: Problems building openmpi 1.6 with AMD open64 compiler
              craas Newbie
              Currently Being Moderated

              I can reproduce this issue (OpenMPI build failure and testprogram illegal instruction issue).

               

              $ strace ./a.out >strace.out 2>&1

              Illegal instruction

               

              strace.out is attached

               

              $ uname -a

              Linux x 2.6.32-279.1.1.el6.x86_64 #1 SMP Tue Jul 10 11:24:23 CDT 2012 x86_64 x86_64 x86_64 GNU/Linux

               

              $ cat /etc/redhat-release

              Scientific Linux release 6.2 (Carbon)

               

              # rpm -q gcc glibc binutils

              gcc-4.4.6-3.el6.x86_64

              glibc-2.12-1.80.el6_3.3.x86_64

              glibc-2.12-1.80.el6_3.3.i686

              binutils-2.20.51.0.2-5.28.el6.x86_64 

              • Re: Problems building openmpi 1.6 with AMD open64 compiler
                santosh.zanjurne Moderator
                Currently Being Moderated

                I think both of you are facing this problem because you have downloaded compiler binaries which are meant to run on Bulldozer machine. i.e. Pakcage listed in "SLES 11, RHEL 6".

                Can you please confirm?

                 

                http://developer.amd.com/tools/open64/pages/default.aspx#four

                 

                On the above link we have two different compiler binaries to download, with rpm and tar version for each. 

                 

                A. SLES 11, RHEL 6 -

                   Since SLES-11 and RHEL-6 by default come with latest binutil package which has a support for Bulldozer architecture,  these binaries are build ON the Bulldozer machine with bdver1 flag.  So compiler binaries/libraries use Bulldozer instructions inside and these will not run on non-bulldozer machine.

                 

                B. SLES 10 SP2, SLES 10 SP3, RHEL 5.5 -

                   With old binutil, without Bulldozer instructions support, binaries in this category should run on Bulldozer as well as non-bulldozer machine.  Since Bulldozer instrunctions are not used in the compiler binaries.

                 

                On non-Bulldozer machine one should use binaries listed under 'B' above.

                 

                Let me know if this helps.

                 

                Regards,

                Santosh

                -------------------------
                The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.
                • Re: Problems building openmpi 1.6 with AMD open64 compiler
                  craas Newbie
                  Currently Being Moderated

                  Thanks, Santosh!

                   

                  Confirmed: We have

                  • RHEL 6
                  • binutils > 2.20.0-0.7.9, and
                  • AMD Opteron Family 16 (instead of AMD Opteron Family 15h)

                  and I chose the RHEL 6 binary package. After switching to the RHEL 5 package (without "intrinsic" bdver1 optimization) the compiler works again. Sorry for this, I should have taken the "Bulldozer architecture" literally.

                   

                  Two proposals:

                  http://developer.amd.com/tools/open64/pages/default.aspx#four

                  http://developer.amd.com/tools/open64/assets/ReleaseNotes.txt

                  a) Maybe one should replace the "x86 Open64 4.5.2-1 Compilers for Linux with older GlibC/assembler" by "x86 Open64 4.5.2-1 Compilers for Linux with older GlibC/assembler or for non-Bulldozer architectures" on the web page?

                  b) In addition, the binutils remarks are somewhat confusing, as two different versions are referred to. Only the "x86 Open64 4.5.2 Release Notes" give the full list of working binutils+distro combinations. Maybe referring to the Release Notes only is less confusing?

                   

                  Full summary:

                   

                  $ rpm -q binutils

                  binutils-2.20.51.0.2-5.28.el6.x86_64

                   

                  $ cat /etc/redhat-release

                  Scientific Linux release 6.2 (Carbon) -> aka community RHEL 6.2

                   

                  $ cat /proc/cpuinfo|head -25|egrep '^vendor_id|^cpu family|^model|^flags'|perl -pwe 's/[\t ]+/ /gm'

                  vendor_id : AuthenticAMD

                  cpu family : 16

                  model : 9

                  model name : AMD Opteron(tm) Processor 6128

                  flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr npt lbrv svm_lock nrip_save pausefilter

                   

                  -> Magny-Cours OS6128WKT8EGO

                  http://developer.amd.com/Assets/CompilerOptQuickRef-61004100.pdf

                   

                  Santosh's test case:

                  x86_open64-4.5.2-1.x86_64.tar.bz2 for RHEL 6 yields illegal instruction.

                  x86_open64-4.5.2-1.rhel5_sles10.x86_64.tar.bz2 for RHEL 5 works.

                   

                  Thanks again!

                • Re: Problems building openmpi 1.6 with AMD open64 compiler
                  mithion Newbie
                  Currently Being Moderated

                  I still think something is wrong with version 4.5.2. I've been using version 4.5.1 and compiling my software with "-march=bdver1" which means I've been using option A for the last 4 months. Option A should work on non bulldozer architecture. We use an old Core 2 Quad for our frontend for the simple reason that it seemed a waste of resources to use an expensive high end Opteron on the frontend which realistically does very little work. But in the end, I was still able to compile 4.5.1 option A with openmpi 1.6 regardless of the frontend architecture. Is there a reason why version 4.5.2 compiler itself was compiled with bdver1 thus limiting its portability?

  • Re: Problems building openmpi 1.6 with AMD open64 compiler
    mithion Newbie
    Currently Being Moderated

    I was able to fix the problem by installing the compiler from the RPM package instead of the tarball version. So to summarize, the RHEL 6 version of open64 4.5.2 does work on non-bulldozer machines, but I wasn't able to get the tarball to work, only the RPM. Hope this helps others.

     

    EDIT: I went and tested the small test program santosh provided and it still results in an illegal operation. But openmpi was compiled correctly and our own internal code compiled and is currently running on the cluster with the new compiler version. There's something still iffy about selected_int_kind...

    • Re: Problems building openmpi 1.6 with AMD open64 compiler
      santosh.zanjurne Moderator
      Currently Being Moderated

      If you see that the test program fails with RPM version of the binaries, then openmpi build should also fail at 'configure' stage, since the test program is taken from the 'configure' script of the openmpi sources.

       

      You can search "checking Fortran 90 kind of MPI_INTEGER_KIND (selected_int_kind(9))"  string in the config.log file, where you build the sources to see if the test program which failed with RPM binaries is tested/passed in 'configure' stage of the openmpi.

       

      Make sure you have clean sandbox before hand; execute 'make clean distclean'.

       

      If you still see it passing then please attach your config.log file here.

       

      To verify that binaries in tar and rpm are same for SLES10/RHEL5 or SLES11/RHEL6 group, you can use opencc from respective folder and pass -v command, to check the build dates to see if they are same.  They must be same for each group.

      -------------------------
      The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.
      • Re: Problems building openmpi 1.6 with AMD open64 compiler
        mithion Newbie
        Currently Being Moderated

        You are correct, after running ./configure on a clean tree, it does fail again at the same place. However, I'm totally confused as to what happened yesterday and why I was able to get it to work. As proof, when I run the following

         

        [root@argo openmpi-1.6]# mpif90 --version

        Open64 Compiler Suite: Version 4.5.2

        Built on: 2012-08-03 01:26:59 -0700

        Thread model: posix

        GNU gcc version 4.2.0 (Open64 4.5.2 driver)

         

        [root@argo openmpi-1.6]# rpm -qa | grep open64

        x86_open64-4.5.2-1.x86_64

         

        [root@argo openmpi-1.6]# openf90 --version

        Open64 Compiler Suite: Version 4.5.2

        Built on: 2012-08-03 01:26:59 -0700

        Thread model: posix

        GNU gcc version 4.2.0 (Open64 4.5.2 driver)

         

        As you can see, I'm clearly using the openf90 4.5.2.

         

        • Re: Problems building openmpi 1.6 with AMD open64 compiler
          santosh.zanjurne Moderator
          Currently Being Moderated

          Execuing 'configure' successfully generates lot of files, including libtool, to assist in the smooth build process.  'libtool' is hardcoded with path, 'where to search the dependant shared objects'.

                    Calling 'configure' again will overwrite the existing build files but if 'configure' fails, then calling 'make' will use old build files.  And I guess you may have done this.

           

          I am wokring removing any ambiguity on the download page, between let me know if you have any issues in using the non-avx compiler binaries for your current purposes.  Make sure you always specify "-march=x" flag to gain the best of performance for your targetted architecture.

          -------------------------
          The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. Links to third party sites are for convenience only, and no endorsement is implied.

More Like This

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points