MPI Support in Shifter: MPICH abi ================================= MPICH and its many variants agreed in 2014 to retain ABI compatibility to help improve development practices. However, this ABI compatibility also provides a clear path for almost transparently supporting MPI within shifter environment containers. The basic idea is that the container developer will use a fairly vanilla version of MPICH and dynamically link their application against that. The shifter-hosting site then configures shifter to inject their site-specific version of MPICH (perhaps a Cray, Intel, or IBM variant) linked to the interconnect and workload manager driver libraries. The site-specific version of libmpi.so then overrides the version in the container, and the application automatically uses it instead of the generic version originally included in the container. Container Developer Instructions -------------------------------- Here is an example Dockerfile:: FROM ubuntu:14.04 RUN apt-get update && apt-get install -y autoconf automake gcc g++ make gfortran ADD http://www.mpich.org/static/downloads/3.2/mpich-3.2.tar.gz /usr/local/src/ RUN cd /usr/local/src/ && \ tar xf mpich-3.2.tar.gz && \ cd mpich-3.2 && \ ./configure && \ make && make install && \ cd /usr/local/src && \ rm -rf mpich-3.2 ADD helloworld.c / RUN mkdir /app && mpicc helloworld.c -o /app/hello ENV PATH=/usr/bin:/bin:/app Going through the above: 1. base from a common distribution, e.g., ubuntu:14.04, 2. install compiler tools to get a minimal dev environment. 3. get and install mpich 3.2 4. add and compile your application 5. Setup the environment to easily access your application To construct the above container, one would do something like:: docker build -t dmjacobsen/mpitest:latest . (setting your tag appropriately, of course) SLURM User Instructions ----------------------- If the MPICH-abi environment is configured correctly (see below), it should be very easy to run the application. Building from the example above:: dmj@cori11:~> shifterimg pull dmjacobsen/mpitest:latest 2016-08-05T01:14:59 Pulling Image: docker:dmjacobsen/mpitest:latest, status: READY dmj@cori11:~> salloc --image=dmjacobsen/mpitest:latest -N 4 --exclusive salloc: Granted job allocation 2813140 salloc: Waiting for resource configuration salloc: Nodes nid0[2256-2259] are ready for job dmj@nid02256:~> srun shifter hello hello from 2 of 4 on nid02258 hello from 0 of 4 on nid02256 hello from 1 of 4 on nid02257 hello from 3 of 4 on nid02259 dmj@nid02256:~> srun -n 128 shifter hello hello from 32 of 128 on nid02257 hello from 46 of 128 on nid02257 hello from 48 of 128 on nid02257 hello from 55 of 128 on nid02257 hello from 57 of 128 on nid02257 ... ... hello from 26 of 128 on nid02256 hello from 27 of 128 on nid02256 hello from 28 of 128 on nid02256 hello from 29 of 128 on nid02256 hello from 30 of 128 on nid02256 hello from 31 of 128 on nid02256 dmj@nid02256:~> exit salloc: Relinquishing job allocation 2813140 salloc: Job allocation 2813140 has been revoked. dmj@cori11:~> System Administrator Instructions: Configuring Shifter ------------------------------------------------------ The basic plan is to gather the libmpi.so* libraries and symlinks and copy them into the container at runtime. This may require some dependencies to also be copied, but hopefully only the most limited set possible. The current recommendation is to copy these libraries into /opt/udiImage//lib64, and all the dependencies to /opt/udiImage//lib64/dep We then use patchelf to rewrite the rpath of all copied libraries to point to /opt/udiImage//lib64/dep The source libraries must be prepared ahead of time using one of the helper scripts provided in the extras directory, or a variant of same. As we get access to different types of systems, we will post more helper scripts and system-type-specific instructions. Finally, we need to force LD_LIBRARY_PATH in the container to include /opt/udiImage//lib64 Cray ++++ Run the `prep_cray_mpi_libs.py` script to prepare the libraries:: login$ python /path/to/shifterSource/extra/prep_cray_mpi_libs.py /tmp/craylibs Note: in CLE5.2 this should be done on an internal login node; in CLE6 an internal or external login node should work. You'll need to install patchelf into your PATH prior to running (https://nixos.org/patchelf.html) Next copy /tmp/craylibs to your shifter module path (see Modules) under mpich/lib64, e.g., :code:`/usr/lib/shifter/modules/mpich/lib64`. Finally, a few modifications need to be made to udiRoot.conf: 1. add "module_mpich_siteEnvPrepend = LD_LIBRARY_PATH=/opt/udiImage/modules/mpich/lib64" 2. add "module_mpich_copyPath = /usr/lib/shifter/modules/mpich" 3. add "/var/opt/cray/alps:/var/opt/cray/alps:rec" to siteFs 4. if CLE6, add "/etc/opt/cray/wlm_detect:/etc/opt/cray/wlm_detect" to siteFs 5. add "defaultModules = mpich" to load cray-mpich support by default in all containers Note, you may need to modify your sitePreMountHook script to create /var/opt/cray and /etc/opt/cray prior the mounts. Instead of setting up the module_mpich_copyPath, you could use siteFs to bind-mount the content into the container instead, which may have performance benefits in some environments, e.g. set module_mpich_siteFs = /usr/lib/shifter/modules/mpich:/shifter/mpich. In that case you'll need to adjust the module_mpich_siteEnvPrepend paths, and pre-create the /shifter directory using the sitePreMountHook. ------ Other MPICH variants/vendors coming soon. If you have something not listed here, please contact shifter-hpc@googlegroups.com! [1] https://www.mpich.org/abi/