Open Science Pool Containers¶
In order to share compute resources with the Open Science pool, sites can launch pilot jobs directly by starting an OSG-provided Docker container. The container includes a simple worker node environment and an embedded pilot; when combined with an OSG-provided authentication token (not included in the container), the pilot can connect to the Open Science pool and start executing jobs.
This technique is useful to implement backfill at a site: contributing computing resources when they would otherwise be idle. It does not allow the site to share resources between multiple pools and, if there are no matching idle jobs in the Open Science pool, the pilots may remain idle.
In order to configure the container, you will need:
- A registered resource in OSG Topology; resource registration allows OSG to do proper usage accounting and maintain contacts in case of security incidents.
- An authentication token from the OSG. Please contact OSG support to request a token for your site.
- An HTTP caching proxy ("squid server") at or near your site.
Running the Container with Docker¶
The Docker image is kept in DockerHub and requires a number of environment variables to be set in order to function appropriately:
- Set the
TOKENenvironment variable to the authentication token you received from OSG.
GLIDEIN_ResourceNameto match the site name and resource name you registered in topology, respectively.
- Set the
OSG_SQUID_LOCATIONenvironment variable to the HTTP address of your preferred squid instance.
- Optional: Some sites prefer that job I/O is done in a specific temporary directory instead of inside the container.
To do this, map the appropriate directory on the host to
/pilotinside containers. If you are using Docker to launch the container, this is done with the command line flag
- Optional: add an expression with the
GLIDEIN_Start_Extraenvironment variable to append to the HTCondor
STARTexpression; this limits the pilot to only run certain jobs.
Here is an example invocation using
docker run by hand:
docker run -it --rm --user osg \ --cap-add=DAC_OVERRIDE --cap-add=SETUID --cap-add=SETGID \ --cap-add=SYS_ADMIN --cap-add=SYS_CHROOT --cap-add=SYS_PTRACE \ --cap-add=CAP_DAC_READ_SEARCH \ -v /cvmfs:/cvmfs:shared \ -e TOKEN="..." \ -e GLIDEIN_Site="..." \ -e GLIDEIN_ResourceName="..." \ -e GLIDEIN_Start_Extra="True" \ -e OSG_SQUID_LOCATION="..." \ opensciencegrid/osgvo-docker-pilot:release
Note the additional capabilities requested in the above
docker run allow the container to invoke
user jobs; this allows the user to utilize a container for their job that is different from the pilot.