Skip to content

Worker Node Overview

The Worker Node Client is a collection of useful software components that is expected to be on every OSG worker node. In addition, a job running on a worker node can access a handful of environment variables that can be used to locate resources.

This page describes how to initialize the environment of your job to correctly access the execution and data areas from the worker node.

The OSG provides no scientific software dependencies or software build tools on the worker node; you are expected to bring along all application-level dependencies yourself (preferred; most portable) or utilize CVMFS. Sites are not required to provide any specific tools (gcc, lapack, blas, etc.) beyond the ones in the OSG worker node client and the base OS.

If you would like to test the minimal OS environment that jobs can expect, you can test out your scientific software in the OSG Docker image.

Common Software Available on Worker Nodes

The OSG worker node environment contains the following software:

  • The supported set of CA certificates (located in $X509_CERT_DIR after the environment is set up)
  • Proxy management tools:
    • Create proxies: voms-proxy-init and grid-proxy-init
    • Show proxy info: voms-proxy-info and grid-proxy-info
    • Destroy the current proxy: voms-proxy-destroy and grid-proxy-destroy
  • Data transfer tools:
    • HTTP/plain FTP protocol tools (via system dependencies):
      • wget and curl: standard tools for downloading files with HTTP and FTP
    • Transfer clients
      • GFAL-based client (gfal-copy and others). GFAL supports SRM, GridFTP, and HTTP protocols.
      • Globus GridFTP client (globus-url-copy)
  • MyProxy client tools

At some sites, these tools may not be available at the pilot launch. To setup the environment, do the following:

[email protected] $ source $OSG_GRID/setup.sh

This should be done by a pilot job, not by the end-user payload.

The Worker Node Environment

The following table outlines the various important directories and information in the worker node environment. A job running on an OSG worker node can refer to each directory using the corresponding environment variable. Several of them are defined as options in your OSG-Configure .ini files in /etc/osg/config.d. Custom variables and those that aren't listed may be defined in the Local Settings section.

Environment Variable OSG-Configure section/option Purpose Notes
$OSG_GRID Storage/grid_dir Location of additional environment variables. Pilots should source $OSG_GRID/setup.sh in order to guarantee the environment contains the worker node binaries in $PATH.
$OSG_SQUID_LOCATION, Squid/location Location of a HTTP caching proxy server Utilize this service for downloading files via HTTP for cache-friendly workflows.
$OSG_WN_TMP Storage/worker_node_temp Temporary storage area in which your job(s) run Local to each worker node (recommended size: 10 GB/job). See this section below for details.
$X509_CERT_DIR Location of the CA certificates If not defined, defaults to /etc/grid-security/certificates.
$_CONDOR_SCRATCH_DIR Suggested temporary storage for glideinWMS-based payloads. Users should prefer this environment variable over $OSG_WN_TMP if running inside glideinWMS.

OSG_WN_TMP

Site administrators are responsible for ensuring that $OSG_WN_TMP is cleaned up. We recommend one of the following solutions:

  • Use common batch-system capabilities to create a temporary, per-job directory that is cleaned up after each job is run.
  • Periodically purge the directory (e.g. tmpwatch).

For VO managers

Note

The following advice applies to VO managers or maintainers of pilot software; end-users should contact their VO for the proper locations to stage temporary work (often, this will be either $TMPDIR or $_CONDOR_SCRATCH_DIR).

Be careful with using $OSG_WN_TMP; at some sites, this directory might be shared with other VOs. We recommend creating a new sub-directory as a precaution:

mkdir -p $OSG_WN_TMP/MYVO
export mydir=`mktemp -d -t MYVO`
cd $mydir
# Run the rest of your application
rm -rf $mydir

The pilot should utilize $TMPDIR to communicate the location of temporary storage to payloads.

A significant number of sites use the batch system to make an independent directory for each user job, and change $OSG_WN_TMP on the fly to point to this directory.

There is no way to know in advance how much scratch disk space any given worker node has available; recall, what disk space is available may be shared among a number of job slots.