Skip to content

Detailed Overview

This document outlines the overall installation process for an OSG site and provides many links into detailed installation, configuration, troubleshooting, and similar pages. If you do not see software-related technical documentation listed here, try the search bar at the top or contacting us at [email protected].

Plan the Site

If you have not done so already, plan the overall architecture of your OSG site. It is recommended that your plan be sufficiently detailed to include the OSG hosts that are needed and the main software components for each host. Be sure to consider the operating systems that OSG supports. For example, a basic site might include:

Purpose Host Major Software
Compute Element (CE) OSG CE, HTCondor Central Manager, etc. (osg-ce-condor)
Worker Nodes OSG worker node client (osg-wn-client)

Prepare the Batch System

The assumption is that you have an existing batch system at your site. Currently, we support HTCondor, LSF, PBS and Torque, SGE, and Slurm batch systems.

For smaller sites (less than 50 worker nodes), the most common way to add a site to OSG is to install the OSG Compute Element (CE) on the central host of your batch system. At such a site - especially if you have minimal time to maintain a CE - you may want to contact to ask about using an OSG-hosted CE instead of running your own. Before proceeding with an install, be sure that you can submit and successfully run a job from your OSG CE host into your batch system.

Add OSG Software

If necessary, provision all OSG hosts that are in your site plan that do not exist yet. The general steps to installing an OSG site are:

  1. Install OSG Yum Repos and the Compute Element software on your CE host
  2. Install the Worker Node client on your worker nodes.
  3. Install optional software to increase the capabilities of your site.


For sites with more than a handful of worker nodes, it is recommended to use some sort of configuration management tool to install, configure, and maintain your site. While beyond the scope of OSG’s documentation to explain how to select and use such a system, some popular configuration management tools are Puppet, Chef, Ansible, and CFEngine.

General Installation Instructions

Installing and Managing Certificates for Site Security

Installing and Configuring the Compute Element

Adding OSG Software to Worker Nodes

Installing and Configuring Other Services

All of these node types and their services are optional, although OSG requires an HTTP caching service if you have installed CVMFS on your worker nodes.

Verify OSG Software

Before receiving real OSG work, your site needs to successfully run test jobs from our GlideinWMS factory and report usage to the GRACC.

If you haven't already, register any publicly facing resources with OSG software installed, including HTCondor-CE, Frontier Squid, GridFTP, and/or XRootD.

Test locally

It is useful to test manual submission of jobs from inside and outside of your site through your CE to your batch system. If this process does not work manually, it will probably not work for the GlideinWMS pilot factory either.

Get test jobs

To begin running pilots at your site, e-mail and ask for test pilots. Please provide them with the following information:

  • The fully qualified domain name of the CE
  • Resource name
  • Supported OS version of your worker nodes (e.g., EL6, EL7, or both)
  • Support for multicore jobs
  • Maximum job walltime
  • Maximum job memory usage

Once the factory team has enough information, they will start submitting pilots from the test factory to your CE. Initially, this will be one pilot at a time but once the factory verifies that pilot jobs are running successfully, that number will be ramped up to 10, then 100.

Verify reporting and monitoring

To verify that your site is correctly reporting to the OSG, check OSG's Accounting Portal for records of your site reports (select your site from the drop-down box). If you have enabled the OSG VO, you can also check

Scale Up to Full Production

After successfully running all the pilot jobs that are submitted by the test factory and verifying your site reports, your site will be deemed production ready. No action is required on your end, factory operations will start submitting pilot jobs from the production factory.

Maintain the Site

To avoid potential issues with OSG job submissions, please notify us of major changes to your site, including:

  • Major OS version changes on the worker nodes (e.g., upgraded from EL 6 to EL 7)
  • Adding or removing container support
  • Policy changes regarding maximum walltime or memory usage
  • Scheduled or unscheduled downtimes
  • Site topology changes such as additions, modifications, or retirements
  • Changes to site contacts, such as administrative or security staff

It is also important to keep your software and data (e.g., CA and VO client) up-to-date with the latest OSG release. To stay abreast of software releases, we recommend subscribing to the mailing list.

Get Help

If you need help with your site, or need to report a security incident, follow the contact instructions.