Installing and Maintaining HTCondor-CE¶
The HTCondor-CE software is a job gateway for an OSG Compute Entrypoint (CE). As such, the OSG will submit resource allocation requests (RARs) jobs to your HTCondor-CE and it will handle authorization and delegation of RARs to your local batch system. In OSG today, RARs are sent to CEs as pilot jobs from a factory, which in turn are able to accept and run end-user jobs. See the upstream documentation for a more detailed introduction.
Use this page to learn how to install, configure, run, test, and troubleshoot an OSG HTCondor-CE.
OSG Hosted CE
Unless you plan on running more than 10k concurrently running RARs or plan on making frequent configuration changes, we suggest requesting an OSG Hosted CE.
If you are installing an HTCondor-CE for use outside of the OSG, consult the upstream documentation instead.
- User IDs: If they do not exist already, the installation will create the Linux users
condor(UID 4716) and
- SSL certificate: The HTCondor-CE service uses a host certificate at
/etc/grid-security/hostcert.pemand an accompanying key at
- DNS entries: Forward and reverse DNS must resolve for the HTCondor-CE host
- Network ports: The pilot factories must be able to contact your HTCondor-CE service on port 9619 (TCP)
- Access point/login node: HTCondor-CE should be installed on a host that already has the ability to submit jobs into your local cluster
- File Systems: Non-HTCondor batch systems require a shared file system between the HTCondor-CE host and the batch system worker nodes.
As with all OSG software installations, there are some one-time (per host) steps to prepare in advance:
- Ensure the host has a supported operating system
- Obtain root access to the host
- Install CA certificates
Choosing the OSG Yum Repository¶
Before considering OSG 3.6…
Due to potentially disruptive changes in protocols, contact your virtual organization(s) (VO) to verify that they support token-based authentication and/or HTTP-based data transfer before considering an upgrade to OSG 3.6. If your VO(s) don't support these new protocols or you don't know which protocols your VO(s) support, install or remain on the OSG 3.5 release series
The OSG distributes different versions of HTCondor-CE and HTCondor in separate YUM repositories. Most notably, the repository that you choose will determine the types of credentials that your CE is able to accept. Use the following table to decide OSG YUM repository to install HTCondor-CE:
|YUM Repository||Bearer Tokens||GSI and VOMS|
|OSG 3.5 upcoming (recommended): HTCondor-CE 5, HTCondor 9.0||✅||✅|
|OSG 3.5 release: HTCondor-CE 4, HTCondor 8.8||✅|
|OSG 3.6 release: HTCondor-CE 5, HTCondor 9.0||✅|
An HTCondor-CE installation consists of the job gateway (i.e., the HTCondor-CE job router) and other support software
osg-configure, a Gratia probe for OSG accounting).
To simplify installation, OSG provides convenience RPMs that install all required software.
Clean yum cache:
[email protected] # yum clean all --enablerepo=*
[email protected] # yum update
This command will update all packages
(Optional) If your batch system is already installed via non-RPM means and is in the following list, install the appropriate 'empty' RPM. Otherwise, skip to the next step.
If your batch system is… Then run the following command… HTCondor
yum install empty-condor --enablerepo=osg-empty
yum install empty-slurm --enablerepo=osg-empty
(Optional) If your HTCondor batch system is already installed via non-OSG RPM means, add the line below to
/etc/yum.repos.d/osg.repo. Otherwise, skip to the next step.
Select the appropriate convenience RPM:
If your batch system is... Then use the following package... HTCondor
Install the CE software where
<PACKAGE>is the package you selected in the above step.:
There are a few required configuration steps to connect HTCondor-CE with your batch system and authentication method. For more advanced configuration, see the section on optional configurations.
Configuring the local batch system¶
To configure HTCondor-CE to integrate with your local batch system, please refer to the upstream documentation based on your installed version of HTCondor-CE:
Depending on the OSG repository from which you have installed HTCondor-CE, you can allow pilot job submission to your CE based on X.509 proxies (i.e., GSI and VOMS), bearer tokens, or both.
GSI and VOMS (OSG 3.5 only)¶
To configure which VOs and users are authorized to submit pilot jobs to your HTCondor-CE, follow the instructions in the LCMAPS VOMS plugin document.
Bearer Tokens (OSG 3.5 upcoming, OSG 3.6)¶
To configure which VOs are authorized to submit pilot jobs to your HTCondor-CE, consult the "SciTokens" section of the upstream documentation.
The OSG CE metapackage brings along a configuration tool,
osg-configure, that is designed to automatically configure
the different pieces of software required for an OSG HTCondor-CE:
Enable your batch system in the HTCondor-CE configuration by editing the
enabledfield in the
/etc/osg/config.d/20-<YOUR BATCH SYSTEM>.ini:
enabled = True
Read through the other
.inifiles in the
Validate the configuration settings
[email protected] # osg-configure -v
Fix any errors (at least) that
Once the validation command succeeds without errors, apply the configuration settings:
[email protected] # osg-configure -c
In addition to the configurations above, you may need to further configure how pilot jobs are filtered and transformed before they are submitted to your local batch system or otherwise change the behavior of your CE. For detailed instructions, please refer to the upstream documentation based on your installed version of HTCondor-CE:
- HTCondor-CE 5
- HTCondor-CE 4
Accounting with multiple CEs or local user jobs¶
For non-HTCondor batch systems only
If your site has multiple CEs or you have non-grid users submitting to the same local batch system, the OSG accounting software needs to be configured so that it doesn't over report the number of jobs.
Determine which file you need to modify
For OSG 3.5 installations, use the following table:
If your batch system is… Then edit the following file on each of your CE(s)… LSF
For OSG 3.6 installations, you'll need to modify
Edit the value of
SuppressNoDNRecordson each of your CE's so that it reads:
Starting and Validating HTCondor-CE¶
For information on how to start and validate the core HTCondor-CE services, please refer to the upstream documentation based on your installed version of HTCondor-CE:
Enabling OSG accounting (OSG 3.5 only)¶
In addition to the core HTCondor-CE services, an OSG 3.5 HTCondor-CE must also start and enable the accounting service,
[email protected] # systemctl start gratia-probes-cron [email protected] # systemctl enable gratia-probes-cron
In OSG 3.6, OSG accounting is managed directly by HTCondor-CE (see the update instructions for more details).
For information on how to troubleshoot your HTCondor-CE, please refer to the upstream documentation based on your installed version of HTCondor-CE:
- HTCondor-CE 5:
- HTCondor-CE 4
Registering the CE¶
To contribute to the the OSG Production Grid, your CE must be registered with the OSG. To register your resource:
Identify the facility, site, and resource group where your HTCondor-CE is hosted. For example, the Center for High Throughput Computing at the University of Wisconsin-Madison uses the following information:
Facility: University of Wisconsin Site: CHTC Resource Group: CHTC
To get assistance, please use the this page.