Frequently Asked Questions
Frequently Asked Questions¶
Here we will provide details on troubleshooting perfSONAR installations for OSG and WLCG as well as some additional configuration options and a FAQ.
A good overview of existing tools provided by perfSONAR toolkit and examples how to use them to identify and isolate network problems can be found at https://fasterdata.es.net/performance-testing/troubleshooting/network-troubleshooting-quick-reference-guide/
We are maintaining a Network Troubleshooting page to guide users in identifying and following up on network problems.
- I suspect there is a network performance issue impacting my site
For OSG sites, please open a ticket with GOC. Otherwise please open a GGUS ticket (or assign an existing) one to WLCG Network Throughput support unit.
I got an email after registering with lots of information in it...what do I do?
This is part of the process. If you are a new site you will need to attend the next OSG operations meeting. If you are an existing site and have just registered perfSONAR instances you don't have to do anything but feel free to attend the next operations meeting if you have questions or concerns.
Once I registered, new tickets were opened concerning perfSONAR...What do I do?
This is standard operating procedure and the tickets are to ensure that OSG operations properly gets your new perfSONAR instances registered. You don't have to do anything and the tickets will be closed by OSG operations staff.
Infrastructure Monitoring (check_mk metrics)¶
- perfSONAR services: versions metric is failing.
This metrics checks if your sonar is at the most recent version. Please check if you have automatic yum updates enabled, this is strongly recommended due to security issues we have seen in the past. In case you're still running an older version (3.3-3.5), please update and reconfigure as soon as possible following Installation Guide
- perfSONAR configuration: contacts or location metrics are failing
Please check if you have added the administrative information as detailed here
- perfSONAR services: bwctl/owamp/pscheduler metrics are failing
This means that we're unable to connect to controller ports of the respective services, please ensure you have correct firewall settings (especially white listed subnets allowed) as described in the Installation Guide . This can also indicate failures of service daemons, please check http://docs.perfsonar.net/FAQ.html for additional details.
- perfSONAR services: esmond metric is failing
This means that your measurement archive is not accessible or failing, there can be many possible causes (disk full, httpd not running or inaccessible, etc.), you can ask for help by opening a GGUS ticket to WLCG perfSONAR support.
perfSONAR json summary is failing
- This means the toolkit's homepage is inaccessible, which is required to check many additional services, so in turn all the other metrics will likely be in unknown or critical state. Please check for usual causes (disk full, httpd not running or blocked), we need to be able to access your homepage via HTTP or HTTPS
perfSONAR configuration: meshes metric is failing
This indicates that you're missing the recommended mesh configuration. Please follow mesh configuration as detailed in the installation guide. Also, please REMOVE any old mesh configuration, this metric will also fail in case you have both the new mesh config and the old mesh URLs
- perfSONAR services: ntp is failing
This indicates that NTP service is not running correctly on your toolkit instance, please note that NTP is critical service. Some things to check include your perfSONAR NTP configuration. If NTP is correctly configured, it is possible you could have a firewall issue: port 123 UDP must be open. There is NTP debugging information available on Google (e.g., https://support.ntp.org/bin/view/Support/TroubleshootingNTP). If you still have problems, please open a support ticket (see below).
- perfSONAR services: regular testing/pscheduler is failing
This indicates that pscheduler is not working correctly. As this is the core daemon please contact WLCG perfSONAR support unit for help.
- There are many tests failing for given sonar, where should I start
Please update and reconfigure your sonar following Installation Guide. Please ensure firewall doesn't block access from the whitelisted subnets that are required for the infrastructure monitoring to work.
- Where can I get support on managing WLCG perfSONAR ?
You can open ticket in GGUS to WLCG perfSONAR support unit or contact directly wlcg-perfsonar-support (at cern.ch)
- perfSONAR esmond freshness Latency/Bandwidth Direct is failing or gives warning
This metric checks freshness of the local measurement archive, in particular it checks if it contains fresh results for all the configured tests. This metric is needed to determine if we're able to consistently get results from perfSONAR boxes in WLCG. Currently it's a non-critical test, you can ignore it.
- perfSONAR services ndt/npad is failing
Both metrics check if you have disabled NDT and NPAD. As both NDT and NPAD have been dropped starting with 4.0, this metrics should stay green in most of the cases.
- perfSONAR hardware check is failing
Please consult the minimum and recommended hardware requirements.