Tuesday Exercise 3.3: Using Wrapper Scripts to Submit Jobs¶
In this exercise, you will create a wrapper script to run the same program (
blastx) as the previous exercise.
Wrapper scripts are a useful tool for running software that can't be compiled into one piece, needs to be installed with every job, or just for running extra steps. A wrapper script can either install the software from the source code, or use an already existing software (as in this exercise). Not only does this portability technique work with almost any kind of software that can be locally installed, it also allows for a great deal of control and flexibility for what happens within your job. Once you can write a script to handle your software (and often your data as well), you can submit a large variety of workflows to a distributed computing system like the Open Science Grid.
Wrapper Script, part 1¶
Our wrapper script will be a bash script that runs several commands.
In the same directory as the last exercise (still logged into
training.osgconnect.net) make a file called
The first line we'll place in the script is the basic command for running blast. Based on our previous submit file, what command needs to go into the script? Once you have an idea, check against the example below:
#!/bin/bash ncbi-blast-2.9.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results.txt
The "header" of
#!/bin/bashwill tell the computer that this is a bash shell script and can be run in the same way that you would run individual commands on the command line.
Submit File Changes¶
We now need to make some changes to our submit file.
Make a copy of your previous submit file and open it.
Since we are now using a wrapper script, that will be our job's executable. Replace the original
blastxexeuctable with the name of our wrapper script and comment out the arguments line.
executable = run_blast.sh #arguments =
Note that since the
blastxprogram is no longer listed as the executable, it will be need to be included in
transfer_input_files. Instead of transferring just that program, we will transfer the original downloaded
tar.gzfile. To achieve efficiency, we'll also transfer the pdbaa database as the original
tar.gzfile instead of as the unzipped folder:
transfer_input_files = pdbaa.tar.gz, mouse.fa, ncbi-blast-2.9.0+-x64-linux.tar.gz
If you really want to be on top of things, look at the log file for the last exercise, and update your memory and disk requests to be just slightly above the actual "Usage" values in the log.
Before submitting, make sure to make the below additional changes to the wrapper script!
Wrapper Script, part 2¶
Now that our database and BLAST software are being transferred to the job as
tar.gz files, our script needs to accommodate.
run_blast.shscript, add two commands at the start to un-tar the BLAST and pdbaa
tar.gzfiles. See the previous exercise if you're not sure what this command looks like.
In order to distinguish this job from our previous job, change the output file name to something besides
The completed script
run_blast.shshould look like this:
#/bin/bash tar -xzf ncbi-blast-2.9.0+-x64-linux.tar.gz tar -xzf pdbaa.tar.gz ncbi-blast-2.9.0+/bin/blastx -db pdbaa/pdbaa -query mouse.fa -out results2.txt
While not strictly necessary, it's a good idea to enable executable permissions on the wrapper script, like so:
[email protected] $ chmod u+x run_blast.sh
Your job is now ready to submit. Submit it using
condor_submit and monitor using