OSG User School 2017 Final Assignment¶
The School focused on using high-throughput computing (HTC) to support and transform scientific inquiry. Your final assignment is to apply your new knowledge to a challenge in your scientific domain that requires significant computation. The assignment is useful, because it:
- Reinforces and consolidates what you learned
- Prepares you to take real action on your large-scale computational challenge(s)
- Demonstrates the value of the School to our funding agencies and to your advisor, colleagues, etc.
- Guides us as we try to improve the School
First, choose a challenge or project to present. An ideal topic:
- Is important to you and your advisor, team, department, or field
- Represents work that is in progress or is planned to start soon
- Requires significant computational resources
Next, think about your topic and how to apply what you learned during the School. Think about how to approach the computational needs of the project using local HTC resources or the Open Science Grid (OSG). We are not asking you to implement the system! Just imagine how you would do it. One approach is fine, but more than one approach is fine, too. Imagine that you will run on the resources available to you at your own institution. If your institution does not have a HTC system available, then think about what kind of resources you would want or how you could get access to resources via the OSG.
Your final assignment will be a written report (see below for more detail). Please address all of the following questions in your submission:
- The science challenge (about 1/3 of the assignment) — described in a way that smart people outside of your field can understand
- What science do you work on?
- What specific challenge do you (want to) work on?
- Why does that work require significant computing resources to solve?
- The computational plan (about 2/3 of the assignment)
- Summarize your approach and explain why you think it is good
- Estimate the resources (CPUs, time, memory, disk, network, etc.) that you need
- Describe in some detail your plan or proposal to use computing tools to work on your challenge (more than one plan is OK)
- Refer back to specific lectures and exercises in the School, when appropriate
There are many possible questions your paper could address. Below are some suggestions — feel free to answer some (or all) of them, or create and answer your own interesting questions:
- What local resources do you have access to?
- Would you use just local resources or do you need remote resources, too?
- How would you turn your project into actual jobs?
- What are the resource needs of the jobs themselves?
- What sort of workflow, if any, would you use? Are there manual steps in your overall workflow? Could they be automated (e.g., with DAGMan)?
- How much data do you need to move around? Which type of data situation do you have? What is your plan for data management?
- Do you think your project is better suited for HTC or HPC? Why?
- What security or privacy concerns do you have with your project? Do you need to do anything special regarding security?
- How would your science be transformed by increasing the amount of computation you can use?
You will write a short paper. You can think of the format as an informal whitepaper, research paper, or proposal, if that helps organize your paper. Please represent yourself and your work well! Your final assignment may be posted on the School website and hence be available to the public.
- There are no precise length requirements, but 1000–1500 words is a good length to aim for
- Pictures, charts, and diagrams are good, if they are appropriate and clear
- The paper does not need to be journal-ready, but should be good quality and ready for public display
Submit your paper in as a PDF — no Word or LaTeX documents, please!
- If you are not sure how to make a PDF, consult your department or campus IT staff for help
- Email the PDF to the email@example.com list
- If the PDF is really huge, contact us and we will find another way to transfer the file (we should be able to manage large data, right??)
The paper is due 31 August 2017. We will consider individual requests for a time extension, but you need a good reason. Talk to us about the deadline, if it seems like a problem.
If you have any questions or comments about the assignment, please contact us at the firstname.lastname@example.org mailing list.