Metasch Differ From Resouce Broker
Metasch Differ From Resouce Broker
Metasch Differ From Resouce Broker
Faculty of Engineering
Summer 2006 Work Term Report
Patrick Armstrong
0532342
Computer Science
patricka@uvic.ca
Prerequisites
The reader should have some knowledge and understanding of general
computer-related topics. Most of the grid related terminology and jargon
are explained prior to their use; however, this document does assume that
the reader is comfortable with computers, writing software, and Unix-like
systems.
Purpose
The purpose of this document is to familiarize the reader with grid com-
puting in general, as well as outline some of the work done for the grid
computing projects collaborated on by UVic and NRC, specifically the
Metascheduling options examined by the team.
1
Contents
1 Introduction: Grid Computing 3
1.1 The Globus Toolkit and the Grid . . . . . . . . . . . . . . . . 3
1.2 Grid Resource Allocation and Management . . . . . . . . . . 4
1.3 Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 CANARIE Project . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Metascheduler Deployments 7
3.1 GCGate01: Production GT2 Metascheduler . . . . . . . . . . 7
3.2 Ugdev06: GT4 Condor Metascheduler . . . . . . . . . . . . . 7
3.3 Ugdev08: GridWay Metascheduler . . . . . . . . . . . . . . . 8
5 Conclusion 15
2
1 Introduction: Grid Computing
In the last few years, many tasks that were seen as computationally un-
feasible, such as protein folding, the simulation of particles studied in
high energy physiscs and breaking cryptographic cyphers, has become
feasible by applying the principles of distributed computing. Generally,
distributed computing is accomplished by breaking one large task into
smaller tasks that can easily be distributed to machines dedicated to ac-
complishing these tasks.
Grid computing is a specific type of distributed computing that is used
to make use of a wide variety of distributed, heterogeneous resources to
solve problems that are too computationally intensive for any one super-
computer, or cluster of computers. A useful analogy is to compare the grid
to the power grid. Users do not need to think about which generator will
produce the power that will run their appliance, and they assume that the
power from the power grid will always be available. In a computational
grid, one would simply submit a job to ”The Grid”, and then the job would
be completed on one of the grid’s resources, and then be sent back to the
client. Ideally, any user could have as much computational power as he
needs from the grid, in the same way any client can get as much electrical
power as he requires from the power grid.
3
1.2 Grid Resource Allocation and Management
The Grid Resource Allocation and Management (GRAM) interface is the Globus
component for handling the ”initiation, monitoring, management, schedul-
ing, and/or coordination of remote computations[2].” GRAM allows users
to create and control their jobs through a standard API, making develop-
ment of grid-enabled applications simple.
GRAM can refer to both the Web services implementation of Globus,
as well as the pre-Web services version, implemented in both GT 2 and GT
4. The current version of GRAM does all of its communications through
Web services, making it very simple for applications to talk to GRAM and
leverage its capabilities.
4
CANARIE, Canada’s advanced Internet development organization, and
is a joint project between the University of Victoria and the National Re-
search Council.
5
controlling machine. The two types of LRMS currently in use in the Cana-
dian GridX1 grid are called the Portable Batch System (PBS) and Condor.
Condor The Condor Cycle Scavenger works similarly to PBS: jobs are
submitted to a controlling central node, called a Condor Collector, which
then picks a machine under its control to run the job. While Condor can
use dedicated compute nodes to run its jobs, it main attraction is that it
can make use of idle computing resources to run its jobs, much like the
Folding@Home protein folding software. The Condor Collector maintains
a pool of resources, which make themselves known after they sit idle. A
job can then be submitted and run on the machine, but execution will halt
if a key is pressed, a mouse is moved, or there is significant non-Condor
CPU usage. In this way, an organization can make use of the many idle
hours of CPU time from its workstations, lab machines and under-utilized
servers, with minimal additional infrastructure cost.
2.2 GridWay
The GridWay Metascheduler is a new Globus “incubator project” that is
able to accomplish many of the tasks you would expect of a metasched-
uler, such as submitting and monitoring jobs and matching using rank
and requirements. GridWay also has a number of interesting features. For
example, GridWay is able to dynamically acquire and monitor resources
using the standard Globus MDS4 Index Service, Which means that a grid
resource administrator could easily add his resource to a grid system by
forwarding his local registry to the central grid registry, or a registry ad-
ministrator can add a resource by pulling information downstream from
a grid site. GridWay also allows for the use of advanced scheduling poli-
cies through the use of “Scheduling Drivers”, which can be written in any
language using any algorithm. The scheduler simply needs to be able to
6
output its commands as text, which the central GridWay daemon will ex-
ecute.
3 Metascheduler Deployments
Currently, there are a number of metaschedulers that have been deployed
in both production and development evironments. This section will high-
light a few of them to explore a couple of the solutions to the metaschedul-
ing problem.
7
There are a few differences in implementation, however. First, rather
than using MyProxy to manage credentials on the metascheduler, they
decided to use Globus’s credential delegation facility, the DelegationFac-
toryService. This allows a grid user to simply delegate his proxy to the
metascheduler, which that machine can use for any grid operations it does
for that user. This way, there is no need to set up a MyProxy server for the
metascheduler to use.
8
<job>
<executable>/bin/uname</executable>
<argument>-a</argument>
<directory>/tmp</directory>
<stdout>/tmp/stdout</stdout>
<stderr>/tmp/stderr</stderr>
<fileStageIn>
<transfer>
<sourceUrl>
gsiftp://machine.ca/tmp/dat.dat
</sourceUrl>
<destinationUrl>
file:///tmp/dat.dat EXECUTABLE=/bin/uname
</destinationUrl> ARGUMENTS=-a
</transfer> INPUT_FILES=gsiftp://machine.ca/tmp/dat.dat dat.dat
</fileStageIn> OUTPUT_FILES=out.dat gsiftp://machine.ca/tmp/out.dat
<fileStageOut> STDIN_FILE=/dev/null
<transfer> STDOUT_FILE=file:///tmp/stdout
<sourceUrl> STDERR_FILE=file:///tmp/stderr
file:///tmp/out.dat
</sourceUrl>
<destinationUrl>
gsiftp://machine.ca/tmp/out.dat
</destinationUrl>
</transfer>
</fileStageOut>
</job>
9
status command, for example the condor q command, with the job ID
returned by the submit method.
Finally, the cancel method is used to cancel a job on the local sched-
uler using the job ID from the submit method. This is done by running
the local scheduler’s job cancel command, for example the PBS qdel com-
mand.
10
Figure 3: Two-hop file staging Figure 4: One-hop file staging
Since Globus job description files already have a syntax for describing
a file transfer to a directory on the execution host, it makes sense to re-use
this syntax when submitting a job. Essentially, this means that we should
place our file staging elements inside an extensions tag. As a side effect,
this also means that Globus will not stage files in to the metascheduler, and
if we also place fileStageOut elements in the extensions element, it will
not stage files out. This happens to be useful, as we can then stage files in
and out from the client machine to the metascheduler in one hop, rather
than two.
11
resource, and from resource to client. This is called one-hop file staging.
One-hop file staging eliminates the problems of two-hop staging by
transferring file staging metadata to the metascheduler, which is then passed
on to the execution host. The metascheduler then initiates a third party file
transfer from the client machine to the execution host. While one-hop file
staging is an ideal solution, it can sometimes be difficult to implement.
Condor-G, for example, requires the files it will stage be present on the
metascheduler before it can transfer the files to another machine running
GridFTP. There is no way to do third-party transfers.
Fortunately, GridWay is able to do file transfers from a remote machine
to to an execution host using a gsiftp:// URL. This means that only meta-
data describing the file transfer must be sent to the GridWay metasched-
uler, not the actual data itself. This results in faster data transfers, as there
is only one file transfer operation per file, rather than two as well as less
load on the metascheduler itself.
12
EXECUTABLE=gwgramwrapper #!/bin/sh
ARGUMENTS=--fastmath --roundfloats # Wrapper script for gridway
INPUT_FILES=gsiftp://sci.ca/tmp/science science
chmod 744 science
./science $@
13
4.3.1 Proxy Management
The stock GridWay scheduler adapter requires that a user manually ini-
tialize a proxy on the metascheduling machine. A simple workaround
was to copy a delegated proxy to the default location for a user proxy,
/tmp, which GridWay could then use. There is a minor security issue
with this solution when more than one grid user is mapped to the same
user account, but this should not be done anyway. For now, a GridWay
administrator should ensure that there no user account is mapped to more
than one grid user.
14
Schema[9]. Adding an information driver that can interpret the GridX1
Schema should not be particularly difficult, as there is already example
source code included with GridWay for gathering monitoring information
from stock MDS4 Registries and the LDAP mapping of the GLUE schema
(MDS2) as well.
5 Conclusion
Choosing the proper metascheduling solution for a Web services com-
putational grid will be a difficult task. There are a number of benefits
and disadvantages to both the Condor-G metascheduling system and the
GridWay-based system. Overcoming the disadvantages to both solutions
is one of the tasks that need completion in the next few months of the CA-
NARIE project. The author is confident that overcoming these obstacles
will be accomplished without much difficulty by the team at
References
[1] D. Booth, H. Haas, F. McCabe, E. Newcomer, M. Champion, C. Ferris,
D. Orchard. “Web Services Architecture.” 11 February 2004.
15
[7] “GridWay 5 Documentation: User Guide”.
16