Grid Computing
Grid Computing
Grid Computing
DEPARTMENT OF
presentation on
ABSTRACT
Grid computing is becoming research the area most in important the high Under more
based
on
Min-Min
algorithm.
Fuzzy C-Mean algorithm classifies jobs into Low, Medium and High groups based on job execution time and resources are classified into Low, Medium and High groups based on processor speed.
INTRODUCTION
performance Grid
computing. has
this concept, the jobs scheduling in computing complicated problems to discover a diversity of available resources, select the appropriate applications and map to suitable resources. However, the major problem is the optimal job scheduling in which Grid nodes need to allocate the appropriate resources for each job. In this paper, jobs are classified based on Fuzzy CMean algorithm and mapping the jobs to the appropriate resources
The last decade has seen a substantial increase in commodity computer faster and hardware there and are network more still performance, mainly as a result of sophisticated Nevertheless, software.
problems in the fields of science, engineering, and business, which cannot be effectively dealt with using the current generation of supercomputers. In fact, due to their size and complexity, these
problems
are
often
resource
global computing, and Internet computing and more recently Peerto-Peer (P2P) computing. GRID Grids enable the sharing, selection, and aggregation of a wide variety of resources including supercomputers, storage systems, data sources, and specialized devices that are geographically distributed and owned by different organizations for solving largescale computational and data intensive problems in science, engineering, and commerce.
(computational and data) intensive and consequently entail the use of a variety of heterogeneous resources that are not available in a single organization. The ubiquity of the Internet as well as the availability of powerful computers and highspeed network technologies as low-cost commodity components is rapidly changing the computing landscape and society. These technology opportunities have led to the possibility of using widearea distributed computers for solving large-scale problems, leading to what is popularly known as Grid computing. The term Grid is chosen as an analogy to the electric power Grid that provides consistent, pervasive, dependable, transparent access to electricity, irrespective of its source. Such an approach to network computing is known by several names: meat computing, scalable computing,
3
Fig. 1 A Simple Grid GRID COMPUTING Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed resources autonomous"
Such high performance Grid nodes provide major resources for simulation, analysis, data mining and other compute-intensive activities. Grid Computing can be denned as applying resources from many computers in a network at the same time to a single problem; usually a problem that requires a large number of processing cycles or access to large amounts of data. At its core, Grid Computing enables devices, regardless of their operating characteristics to be virtually shared, managed and accessed This resources necessary across an enterprise, of the and industry or workgroup. virtualization places access, all of data
dynamically at runtime depending on their availability, capability, performance, cost, and user's quality-of-service requirements
Grid applications are special class of distributed applications that has high computing and resource often requirements, and are
collaborative in nature. Networks connect resources on the Grid, the most data prevalent storage. of which are Although the computers with there associated computational resources can be of any level of power and capability, some of the most interesting Grids for scientists involve nodes that are themselves high performance parallel machines or clusters.
processing power at the getups of those who need to rapidly solve complex conduct business problems, compute-intensive The grid the
enables
of resources network
and storage capacity to create a single system image, granting users and applications seamless access to vast IT capabilities. Just as an Internet user views a united instance of content via the Web, a Grid user essentially sees a single, large virtual computer. TYPES OF GRID
Computational
SCHEDULING Efficient scheduling software should minimize idle processing time. No single algorithm can achieve optimal performance on all possible job execution information time such distributions. as which However, when we know more
grids,
in
distribution dominates the job spectrum, a scheduling system can choose which policy will be near optimal. Scheduling is said to be static when the processors on which the jobs will run are assigned at compile time or before execution. Dynamic scheduling or load balancing is performed at run time STAGES OF SCHEDULING
which machines with set-aside resources stand by to crunch data or provide coverage for other intensive workloads.
Scavenging
grids
commonly used to scavenge CPU cycles from idle servers and desktop
resource-intensive tasks. Data grids, which provide a for all data unified interface
Resource Discovery Resource discovery involves the user selecting a set of resources to investigate in more detail in phase two, information gathering. At the beginning of this phase, the potential set of resources is the empty set, and at the end of this phase, the potential set of resources is some set that passed a minimal feasibility requirement System Selection Given a group of possible resources (or a group of possible resource sets), all of which meet the minimum requirements for the job, a single resource (or single resource set) must be selected on which to schedule the job. This is generally done in two steps: gathering information and making a decision.
The third phase of scheduling is running a job. This involves a number of steps, few of which have been defined in a uniform way between resources. They include: Making an advance reservation (optional), Submitting job to the resources, Preparing tasks, monitoring progress, and Completion of tasks.
PROBLEM DESCRIPTION Problem Definition Scheduling algorithms are responsible for mapping the jobs to the resources. In the existing scheduling algorithms, the nature of job is not considered while allocating them to resources. Hence if a heavy job is allocated to a resource with low processing capability then it will decrease the overall efficiency of the scheduling.
Run Job
Problem Solution We propose a scheduling algorithm in which we combined Fuzzy C-Mean and Min-Min C-Mean jobs and Algorithms. algorithm Fuzzy classifies
resources into Low, Medium and Heavy groups based on execution time and processing speed fig- Our Model respectively. Min-Min algorithm then maps the low jobs to low processors, to heavy medium processors. jobs to medium processors and heavy jobs Hence reducing the makespan and the process wait delay time. ALGORITHMS: Fuzzy C-Mean Classification The following diagram depicts our model: Fuzzy C-Mean algorithm is used to classify jobs and resources into Low, Medium and Heavy groups. For FCM job takes classification,
The first three on n jobs (total no of jobs) are assigned to Low, Medium and Heavy groups based on their execution time. Then the next incoming job is compared with the mean value (cluster center) of each group. Then the job is assigned to one of the groups by minimum distance principle. The cluster center is recalculated for the group to which the job is added recently. If the cluster center varies all the jobs are reclassified according to the new cluster centers. The above process is repeated until all the incoming jobs are classified. For resource classification, we take processor speed as input and the classification is done similar to that of job classification. Min-Min Scheduling Min-Min algorithm is used to schedule jobs to resources. In our project Min-Min algorithm
8
schedule jobs in Low group to resources in Low group, Medium jobs to Medium resource group and Heavy jobs to Heavy resource group. Min-Min algorithm first chooses the resource that executes the given job in minimum time. It similarly chooses the resource for all the jobs. Then it schedules the job with the minimum execution time from the list of all jobs to the appropriate resource and removes it from the job list. The scheduled. CONCLUSION This algorithm is designed for the distributed grid environments. Fuzzy C-Mean and Min-Min algorithms are used for developing this system.This system is mainly designed to classify the jobs and resources, above process is continued until all the jobs are
hence improving the efficiency of the scheduler. This system is designed as a GUI based application. So any user can easily access and analyze the performance of the job
REFERENCES
scheduling
algorithm.
Clients 1. Siriluck Lorpunmanee1, Mohd Noor Md Sap2, Abdul Hanan Abdullah3 (2006) A static jobs scheduling for independent jobs in Grid Environment by using Fuzzy C-Mean and Genetic algorithms , Proceedings of the Postgraduate Annual Research Seminar. 2. H. Topcuoglu, S. Hariri, and M.Y. Wu (2002) Performance-Effective and Low- Complexity Task Scheduling for Heterogeneous Computing, IEEE Trans. Parallel and
running on different operating systems can be integrated and perform the scheduling process. SCOPE FUTURE DEVELOPMENT The system is developed for scheduling jobs in a grid environment. The system can be improved with the the following tool is enhancements. Currently developed as a simulation tool for the grid environment. In future development of the system the tool is converted into a real time tool. Currently the tool is tested with the Windows and Linux environment. In future the tool can
9
FOR
Zhang, H (2000) Segmented min-min: a static mapping algorithm for meta-tasks on heterogeneous computing systems, Heterogeneous Computing Workshop, 2000. (HCW 2000) Proceedings. Vol. 9, pp.375 385.