Comsol Parallel
Comsol Parallel
COMSOL Multiphysics
Master student
Department of Electric Power Engineering
Norwegian University of Science and Technology
Trondheim, Norway
Email: solmajab@stud.ntnu.no
Abstract—In this thesis the parallel capabilities of COMSOL The most common types of jobs that requires a great deal
Multiphysics are investigated. A description for how one can run of computational resources are
COMSOL on a Linux cluster is presented. The speedup was • Large 3D simulations with millions of degrees of freedom
found to be poor for medium-sized simulations with less than 10
• Time dependent simulations
million degrees of freedom. The speedup for parametric sweeps
were found to be excellent. Particle swarm optimization (PSO) • Optimization methods that run hundreds or thousands of
was implemented using LiveLink for Matlab, and run on the simulations
supercomputer at NTNU. It was found to perform very well A general description of how one can run COMSOL models
without any tuning of the algorithm. on a cluster will be presented.
In master projects, time dependent simulations and op-
timization are perhaps the most common jobs. Of these,
I . I NTRODUCTION
optimization is the type of job that will benefit the most
Motivation from parallel computing. Different optimization methods are
This thesis was motivated by the wish to run larger FEM reviewed, and a global optimization method is selected for
simulations by the electrical power department. implementation on Vilje. The goal is to 1) describe a method
Research objectives are often restricted by the computa- for how one can optimize COMSOL models on Vilje, and
tional resources available. There is a trade of between the 2) use the method to optimize a model from fellow student
accuracy of a simulation and the simulation time, and sim- Charlie Bjørk.
ulations will be created with time and memory-constraints in This thesis is partially meant to be a "how to"-manual for
mind. When it comes to FEM software, models are usually future students. Hopefully this work will make it easier for
simplified in order to achieve a reasonable simulation time. For future master students to utilize high performance computing
example by using linear models instead of complex models. in their COMSOL projects.
NTNU has a supercomputer (Vilje) at campus which is
available for students and phd’s, but it is being utilized by II . S CIENTIFIC AND PARALLEL C OMPUTING
master students to a very small degree. Scientific computing can be defined as "the study of how
Some reasons that Vilje has not been utilized more is to use computers to solve mathematical models in science
• No awareness: Students doesn’t know it exists, or that it and engineering". Scientific computing emerged in 1938 when
is available for them Konrad Zuse build the first programmable computer in order
• Limited knowledge: students don’t know how to use it to solve systems of linear equations [1]. Before that scientists
• Limited need: Master projects seldom require heavy had to make many simplifications to be able to solve a problem
computations by hand, and only simple PDE’s could be solved accurately.
There is a gap between the electrical power department and Since then, scientific computing has become an increasingly
the world of cluster computing. The goal of this thesis is to important enterprise for researchers and engineers. Most in-
bridge the gap, so that future students are not restricted by the dustrial sectors rely on scientific computing when developing
computational resources of a laptop. new designs and products. The development is driven by the
COMSOL Multiphysics is the chosen FEM software by need to solve larger and more complex problems.
NTNU’s electrical power engineering department. It is a Scientific computing combines mathematic models and nu-
powerful and versatile simulation software with cluster capa- merical analysis to solve complex problems. The first step is to
bilities. use understanding of the physical problem to set up a suitable
Scope of Work mathematical model. The model will in most cases consist of
The main objectives at the start of this thesis was to differential equations and a number of initial and boundary
gain experience in running COMSOL Multiphysics on the conditions [2]. Numerical methods and computer science can
supercomputer Vilje. then be used to solve the system of equations.
MASTER THESIS, JUNE 2017, NTNU 2
• The distributed memory model There are many iterative solvers to choose between in
When running COMSOL on a Linux or Windows cluster COMSOL, but two of the most widely used are multigrid
consisting of several computers, the distributed memory methods and domain decomposition.
model is used. The nodes in a cluster do not share the
same memory, and the speed of communication between
nodes will depend on the physical distance between them.
COMSOL uses MPI for node-to-node communication on
a cluster. MPI the dominating message passing protocol
for parallel computing.
Both memory models are combined to give the best perfor-
mance. When you start a COMSOL job on a cluster, COMSOL
will use the shared memory model within each node, and MPI
to communicate between nodes. It is also possible to use MPI
(distributed mode) to communicate between the cores within
a CPU, but this will be slower than using the shared memory Fig. 4: A schematic description of the full multigrid algorithm
mode. [12]
An advantage of COMSOL Multiphysics is that is will set
up and manage the communication between nodes automati- Multigrid Solvers
cally. Little knowledge about parallel computing is required to Basic iterative methods remove the high frequency errors
use it, but in order to get the most out of distributed mode, the quickly, but use a long time to remove the low frequency
model should be set up with parallelism in mind. Choosing the errors. The multigrid method is based on the observation that if
right solver is important in order to best utilize a computing a problem is transferred to a coarser grid, the highest frequency
cluster. errors disappear and the low frequency errors turn into higher
frequency errors. Transferring the problem to a coarser grid is
called restriction.
A. Solvers The first step in multigrid methods is to remove the high
There are two main methods to solve systems of linear frequency errors with a basic iterative method, giving the
equations: direct and iterative methods. Direct methods require solution xl . The residual
a fixed, deterministic number of steps to produce a solution. r = b − A · xl (4)
Gaussian elimination and LU factorization are some examples
of direct methods. Iterative solvers improve on an initial guess of the solution is computed, and the system is restricted from
for each iteration. The process can be repeated until the a mesh with size h to a mesh with size 2 · h.
residual Ax-b sufficiently close to ~0. The solution on the mesh n · h is then prolongated up to
Direct solvers are very robust and will work on most the mesh (n/2)h by interpolation.
problems, but iterative methods are usually more efficient both In order to solve the residual on the coarser mesh, the
in time and memory consumption. Because of this, the default system can again be transferred to a coarser grid. This can
solver uses a direct method only for 2D problems, and for be repeated recursively until the system is small enough to
smaller 3D problems. solve with direct solvers. This recursive scheme is called the
The initial sparseness of the system matrix will not be multigrid V-cycle.
maintained when using a direct solver. Many of the zero terms Multigrid methods are popular because of their rapid con-
will become non-zero during the solution process [10]. This vergence rate. The computational work increases linearly with
is called "fill-in", and it is undesirable because it increases the the number of unknowns.
memory requirements and the number of arithmetic operations. Domain Decomposition
Different strategies can be used to reduce the fill-in. Domain decomposition works by dividing model domains into
Iterative methods can be challenging to set up for complex sub-domains, and solving the problem for each sub-domain.
multi-physics models, as they are less robust. There is a trade- The total solution is then found by iterating between the
off between robustness of a solver and their time and memory computed solutions for each sub-domain, and using the cur-
requirements. The fastest iterative solvers are least robust, and rent neighboring sub-domain solutions as boundary conditions
do not work for all cases. [11].
There are many iterative solvers available in COMSOL, but To solve each sub-domain, a "Domain solver" is used. In
they are all similar to the conjugate gradient method. COMSOL, MUMPS is the default domain solver, but there
Iterative methods rely on good preconditioners to be effi- is a wide range of solvers to choose between. The idea is to
cient. A widely used preconditioner is the geometric multigrid divide the the domains into small enough pieces that direct
(GMG) technique, which can handle a large class of problems solvers are efficient. Domain decomposition is combined with
[? ]. a global coarse solver to accelerate convergence [13]. Figure 5
Preconditioners can use more time and memory than the shows the domain decomposition tree in COMSOL, with the
iterative solver itself [11]. coarse solver and the domain solver nodes.
MASTER THESIS, JUNE 2017, NTNU 5
(a) A coil before par- (b) The coil after par- (c) The coil after par-
titioning titioning into 2 do- titioning into 20 do-
mains mains
B. Meshing in Parallel
The free mesher in COMSOL runs in parallel both in shared
memory mode, and in distributed mode.
The free mesher starts by meshing the faces in a model, and
then moving on to the interior volumes of the domains. After
the faces between two domains are meshed, the domains can
be meshed independently of each other, and the two jobs can
be distributed on the available cores.
The free mesher will distribute the domains automatically,
but it can not divide a domain into sub-domains in order
to parallelize the job further. If there is only one domain
in the model, which can be the case for some imported
CAD-models, there will be limited speedup by using more (b) Small scale
processors. Fig. 7: Meshing time for the model "Box" as a function of
number of domains in the model
Reducing the meshing time by partitioning
In order to parallelize the meshing, the domains can be
partitioned manually. To do this, add a work plane to the 2) Model "Coil" consists of a simple coil with 10 windings.
model. A partition objects geometry operation can then be The mesh set to "Free tetrahedral", and the size to
used to partition the domain with the work plane. If you only extremely small. See figure 6.
want the partition to affect the meshing, and not the geometry, The models were meshed with a different number of parti-
the geometry operation Virtual opertions is useful. It allows tions to investigate the speedup. The simulations were run on
you to partition a model only for meshing purposes, without my laptop, which has an Intel Core i7-3720QM CPU with 4
influencing the physics settings of the model. physical cores, and 8 GB of memory. The results are plotted
To test the effects of partitioning on meshing time, two in figure 7 and 8.
simple models were created. In figure 7 b) is is possible to see that a parallel job
1) Model "Square" consist of a simple 3D-box geometry. is most efficient if the number of jobs is divisible by the
The mesh was set to extremely small. number of cores. The lines for 4 cores (blue line) and 3 cores
MASTER THESIS, JUNE 2017, NTNU 6
Fig. 10: Two different ways to distribute a parametric sweep. TABLE III: Simulation times for the model "Electronic En-
In figure a), 4 simulations in the sweep are started simultane- closure Cooling"
ously, with each simulation running on 8 cores
Fig. 12: Simulation times for the model "Box", for different
solvers and mesh sizes
In 1996 and 1997 David Wolpert and William Macready Algorithms (EA’s) and Swarm Intelligence (SI) are the most
presented the "No Free Lunch" theorem. They observed that studied groups of population based methods. EA’s are based
the performance of all search algorithms, averaged over all on Darwin’s evolutionary theory, and contains, among others,
possible problems is exactly the same [18]. This includes Evolutionary programming and Genetic Algorithms (GA).
blind-guessing. No algorithm is universally better than another Swarm Intelligence algorithms are methods inspired by the
on average. This means that there is no search algorithm that social behavior of insects and birds. Typically the individuals
always perform better than others on any class of problems. will be simple, and need to cooperate to find a good solution
Instead of looking for the best general optimization algorithm, [19]. Together the individuals can perform complex tasks,
one must look for the best optimization algorithm for the similar to ants and bees. The most studied SI’s are Particle
specific problem at hand. Much research has been done the last Swarm Optimization (PSO) and Ant Colony Optimization.
decades on finding the strengths and weaknesses of different Some other examples are Bacterial foraging optimization,
optimization algorithms, and for which problems they excel. Artificial Immune Systems and Bee Colony Optimization.
Hybrid algorithms takes advantage of both local and global In these algorithms the solutions are imagined to be points
optimization methods. Hybrid alogitms use a global method in a n-dimensional room where each control variable repre-
to find the area of the global optimum, and then switches sents one dimension. The solutions (called particles or insects
to a gradient-based method to find the optimum with fast depending on the algorithm) can then fly or crawl through this
convergence. multidimensional room, often called the search space.
A. Metaheuristics
Metaheuristic methods can not guarantee that the solu-
tion is the best solution, but are useful as they make few
assumptions about the problem, and can handle very large
search spaces. Metaheuristics can also solve a wide range
of "hard" optimization problems without needing to adapt
to each problem [19]. A metaheuristic is useful when a
problem can not be solved with an exact method within a
reasonable time, and are used in many different areas, from
engineering to finance. Metaheuristics can solve problems with
several control variables. Metaheuristic methods are very often
inspired by nature, and often use random variables.
Metaheristic methods have gained more and more attention
the last 30 years.
"The considerable development of metaheuristics can be
explained by the significant increase in the processing power
of the computers, and by the development of massively
parallel architectures. These hardware improvements relativize
the CPU time costly nature of metaheuristics." (Boussaid,
Lepagnot, Siarry) [19].
Fig. 14: Overview of basic PSO
Two important processes decide the success of a metahuris-
tic, exploration and explotation. Exploration is a global sam-
pling of the search space, with the purpose of finding areas 1) PSO: PSO was introduced by Kennedy, Eberhart and
of interest. Explotation is the local sampling in these areas Shi in 1995 as a way to simulate the social behavior of flock
in order to close in on the optimal solution. The main of birds or a school of fish. An overview of the generic PSO
difference between metaheristics is how they balance between algorithm can be found in figure 14. PSO is being used on
exploration and explotation [19]. many different problems, from reactive power and voltage
Metaheuristic methods can be divided into single-solution control to human tremor analysis [18].
and population based methods. Single-solution based meta- The algorithm works by first initializing a random group of
heuristics starts with a single point in the search space and solutions, called particles. The values of the decision variables
moves around, they are also called "trajectory methods". The is the position of the particle. Each particle will have a velocity
most "popular" trajectory method is Simulated Annealing. and a position in the search space. For each iteration (called
Simulated Annealing was proposed by Kirkpatrick et al. in a time step) the particles will move to a new position, and
1982 and is inspired by annealing in metallurgy. The objective evaluate the fitness of the solution. How a particle moves is
function is thought to be the temperature of a metal, which is dependent on its own memory of the best found position, and
then lowered to a state of minimal energy. the memory of the neighborhood particles. The neighborhood
Population based meta-heuristics initializes a set of solu- size can vary from a few to all of the particles in the swarm,
tions, and moves these through the search-space. One main depending on the implementation.
advantage to population-based methods over trajectory meth- The best known position found by all the particles is called
ods is that they can be implemented in parallel. Evolutionary global best, and the best position found a single particle local
MASTER THESIS, JUNE 2017, NTNU 11
best. The particles will move based on its own experience, and or slow down the particle randomly. [20] found in experiments
the experience of its neighbors. that random inertia weight was the most efficient strategy (least
As a particle is drawn towards another, it might discover iterations) for a handful of benchmark problems, while linear
new better regions, and will start to attract other particles, and decreasing weight gave near optimum results compared to
so on. The particles will be drawn towards and influence each other methods, but required a very high number of iterations.
other similar to individuals in a social group. Interestingly, if Random inertial weight is given by
looked on as social individuals, one of the driving factors of
rand()
the behavior of the particles is confirmation bias. From [? ] w = 0.5 + (9)
"...individuals seek to confirm not only their own hypotheses 2
but also those of their neighbors". Where rand() is a normal distributed random number in
Two equations are updated for each particle and time step; [0, 1].
velocities and positions. There is no known best population size to initialize a particle
The position is given by: swarm [21]. It is important to have enough particles for the
algorithm to be effective. The particles need to sample the
xi+1
p = xip + vp (5)
available solutions early, because the search space will become
Where xip is the position vector for the previous time-step, more and more restricted as the particles move towards the
and Vp is a velocity vector. The velocity is given by: global best position. [22] However, having a large amount
of particles will limit the number of time steps owing to
vp = vp + c1 r1 (pl − xp ) + c2 r2 (pg − xp ) (6) limited computational resources. [? ] recommends, based on
= vcurrent + vlocal + vglobal (7) experience, using between 10-50 particles.
The new velocity is the sum of three vectors, the current To achieve quick convergence and a small error, it is
velocity, the new velocity towards the global best position and necessary to tune the parameters w, c1 and c2 . Small changes
the new velocity towards the local best position. r1 and r2 are to the parameters can give large changes in the behavior of
normally distributed random numbers r ∼ U (0, 1). The local the particles [? ]. Choosing the parameters is a optimization
and global best positions are given by pl and pg . Often a Vmax problem in itself, and the optimal values will be different for
parameter is defined, such that if vp > Vmax then vp = vmax . each problem.
This is to limit the velocity. A well known usual value is Cl = C2 = 2, c1 + c2 =< 4
c1 and c2 are called cognitive and social scaling param- and w = [0.4 − 0.9]. Pedersen [23] recommends that c2 > c1 .
eters [20]. These parameters determine the balance between The size of the neighbor is often set to be the size of
exploration and exploration. c1 and c2 determines how much the swarm, that is, all the particles communicate with each
the movement of the particle will be affected by the local other. This is called starstructure, and has been found to
and global best found positions respectively. A large cc21 ratio perform well [18]. The main disadvantage of using only one
will restrict the search space quicker and shift the balance neighborhood is that the particles can not explore more than
towards exploration, as the particles will be pulled fast towards one optima at a time, they are all drawn towards the same
the global best known position. A small ratio means that the location. How many optima the objective function has should
balance will shift towards exploration. The values chosen for be considered when deciding the size of the neighborhood.
c2 and c1 will effect the convergence significantly. For optimization of electrical motors, it seems intuitive that
A parameter that can improve convergence was introduced there are several optima, and so using several neighborhoods
by Shi and Eberhart [21] in 1998. Adding an Inertial weight might be the best solution.
to the particles gives increased control of the velocity. With
inertial weight the expression for velocity becomes: B. Optimization in COMSOL
There are several ways to optimize a COMSOL model:
vp = w · vp + c1 r1 (pl − xp ) + c2 r2 (pg − xp ) (8)
• Using understanding of the model and intuition to tweak
Where w [0, 1] is the weight. w determines how much the the parameters by hand. One can also set up parametric
particle accelerates or deaccelerates. A large inertial weight sweeps manually.
will give the particles higher speed and result in a more global • Using the build-in optimization capabilities of COMSOL.
search, while a small w makes them search locally [? ]. The • Using LiveLink for Matlab to implement your own opti-
velocity need to be large enough that particles can escape mization algorithm.
local optima. The benefit of adding inertial weight is that the Explain why you do PSO, LIVLINK?
velocity can be changed dynamically, controlling the transition
from exploration to exposition.
Many different dynamic Inertia Weight strategies have been C. LiveLink for Matlab
introduced. Linear Decreasing Weight will simply decrease LiveLink for Matlab is a very useful product from COM-
the inertial weight for each time step. As the particles have SOL, that allows you to create, edit and run COMSOL models
explored the search room, and are closing in on the global from Matlab scripts. LiveLink for Matlab is one of COMSOL’s
maximum, the speed is reduced, moving from a global search "interactive" products, a group of products that allow you to
to a local search. Random Inertial Weight will simply speed up work on COMSOL models through other programs.
MASTER THESIS, JUNE 2017, NTNU 12
1 f o r p o r t = 2036:2055
2 try
3 mphstart ( port ) ;
4 break ;
5 catch
6 s = [ ’ t r i e d p o r t number ’ ,
num2str ( p o r t ) , ’
unsuccesfully ’ ];
7 disp ( s )
8 end
Fig. 15: The model "box.mph" before and after running a 9 end
Matlab script. The figure shows the COMSOL desktop and
Matlab side-by side, connected to the same server. COMSOL
automatically updates the graphics widow when you change Fig. 16: An suggestion of how to connect Matlab to the
the model in Matlab COMSOL mphserver on a cluster
Matlab was not to save a model as a .m file and edit it, but
to use the "mphopen" function, which asks the mphserver to 1 module l o a d c o m s o l / 5 . 2
open an existing model. Figure 18 gives an example of the 2 module l o a d m a t l a b
"mphopen" function. Here the model "Box.mph" is loaded by 3
the mphserver. The data in the table with tag "tbl1" is then 4 COMSOL_MATLAB_INI= ’ m a t l a b −n o s p l a s h −
n o d i s p l a y −nojvm ’
extracted and returned to Matlab with the "mphtable" function.
5
When you are working with a model, it is important to 6 c o m s o l −nn 6 −np 8 − c l u s t e r s i m p l e −
use the right tags for the object you want to change. The m p i a r g −rmk −m p i a r g p b s m p h s e r v e r
command "mphnavigate" will open a window in Matlab with − s i l e n t −t m p d i r $w &
all the objects and their tags, but I found it a lot easier to just 7 sleep 8
8 m a t l a b −n o d i s p l a y −n o s p l a s h −r "
open the model in COMSOL, and find the tags there. You can
a d d p a t h $COMSOL_INC / m l i ,
find the tags of a study, geometry object, mesh and so on, by yourScript "
clicking the object and going to the "Properties" window.
If you write a Matlab script that interacts with a COMSOL
model, make sure that tags you are using in the script still Fig. 19: An example job script for running the COMSOL
refers to the right objects if you later change the model using mphserver
the COMSOL desktop.
Subject to the constraints A lot of time was spend trying to implement PSO on Vilje.
In order to parallelize the script as much as possible, it would
2 < Pi < 30
be best if the particles could act independently of each other.
12 < Ni < 150 That is, if the fitness values of one particles could be evaluated
Pi + 10 < Ni ≤ 8 · Pi as soon as the particle has moved to a new position. To achieve
0.01 < HSRiron_thickness < 0.10 this with LiveLink, the first attempt was to start one mphserver
per particle. However, it was discovered that each instance of
0.01 < Slotthickness < 0.10
the mhpserver uses one license. This means that the number
. of particles would be limited to the number of licenses, not a
. feasible solution.
. In the end, the only working solution discovered was to use
only one mphserver. The particles have to wait until all of
0.01 < AGi < 0.1
them are finished before they can move on to the next step.
This is not an optimal solution, especially if there exist some
Each parameter represents one dimension in the search parameter combinations that creates models that are difficult
space, a 12 dimensional space where the length of the "walls" to solve. In that case, many particles could be held waiting
are determined by the parameter bounds. The bounds on each each iteration. A time limit on each simulation would help to
parameter is called the barriers of the search space. All of the limit waiting time, but COMSOL does as per today have no
dimensions, except Ni , had static barriers. Ni was handled functionality for aborting a simulation after a set time. Instead,
by first calculating the new particle positions in dimension Pi , some care was taken to choose good barriers.
and then setting the barriers for Ni to [Pi + 10, Pi · 8] in every The fitness evaluation was done by creating a parametric
iteration. sweep in the COMSOL model. For each iteration, the particle
The barriers were handled with "bouncing walls", if a positions were calculated and saved in a matrix called "posi-
particle hits a wall, the velocity will change direction. tions". The parametric sweep list was then set to the position
The integers were handled by simply rounding off the matrix as in figure 21. For reasons not entirely understood,
values. it is necessary to change the list in both "Study1->Parametric
Sweep" and "Study 1 -> Job Configurations -> Parametric
Sweep", or COMSOL gives you an error. It might be because
A. Implementing PSO with Matlab
the "Job Configurations" node does not update automatically.
1
2 model . r e s u l t . t a b l e ( ’ t b l 1 0 ’ ) .
clearTableData ;
3 model . s t u d y ( ’ s t d 1 ’ ) . f e a t u r e ( ’ param ’ ) .
set ( ’ plistarr ’ , positions ) ;
4 model . b a t c h ( ’ p1 ’ ) . s e t ( ’ p l i s t a r r ’ ,
positions ) ;
5
6 model . s o l ( ’ s o l 1 ’ ) . r u n A l l ;
7 model . b a t c h ( ’ p1 ’ ) . r u n ;
Fig. 21: Matlab code showing how you can change a paramet-
ric sweep list, and run the sweep using the mphserver
1 model . b a t c h ( ’ p1 ’ ) . s e t ( ’ p l i s t a r r ’ ,
positions ) ; (a) "AG_i"
2 mphsave ( model , ’ t e m p f i l e . mph ’ ) ;
3
4 s t a t u s = s y s t e m ( ’ c o m s o l −nn 200 −np 4
−v e r b o s e − c l u s t e r s i m p l e −f
c o m s o l _ n o d e f i l e −b a t c h l o g l o g . t x t
b a t c h −t m p d i r $w − i n p u t f i l e
t e m p f i l e . mph − o u t p u t f i l e o u t f i l e .
mph ’ ) ;
5
6 model = mphopen ( ’ o u t f i l e . mph ’ ) ;
Fig. 22: Matlab code showing how the sweep was run using
the system command (b) "Stat_PM_thick"
Fig. 24: Particle paths for a run with S=30, showing the paths
Hopefully, COMSOL will fix the problems with the mph- of each particle for the control variables "Stat_PM_thick"
server so that this roundabout method will not be necessary (thickness of stator permanent magnets) and "AG_i" (inner
in future projects. air gap length).
If one of the simulations in the parametric sweep fails, there
will be a missing line in the probe table. To handle this a
function was created to find missing lines in the matrix, and B. Results
insert a fitness value of 0. PSO was run several times with different number of parti-
The maximum velocity was set by testing the algorithm and cles, and the results can be found in figure 23. As the figure
looking at the paths taken by the particles. It was given a value shows, it looks like a higher number of iterations would have
that gave a nice looking path without any large jumps. The been beneficial. Looking at the dark blue line, it rises rapidly
chosen PSO parameters can be found in table V. towards the end of the optimization.
MASTER THESIS, JUNE 2017, NTNU 16
Figure 24 shows the path taken by the particles in two There was many problems with running LiveLink for Mat-
of the dimensions, for one of the runs (dark blue line). For lab distributed on a cluster, and a great deal of time was spent
figure b, the particles converge early, at around iteration 30- trying to make it work. In the end, an alternative method was
40. Studying the particle paths for all the dimensions showed found for running PSO in Vilje.
that the particles converged early in all the dimensions, except Vilje will be replaced by the supercomputer "Fram" in the
for the one shown in figure 24 a). If the particles had been near future, but the methods and tips presented in this thesis
given enough iterations to converge in all the dimensions of should be valid and relevant for Fram as well.
the search room, a better objective function might have been
achieved.
VII . ACKNOWLEDGMENTS
The algorithm could also have been tested with several
neighborhoods, and with different scaling parameters. The I would like to thank:
optimization algorithm was not optimized further however, Supervisior Robert Nielssen, for help and encouragment
because the results was found to be excellent by Bjork, and Boyfriend Andreas Roessland, for moral support
because of time restrictions. Friend Muhammed Usman Hassan, for his inputs and for
A large parametric sweep of the same mode was also creating a positive working environment
run for Bjork on Vilje, with close to 2000 simulations. The Chalie Bjørk for a pleasant collaboration
results were used to select good parameter values in his
iterative optimization approach. PSO delivered a slightly better
objective function than his approach. PSO needed to run
R EFERENCES
more simulations in total to achieve a good result (about 10
000 simulations), but it was still a relative small amount of [1] W. Gander, M. J. Gander, and F. Kwok, Scientific
computational resources. Computing An Introduction using Maple and MATLAB.
It is surprising that PSO performed so well without any Switzerland: Springer, 2014.
tweaking. The chosen scaling parameters most likely fit the [2] G. H. Golub and J. M. Ortega, Scientific computing
problem very well. and differential equations : an introduction to numerical
methods. Academic Press, 1992.
VI . C ONCLUSION [3] W. Frei, “Meshing your geometry: When
Running COMSOL jobs on Vilje requires some technical to use the various element types,”
know-how, but it should be relatively easy for most students https://www.comsol.com/blogs/meshing-your-geometry-
to learn. Very little knowledge about parallel computing is various-element-types/, accessed: 2017-07-03.
required, as COMSOL will set up communication between [4] C. M. Cyclopedia, “The finite element method (fem),”
nodes automatically. https://www.comsol.no/multiphysics/finite-element-
The COMSOL batch job and the COMSOL mphserver method, accessed: 2017-07-01.
was both run on Vilje. The batch job was found to be the [5] S. J. Salon, Finite Element Analysis of Electrical Ma-
most stable version. The COMSOL mphsever was not run chines. Kluwer Academic Publishers, 1995.
successfully in more than one node. [6] R. Shonkwiler and L. Lefton, An Introduction to Parallel
The speedup of medium-sized single stationary studies was and Vector Scientific Computing. Cambridge University
found to be very limited, and according to COMSOL [14] the Press, 2006.
maximum possible speedup for large models is in the range [7] E. Rønquist, A. M. Kvarving, and E. Fonn, Introduction
8-13. The speedup is generally greater for models with ten to to Supercomputing, TMA4280, 2016.
hundred of millions of degrees of freedom. [8] “Top500 list - june 2017,”
In some cases, speedup can be improved by partitioning the https://www.top500.org/list/2017/06/?page=2, accessed:
mesh. The most important thing is to select the right solver. 2017-07-06.
Iterative solvers were found to be much faster for a selected [9] D. Padua, Encyclopedia of Parallel Computing.
model, but the speedup was in the same range. Springer, 2011.
The best possible speedup when running COMSOL jobs [10] J. Akin, Finite Element Analysis with Error Estimates, An
on Vilje is achieved when parametric sweeps are distributed Introduction to the FEM and Adaptive Error Analysis for
on the cluster. Population based optimization algorithms will Engineering Students. Elsevier Butterworth-Heinemann,
benefit from this, as in each iteration the population individuals 2005.
can be evaluated by a parametric sweep. [11] COMSOL, “COMSOL multiphysics reference manual,”
PSO was implemented, and gave good results when applied 2016.
to a machine optimization case. It is surprising that PSO was [12] I. Yavneh, “Why multigrid methods are so efficient,”
able to deliver so good results without any tweaking of the 2006.
algorithm, as most literature suggests that this is important in [13] J.-P. Weiss, “Using the domain decompo-
order to achieve good convergence. This suggests that PSO sition solver in COMSOL multiphysics,”
might be more powerful than first assumed, or that the chosen https://www.comsol.com/blogs/using-the-domain-
parameters fit the problem very well. PSO was found to be a decomposition-solver-in-comsol-multiphysics/, accessed:
very useful tool for design optimization. 2017-07-09.
MASTER THESIS, JUNE 2017, NTNU 17
winSCP allows you to move files from your computer to Vilje by simply dragging your files from one folder to the other.
The left window is your own computer, navigate to the right folder and drag your COMSOL file across.
(a) puTTY start screen (b) How puTTY looks when you are logged in to Vilje
puTTy allows you to log in to Vilje from another computer. To log on to Vilje write "vilje.hpc.ntnu.no" under "Host
name", and press open. In the future, this will change as Vilje is replaced by "Fram". You will be prompted to give your
username and password.
Vilje runs Unix, and has a command-line interface. To navigate and submit your job, you need to learn a few Linux
commands. As a minimum, learn how to navigate.
• cd Change directory, feks "cd myfolder"
• cd.. Move to the parent directory.
• ls List the contents of the directory
• cat myfile Print the contents of "myfile" to screen
• vi myfile Edit the file "myfile" with Vi
To summarize: With winSCP you move your files to Vilje, with puTTY you can tell Vilje what to do with the files.
MASTER THESIS, JUNE 2017, NTNU 19
It is also very useful to learn how to use a text editing program like Vi/Vim, so you can easily change a job script. Vi
has its own set of commands, but you only need a few in order to edit a text file.
• i Insert mode will move you from the Vi command line and "into" the text, allowing you to edit the text.
• Ecs Use escape to leave "insert mode" and move back to the Vi command line.
• :q Quit the program without writing to file.
• :w Write to file
• :wq Write to file, and quit Vi.
• gg and shift+g In "command line mode", these commands will move you to the top or bottom of the text. Useful if
you are reading a large log file.