Palladio
Palladio
Palladio
DIFFICULTY CALIBRATION
As we have mentioned, deciding the correct difficulty of a
task is a challenge itself. A difficult too low, or too high, (b) Example of answer.
would push the player to stop playing, either via boredom or
frustration. This is a concept that is easily explained, but is
difficult to define.
Drawing ideas from (?), we designed a type of questionnaire
that would help us understanding which is the range of im-
provement that we should expect from a player. The question-
naire constituted of nine graphs of past performance. Each
graph regarded a single action that was recorded in the game,
for the last six weeks, in order. For example, the first graph,
shown in Figure 1a, pertains to the total amount of kilometres
that an user has recorded by walking, for each of the last six
weeks, in order (the last point is the most recent). (c) Example of aggregated response.
The nine graphs were taken from real data, and chosen in order Figure 1: Example of a graph of the questionnaire, pertain-
to represent all the range of possible performance progression, ing total amount of kilometres recorded walking during the
as shown in Figure ??. The first row show a rising pattern (the previous six weeks.
user is improving its performance); the second row a stable
pattern (the user is mainting its performance); the third row
is a declining pattern (the user is worsening its performance). The next step was to find a function able to approximate the in-
The second column and the third column exhibit a sudden tentions expressed in the questionnaire. We desired a function
change in performance during one week. Each user received that was of easy interpretation. By its definition, a challenge
a questionnaire where the graph were randomly shuffled in requires an increment of the player’s efforts. So we chose to
order. model the challenge prediction as:
We selected 20 user from explain where and why.
We asked the user to write on the graph would he/she think C = P ∗ I, (1)
would have been the correct challenge to issue to that particular
player, taking into account only the short window of its past where C is the goal of the challenge, P is the prediction of
performance. Users were also able to write down a short the player’s efforts, and I is the improvement. We choose to
description of their though process, as shown in Figure 1b. model it as a linear function; a future work could be to assess
other functions.
We collected the answers and recorded the proposed goal for
each graph. An example of the aggregated responses is given For the estimation of We considered the following extrapola-
in Figure 1c, while we show the complete responses in Figure tion functions: Linear, Polynomial, Conic, Moving Average,
??. Weighted Moving Average. The rationale was to approximate
the intentions expressed in the questionnaire, as all of their rea- the user would automatically choose one challenge for them
soning took into consideration the past performances shown in at twelve o’clock of Friday. The thus measured the percent-
the graph. In particular, no user indicated that their reasoning age of players that did choose their challenge, from all the
was based on only the last performance. Thus all methods players that were given a choice (the set of different available
were tested taking into consideration a number of previous challenges). We call this percentage choice rate. We use it
points in the range [2, 5] (this means basing the prediction on to measure the user’s engagement as an user that performed
the performance of the past two weeks, three, etc. up to five). the choice can reasonably be expected to be still interested
in the game and actively playing, compared to an user than
The function were evaluated comparing the difference of their will passively accept the choice made for them by the system.
outcome to the mean responses of the questionnaire, using the We considered for each week only the users that recorded at
MAPE (Mean absolute percentage error) as an error function,
least one activity during that week. We thus didn’t consider
since the graphs pertained different performance indicator with
users that were inactive. The results are shown in Figure 3.
different ranges.
In the first weeks the choice rate was high, indicating an high
In the end the lowest MAPE was found with the combination engagement most probably due to the novelty of the game, but
of WMA-5 as the prediction function (Weighted Moving Aver- soon started to decrease, most probably due to lost interest by
age over the last 5 recorded level of performance), and I = 1.3, the players. After the introduction of the novel approach, at
corresponding to an increase of 30% over the predicted user week 11, the choice rate had an increase that was sustained
performance. over time. We can conclude that the novel approach had a
direct effect of improving the user’s engagement within the
EVALUATION gamified system.
Mean difficulty
1.34
0.6
1.32 Choice ratio
0.5
1.3
0 5 10 15 0.4
week 0 5 10 15 20
week
Figure 2: Mean difficulty of the challenges proposed in the
serious game. Figure 3: Choice ratio of the challenges proposed in the serious
game.
We employed the proposed approach during the course of the
serious game. For the first part of the game, the challenges
were proposed with the naïve approach of proposing a goal We also observed the effect of the novel approach on the com-
that was directly computed using the performance observed pletion rate of the challenges. This is the percentage of users
during the the last week and a fixed improvement factor in that completed their challenges. Again we considered for each
{1.2, 1.3, 1.4}. Our novel approach was introduced starting week only users that were active during that week. The results
from the eleventh week. This allows us to directly compare its are shown in Figure 4. During the first week the completion
effects to the naïve approach.
In Figure 2 we compare the mean difficulty of the challenges
0.8
proposed. The difficulty was computed as the ratio of the
requested performance and the performance observed during
Completion ratio