0% found this document useful (0 votes)
25 views9 pages

paper-43

i

Uploaded by

OH TAKU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views9 pages

paper-43

i

Uploaded by

OH TAKU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Player Retention In League of Legends: A Study Using Survival

Analysis
Simon Demediuk Alexandra Murrin David Bulger
RMIT University Macquarie University Macquarie University
Melbourne, Australia Sydney, Australia Sydney, Australia
simon.demediuk@rmit.edu.au alexandra.murrin@students.mq.edu. david.bulger@mq.edu.au
au

Michael Hitchens Anders Drachen William L. Raffe


Macquarie University DC Labs, University of York University of Technology (UTS)
Sydney, Australia York, United Kingdom Sydney, Australia
michael.hitchens@mq.edu.au anders.drachen@york.ac.uk william.raffe@uts.edu.au

Marco Tamassia
RMIT University
Melbourne, Australia
marco.tamassia@rmit.edu.au
ABSTRACT KEYWORDS
Multi-player online esports games are designed for extended dura- League of Legends, churn prediction, game analytics, esports, pre-
tions of play, requiring substantial experience to master. Further- diction, business intelligence, churn
more, esports game revenues are increasingly driven by in-game ACM Reference format:
purchases. For esports companies, the trends in players leaving Simon Demediuk, Alexandra Murrin, David Bulger, Michael Hitchens, An-
their games therefore not only provide information about potential ders Drachen, William L. Raffe, and Marco Tamassia. 2018. Player Retention
problems in the user experience, but also impacts revenue. Being In League of Legends: A Study Using Survival Analysis. In Proceedings of
able to predict when players are about to leave the game - churn Australasian Computer Science Week 2018, Brisbane, QLD, Australia, January
prediction - is therefore an important solution for companies in the 29-February 2, 2018 (ACSW 2018), 9 pages.
rapidly growing esports sector, as this allows them to take action https://doi.org/10.1145/3167918.3167937
to remedy churn problems.
The objective of the work presented here is to understand the
impact of specific behavioral characteristics on the likelihood of a
1 INTRODUCTION
player continuing to play the esports title League of Legends. Here, Electronic sports (esports) has in the past decade emerged as a pop-
a solution to the problem is presented based on the application of ular format for players as well as spectators, fostering a substantial
survival analysis, using Mixed Effects Cox Regression, to predict industry and a developing field of research [19, 21, 25, 28, 37, 38].
player churn. Survival Analysis forms a useful approach for the While it is difficult to estimate the size of the esports market, Super-
churn prediction problem as it provides rates as well as an assess- data Research predicted that the market will be worth $1.1 billion
ment of the characteristics of players who are at risk of leaving the in 2017 and that there will be 330 million spectators by 2019 making
game. Hazard rates are also presented for the leading indicators, esports an important research and development field across game
with results showing that duration between matches played is a academia and industry.
strong indicator of potential churn. While there is no official definition, Schubert et al. [25] proposed
that esports was any digital games played in a competitive context
CCS CONCEPTS with an audience. Within esports, Multiplayer Online Battle Arena
(“MOBA”) games are an increasingly common form, with League
• Applied computing → Computer games; • Information sys- of Legends (“LoL”) being the most popular example. LoL possesses
tems → Data mining; • Mathematics of computing → Survival an international player base of approximately 100 million monthly
analysis; players [15]. Like other MOBAs, LoL involves two teams of five
players, each competing to destroy the opposing team’s “Nexus”, a
Permission to make digital or hard copies of part or all of this work for personal or physical structure located at the teams’ bases. Each player is termed
classroom use is granted without fee provided that copies are not made or distributed a “summoner” and controls a “champion”, which is the player avatar
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for third-party components of this work must be honored. for the battle. There are, at the time of writing, over 120 champions
For all other uses, contact the owner/author(s). for players to select from. In addition to the opposing five team
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia members, players must also battle computer controlled monsters.
© 2018 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-5436-3/18/01. Defeating enemies gains the player experience and gold, the former
https://doi.org/10.1145/3167918.3167937 allowing for more powerful abilities to be unlocked in the current
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia Demediuk et al.

match while the latter can be spent to buy items that increase in-depth discussion of the interpretation of the results and avenues
strength and performance. for future work provided in Section 6.
LoL is free-to-play, with revenue predominantly driven by micro-
transactions, the price of which can range from US$2 to hundreds of 2 RELATED WORK
dollars, which allow players to purchase items such as champions
This research builds on a longer chain of investigation in games
and champion skins. Given that these items have essentially zero
research, network science and machine learning, which originates
marginal cost, LoL is able to be highly successful and profitable
in the efforts to manage network and server loads for Massively
even with only a relatively small number of players making these
Multi-player Online Games (MMOGs). Representative of early work,
optional purchases. In 2016, revenue for LoL reached US$1.7bn, the
Chambers et. al. [4] initially investigated server loads in online
highest grossing of free-to-play-online games [30], despite record-
games via mining the client-server data streams. Tarng et. al [32]
ing the lowest average spend per player of US$18.88, relative to
and others expanded this area by investigating why people where
other MOBAs [29].
leaving games and how it related to playtime within a game. The
Given this, The MOBA business model relies on player reten-
early work investigating temporal patterns in MMOGs and other
tion so that there continues to be a body of players making these
online games, usually a game or two at a time, led to more recent
purchases. Even if only a fraction of the player base engages in
large-scale investigations such as Sifa et. al. [26] in which patterns
micro-transactions, the more such players there are, and the longer
where found between playtime and player churn across several
they play the game, the higher the income to Riot Games, the pub-
thousand games. The discovery of patterns in playtime, including
lisher of LoL. In addition, there is evidence that in freemium games,
the importance of metrics such as inter-session intervals, session
the duration of player engagement appears correlated with the
lengths, total playtime, etc., matched the increased adoption of
chance of purchases occurring [17, 39]. This adds further weight to
Freemium business models in the games industry and introduced
the interest in fostering a player experience that leads to long-term
the idea of using behavioral telemetry to predict player behavior.
retention.
This in turn has recently introduced the idea of performing predic-
The company’s reliance on in-game purchases implies that an
tive analysis on players behaviour in games [9, 17, 23, 27, 35, 39],
understanding of the predictors of the cessation of play is integral
including recently survival analysis [34, 36], and the insights that
to the continued financial success of LoL and similar games. The
might be gained through this type of investigation. In the literature
ability to identify behaviour that is characteristic of a player close
on these topics, player departure has been termed “churn” and de-
to leaving the game can assist a company with knowing when to
parting players as “churners” [9], adopting terminology from the
strategically increase its services or cater more specifically to these
telecommunications industry.
individuals to prevent them from leaving [9, 23].
In this section we first discuss recent work that has been done
Thus, the contribution of this paper is to provide an initial inves-
on analysing and predicting churn in video games. Highlighting
tigation into key predictors of player disengagement (or “churn”)
the difference in available data and player behaviour between game
in LoL. This is accomplished by considering the playing of a match
genres and the variety of techniques used to address this. The use
of LoL as a significant event and using survival analysis to predict
of survival analysis techniques in churn prediction within other
how much time will pass between an epoch (one game) and that
industries is then presented to demonstrate its successful applica-
event (another game) given a set of independent variables. Sur-
tion in non-games context, while also noting its limited use for the
vival analysis is commonly used to analyse customer churn in a
player churn problem.
variety of industries but its use within games analytics is much
rarer. Specifically, the work in this paper extends on previous work
done in playtime measurement and survival analysis on mobile 2.1 Churn Analysis in Games
games [34, 36] and applies it to the more complex MOBA domain. The literature indicates that both behavioural and environmen-
To achieve this, three survival analysis techniques are applied and tal factors are key components in determining the likelihood of
the results they produce compared: the Kaplan-Meier estimator, churn. Research on player churn in online games goes back for
the standard Cox regression model, and the mixed effects Cox re- at least a decade, for example Feng et. al. [7] who studied the is-
gression model. These are used with temporal player behaviour sue in Eve Online, a science fiction Massively Multi-player Online
features that include recent average time of matches, recent aver- Role-Playing (MMORPG) game, using traffic analysis to examine
age time between matches, and highest achieved season tier. These data from the early period of the game, 2003-2006. Amongst their
features are predominately agnostic to this specific game, making conclusions were that player churn increases over time and that
the approach to churn analysis presented here easily transferable the time between play sessions was a reliable means of “identify-
to the wider MOBA genre. ing players that are about to quit” (i.e., churn). Kawale et. al. [14]
The remainder of the paper is as follows: Section 2 provides more examined the impact of social influence of other players on churn.
details on customer churn as it applies across multiple industries, Studied in the context of the MMORPG, Everquest II, it was found
as well as the use of survival analysis to identify key indicators that a significant improvement in the accuracy of churn predic-
of churn. Section 3 describes the survival analysis models that are tion was achieved through an analysis that combined a player’s
used in this paper, while Section 4 introduces the data set used session length (behavioural) and network influence (environmen-
and the primary features that are used within the survival analysis tal), compared to analysing either factor in isolation. A player’s
models. Section 5 provides the results of the analysis with a more network influence was modelled using a vector quantity of two
components, one being negative influence and the other positive
Player Retention In League of Legends: A Study Using Survival Analysis
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia

influence, reflective of how much the player is inclined towards the binary outcome. For decision tree analysis, historical data is or-
playing the game. They found that a Modified Diffusion Model ganised into a hierarchical structure according to a set of conditions
was superior to both a Simple Diffusion Model and a classification with a probability assigned to each node. Nie et al. [18] compared
approach based on Network and Engagement features. However, the usage of these two techniques in predicting churn using credit
even their best variant had a precision of just over 50%, indicating card data collected from a Chinese bank. The data analysed included
that considerable improvements could be made to their approach. customer, card and risk information, as well as transaction activity.
The importance of a player’s social network was further explored It was found that superior performance was achieved using the lo-
in research conducted by SuperData [3]. That survey found that gistic regression approach over the decision tree algorithm. Further,
gamers tend to abandon games in groups, with 34% of churned the different classes of statistical models used for customer reten-
players indicating that they had left a game because their “friends tion modelling are often split between what is deemed ‘static’and
stopped playing”. ‘dynamic’depending on the type of data. Static models are applied to
Borbora et. al. [1] took an approach based on both data analysis cross-sectional data and generally include logistic regression, linear
and player-motivation theory to predict players likely to churn. regression and neural networks, whereas dynamic models tend to
Various game play features (such as rate of quest participation) capture longitudinal data and include methods such as Bayesian
were used in training a decision function as to whether a player and survival analysis [40].
would churn. They found the theory-driven approach to be almost
as accurate as the data-driven alternative and claim that the former 2.3 Survival Analysis for Churn Prediction
is more comprehensible by domain experts. They also found that
When modelling longitudinal data, survival analysis is a common
a single classification algorithm may not be able to identify all
approach with usage prevalent across many industries. Lu [16]
likely churners. Runge et al [23] looked specifically at high value
applied survival analysis techniques in his study on the “fiercely
players of social games, where a high value player is defined as one
competitive” telecommunications industry. Specifically, the study
who is within the top 10% of all paying players. They evaluated
applied a parametric regression model for the estimation of the
various approaches to churn prediction, including using a hidden
survival and hazard function to provide information on customer
Markov model and a neural network, using data sets from two
churn rates, as well as for the identification of customers at high
games, Diamond Dash and Monster World. They found that a single
risk of churn. Kaminski and Geisler [12] used survival analysis
hidden layer neural network, with some modifications, had the best
to understand the retention of science and engineering associate
performance in terms of predicting players likely to churn. They
professors at multiple US universities through analysing the time
then used this, in the context of Monster World, to identify players
from original hire to departure. Further, through Zhang’s [40] ap-
likely to churn and applied strategies to dissuade them from leaving
plication of a Cox Proportional Hazards Model on retail banking
the game. This met with some measurable success.
customer data, he found that increases to customers'services usage,
cross-buying, tenure experience and complicated product usage led
to longer customer retention.
2.2 Churn in Other Industries The application of Mixed Effects Cox Regression for churn anal-
Churn is not just an issue for online games. Customer retention is ysis in the gaming industry appears somewhat less conventional,
widely recognised as less costly than recovering churned customers with little, if any, publicly available evidence of this being con-
[8] with Reichheld [20] stating that within the financial services ducted. The objective of the current paper is to understand the
industry “a 5% increase in customer retention produces more than a impact of certain behavioural characteristics on the likelihood of
25% increase in profit”. Understandably, extensive research and anal- a player continuing to play LoL. Specifically, through analysing
ysis has been conducted on the topic of churn prediction and factors factors affecting the rate of time until a subsequent match is played,
affecting customer retention. In a study of the online gambling in- behavioural characteristics associated with longer durations until
dustry, Coussement and De Bock [5] analysed both behavioural and the next match can be highlighted as leading indicators of potential
demographic factors in their study on the online gambling industry churn. Through providing greater insight into the effects of match
using both a Random Forest model and Generalised Additive Model duration, time between consecutive matches and player skill on
(GAM). A total of 30 drivers (27 behavioural and 3 demographic) the hazard rate, esports companies will become better informed
were ranked according to predictive power on churn, with the top on how to introduce targeted strategies for players who exhibit
3 variables being number of days since last bet, number of days these characteristics and are consequently at risk of churn. Given
since last net loss and number of betting sessions relative to the the dependency of Riot Game's - and other esports companies -
length of relationship. business model on revenue generated via in-game purchases, these
Statistical models used to analyse customer attrition in areas characteristics must be identified early so that the necessary actions
such as the telecommunications industry and credit card provi- can be pursued.
sion include logistic regression and decision tree analysis, typically
when encountering cross-sectional data [2], [11], [18], [33]. In the 3 SURVIVAL ANALYSIS MODELS
case of logistic regression, an arbitrary threshold (specific to the
When modelling longitudinal data, survival analysis is a common
context) is typically selected as the point of churn, which results in
approach with usage prevalent across many industries. It is used to
a binary response variable indicative of a subject having churned.
predict the amount of time that will pass before an event occurs,
Independent variables are then used to predict the probability of
basing prediction on potential influencing features. In a traditional
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia Demediuk et al.

medical sense, this event is typically a reaction, remission, or death. 3.2 Standard Cox Regression Model
The analogous negative event in our context may seem to be the The Standard Cox Regression Model [6] can be used to explore
churning of a player. However, there is no specific time that a “churn the relationship between specific features and the rate of experi-
event” occurs and so for our purposes the analysis is inverted; the encing some defined event. The model assumes a functional form
event is that of a player playing a match of LoL. This means that for the features (parametric), whereas no distribution assumption
a player is said to have survived if they have not played a game is required for the survival times (non-parametric). This makes
between the epoch and the current time interval and so survival this model advantageous over alternative statistical models, which
analysis is used to predict how long it will be until the player plays require distributional conditions on the response variable.
another match. Thus, if a player is predicted to “survive” for a long The model assumes a hazard function of the following form:
time, then it is likely that they have churned and are not returning
®T
or at least are disengaged with the game and will not return for a h(t) = h 0 (t)e β x® , (2)
while.
where h 0 (t) represent the baseline hazard function at time t (i.e.,
Survival analysis calculates the survival function, which gives
the hazard function when all explanatory variables are zero), x®
the probability of a subject surviving (not playing a game) past a
certain time t: S(t) = P(T ≥ t). Inversely, the hazard function is used is the vector of explanatory variables and β® is the vector of the
to give the probability of the event (playing the game) occurring coefficients. This model enables one to examine the effects of several
at a specific time step given that it has not already happened, also independent variables on survival.
known as the instantaneous failure rate: h(t) = − ∂t ∂ log S(t). The The key assumption for the Cox Regression Model is the propor-
two quantities can be used to derive each other, and hence they tional hazards assumption, which assumes that the hazard function
are equivalent. However, this paper focuses on the hazard function, for each individual inter-match observation is a multiple of the
as a decrease in the hazard rate over time implies a decreasing hazard function of any other inter-match time. That is, all players
probability of the player returning to the game. It can also be seen will have several hazard functions which are assumed to possess
as the decreasing probability of a player returning to game of their the same proportional shape, resulting in the features exerting a
own volition and therefore likely requiring incentives from the constant effect on the hazard rate over time.
developer (Riot Games) to return. Given the structure of the data we analyse, whereby consecutive
We first use a Kaplan-Meier estimator to model the survival matches are recorded per player, there is potential for the indepen-
function at the population level. Then, we study the impact of dence condition to be violated, as a result of the inherent correlation
behavioural variables using a Standard Cox Regression Model and a amongst an individual player’s consecutive matches. Fitting a Cox
Mixed Effects Regression Model. All the models used are introduced Regression Model without accounting for this possible dependency
in the next sections. within players, may result in inaccurate and misleading results.
Cox introduced [6] a way of estimating the model parameters in
the standard Cox Regression Model via maximisation of the partial
3.1 Kaplan-Meier Estimator likelihood function with respect to β. We do not cover this as it is
Survival analysis is often faced with difficulties related to the data. beyond the scope of the paper.
For example, some individuals do not experience the “event” during
the study, so it is unknown after how long they experience it or if 3.3 Mixed Effects Cox Regression Model
they experience it at all. Furthermore, some individuals may decide An extension to the aforementioned standard Cox Regression Model
to quit the study before the end. These types of unknown data are is the Mixed Effects Cox Regression Model [6], which gives the
termed censored data. The simplest method to calculate the survival hazard rate for the j-th observation in the i-th cluster:
over time despite these difficulties is the Kaplan-Meier estimator
[13]. The survival probability is estimated according to the number λ(t)i j = λ 0 (t)e β x i j +bi zi j , (3)
of observations surviving past time t, divided by the total number where λ 0 (t) is the baseline hazard function, x i j is a vector contain-
of observations in the risk set for a given interval of time. Hosmer Jr ing the fixed effects variables, β is a vector containing the fixed
and Lemeshow [10] summarises this with the following equation: effects coefficients, zi j is a vector containing the random effects
variables and bi is the random effect for the i-th cluster from a
Ö ni − di vector containing the random effects and is assumed to follow a
S(t) = , (1) normal distribution with mean 0 and covariance matrix Σ.
t ≤t
ni
i
The model accounts for dependence amongst the player times
via the addition of a random effect component. The use of the word
where ni represents the number of observations at risk of the event “mixed” in the model refers to the combination of both fixed and
and di represents the observed number of observations who have random effects. This allows for heterogeneity in the population,
experienced the event. where there are dependencies amongst clustered event times for
The main limitation of the Kaplan-Meier estimate is that it mod- a given individual. By introducing this random term, individual
els survival at the population level. It is desirable, instead, to model players who have a higher sensitivity will have an increased (or
survival as dependent of some features (e.g. behavioural features). decreased) hazard rate.
The Standard Cox Regression Model, introduced in the next section The Mixed Effects Cox Regression Model mandates an additional
addresses this need. assumption through the requirement that each subject belongs to
Player Retention In League of Legends: A Study Using Survival Analysis
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia

Figure 2: Histogram of the explanatory variable Recent Av-


Figure 1: Histogram of the explanatory variable Recent Av- erage Time Between Matches
erage Match Duration

only one cluster. The assumption is satisfied in our data whereby


each time until the next match (subject) is unique to an individual
player (cluster).
Ripatti and Palmgren introduced [22] a generalised approach to
parameter estimation for the Mixed Effects Cox Regression Model
using a penalised partial likelihood technique. We do not cover this
as it is beyond the scope of the paper.1

4 DATA SET AND FEATURES


We analyse historical data to model and predict the probability of
a player quitting the game. Our analysis is based on behavioural
characteristics of players, namely Recent Average Match Duration
(RAMD) and Recent Average Time Between Matches (RATBM), both
being calculated over a 90 days period prior to the last event. We
also use Highest Achieved Season Tier (HAST) as a proxy for the
skill level of the player. The response variable we are interested in
is Time Until Next Match (TUNM).
Our analysis is performed on data collected from the League
of Legends API2 . First, we randomly sampled 1000 players from
the Oceania region among the 42,006 who participated in a public Figure 3: Histogram of the response variable Time Until Next
event in 2015. For each player, we collected match data between May Match
2014 and September 2016. Due to data restrictions, we could only
download the data from 201 players, for a total of 7,842 matches.
This, however, still gives a large number of data points. transformation to mitigate the problem. Figure 4 shows the distri-
The distribution of the variables of interests is shown in Figures bution of the transformed variable. The response variable is heavily
1, 2 (explanatory variables), and 3 (response variable). skewed (lower quartile is 4̃5m, median is 1̃h38m and upper quartile
The variable Recent Average Match Duration is approximately is almost 2 years) and has some extreme outliers. We excluded 2
normally distributed. On the other hand, the variable Recent Average observations where the players did not play for 12 months and then
Time Between Matches is severely skewed, so we applied a logarithm played a single game. Despite removing these outliers, the data still
exhibits severe right-skewness. However, for survival analysis, no
1 This is implemented in the coxme function in the survival package in R. distributional assumptions are enforced upon the response variable,
2 https://developer.riotgames.com
therefore no further data points will be removed.
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia Demediuk et al.

Figure 4: Histogram of the explanatory variable Recent Av-


erage Time Between Matches after a log transformation. Figure 5: Kaplan-Meier Survival Function for Time Until
Next Match with 95% confidence intervals.

Table 1: Median time until the next match and tier sample
sizes. HAST is short for Highest Achieved Season Tier and to estimation of the probability of surviving beyond the event of in-
TUNM is short for Time Until Next Match. terest, and which accounts for censored time-to-event data. Figure
5 depicts a decreasing rate of decline in survival probability over
HAST Median of TUNM Sample Size time after the initial severe drop. This indicates that as the time
Challenger 0.030 2 between a match increases the probability of a another consecutive
Master 0.583 5 match being played decreases. The curve begins to plateau around
Diamond 0.169 7 day 100 indicating that players with Time Until Next Match greater
Platinum 0.066 10 than 100 days have very low probability of returning to the game.
Gold 0.051 12 Given the sharp initial decline, the survival function is again
Silver 0.060 44 plotted in Figure 6 over a period of 6 hours (0.25 days) given 58% of
Bronze 0.074 89 observations are below this threshold. It is easier to observe that
Unranked 0.080 32 the probability of the observations consisting of a match not yet
having been played declines to around 50% after approximately 1
hours and 12 minutes (0.05 of a day). The initial plateau seen in this
plot is representative of the time when all players are still playing
The median values and sample sizes for Time Until Next Match
the match associated with the previous event.
per Highest Achieved Season Tier are provided in Table 1 below.
There does not appear to be a clear association or trend between the
ordered tiers and time. However, sample sizes across the categories
5.2 Standard Cox Regression Model
vary substantially with a much lower sample size for higher-tiered We fit a Standard Cox Regression Model which includes the two
players, which may influence the reliability of results. continuous variables and a categorical variable containing seven
tiers with hazard ratios provided relative to the Challenger tier. The
5 RESULTS model treats all observations individually without accounting for
the clustering of matches by player.
In this section we present the results of the analysis performed As introduced earlier, the proportional hazard assumption re-
on the League of Legends data. The results of the three models, quires that the hazard functions across all features must be propor-
Kaplan-Meier and Standard/Mixed Effects Cox Regression Models tional over time. We run the Pearson product-moment correlation
are reported in the next sections. between the scaled-Schoenfeld [24] residuals and time to verify the
assumption holds in the data. We found small p-values for Recent Av-
5.1 Kaplan-Meier erage Match Duration (p < 0.001) and Recent Average Time Between
An estimated Kaplan-Meier survival function for the response vari- Matches (p = 0.030), providing evidence of a violation of the pro-
able including 95% confidence intervals (dotted lines) is shown in portional hazards. The result of the test for the categorical variable
Figure 5. The Kaplan-Meier estimator is a non-parametric approach Highest Achieved Season Tiers is not significant (0.627 ≤ p ≤ 0.899),
Player Retention In League of Legends: A Study Using Survival Analysis
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia

Table 3: Mixed Effects Cox Regression Model output. RAMD


is short for Recent Average Match Duration, RATBM is short
for Recent Average Time Between Matches and HAST is short
for Highest Achieved Season Tier.

Independent Parameter Hazard


p-value
Variable Estimates Ratio
RAMD -0.027 0.973 <0.001
RATBM -0.001 0.999 0.037
HAST Challenger N/A 1.000 N/A
Master -1.314 0.269 0.290
Diamond -1.561 0.210 0.120
Platinum -1.571 0.208 0.120
Gold -1.499 0.223 0.140
Silver -1.474 0.229 0.140
Bronze -1.492 0.225 0.140
Unranked -1.580 0.206 0.120

Hazard ratios that are less than one are associated with longer
time durations until the next match is played. The output reveals
Figure 6: Kaplan-Meier Survival Function for Time Until
a strongly significant effect for Recent Average Match Duration
Next Match with 95% confidence intervals over a quarter-day
(p < 0.001). A hazard ratio of 0.973 indicates that a one-minute
period.
increase in recent average match duration decreases the hazard rate
by 2.7% (1 − e 1∗−0.027 ≈ 0.027). Equivalently, a 15-minute increase
Table 2: Standard Cox Regression Model output. RAMD is in recent average match duration is equal to a 33% decrease in the
short for Recent Average Match Duration, RATBM is short rate of time until the next match is played (1 − e 15∗−0.027 ≈ 0.333).
for Recent Average Time Between Matches and HAST is short As such, as the duration of matches tends to increase for a player,
for Highest Achieved Season Tier. the time until the next match decreases.
A significant effect was also found for Recent Average Time Be-
Independent Parameter Hazard tween Matches (p = 0.037). Whilst this feature does not have as
p-value
Variable Estimates Ratio strongly significant of an effect as the match duration feature, the
RAMD -0.024 0.976 <0.001 hazard ratio indicates that a one day increase in recent average
RATBM -0.002 0.998 <0.001 time between consecutive matches, decreases the hazard rate by
HAST Challenger N/A 1.000 N/A 0.1% (1 − e 1∗−0.001 ≈ 0.001).
Master -1.134 0.322 0.355 In terms of the highest achieved season tier, all tiers are insignif-
Diamond -1.509 0.221 0.133 icant at the 5% level (all p ≥ 0.12).
Platinum -1.484 0.227 0.138
Gold -1.431 0.239 0.153 5.4 Comparison of Models
Silver -1.404 0.246 0.161 A comparison of the hazard ratios from the Standard Cox Regression
Bronze -1.443 0.237 0.149 and the Mixed Effects Cox Regression Model are shown in Table
Unranked -1.563 0.209 0.118 4. It can be seen that the ratios, with and without accounting for a
random effect, are relatively stable. This suggests that there is little
impact in fitting the random term on the overall model.
hence satisfying the proportional hazards assumption. The global
test has a large chi-squared value, indicating that the overall model 6 DISCUSSION
violates the proportional hazards assumption (p < 0.001). A Mixed Effects Cox Regression Model was used in this work to
Keeping in mind that some of the assumptions about the data quantify the influence that player behaviour and skill exert on the
are violated, the output of the Standard Cox Regression Model rate of time until a subsequent match. It was found that as the
is reported in Table 2. Hazard ratios that are less than one are length of play tends to increase, the rate of time until occurrence
associated with longer time durations until the next match is played. of the next match was found to decline. This may suggest that the
player may have experienced a more challenging game. Without
5.3 Mixed Effects Cox Regression Model variables indicative of whether a player has won or lost a particular
We fit a Mixed Effects Cox Regression Model to the data with the match, it is difficult to attribute longer match durations with a skill
player ID as the random effect variable to accounts for within cluster discrepancy between players, or whether a long ”lose” or long ”win”
dependency; i.e., the dependency between a player’s various times. impacts on this decline. Over time, players may become discouraged
The output of the regression is presented in Table 3. or feel somewhat defeated and as a result, are less motivated to
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia Demediuk et al.

Table 4: Comparison of hazard ratios. RAMD is short for Re- future that replaces LoL and causes a mass migration away from
cent Average Match Duration, RATBM is short for Recent Av- this MOBA game.
erage Time Between Matches and HAST is short for Highest Improvement to the analysis may also be achieved through in-
Achieved Season Tier. corporation of data such as level of player spend and frequency
of player purchases. Coussement et. al. [5] incorporated monetary
Independent Standard Mixed Effects factors in their study of churn in the online gambling industry.
Variable Cox Model Cox Model Factors including number of bets placed during the preceding week,
RAMD 0.976 0.973 number of bets during the preceding month and total monetary
RATBM 0.998 0.999 amount of stakes were found to be significant drivers of churn.
HAST Challenger 1.000 1.000 Incorporating these factors into the retention analysis would bet-
Master 0.322 0.269 ter facilitate Riot Games’ ability to target players close to churn
Diamond 0.221 0.210 through analysing behaviours of players with a higher propensity
Platinum 0.227 0.208 to make in-game purchases. Note this data is current not available
Gold 0.239 0.223 within the API and would need to be sourced from elsewhere.
Silver 0.246 0.229
Bronze 0.237 0.225
Unranked 0.209 0.206
7 CONCLUSIONS AND FUTURE WORK
As introduced in Section 1, and discussed in Section 2, the motiva-
tion for this analysis is driven by the business model adopted by
immediately recommence playing. As such, the availability of a Riot Games, which relies heavily on customer retention, and further
variable indicating ”win” or ”lose” may have improved both model contributes to an extensive topic in games research which focuses
fit and interpretation of the results. on behavioral patterns in player activity. LoL is unlike online games
Similarly, increasing average time between sequential matches that require a subscription fee or an up-front cost for download.
was found to be associated with a longer time period until the As such, the rationale for undertaking this study was underpinned
next match. Typically, one would expect that individuals who are by the importance of player loyalty in a game that generates the
becoming increasingly disinterested towards the game, would tend majority of its revenue from in-game purchases.
to be associated with longer gaps between subsequent matches. The findings on behavioural factors in this paper are consistent
Again, it is difficult to draw conclusions without further information with prior research. As highlighted in Section 2, recent studies
on aspects including the age of a player or a player’s external have found evidence supporting the association between player
environment. These factors are likely to provide more information behaviour and churn [5], [14]. Inclusion of player behavioural char-
around the particular circumstances of an individual’s behaviour. acteristics, including average match duration and average time
A player’s highest previously achieved season tier (included as a between subsequent matches, were found to be significant factors
proxy for player skill level) was not found to be associated with rate affecting the rate of time until the next match. The present study
of match play. This may be due to the fact that Riot Games’ have provides evidence of this association between player behaviour and
implemented targeted rules to discourage inactivity. For example, a the likelihood of match play. More specifically, the current study
Challenger-tiered player (highest ranking) can be demoted after 10 applies survival analysis techniques including a Mixed Effects Cox
days of inactivity. A further incentive, League Points, sees points regression model with the results showing evidence of increasing
lost for Platinum, Diamond, Master and Challenger-tiered players rate of time until a player’s subsequent match with shorter average
after an inactive period of 28 days. It was anticipated that player match durations. Similarly, the results indicate that longer dura-
skill level would exert some influence on the time until the next tions of time between subsequent match plays are associated with a
match. However, this variable did not significantly contribute to decrease in the rate of time until a player’s next match. This analy-
the likelihood of match play. It should be noted that this conclusion sis, along with increasingly sophisticated methods will continue to
may be biased by the small sample sizes prevalent across the higher contribute to understandings and insights into indicators of player
tiers, as highlighted in Section 4. churn. Given the reliance of ongoing player purchases as the key
The selected observation period includes matches beginning in source of revenue, these insights will become increasingly more
May 2014 until September 2016. Variations in both the length of valuable.
the time period, as well as the selected observation period may In future work, we plan to extend this study to different regions
influence results. Riot Games constantly attempts to improve the in which LoL servers are hosted. This will remove a degree of pop-
game by making changes to game-play and characters. Performing ulation bias as a result of the inclusion of solely Oceania-region
survival analysis at different periods in time, for example after a players. Additionally, extending this work into different regions
major patch or character rework, may provided deeper insight into will allow for comparisons of different hazard rates across regions,
their affects of player churn. which may provide useful insight into location based influences on
Furthermore, given this study conveys an online product, changes player churn. We also plan to perform survival analysis at different
to technology may influence the user experience. It would be best- snapshots in time (i.e after a patch or character rework), to inves-
practice to continue using the most up-to-date LoL player data in tigate how large changes affected player churn. This in analysis
order to keep the results relevant. New games are constantly being could provide insight as a predictor for player churn after changes
released which presents a risk that a game could be released in the or be used to direct future additions and changes to the game.
Player Retention In League of Legends: A Study Using Survival Analysis
ACSW 2018, January 29-February 2, 2018, Brisbane, QLD, Australia

ACKNOWLEDGMENTS Games (CIG), 2014 IEEE Conference on. IEEE, 1–8.


[24] David Schoenfeld. 1982. Partial residuals for the proportional hazards regression
The authors acknowledge support from ARC LP130100743. Addi- model. Biometrika 69, 1 (1982), 239–241.
tionally, this work is supported by the Digital Creativity Labs jointly [25] Matthias Schubert, Anders Drachen, and Tobias Mahlmann. 2016. Esports analyt-
ics through encounter detection. In Proceedings of the MIT Sloan Sports Analytics
funded by EPSRC/AHRC/InnovateUK grant EP/M023265/1. Conference.
[26] Rafet Sifa, Christian Bauckhage, and Anders Drachen. 2014. The Playtime Prin-
ciple: Large-scale cross-games interest modeling. In Computational Intelligence
REFERENCES and Games (CIG), 2014 IEEE Conference on. IEEE, 1–8.
[1] Zoheb Borbora, Jaideep Srivastava, Kuo-Wei Hsu, and Dmitri Williams. 2011. [27] R. Sifa, F. Hadiji, J. Runge, A. Drachen, K. Kersting, and C. Bauckhage. 2015.
Churn prediction in mmorpgs using player motivation theories and an ensemble Predicting Purchase Decisions in Mobile Free-to-Play Games. In Proc. of AAAI
approach. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inerna- AIIDE.
tional Conference on Social Computing (SocialCom), 2011 IEEE Third International [28] Adam Summerville, Michael Cook, and Ben Steenhuisen. 2016. Draft-Analysis
Conference on. IEEE, 157–164. of the Ancients: Predicting Draft Picks in DotA 2 using Machine Learning. In
[2] Ionut Brandusoiu and Gavril Toderean. 2013. Churn prediction modeling in Twelfth Artificial Intelligence and Interactive Digital Entertainment Conference.
mobile telecommunications industry using decision trees. Journal of Computer [29] Superdata. 2016. 2016 MMO and MOBA Games Market. (2016). https://www.
Science and Control Systems 6, 1 (2013), 14. superdataresearch.com/market-data/mmo-market/
[3] James Brightman. 2016. League of Legends generates 150m [30] Superdata. 2016. Year in Review, December 2016. (2016). https://www.
a month - SuperData. http://www.gamesindustry.biz/articles/ superdataresearch.com/market-data/market-brief-year-in-review/
2016-06-09-league-of-legends-generates-usd150m-a-month-superdata. [31] Superdata Research. 2017. European Esports Conference Brief
(2016). (http://strivesponsorship.com/wp-content/ uploads/2017/04/Superdata-2017-
[4] Chris Chambers, Wu-chang Feng, Sambit Sahu, and Debanjan Saha. 2005. esports-market-brief.pdf). (2017). http://strivesponsorship.com/wp-content/
Measurement-based characterization of a collection of on-line games. In Pro- uploads/2017/04/Superdata-2017-esports-market-brief.pdf
ceedings of the 5th ACM SIGCOMM conference on Internet Measurement. USENIX [32] Pin-Yun Tarng, Kuan-Ta Chen, and Polly Huang. 2009. On prophesying online
Association, 1–1. gamer departure. In Proceedings of the 8th Annual Workshop on Network and
[5] Kristof Coussement and Koen W De Bock. 2013. Customer churn prediction in Systems Support for Games. IEEE Press, 16.
the online gambling industry: The beneficial effect of ensemble learning. Journal [33] Thanasis Vafeiadis, Konstantinos I Diamantaras, George Sarigiannidis, and K Ch
of Business Research 66, 9 (2013), 1629–1636. Chatzisavvas. 2015. A comparison of machine learning techniques for customer
[6] David R Cox. 1992. Regression models and life-tables. In Breakthroughs in churn prediction. Simulation Modelling Practice and Theory 55 (2015), 1–9.
statistics. Springer, 527–541. [34] Markus Viljanen, Antti Airola, Jukka Heikkonen, and Tapio Pahikkala. 2017.
[7] Wu-chang Feng, David Brandt, and Debanjan Saha. 2007. A long-term study Playtime Measurement with Survival Analysis. arXiv preprint arXiv:1701.02359
of a popular MMORPG. In Proceedings of the 6th ACM SIGCOMM Workshop on (2017).
Network and System Support for Games. ACM, 19–24. [35] Markus Viljanen, Antti Airola, Anne-Maarit Majanoja, Jukka Heikkonen, and
[8] John Hadden, Ashutosh Tiwari, Rajkumar Roy, and Dymitr Ruta. 2007. Com- Tapio Pahikkala. 2017. Measuring Player Retention and Monetization using the
puter assisted customer churn management: State-of-the-art and future trends. Mean Cumulative Function. arXiv preprint arXiv:1709.06737 (2017).
Computers & Operations Research 34, 10 (2007), 2902–2917. [36] Markus Viljanen, Antti Airola, Tapio Pahikkala, and Jukka Heikkonen. 2016.
[9] Fabian Hadiji, Rafet Sifa, Anders Drachen, Christian Thurau, Kristian Kerst- Modelling user retention in mobile games. In Computational Intelligence and
ing, and Christian Bauckhage. 2014. Predicting player churn in the wild. In Games (CIG), 2016 IEEE Conference on. IEEE, 1–8.
Computational intelligence and games (CIG), 2014 IEEE conference on. IEEE, 1–8. [37] Huiwen Wang, Bang Xia, and Zhe Chen. 2015. Cultural Difference on Team
[10] David W Hosmer Jr and Stanley Lemeshow. 1999. Applied survival analysis: Performance Between Chinese and Americans in Multiplayer Online Battle Arena
regression modelling of time to event data (1999). Eur Orthodontic Soc (1999), Games. Springer International Publishing, Cham, 374–383. https://doi.org/10.
561–2. 1007/978-3-319-20934-0_35
[11] Shin-Yuan Hung, David C Yen, and Hsiu-Yu Wang. 2006. Applying data mining [38] M. Wu, S. Xiong, and H. Iida. 2016. Fairness mechanism in multiplayer online
to telecom churn management. Expert Systems with Applications 31, 3 (2006), battle arena games. In 2016 3rd International Conference on Systems and Informatics
515–524. (ICSAI). 387–392. https://doi.org/10.1109/ICSAI.2016.7810986
[12] Deborah Kaminski and Cheryl Geisler. 2012. Survival analysis of faculty retention [39] Hanting Xie, Sam Devlin, Daniel Kudenko, and Peter Cowling. 2015. Predicting
in science and engineering by gender. Science 335, 6070 (2012), 864–866. Player Disengagement and First Purchase with Event-frequency Based Data
[13] Edward L Kaplan and Paul Meier. 1958. Nonparametric estimation from incom- Representation. In Proc. of CIG.
plete observations. Journal of the American statistical association 53, 282 (1958), [40] Hong Zhang. 2008. Customer retention in the financial industry: An application of
457–481. survival analysis. Ph.D. Dissertation. Purdue University.
[14] Jaya Kawale, Aditya Pal, and Jaideep Srivastava. 2009. Churn prediction in
MMORPGs: A social influence based approach. In Computational Science and
Engineering, 2009. CSE’09. International Conference on, Vol. 4. IEEE, 423–428.
[15] P Kollar. 2017. The Past, Present and Future of League of Legends Studio Riot
Games, Polygon Platform, 2016. http://www.polygon.com/2016/9/13/12891656/
the-past-present-and-future-of-league-of-legends-studio-riot-games. (2017).
[16] Junxiang Lu. 2002. Predicting customer churn in the telecommunications
industry—-An application of survival analysis modeling using SAS. SAS User
Group International (SUGI27) Online Proceedings (2002), 114–27.
[17] M. Milosevic, N. Zivic, and I. Andjelkovic. 2017. Early churn prediction with
personalized targeting in mobile social games. Expert Systems with Applications
(2017).
[18] Guangli Nie, Wei Rowe, Lingling Zhang, Yingjie Tian, and Yong Shi. 2011. Credit
card churn forecasting by logistic regression and decision tree. Expert Systems
with Applications 38, 12 (2011), 15273–15285.
[19] Noppon Prakannoppakun and Sukree Sinthupinyo. 2016. Skill rating method in
multiplayer online battle arena. In Electronics, Computers and Artificial Intelligence
(ECAI), 2016 8th International Conference on. IEEE, 1–6.
[20] Fred Reichheld. 2001. Prescription for cutting costs. Bain & Company. Boston:
Harvard Business School Publishing (2001). http://www.bain.com/IMages/BB_
Prescription_cutting_costs.pdf
[21] François Rioult, Jean-Philippe Métivier, Boris Helleu, Nicolas Scelles, and
Christophe Durand. 2014. Mining tracks of competitive video games. AASRI
Procedia 8 (2014), 82–87.
[22] Samuli Ripatti and Juni Palmgren. 2000. Estimation of multivariate frailty models
using penalized partial likelihood. Biometrics 56, 4 (2000), 1016–1022.
[23] Julian Runge, Peng Gao, Florent Garcin, and Boi Faltings. 2014. Churn prediction
for high-value players in casual social games. In Computational Intelligence and

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy