Digital Media: Massively Multiplayer Online Gaming (MMOG) : Technical Guide
Digital Media: Massively Multiplayer Online Gaming (MMOG) : Technical Guide
Digital Media: Massively Multiplayer Online Gaming (MMOG) : Technical Guide
S O L U T I O N
GUIDE
Digital Media:
Massively Multiplayer
Online Gaming (MMOG) DIGITAL MEDIA
Game developers call the art of optimizing code for Solution Provider
The Butterfly.net Grid
a particular game platform getting close to the metal. Solution Focus Area
Subscription Media
Butterfly.net has created a powerful architecture for online ServicesMMOG
a high-performance way.
growing pains, and online gaming is no exception. The Internet has brought to the gamer
what standalone systems never could: millions of users on a packet network investing
enormous amounts of time, energy, and money in the pursuit of adventure, status,
power, fellowship, and fun in persistent-state, massively multiplayer games (MMGs).
Examples such as EverQuest*, the first 3D-hardware-required MMG, claims a half-
million subscribers. NCSoft estimates that each month 1.2 million people pay to play
Lineage2, its South Korean MMG. But as engrossing and compelling as current MMGs
are for some players, they leave most gamers cold, for several reasons:
MMGs are often unavailable for hours at a time. By contrast, standalone games
offer rapid response times.
MMGs suffer from lag. Standalone games host intelligent adversaries and allies
in thought-provoking conflicts.
MMGs offer a minimal set of interactions.
One reason that MMGs do not currently operate as well as standalone games is that
MMG developers do not have a development platform on which to write and test their
games. For example, standalone game developers, writing for personal computers,
can optimize their code for DirectX*. Developers creating games for console devices,
such as the Sony* PlayStation* 2 (PS2), can write to the Emotion Engine*. In these
cases, developers have the opportunity to code to the metal of a particular device
or platform and the peripherals that each platform supportsto optimize game
performance through tight integration of hardware and software.
MMGs also suffered from conventional network infrastructure problems that have
developed as quickly as the number of online gamers. Legacy servers based on inflexible,
monolithic architectures are the source of the problems. To overcome server limitations,
game designers must divide games into shards that provide copies of each game
world on separate servers. Generally, each shard supports a maximum of around 4,000
gamers. As a result, MMG developers struggle with complex network and software
balance issues, which distracts them from developing the best games possible.
Game publishers, in turn, must manage and support homegrown technologies for each
game. This limits their ability to build effective, repeatable, and reliable infrastructures
that support multiple MMGs and titles. They are forced to a high price for poor reliability
and support costs, even as valuable revenue diminishes, thanks to server maintenance
and reconfiguration that shuts down or slows down entire games.
1
Intel Business Center Case Study, Butterfly.net Uses Intel Architecture to Build a Global Gaming Grid, Pg.1, Copyright 2003, Intel Corporation
2
The Butterfly Grid: Technical Architecture Overview, Pg.7, Copyright 2002 Butterfly.net, Inc.
2
T E C H N I C A L S O L U T I O N G U I D E
Performance and reliability issues severely impact an online gamers experience and
compromise their ability to interact with online friends. As the number of gamers per
shard increases, so does latency within traditional MMGs. To avoid long delays to
attach to popular game servers or the need to log in and out of MMG levels, developers
must restrict the size of game worlds per serverwhich limits the players experience
and satisfaction level in another way.
To allow online games to evolve into a profitable and mature market, a cost-effective,
scalable, high-performance, and reliable infrastructure is desperately needed.
Butterfly.net, faced with the architectural challenge that the MMG market presented,
turned to Intel Architecture and grid computing.
3
T E C H N I C A L S O L U T I O N G U I D E
developed, deployed, The grid that Butterfly.net developed provides a way to use both technology and
operational investments on the same platform. When fully deployed, the Butterfly
and distributed keeps grid architecture can support over one million simultaneous players without
compromising performance.
the accessibility and
games down.
4
T E C H N I C A L S O L U T I O N G U I D E
operational investments
5
T E C H N I C A L S O L U T I O N G U I D E
High performance
Maximally efficient
Exceptionally entertaining
Utterly Reliable
To create a truly reliable system, the servers themselves must be hot swappable. If a
server is taken offline, another server must be ready and able to take its place. If a hosting
center or a section of the Internet is not available, connections to the game must be
re-routed to a new set of resources. This must be unnoticeable to the gamer. Reliability
dictates that the game architecture be fully distributed, and that the data that comprises
a persistent state be instantiated from many different databases into many different
game servers and written back to redundant data-stores as game play progresses.
Reliability also dictates that the processes by which game play is distributed among
servers be automated. System administrators can be reacting to changes in game play
and looking for resources to acquire. A truly autonomic computing infrastructure is
required. These requirements led to the selection of hot-swappable blade technology
based on Intel Xeon processor architecture.
6
T E C H N I C A L S O L U T I O N G U I D E
Infinitely Scalable
Todays games are not built to easily scale. Most game publishers simply replicate worlds
on multiple systems to create scale. This inevitably leads to problems because MMOG
players are then bound to specific game servers, with each server able to support only
a finite number of players at any one time. The games themselves do not truly scale,
A fully distributed game
infrastructure scales by
the systems merely do.
allowing the game to acquire
Currently game operators have the choice of building large-scale infrastructures to
support anticipated player numbers. Building infrastructure represents a large initial more computing resources
investment that is at risk if the game does not attract the expected audience. Worse, as needed.
if the game attracts more players than anticipated, inadequate systems may not be able
to handle peak loads or over-subscription periods, which could result in poor game
performance or worse, a halt altogether.
A fully distributed game infrastructure, such as the Butterfly grid, scales by allowing
the game to acquire more computing resources as needed, and by flexibly dividing up
game play, to best take advantage of the systems available. On the Butterfly grid, the
game world is not replicated on the servers, but portions of the world are placed on
different servers so that the game itself scales.
The server infrastructure itself can be heterogeneous. Many 1U rack-mounted Intel
Xeon processor servers can inexpensively be added to the grid, as well as powerful
computer clusters or high-end servers based on Intel Xeon processors using blade
technology. The scaling model that makes the most sense for the operations team
can be accommodated. While a developer can still segment players and games into
shards, it should not be mandatory because of server constraints. A single shard can
run across many servers so that resources can be allocated to the area that requires the
most processing power.
7
T E C H N I C A L S O L U T I O N G U I D E
Absolutely Secure
Security involves three things:
Authentication
Authorization
Accounting
The Butterfly grid uses several security protocols, including a 128-bit session message
digest that can only be generated by an authentic client (one with both the session key
Industry Standard
standardization, but the The game industry is a long way from standardization, but the Butterfly grid is a step
in the right direction. Using a set of supportable industry standards and architecting
Butterfly* grid is a step in
a distributed computing system specifically for games that takes advantage of standard
the right direction. components, games are designed and optimized for that system. The distributed infra-
structure becomes the platform that the game developer writes to, rather than writing
to the specifications of the edge device. They are working with the Global Grid Forum
and use Open Grid Services Architecture (OGSA) components to ensure that any
game that conforms to open standards and publicly available specifications operates
over the Butterfly grid.
High Performance
The Butterfly grid is optimized for online gaming in several ways. The dead reckoning
systems ensure that communications among clients and servers only occur when a
model is not synchronized with others. The game developer controls the level of toler-
ance that triggers communication and resynchronization. In a first-person shooting,
racing, or other action game, the objects can update each other frequently, and the game
developer limits the number of objects within an area of interest. In an epic adventure,
role-playing, or strategy game where thousands of players may be in one area, the dead
reckoning models can be set to support an appropriate level of tolerance.
Dedicated servers within the grid run artificial intelligence as well. Games with lots of
intelligent creatures can dedicate resources to running those creatures without bogging
down the game servers. In addition, the network protocol stack itself is based on UDP
with a thin reliability layer for optimal performance.
UDP sends each datagram in the Butterfly grid as an individual unit with no connection
to other units that it sends. In contrast, TCP would send a stream of data, which is
interconnected to the preceding and following packet. A packet of TCP data is not sent
until receipt of the preceding packet has been acknowledged by the receiver. After a
delay, it resends the missing packet thus ensuring the entire message gets through.
While this is a highly efficient model in a normal data-driven Internet protocol-based
network, this can cause problems in a distributed game environment. TCP ensures
that packet loss does not occur, but at the price of a heavy communications overhead,
because any transmission loss necessitates the retransmission of the data. This inevitably
leads to real-time delays in the transmission, which in gaming scenarios appear to
the player as jitter.
8
T E C H N I C A L S O L U T I O N G U I D E
UDP largely circumvents the connection setup process, flow control, and retransmission
problem. UDP-based grids allow the sender to specify both the source and destination
port numbers for their message. Coupled with a calculation of the checksum on both
the data and header, this allows both the sending and receiving applications to ensure
the correct delivery of a message without the overhead TCP imposes. In the Butterfly
grid architecture, the player is a known, proxied entity, and so managing source and
destination information can be efficiently handled using UDP. Transmissions between
the player and grid components are more instantaneous, and thus preferable.
Maximally Efficient
The Butterfly grid is exceptionally efficient, allowing the game to allocate resources
to the areas that need them the most. All server code is highly optimized C and C++
for Linux and client code hand-tooled to the various edge devices (such as PS2,
Microsoft* Windows*-based computers, and Microsoft PocketPC-based devices).
In an MMG, its not enough to optimize the network protocol stack, the client code,
and the server code; all three systems must be synchronized and work together for
the best overall effect. That effect also depends on the game itself. With the Butterfly
grid, the game developer controls the clients, the server, the gateway, the artificial
intelligence engines, and the database management system, and can make the most
intelligent trade-offs to optimize the gaming experience, while using the fewest
resources (client processor, bandwidth, and server capacity) overall.
Exceptionally Entertaining
Architecture requirements led Butterfly.net first to a fully distributed architecture, then
to grid computing, and finally, to the Butterfly grid. With this platform, developers
could finally optimize code for the Internet itself, creating games that incite,
confound, baffle, delight, and amuse gamers.
The Butterfly* grid
9
T E C H N I C A L S O L U T I O N G U I D E
Butterfly.net developed
a middleware application
functions. Business functions and grid interfaces are wrapped in Java*, with a Java-
based messaging system for process management. They also incorporated a Java 2
Enterprise Edition (J2EE*) application server, which provided them with a sophisticated
engine to handle real-time transaction volumes as users traversed worlds and shards.
server core that could
Each mesh is connected over grid-compliant protocols to add services and computing
support multiple functions capacity on demand. The Globus Toolkit*, a reference implementation of OGSA,
provides the following services:
within the software
Security
architecture.
Authorization
Authentication
File transfer
Access to secondary storage
Web-services bindings
Meta-computing directory service
Other standard utility services
The Butterfly.net Game Configuration Specification is an extensible markup language-
remote procedure called (XML-RPC) Web Services Description Language (WSDL)
binding, which is used by the toolkit to configure resources to support specific games.
Fully-Meshed Architecture
of the Butterfly Grid
10
T E C H N I C A L S O L U T I O N G U I D E
allows Butterfly.net to
registry within the grid.
monitor servers and
MDS-2 provides a uniform framework for discovering and accessing system
configuration and status information such as: distributes the processing
Compute server configuration
needs of more popular
Network status
Locations of replicated datasets within the grid hierarchy games and populated
MDS-2 uses a soft-state protocol for lifetime management of published information. areas to idle computing
The public key-based grid security infrastructure (GSI) protocol provides MMOG resources within the
players with single sign-on authentication, communication protection, and some initial
data center.
support for restricted delegation. Single sign-on authentication allows a user to establish
who they are once. The grid architecture then creates a proxy credential that it uses to
establish the users identity with any remote service on the users behalf.
This is especially useful as gamers traverse shards and worlds within the grid itself,
which can be physically distributed across a mesh of systems in differing geographies.
Credential delegation allows for the creation and communication to a remote service of
delegated proxy credentials that the remote service can use to act on the users behalf
perhaps with various restrictions; this capability is important for nested operations.
The Globus Toolkit is the standard for grid-based sharing of online resources across
organizations. It allows Butterfly.net to monitor servers and distributes the processing
needs of more popular games and populated areas to idle computing resources within
the data center. The key network protocol stack (NPS) is a thin reliability layer on UDP
that connects edge devices to the gateways, which transparently relay information to
the correct server.
Butterfly also uses multicast communications to all servers subscribed in the grid to
create a fully meshed, multicast network during real-time game play. Additionally,
each mesh is connected over grid-compliant protocols to add services and computing
capacity, as needed, using OGSA components.
11
T E C H N I C A L S O L U T I O N G U I D E
connects personal
computers, and Microsoft PocketPC-based devices). It also handles and translates the
interactions, changes, and actions of these game objects to communications protocols
that are understood by the end-users client platform. On a sufficiently complex or
powerful platform, this layer can be thin. On more modest platforms, this layer may
computers, notebooks,
be complex and could involve different translations, where certain data elements are
Pocket PCs, Palm*- parsed out and not transmitted to the end client.
The back-end tier is the data store and server. This layer provides the embodiment of
compatible handhelds,
the game as it is defined by the game rules and logic, the objects and attributes that
and next-generation comprise the game environment, and objects that represent the players themselves as
they are involved in game play.
128-bit consoles in one
Going in the other direction, game software provides natural interfaces for performing
seamless world. actions. These game actions are captured by the OMS, which in turn sends the infor-
mation to the middle tier. The middle tier translates the information and communicates
with the back-end tier, and re-distributes information to other client platforms, as
appropriate to the game design.
12
T E C H N I C A L S O L U T I O N G U I D E
The gateway servers translate the data objects and communications protocols to forms
that are understood by the users platform and routes player connections to game servers.
The daemon controllers are dedicated artificial intelligence servers that drive the
activities of non-player characters (NPCs). NPCs are game elements not directly
controlled by player actions. They are essentially privileged clients that act like
players and interact directly with the grids gateway servers.
The game servers are responsible for running games within the grid. They manage the
game as it is defined by the game rules and logic, the objects and attributes that comprise
the game environment, and objects that represent the players themselves as they are
involved in game play. The intelligence that determines when players are shifted to new
servers resides on both the game and gateway servers. When a game server becomes
overused or fails, it sends a controlling message to the gateway servers. The gateway
servers are ultimately responsible for redirecting players to a new game server.
The intelligence that
13
T E C H N I C A L S O L U T I O N G U I D E
Internet
Firewall
Administration Workstation
Intel Xeon Processor-Based Servers
that is in service.
Game
Servers
When client software connects to the system, it can connect to any gateway that is in
service. After authentication and authorization, the gateway translator acts as a proxy
for the client to the back-end server. Multiple servers, each responsible for managing a
segment of the environment can be used.
If in the course of using the system, the participants state changes in such a way that
they need to be served from a different server, a move request is transmitted to the
gateway (a request that is generated by the server), at which point the gateway begins
its proxy communications with the new server.
Whats important is that as player state changes, and the gateway proxies the player
to a different server, that server could physically be located in an entirely different
geography or locale. Because the grid is a mesh, and that mesh is logically connected
to other systems within the grid architecture, players can access grid components that
are not physically located in the same place. Of course, this process is transparent to
the end client device or user. A player does not even know it is happening.
14
T E C H N I C A L S O L U T I O N G U I D E
The grid allocates resources based on usage, liberating game aficionados from server-
bound game play. Once adopted on a global scale, the Butterfly grid will replace the
far less efficient current online infrastructure and provide an open, scalable,
high-performance environment that delivers responsive, reliable gaming action.
3
Intel Business Center Case Study, Butterfly.net Uses Intel Architecture to Build a Global Gaming Grid, Pg.1, Copyright 2003, Intel Corporation
15
www.intel.com/go/digitalmedia
Intel, the Intel and Intel Inside logos, Pentium, XScale, Centrino and Xeon are trademarks or registered trademarks of Intel Corporation or its
subsidiaries in the United States and other countries.
Copyright 2003 Intel Corporation. All Rights Reserved.
* Other names and brands may be claimed as the property of others. Information regarding third party products is provided solely for educational
purposes. Intel is not responsible for the performance or support of third party products and does not make any representations or warranties
whatsoever regarding quality, reliability, functionality, or compatibility of these devices or products. 251853001