This is a summary from our Netgames DiveReal whitepaper. Download the full whitepaper.
In his 1992 science fiction classic novel Snow Crash, author Neal Stephenson imagined a collective virtual reality called the Metaverse, where user-controlled avatars hung out in 3D venues. It is a virtual world intended to meet people, do business, socialize. Since then, despite several tentatives to realize Neal Stephenson’s dream, social virtual worlds still attract orders of magnitude less users than other popular social online services.
Issues with Rooms
In virtual worlds, the common way to have people together interacting is connecting them to the same server. As servers have bounded ressources, this limits the size of groups to few hundreds, and usually much less. This is perfectly acceptable for games with rules compatible with this constraint. However, having a limit in the number of people together ruins the social experience, first by restraining with whom a user can interact but even worse, by leaving many users mostly alone for extended periods of time.
This has consequences from the user’s perspective. Let’s start first with some vocabulary:
- Scene: The ‘scene’ is what is seen by users. Here, for the sake of simplicity and because it suffice for basic socialization, we will consider only scenes with avatars evolving in a static landscape.
- Decor: The static elements of a scene.
- Room: The portion of virtual space corresponding to a given server. Anyone connected to the server has her avatar in the room. Actually, a ‘room’ does need to look like a real room and may have any size or shape.
- Region:A virtual space can be divided in rooms, each with a different decor. These rooms are called ’regions’ of the overall space.
- Shard: When many rooms have the same decor, they are called ‘shards’ of the same place.
Shards and regions can be combined and multi-region shards are not uncommon.
Regarding social virtual worlds – which primary purpose is to be with others – rooms have unwanted and awkward effects. When a user arrives in an empty room, since her goal is to socialize, she do not stay long. Hence, empty rooms tend to remain empty.
One common solution to this issue is to organize a social event to be held in the room at a given time so many will arrive simultaneously. But then arises another problem, if the intention of attracting people is successful then soon the room reaches its maximum capacity and starts refusing people.
In region-based worlds, like Second Life, the result is that most regions are empty or nearly empty while few regions are too crowded to accept new users. Then, even if many users are connected at a given moment, the standard experience for a new-coming user is being alone or with at most one or two others.
The other way to get people apparently in the same place is by having enough shards to host them all. However users who are assigned to a given shard can interact with users in the same shard but not with users in other shards. This restriction can be specially annoying for a user willing to be with someone in particular. Moreover the fluctuation of users arriving and leaving the system eventually put some shards bellow the threshold needed to sustain a social activity thus accelerating further their depletion.
We can note that in the happy case of an steady stream of users joining the system, when all shards are at maximum capacity a new – empty – shard has to be made available. But then the users arrive in an empty room so they may change their mind and decide not to stay.
Dividing a social virtual world in many rooms – either by sharding or by partition in regions – not only isolated users, lessening the social value of the service, but repels new users often putting them in empty rooms. This is hardly a design choice but more an unwanted consequence of the server-based architecture. It is then admirable that some social virtual worlds have succeed at all, reaching, however orders of magnitude less users than other popular social services.
Using exaQuark
To address these issues with rooms we have designed and implemented exaQuark, an overlay aimed at replacing the servers running today’s classical virtual worlds. Unlike a server, the exaQuark overlay can grow at will – by adding more machines – and accommodate an unlimited number of users in a continuous space.
In DiveReal the first step to connect to exaQuark overlay is to ask an allocator for an entry point. The allocator is called only once but the connection to the entry point remains for the whole session. Each time the user moves her avatar, the coordinates of the new position are sent to the entry point. Whenever an avatar moves, leaves or arrives in the neighborhood a notification is sent by exaQuark so at each moment the user has an updated list of the neighbors and positions. The decor is handled outside of exaQuark by a spatial database that provides the static elements of the world at avatar’s position. For DiveReal first version we use Google Earth and Street View data and servers so our avatars evolve in a mirror world, a digital copy of our planet.
IV. Distributed System and Consistency
Usually virtual world servers do more than just ensure the communication among the connected. In the standard implementation the server is a simulator that computes the state of the world and broadcast it to all. When a user wants her avatar to perform some action such as ”move forward” or ”shoot monster”, the request is forwarded to the server which computes the resulting effects on the world. For example, order ”move forward” would yield an effect such as ”avatar moved 20 cm before hitting a wall” while ensuing order ”shoot monster” could end up in ”too late, the monster just ate your avatar”.
Having a server in the middle receiving all data and computing simulations is clearly a bottleneck and can incur in annoying delays, but as the state of the world has a unique source, the consistency is thoroughly and naturally ensured, with all participants eventually receiving the same data about the scene. At the contrary, when using exaQuark, there is no central point where every state of the world is computed. Instead, avatars’ actions are broadcasted to the neighbors and a simulation of the immediate surroundings of the user’s avatar is performed on each user’s computer. Hence the simulation is distributed, which implies that at some point two users might have divergent views of the world.
However what matters is not an absolute consistency but a consistency good enough so the gaming or social activities are not disrupted. DiveReal’s strategy to deal with discrepancies in replicated states is to allow some divergence in the short term with a guarantee the replicas will eventually converge in a delay compatible with the application. This strategy, called optimistic replication, is greatly simplified when the set of allowed operations is limited in order to avoid conflicts among replicas. As DiveReal is intended for social interactions, we have, at first, restricted the set of allowed actions to a bare minimum: a user can only move her avatar and she is the only one allowed to do so. When an avatar moves not all simulators would know it simultaneously, but — as shown in virtual world behavioral studies — most avatar movements are followed by long periods of immobility, giving time for replicas to synchronize theirs states. When relying on a central server, consistency might arguably be better, but at the cost of an increased delay between the user action and its visible effects. This latency, known as lag, is induced by the round trip to the distant server and becomes even worse when the server approaches its maximum capacity and gets late in computing and transmitting the next state of the world.
Next steps
The DiveReal architecture around exaQuark overlay enables a virtual world to host thousands, and even millions, into a seamless space. To reach such scalability, we have had to rethink how virtual worlds are implemented and separate functions that are often seen mandatorily together.
In the DiveReal architecture the decor comes for a spatial database, the communication is peer-to-peer when possible, the simulation is distributed and connections are established by proximity through a Delaunay graph computed by a distributed algorithm. However, major issues remain to be solved, first the simulation of the world is overly simplistic; and, more complex to tackle, the social rules for a virtual world with crowds have yet to be invented.
Servers acting as referees for a zone might allow playing FPS games and, more generally, innovative architectures for complex physics using Distributed Scene Graphs. However, designing human computer interactions for crowds in a social virtual world is a messy process involving trial and error with humans. Our first experiments with users show, for example, that teleportation tend to disrupt conversations or that assembling people with enough things in common to enact meaningful interactions has no easy recipe.
To progress on this front, further experiments should involve growing numbers of users, both real and simulated. We can speculate that, eventually, a social virtual world will start attracting users, while having a increasing value as more people join it. Hopefully with no ”room” limit to prevent further growth.