Test servers are, let’s face it, kind of a terrible idea that just keeps getting trotted out there in the MMO space. And the older I get and the more I understand about the industry, the more I see them as unnecessary at best and actively harmful to the testing process at worst.
This was on my mind during the most recent World of Warcraft interview roundup, in which it was mentioned that part of the problems with Torghast weren’t found by the people in the test. Now, the point here isn’t to drag that particular statement (which sure sounds like it’s patently false if you were paying attention to the actual beta feedback, but whatever) but rather to look at some of the philosophy of testing that went into this and how it might have contributed to the problem.
Real talk: Part of the problem here comes down to relying on players testing the system to provide useful feedback. And players have different priorities than actual testers, and you can’t just unleash a bunch of them and act like you’ve gotten a good testing session taken care of.
Let’s take a step back. What, exactly, is testing supposed to accomplish? Because in broad strokes, there are three things that fall under the broad header of “testing,” and they tend to all be conflated as one job. In broad strokes, we can look at testing as having three main components:
- Functional testing: This is the nuts-and-bolts, kick-the-tires sort of testing. Does the server hold up under load? Are there glaring bugs with clicking on something? Can you get locked in infinite loops, good or bad? Do the systems perform like they’re supposed to? This is the process of locating bugs and existing problems and making sure that they’re addressed and corrected.
- Conceptual testing: Is this fun to do? Is the gameplay loop satisfying? Do the quest parts lead into one another and have all of the appropriate connective tissue? Are the rewards commensurate for the time put in? The goal here is to look beyond the immediate elements of whatever is being tested and place it into a larger context, especially when referencing design documents.
- Executive testing: Do the various parts of this content load consistently? Is everything working properly and giving you an idea of how to play it? How do these systems interact with others? This is the basic nuts-and-bolts path of making sure that the systems and content are explained clearly and are approachable for players.
You might see the problem just from looking at these categories, but let me make this explicit. Most of this stuff are things that players don’t actually care about. No one who is playing on a test server cares about where this new content slots in on a design document, just whether or not it’s fun to play on its own.
In fact, a lot of these things are actively antithetical to the way that players on test servers are looking into all of this. Yes, players want to find bugs and make sure that they’re not there, but they also don’t always know if something is a bug or actually the way it’s supposed to work. Players can tell you whether something is fun or not, but they can’t tell you if there’s a reason something isn’t fun. Heck, most people on test servers are their primarily to try out new content, not really to test it.
Mind you, I’m not faulting the players here. If you’re not getting paid to test this stuff, it is by definition a hobby rather than a job, and there’s no way to fault players for treating a hobby as something to do for fun. But that’s going to naturally affect the quality of testing, and it’s also going to affect how well studios communicate back and forth with the players.
For example, let’s say that hypothetically a new patch for a game adds boat-building. Let’s just have fun and claim it’s, oh, Star Wars: The Old Republic. There’s a whole system for building your own boats in there now. Why? Who cares. The point is, you’re building boats.
Immediately, players start coming back and explain that boat building isn’t fun, it takes too long, and there aren’t enough places to use the boats. There are several possibilities about what could be happening here. It could be, for example, that the boat building mechanics aren’t explained well enough, so players don’t know how to build boats properly. It could be that boat content isn’t ready for testing yet, so boats are still limited in their usage. It could be that boat building is supposed to be slow because you’re supposed to be done after building one boat.
But absent any of that information, players are going to take away that boat building isn’t fun and isn’t useful. And without having an understanding of why players are finding this system unfun – because players don’t have a full picture of how boats are supposed to work – it’s very easy to fall back on just reassuring players that it will be fun once they have the full picture, even if maybe the problems are very real and need to be actually fixed.
Now, to be fair, there are some things that are best tested with masses of players. Server stability, for example, is much easier to test when you have a whole lot of people stampeding onto your servers with a variety of connection speeds and locations. Similarly, opening up the game into the wild is often the best way to ensure that players at different hardware levels can run the game decently. That’s just basic logic.
The trick is differentiation. Players are useful for testing these things when the game has already undergone most of the other testing. You’re pretty sure that the systems work and are stable. The testers are finding things fun. The game is playing the way it’s supposed to. This is what you hire testers for, people whose job is actually to play the game and provide useful feedback.
And here’s where test servers start to be actively malicious. Because the more the function of tester is moved over to players, the more tempting it is to skimp on the QA staff who can actually do the important work of testing the game in its totality, getting into the areas where players tend to neglect, actually taking a full and comprehensive look at the game and testing it in all the ways it needs testing.
Not putting investment into that, of course, leads to a cycle of hurting. Updates come out with more bugs, less polish, and less understanding of what players want. The blame gets foisted on the people who actually were doing the testing, and then you cycle back to the next update, and…
Yeah, you get the idea.
Test servers are a neat idea. But they’re a neat idea that doesn’t really work as a free battery of professional QA testers and cannot replace them. And the more a company tries to replace proper testing with player efforts, the worse things are going to go as a whole.