ArcheAge Unchained’s Belstrom EU server is down for maintenance thanks to an auction house issue

    
5

The hits just keep on coming for ArcheAge Unchained players, specifically those who play on the EU server Belstrom. As of this writing, the server continues to stay dark for extended maintenance thanks to an issue with the auction house.

Problems started earlier this week when the Belstrom auction house was not delivering gold to users as intended. A few hours later, the devs announced that Belstrom would be taken offline to address the matter, which transformed into a far bigger issue once the team took a look under the hood.

“When we initially diagnosed the data, it was apparent that there were other contributing factors that needed to be addressed immediately. The team has worked around the clock on these issues and we’re making progress at both restoring functionality and cleaning up the incorrect data. It’s imperative that the server remains inaccessible while the restoration occurs.

“Some accounts took advantage of this situation and will be disciplined accordingly prior to the server being opened again.”

The most recent update on the matter explains that there’s still no ETA on when Belstrom will be back online. The EU servers have even been updated today with the exception of Belstrom as the team attempts to work through the problems.

“We understand that you all have been waiting for over 48 hours to gain access to your server again, and that a downtime like this without a solid ETA on when the server will be back up is not ideal,” explains the most recent forum update, “but we need to make sure that the health of the server going forward is maintained, and that any potential fraudulent activity that happened during the auction house issues are cleaned up before we can bring the server online again.”

Update
Aaaaand it’s back.

5
LEAVE A COMMENT

Please Login to comment
  Subscribe  
newest oldest most liked
Subscribe to:
Reader
Castagere Shaikura

So what happens to the other Archage game now that this one is out? Is it a dead game now?

Reader
Arnold Hendrick

The achilles heel of traditional MMORPG architectures is the database system that manages everything from player wealth, inventory and stats to the constantly changing contents of the auction house. If the DB system “gets behind” the rest of the game, all sorts of havoc can occur. Eventually some “error flags” communicate this, and the server system must be halted before more damage occurs.

Fixing the DB means discovering all the little operations that are taking too much time – the accumulation of delays that caused the problem. Finding the root cause(s) can be difficult, and creating test conditions to duplicate operation “under heavy load” can be equally difficult.

One solution to this problem is throwing hardware at the problem. Gigantic solid state drives (SSDs) and massive RAM buffering can do wonders, but adds $50K to $100K (or more) in hardware costs for each server – which may be beyond the means of this developer/operator. The other way is optimizing each little operation, then testing the results, until you get a system that runs quickly using the hardware in place.

Ultimately, the real root cause is that a MMORPG may use database system poorly designed for the constant read-write activities of the game. Relational database software is familiar and cheap, but performs much slower than a more modern “flat” database system. My guess is that insufficient attention was paid to database architecture. Now both they and their players are paying the price.

Reader
Sorata

For
4(!!!)
days
and counting…

Apparently the only server that is affected… the code of all the other servers is fine… smells that fishy…

Reader
Arnold Hendrick

Probably something about the DB hardware, or player use patterns on that server has revealed the problem on the EU server. If it is the DB, they should migrate any code fixes to other servers with similar hardware.

The problem was revealed to users in the aftern00n hours of 11/4/19 (EU time), and the server went offline a few hours later. Today is 11/7/19, so they’re on the fourth day of repairs. It’s feels like an eternity to a player, but to developers who are probably putting in 10-12+ hour days to fix the crisis, it’s easily possible. You would be shocked at how much time it takes to dig through code (that you probably didn’t write), try a fix, debug it, and then test how much it speeds things up – all before you dare apply it to a test server. Having the live server offline helps – you can use the live hardware for testing as well.

In this case, if I’m not mistaken, the developer is in S.Korea, but the operator is in Europe (Germany?). That means multiple levels of bosses at both companies are frequently involved in all communications, plus a language barrier. This is a perfect situation for blaming the other guy, which never makes solutions faster.

I have had producer responsibilities for an MMORPG during development and then in live operation. The most feared problem was a DB failing, since the root cause was frequently in the server-side game code that was feeding stuff into the DB, as well as the efficiency within the DB. We invested extra development time to test and tune the DB, including a week to just build a test rig (in code) that would hammer the DB under typical heavy live use. The upfront investment was successful – our DB managed to keep up with the game. (Although there were ‘near the red line’ problems when the DB had to both operate, and simultaneously spin off a DB copy to a separate machine as a data backup – kind of like you backing up stuff on your PC, only more complex.)

However, as I remember it, ArcheAge Unchained was under heavy pressure for a release as soon as possible. When you are rushing to go live, it’s easy to take shortcuts. Sometimes important stuff gets cut short. The result can be a failure like this.

Remember, I’m not associated with either the developers or operators. My guesses could be totally wrong.

Reader
Sorata

Yeah I know, I am coding a bit my selfe. Not that fit in databases though.
I know it could take a lot of time. But they have coded different stuff in the different servers is strange. Shouldn’t be the same problem on the other servers otherwise?