Elitist Jerks
Register
Blogs
Forums


Go Back   Elitist Jerks » Public Discussion » Public Discussion

Reply
 
LinkBack Thread Tools
Old 02/21/07, 9:56 AM   #16
Diabolic
Glass Joe
 
Undead Priest
 
Ursin
Originally Posted by Shlomi View Post
Backend server software is likely to be custom C or C++, because those are still the language of choice for games programming, and Blizzard doubtless needs to wring every last ounce of performance out of their kit. They also want people who have 5 years C++: http://www.blizzard.com/jobopp/core-...ogrammer.shtml

The credits inside the manual in the TBC expansion credit Brian Fitzpatrick as the network programming lead (which actually sounds like a super fun job, they have lots of stuff going on to cram a 40-man MC raid into 2.5KB/sec downstream). This might be the same Fitz who designed the Subversion repository architecture, but then again he might still be at Google.
As a programmer with rather extensive experience myself, I would say that C++ is all but a certainty for the language. Virtually any other choice would be unable to meet the performance needs.

As for the network programming, I'm also fairly confident that Blizz reduces everything that it possibly can to a bit or byte(s), with minimal overhead for anything outside of TCP. I have done very similar work in the past and although it is not easy, it is indeed fun.

Lastly, could you elaborate briefly on what the Subversion Repository Architecture is? I would appreciate it.

Offline
Reply With Quote
Old 02/21/07, 10:04 AM   #17
Mercuria
Glass Joe
 
Mercuria's Avatar
 
Night Elf Druid
 
Skywall
Originally Posted by Diabolic View Post
Lastly, could you elaborate briefly on what the Subversion Repository Architecture is? I would appreciate it.

http://subversion.tigris.org/

It's an open-source configuration management tool, like cvs and such.

Offline
Reply With Quote
Old 02/21/07, 10:06 AM   #18
spinal
Glass Joe
 
Tauren Shaman
 
Turalyon (EU)
Originally Posted by Kerruul View Post
It's unclear from the SAN engineer posting what storage system they use, but I'd bet on either EMC, IBM or maybe Hitachi based arrays with probably a Brocade fabric. (Those being the most common I've run across.) Possible NetApp, but that's more of a NAS solution and most serious DB people prefer to use FC-AL over iSCSI or NAS-based solutions.
I do recall seeing EVA experience explicitely listed in the storage engineer requirements back when wow launched. Unfortunately I don't remember if it was just listed amongst others but it struck my mind back then because we were doing some tests with some of these aswell at that time.

Offline
Reply With Quote
Old 02/21/07, 10:07 AM   #19
Foxery
Von Kaiser
 
Foxery's Avatar
 
Tauren Druid
 
Auchindoun
Also worth noting, for those who haven't thought about it: It's believed that each "realm" physically consists of 4-5 actual computers. Kalimdor, Eastern Kingdoms, Outlands, Instances.

edit: Ever since Battle Groups were implemented, PvP must also be a seperate machine -- a pretty beefy one that hosts all of the realms in a paticular hosting site.

Hence why performance problems on one don't affect the others, and why a single continent can crash without the whole realm dying.

Offline
Reply With Quote
Old 02/21/07, 12:04 PM   #20
Magunsson
Von Kaiser
 
Magunsson's Avatar
 
Gnome Mage
 
Argent Dawn
Originally Posted by Foxery View Post
Also worth noting, for those who haven't thought about it: It's believed that each "realm" physically consists of 4-5 actual computers. Kalimdor, Eastern Kingdoms, Outlands, Instances.

edit: Ever since Battle Groups were implemented, PvP must also be a seperate machine -- a pretty beefy one that hosts all of the realms in a paticular hosting site.
Interesting thread. The chat server is almost certainly seperate as well, as often the realms can die but chat will still be working until they shut down the realm properly. There will be a number "management" servers as well, that monitor status of the other servers and allows remote reboots etc. These may be per realm, or maybe for groups of realms, or maybe there is just one (with some redundency) per continent. Speculation here obv.

Realms are also grouped into physical sites, we know this as it has been said. Who is in your Battlegroup is dependant upon who is in your physical site.

The authentication server architecture seems to be something that has caused them a lot of bother. It seems like they have only one Authentication server per Continent, which causes problems when mass numbers of realms are restarted as the Authentication server bottlenecks. I suspect now they have some kind of distributed architecture for the authentication server as this has been improved in recent times, but the login-spikes are sufficiently huge sometimes to still cause problems.

(edit: typos)

Last edited by Magunsson : 02/21/07 at 12:11 PM.

Offline
Reply With Quote
Old 02/21/07, 12:09 PM   #21
Magunsson
Von Kaiser
 
Magunsson's Avatar
 
Gnome Mage
 
Argent Dawn
Originally Posted by Shlomi View Post
Backend server software is likely to be custom C or C++, because those are still the language of choice for games programming, and Blizzard doubtless needs to wring every last ounce of performance out of their kit. They also want people who have 5 years C++: http://www.blizzard.com/jobopp/core-...ogrammer.shtml
Other pages, including the "how to submit a resume" page mention C++. This add here also mentions STL and Boost, so this presumes they are from the Herb Sutter / Scott Meyers school of hard knocks.

From a project management point of view, it talks about Scrum, and other Agile development technologies which makes a lot of sense. WoW especially seems to release things with a "timeboxing" concept, where the release date is (semi) fixed at the start, and they cram in features until that release date, moving whatever features they can't fit into future patches.

Offline
Reply With Quote
Old 02/21/07, 12:14 PM   #22
Magunsson
Von Kaiser
 
Magunsson's Avatar
 
Gnome Mage
 
Argent Dawn
Originally Posted by Mercuria View Post
http://subversion.tigris.org/

It's an open-source configuration management tool, like cvs and such.
That they have an ex-Googler is interesting, although I suspect like most of the games industry, they would use Alien Brain for config control.

Offline
Reply With Quote
Old 02/21/07, 2:45 PM   #23
Meep
Glass Joe
 
Dwarf Priest
 
Aszune (EU)
Originally Posted by Magunsson View Post
That they have an ex-Googler is interesting, although I suspect like most of the games industry, they would use Alien Brain for config control.
For art assets, perhaps. I've never heard of a games company using Alienbrain for code assets, and Avid seem rather reluctant to name any on their web site. Perforce and Subversion would be more likely choices.

Offline
Reply With Quote
Old 02/21/07, 3:08 PM   #24
Snowcrasher
Custom User Title
 
Snowcrasher's Avatar
 
Orc Hunter
 
Mal'Ganis
Originally Posted by Foxery View Post
Also worth noting, for those who haven't thought about it: It's believed that each "realm" physically consists of 4-5 actual computers. Kalimdor, Eastern Kingdoms, Outlands, Instances.

edit: Ever since Battle Groups were implemented, PvP must also be a seperate machine -- a pretty beefy one that hosts all of the realms in a paticular hosting site.
I personally think this is the more interesting aspect of their operations to learn about. Which server platform/processor/router brand etc. doesn't really say too much without an understanding of exactly how many servers comprise a realm and how they are partitioned.

Canada Offline
Reply With Quote
Old 02/21/07, 3:43 PM   #25
Ukerric
Don Flamenco
 
Dwarf Priest
 
Dalaran (EU)
Originally Posted by Haldane View Post
If by "Itanium Blades", you mean "HP Opteron-based servers", you'd be correct.
A friend of my manager was customer sales manager at HP and involved in the processing of the dozens of hundreds or so DL580 that Blizzard purchased in august 2004 in preparation of the original launch.

I used to joke that they probably got free shipping with their order

France Offline
Reply With Quote
Old 02/21/07, 4:08 PM   #26
Shlomi
Tank Wannabe
 
Night Elf Warrior
 
Baelgun
Originally Posted by Magunsson View Post
The authentication server architecture seems to be something that has caused them a lot of bother. It seems like they have only one Authentication server per Continent, which causes problems when mass numbers of realms are restarted as the Authentication server bottlenecks. I suspect now they have some kind of distributed architecture for the authentication server as this has been improved in recent times, but the login-spikes are sufficiently huge sometimes to still cause problems.
Blizzard originally went live with JAAS, a Java-based auth service. Sadly I can't find the original reference for that info. It didn't scale wonderfully and I believe they're now on their own custom solution.

The problem is pretty hard. "Given a million logged in North American users on hundreds of realms spread over multiple geographical locations, ensure no-one can log in more than once, expired users can't log in, etc etc". Again, fun stuff to work on.

Offline
Reply With Quote
Old 02/21/07, 5:01 PM   #27
Foxery
Von Kaiser
 
Foxery's Avatar
 
Tauren Druid
 
Auchindoun
Originally Posted by Snowcrasher View Post
I personally think this is the more interesting aspect of their operations to learn about. Which server platform/processor/router brand etc. doesn't really say too much without an understanding of exactly how many servers comprise a realm and how they are partitioned.
Someone mentioned Chat as being seperated from continent servers... Which makes me think that perhaps there is some sort of "Master" machine (or process on one of the others, at least) which controls Chat, the Character Selection screen, and the item database. This one would keep track of which continent your character is supposed to be on and when you zone to another. It would also help explain why you're able to stay connected when other continents crash. [NOTE: this is only an educated guess based on observed behavior over the years.]

Realizing these aspects of server structure makes it all the more amusing when the General forums whine about "fix my server" or "upgrade my slow realm"... It's a lot more involved than sitting down at one keyboard and pressing Control-Alt-Delete. Maintenence on "a realm" involves checking the software configuration and wiring for 4-5 boxes... It's no wonder upgrading the hardware last year required 24-hour downtimes - a swarm of technicians had to reorganize something like 40 machines per hosting site, ensuring that each one was configured for the correct function and wired to its sister boxes!

Back to hardware specifics, the processor types and OS don't seem nearly as interesting as wondering what sort of RAM and storage requirements it really takes. Keep in mind that the server doesn't need ANY graphics processing whatsoever, so these figures are probably strikingly small compared to what professional servers are capable of having in them. (I don't know where to start guessing, though.)

All a server really needs to keep in memory is a standard data structure listing characters, mobs, resources, and the coordinates of each. The item database is... well, just an ordinary database. I am oversimplifying here - of course the server has to track interactions between players and mobs, but the math happening inside the CPU is a lot simpler than the sexy 3D graphics at home make it seem.

Food for thought.

Offline
Reply With Quote
Old 02/21/07, 5:27 PM   #28
Fex
Piston Honda
 
Fex's Avatar
 
Human Mage
 
Azgalor
My educated guess on what the application actually "does" is pretty simple. There's a back-end database which contains your character data, as well as all the other data in the world: mob types, individual mob data, items, saved raid instances, etc. Connecting to that is a "simple" (relatively speaking) application which tracks character and mob movement. It interfaces directly to the clients which run on our desktops. Figuring out exactly which interactions are server-side and which are client-side I believe is mostly a guessing game, though there are some predictable actions that you can assign to each one. Anything which affects the entire world (i.e. things which multiple people can interact with) is server-side. Anything that you control or manipulate is triggered from the client. For instance: Two characters ride up to an herb spawn, and dismount, one gets the tap, the other doesn't. What happens? The server reports to the client that there's an herb at x,y on the map. Your herb radar pops up on your client based on this information which is sent to you (once you're in a certain radius.) The other player gets the same information. Network latency and system load determine who gets the "blip" first. You both run towards it -- this is pure client control. Does the server need to know you're moving? Not really. When you both get there, and dismount, you click the node. Client tells server, "hey, player is at x,y trying to pick up the weed." Server determines if you're in range of it, sends client "OK" and you begin harvesting. Based again on network latency which we all know and love, when the actual event is processed by the server, and sent to the client, is variable. Your bar fills up, client tells server "done picking the weed" and whoever got their "finished" packet back to the server first gets the herbs.

What this all really means is that system load in different parts of the systems affect our experience in different ways. Mobs warping around is usually a network latency problem, or "server" load. (I use "server" to mean the app that interfaces with our desktop clients on the front end, and the database on the back end.) I imagine some component of mob movement is server and database related. If the server can't update positions to the client often enough, they warp. My guess about how much interaction takes place for mob movement is that the server sends a position, a direction, and a speed every once in a while to the client, based on mobs in some arbitrary range. It then lets the client render the movement of that mob until it gets a newer update on where it should be going. This is what causes warping when there is network latency, or server lag. Your client thinks a mob is in a certain spot, but it's really not. So you either aggro it from where you don't expect, or you can't enter combat because you get out of range errors. Loot lag is a database problem. Most likely, the server process made a call to update your character data based on clicking an item in a loot window, and the success response was not forthcoming. Why does it wait? Well, you want to make SURE that when the user feedback indicates the item is looted (the window closes, or some other visual indication) that it ACTUALLY has been looted and the database is updated and saved. Otherwise, you might have lost items. ("HAI GUYZ I LOOT EPIX AND ITZ GONE NOW!!1?!?")

A lot of this seems pretty obvious to me because I work with applications like this every day, just not nearly on this scale. I hope it's given what I can assume to be a more educated audience than the WoW forums crowd at least a little insight, though it really is only an educated guess after all.

Wall of text crits you for 45212.
You die.

United States Offline
Reply With Quote
Old 02/21/07, 5:31 PM   #29
Kerruul
Piston Honda
 
Kerruul's Avatar
 
Troll Mage
 
Mug'thol
Originally Posted by Snowcrasher View Post
I personally think this is the more interesting aspect of their operations to learn about. Which server platform/processor/router brand etc. doesn't really say too much without an understanding of exactly how many servers comprise a realm and how they are partitioned.
As is often the case, the more interesting thing is the harder thing to suss out. There are some things we can say with some confidence based on behavior when things go wrong:
- Instances and battlegrounds are on a separate (set of) server(s) than the major zones in a battlegroup. It's possible that there are numerous machines acting as instance/batteground servers in a battlegroup. (probably depends on population/demand.)
- Authentication is a separate service, probably battlegroup or region-wide. It is probably some sort of Kerberos-like token passing system.
- Chat is also a separate service (though what hardware hosts it is hard to say)
- Each major zone (EK, Kalimdor, Outland) is a separate instance server process, and possibly separate hardware.
- It does not appear that they do failover clustering for instances, including Kalmidor, et. al. (I.E. if they did you could have situation where half your friends in Kalimdor would get dropped but you'd stay online with no significant issue while you're questing in, say, Tanaris. I've never observed this behavior during server crashes.)
- Battlegroups are probably some sort of super-cluster. If they don't share database/storage resources they can probably cross-connect to some degree. (This would be necessary for x-realm BGs, and is probably one of the significant technical issues they had to hurdle to make that feature possible.)

I'm not sure what else you can say with any degree of confidence about the deployment architecture (short of having insider info).

Offline
Reply With Quote
Old 02/21/07, 5:57 PM   #30
Meddler
Piston Honda
 
Tauren Druid
 
Blackrock
Originally Posted by Kerruul View Post
(I.E. if they did you could have situation where half your friends in Kalimdor would get dropped but you'd stay online with no significant issue while you're questing in, say, Tanaris. I've never observed this behavior during server crashes.)
During the first week or two of TBC we did actually see this happening quite a lot on Blackrock. Outland, as expected, was horribly overcrowded and regularly dropping people. Some of the time it was Outland as a whole, a lot of the time however it was just Hellfire Peninsula with those of us that had escaped to Terrokar especially/Zangarmarsh to some degree able to watch others fall offline on a regular basis.

Offline
Reply With Quote
Reply

Go Back   Elitist Jerks » Public Discussion » Public Discussion

Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Blizzard's UI Accelerator Uziel Public Discussion 9 01/23/07 8:56 PM
TBC and Blizzard's Need/Greed Box khaavren Public Discussion 4 12/21/06 10:58 AM
The Problems with New Hardware Kerulak Public Discussion 32 04/23/06 10:20 AM
Looks like Mal'Ganis is getting new hardware! subscience Public Discussion 67 04/11/06 8:40 AM