Elitist Jerks
Register
Blogs
Forums


Go Back   Elitist Jerks » Class Mechanics

 
 
LinkBack Thread Tools
Old 06/04/07, 5:48 PM   #1
Disquette
doop doop de doooo
 
Disquette's Avatar
 
Human Rogue
 
Sargeras
Survey of Theorycrafters - are you a good tester?

I'm curious, after all the churn we've had in the Shaman threads, how rigorously are your spreadsheets, simulators, etc., tested.

In the Shaman Enhancement thread, there are some things that have seemed so obvious in theory, but have turned out to be wrong with even rudimentary testing. Basic damage formulas are being redone in at least one case.

So, my questions to you are:

1) How closely do tests match predicted results for your class/spec?
2) How do you go about conducting your tests?
3) Are tests done by multiple people to insure independently good results?
4) Do you have any general purpose tricks for testing that make runs go smoothly (I realize this is very open ended, but I know I've picked up a couple, and am wondering if others have some as well).

vv fluff question, I realize, but I'm curious
5) How confident are you, if you were to start a new class, that the people doing theory crafting in the Elitist Jerks forum have "figured it all out", at least for the most part, and you could rely on that calculator?

http://us.battle.net/wow/en/forum/to...6766?page=3#41
Let me map a priority list out for you so that you can refer to it in the future:
1. Money 2. Money 3. PvE 4. Mages 5. Companion pets 6. PvP

United States Offline
Old 06/04/07, 6:06 PM   #2
slant
Don Flamenco
 
Draenei Shaman
 
Drenden
I learned long ago not to expect MMO mechanics to behave logically or as promised. I was actually one of the very first theorycrafters, back when applying tests and math to game mechanics was an entirely new concept. Most notably, I was the guy that proved that stats didn't do anything in everquest, rendering most (everquest) shaman buffs worthless. Back then I was just as hardcore as most of the hardcore progression guilds are today. That was a long time ago, and I don't get into the games so much these days, but I still find the intricate mechanics pretty compelling.

Valid testing methodology, scientific method, applied mathematics, all of those things are pretty well understood these days. Not on the blizzard forums, certainly, but in this forum. If a theory is found invalid, you throw it out and start banging on a new one. You guys are doing it right.

United States Offline
Old 07/01/07, 7:47 PM   #3
Hadd
Glass Joe
 
Draenei Shaman
 
Wildhammer
I just started doing some testing on different weapons, and weapon enchants. I know all that sort of stuff has been figured out already, but I needed to farm signets anyway.

For my test I've been fighting the same 3 types of mobs, and running the combat log, and damage meter over 1000 melee hits. So far the damage, and crit rates have been fairly consitant over that span, I haven't done the math to determine standard deviations or confidence intervals yet, I still have some tests I'd like to run, mostly how small a sample size is significant? The early results look like over 500 hits is enough to smooth out most of the spikes in crits, and weapon procs.

Has anyone else looked into the sample size needed to remove otuliers? Using a different approach has anyone tried using Student's T-test to try and reduce the sample size needed?

Offline
Old 07/01/07, 10:09 PM   #4
Xerophyte
King Hippo
 
Xerophyte's Avatar
 
Awnh
Tauren Warrior
 
No WoW Account (EU)
1) How closely do tests match predicted results for your class/spec?
Well, for the most part. With warriors I've only really looked at the TPS numbers and general tank theory (avoidance stacking, combat table, taunt, etc) and not so much our dps, but they match my observations.

2) How do you go about conducting your tests?
Not with as great an amount of rigour as I should. I can't really be arsed to be precise for wow and so I tend to opt out of doing any detailed analysis and just do basic "is this value/hypothesis/proposed mechanic reasonable?" tests. Which basically lets me determine if a certain hypothesis is flat-out wrong, but not if its a correct mechanic or just a decent approximation in some/most situations.

For a specific example: I did some testing on Taunt back before anyone was sure how threat worked, which basically consisted about of me around with a Warlock friend in Burning Steppes and taunting mobs off the Warlock after he'd done certain amounts of damage measured by DM. I was able to conclusively determine that Taunt had a specific 4 second focusing effect and that it did something with your threat in addition, which debunked a good number of xor hypothesises for either effect floating around the eu forums at the time, but didn't really say exactly exactly what it did either. My working theory at the time - that Taunt added 2k-3k damage worth of threat - was completely false and that I at all thought it reasonable was a direct result of me being too lazy to test on mobs with a large enough amount of health to allow for extended testing.

3) Are tests done by multiple people to insure independently good results?
Mine are, yes. I've got good friends who play the same classes as me and share an interest in knowing the innards of the game, so it's usually painless enough to have my tests independently verified.

4) Do you have any general purpose tricks for testing that make runs go smoothly (I realize this is very open ended, but I know I've picked up a couple, and am wondering if others have some as well).
I'd generally advise everyone to do enough rough theorycrafting to have an idea of what to expect before trying to test any mechanic or hypothesis. Take a grab-bag of boundary conditions and assumptions, work out roughly what region your results should end up with according to current theorycraft and do some quick & dirty tests. If they're in the right ballpark, feel free to do a rigorous analysis and thorough test so you can nail down the specific questions and variables left within the theory that your initial tests said fit. If those initial tests don't match at all, however, then you just managed to catch a fundamental error in theory or in your testing methodology that would've made a mess of things if you'd tried doing detailed testing.

5) How confident are you, if you were to start a new class, that the people doing theory crafting in the Elitist Jerks forum have "figured it all out", at least for the most part, and you could rely on that calculator?
Very confident, really. Most of the serious theorycrafters I've seen for the warrior & druid bits are keen on exposing their data and reasoning to peer review, which tends to keep the larger threads, spreadsheets and simulators well free of major inaccuracies. I don't expect any of it to be 100% accurate and I'm sure there are plenty of smaller errors, but the results are generally sufficient to determine when a change in strategy, talents or gear is significant enough to overcome the benefits of personal preference and when it is not. Which is about all I need.

Sweden Offline
Old 07/02/07, 10:49 AM   #5
Evolve
Von Kaiser
 
Evolve's Avatar
 
Worgen Mage
 
Argent Dawn (EU)
I've started theorycrafting paladin threat recently, and I have been quite amazed at the lack of theorycrafting that has been done previously about it (there even ain't a correct formula that can be found for SoR damage per swing , mine still ain't correct ) but thats probably because a palatank is one of the least played specs of my class :p

Belgium Offline
Old 07/03/07, 12:47 PM   #6
Nightshroud
Glass Joe
 
Nightshroud's Avatar
 
Night Elf Priest
 
Alleria
The thing I find most dissatisfying in my testing is that I don't understand the math for knowing how many random trials are needed to be confident my results are correct to a given precision.


Offline
Old 07/03/07, 1:06 PM   #7
Teez
Piston Honda
 
Draenei Shaman
 
Kel'Thuzad
Originally Posted by Nightshroud View Post
The thing I find most dissatisfying in my testing is that I don't understand the math for knowing how many random trials are needed to be confident my results are correct to a given precision.
I presume you're talking about determining sample sizes here (or how many sets of tests with a given amount of samples you need to run, alternatively.) While this doesn't even scratch the surface of the whole topic, the article concerning Sample Size on Wikipedia should give you some insight. You may also want to skim through the definition of Standard Deviation, as this is highly pertinent to the topic, and should be something that should be documented with any statistical analysis (unfortunately, some of the samples here are somewhat rudimentary, so it's not always supplied.) Last but not least, as linked on the Sample Size page link I mentioned above, the NIST offers a nice breakdown of how to select sample sizes here.

I hope this helps a bit - don't expect too much out of just those sources though, it becomes a lot deeper if you read on further into the topic (and a lot more interesting too).


Offline
Old 07/04/07, 12:15 PM   #8
Disquette
doop doop de doooo
 
Disquette's Avatar
 
Human Rogue
 
Sargeras
Personally I've been trying to do sample sizes of 1000+ of whatever I'm testing if it's a binary data point. So if I'm measuring crit rate on white swings, I want at least 1000 white swings. This stems from the fact that political pollsters seem to agree that 1000 is a fair sample size for saying "how many people like vanilla ice cream". It's a binary result of "yes" or "no". Similarly, crits are either "my white swing crit" or "it didn't".

http://us.battle.net/wow/en/forum/to...6766?page=3#41
Let me map a priority list out for you so that you can refer to it in the future:
1. Money 2. Money 3. PvE 4. Mages 5. Companion pets 6. PvP

United States Offline
Old 07/04/07, 1:16 PM   #9
Erongg
Great Tiger
 
Erongg's Avatar
 
Lorentz
Troll Shaman
 
No WoW Account
Well the sample size is a function of how confident you want to be in your estimate. If you want to measure the crit rate to +/- .1%, and you want to be 99% confident in your result, you'll need a larger sample size than if you need it within +/- 1% with 95% confidence. Measuring it to within +/- 10% with 70% confidence would require only a small sample size.

The formulas are all pretty simple, but I just googled and found this website that will do the calculations for you: http://www.surveysystem.com/sscalc.htm

With a sample size of 1000 with an infinite population (and the theoretical number of swings you could poll is indeed infinite, so that works in this case), you can say with 95% confidence that your crit rate is within +/- 3% of X.


Offline
Old 07/04/07, 1:17 PM   #10
Disquette
doop doop de doooo
 
Disquette's Avatar
 
Human Rogue
 
Sargeras
Thanks much for the link!

http://us.battle.net/wow/en/forum/to...6766?page=3#41
Let me map a priority list out for you so that you can refer to it in the future:
1. Money 2. Money 3. PvE 4. Mages 5. Companion pets 6. PvP

United States Offline
Old 07/04/07, 6:15 PM   #11
The Iron Colonel
Don Flamenco
 
The Iron Colonel's Avatar
 
Dwarf Hunter
 
Mug'thol
Although it's somewhat academic, I recommend reading Wikipedia's write-up on Confidence Intervals. Essentially, this is with what (as theory testers) we're primarily concerned.
http://en.wikipedia.org/wiki/Confidence_interval
Specifically, read through
http://en.wikipedia.org/wiki/Confide...in_measurement
Hope this helps (although, as I said, it's somewhat academic and dry, per usual with academic style sources).

Offline
Old 07/05/07, 2:09 PM   #12
CasT
Piston Honda
 
CasT's Avatar
 
Night Elf Druid
 
Outland (EU)
Those times I have tried to check numbers I have tried them to see if they are close or far away. I am well aware that the formula most often is far from the truth and that it there by does not fit in all cases, but as long as it fits my current (and near future) situation, I feel that it good enough.

[edit]
This is also the reason I think that we find the need to readjust these formulas, since they were made during one period of time or should we say Teir set of the game. When the formula is added to a couple of higher tiers it's often first then the numbers diffentiate enough to start the curiosity or the question why.

Do not matter how much you play, you will never get the carrot.

Offline
 

Go Back   Elitist Jerks » Class Mechanics

Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Quick Survey, for a good cause! Leysin The Dung Heap 1 05/02/07 9:56 PM
BC SURVEY: "Replacing" Tier1, T2 and T3 gear TseTse Public Discussion 58 02/16/07 12:00 PM
Combat Potency [WTB Theorycrafters] warnpeace Public Discussion 1 10/16/06 8:17 PM
Online Survey JDWolters Public Discussion 16 06/01/06 10:48 AM