It is mid-June and the Oakland Athletics are the best team in baseball. They are also, somehow, one of the most underachieving.
They're scoring 5.09 runs a game (first in the majors) and allowing only 3.12 (also first), which at their current pace works out to a full season of 824 runs scored, 505 runs allowed and an astounding +319 run differential. They are in the top five of just about any team offensive number or metric you'd care to look at except batting average and stolen bases. But somehow, the expected win/loss metrics say they should be better. The Athletics are already sitting on a run differential of +130, but with "only" the second-best record in baseball at 40-26, they're running behind their expected W/L record by about seven wins. If the 2014 Athletics keep on scoring runs, allowing runs and winning games at the same exact rates they are now through the end of the year, they will own one of the biggest positive single-season run differentials in the modern era -- as well as the biggest discrepancy between how many games they actually win and how many games their Pythagorean expectation said they should win. This seems absolutely nonsensical on the face of it: It should not be possible to be one of the best teams in baseball and be one of the biggest underachievers in league history in the same season.
Yet here we are. The Athletics' current .606 winning percentage would leave them with 98 wins at the end of the year -- an outstandingly successful season by any measure, almost certainly good enough to win the AL West and perhaps even the American League outright. But a team that scores 824 runs while only allowing 505 over a 162-game schedule has an expected Pythagorean winning percentage of .710. And a .710 winning percentage over 162 games is a 115-win season. There is only one team in the last 60 years that has won more than 115 games in a single year: the 2001 Seattle Mariners, who went 116-46 on their way to a 4-1 loss to the New York Yankees in the ALDS. And that Mariners team outperformed their projected record by seven wins.
At this point it might be useful to briefly touch on what "Pythagorean winning percentage" actually is, because tossing around the names of old Greek mathematicians has the tendency to make people's minds jump to algebra or geometry, not statistics. Worry not, there is no proof-writing to be found here; the "Pythagorean Theory of Baseball" was one of the early metrics put together by Bill James back in his Almanac days, and it looks like this:
(Team Runs Scored)^2 divided by [(Team Runs Scored)^2 + (Team Runs Allowed)^2] = expected winning percentage
It's a simple formula, but a strange one, and the only real resemblance to Pythagoras' actual theorem (a^2+b^2=c^2) is that it involves squaring variables. This is probably a point in its favor, given how little Euclidean geometry has to say about how many games of baseball the Oakland Athletics should win this year.
Ignore the exponents, however, and it becomes quite clear what "Pythag" actually is: It's the formula for winning percentage, replacing wins with runs scored and losses with runs allowed, and then pairing those values with an exponential coefficient so that the output more closely resembles a team winning percentage within standard statistical norms. The exponents in James' formula represent not geometric truth but statistical precision -- all in all, a very nifty exercise in branding.
You also may have noticed that plugging 824 and 505 into their respective places in that formula spits out .722 after rounding, not .710. It turns out that choosing your exponential coefficients for purposes of name-checking a famous dead guy can lead to some small degree of imprecision; current versions of the formula call for an exponential factor of 1.81 or 1.83 instead of 2. The precise character of the factor is calculated using the current league scoring environment in total runs scored. (In fairness to James, he was an amateur statistician in a field he was essentially developing by himself, and in the end his selection of 2 still turned out to be pretty close. James would go on to introduce some more rigorous logarithmic methods of calculating expected winning percentage.)
All of this is not to say that the underlying theory behind Pythagorean winning percentage is wrong: It is eminently just and proper to believe that, over any reasonably significant period of time, a team that scores more runs than it allows will win more games than a team for which the reverse is true. But as much as that resonates, Major League Baseball (as currently constructed) doesn't particularly care how many runs you score. It cares about how many times your team scored more runs that it allowed over 162 discrete contests, with no extra credit or demerits given for blowouts.
The variation in how runs are distributed by game can play havoc with how well any expected Win-Loss estimator works -- the 2012 Baltimore Orioles will remain a poster child for that particular truism for years to come. As you might expect, with the season roughly one-third gone, Oakland leads the league with 14 games in which they've scored nine or more runs, with only the Colorado Rockies (with 12) anywhere close. The A's, who were 14-0 in those contests, scored 149 runs and allowed 36. But there are a number of routs in the 11-1 range in there; that means that nearly half of the runs that Oakland has plated in 2014 came in only 21 percent of the games they played. Compare that to the only team in the league with a better record than Oakland, the San Francisco Giants: The Giants have only five games with nine or more runs, all wins, for a total of 51 runs scored; but they also have 36 runs allowed, the same as the Athletics in one-third the games. Pythagorean win expectancy favors curbstomps such as Oakland's, with the expectation that they will balance out across a large enough sample. The Giants, despite being 38-9 this year when they score three or more runs, do not routinely run their opposition out of the stadium, which is why they're actually beating their Pythag by three wins at present. (The Athletics are no slouches themselves at 39-12 in that situation.)
So how will this play out through the end of the season for Oakland? I personally still won't believe that Scott Kazmir's going to throw 200 innings of 2.20 ERA baseball until I see it in the books, and even then I'll probably still have my doubts. But outside of Kazmir and the overachieving catcher platoon of Norris/Jaso, there's not a whole lot of weirdness on the team. Josh Donaldson is an early MVP candidate for the second straight season, and Brandon Moss has quietly been a top-five first baseman for about two years now, while Yoenis Cespedes, Coco Crisp and Jed Lowrie are all very nice complements to the Moss/Donaldson whirlwind doom squad. Sonny Gray is a special young pitcher who is capable of not only sustaining what's been a strong sophomore season so far, but of building on it as well. The bullpen has been its usual great self outside of the swiftly-remedied Jim Johnson disaster. (It is worth noting that bullpen performance also sabotages Pythag outcomes -- the Athletics would likely have a couple more wins right now if not for Johnson's early-season meltdowns.)
The biggest concern the A's have right now is what to do if Scott Kazmir and Jesse Chavez both come back to Earth, at which point the rotation once again becomes an issue for the team. It's not inconceivable to think that the A's might address that with some sort of deadline deal for a rental, which would firm things up considerably.
All in all, however, the A's on paper look a lot more like "just" the first-place team they are right now than the transcendently, generationally-dominant club their "Pythag" predicts they should be. And there's good reason to believe that, in this instance, the projections will come down to meet reality as the season goes on, instead of the other way around.