Tuesday, October 7, 2008

Can 1 run games and blowout wins and losses explain why a team's Pythagorean record is off from their official record?

Question: Can one-run games and blowout wins and losses explain why a team's Pythagorean record is different from their official record?

Note: The formula to get the expected Pythagorean winning percentage :

Win% = (Runs Scored^2)/( Runs Scored^2+Runs Allowed^2)

Why I asked the question: Last season there was much discussion about how the Diamondbacks were overachieving by winning more games than their Pythagorean record said they would.

In the 2007 season the Diamondbacks went 90-72, but according to the Pythagorean formula (712 runs scored and 732 runs allowed), they should have been 79-83, a difference of 11 games. I read many an article on this, and what I found even more interesting was that two years before, this same team did the same thing when they went 77-85 with 696 runs scored and 856 runs allowed. These runs scored resulted in a Pythagorean record of 64-98. In the over 100 year history of baseball the Bob Melvin coached Diamondbacks hold the second (2007) and 7th (2005) best percentage of games won compare to the number of games that they were supposed to achieve. Most teams average a difference of

Searching the Internet, I found quite a few articles on the subject. Many of the authors were not able to put their finger on exactly what was going on. A couple of people did make pretty compelling arguments. First Dan Rosenheck of the New York Times (http://www.nytimes.com/2007/09/23/sports/baseball/23score.html?_r=4&ref=sports&oref=slogin&oref=slogin&oref=slogin&oref=slogin) showed that Arizona’s relievers either had close duty or mop duty. Chris Jaffee of the Hardball Times (http://www.hardballtimes.com/main/article/no-mirage-in-arizona/) stated the same thing. A common observation from several articles was that Arizona was really good in one-run games and really bad in blowouts (games decided by 5 runs or more) as seen in Table 1. I decided to isolated those two factors and see if they really did matter in the difference in the records.

Analysis:

From 2002 to 2007 there were 180 seasons of baseball played and during that time18 teams (10% of the teams) over/under achieved by 7 games or more.

Table 1

Team

Year

Actual Record

RS

RA

Pythag.

Record

Difference (actual – pythag.)

Record in 1 run games

Record in Blowouts

Cleveland

2006

78-84

870

782

90-72

-12

18-26

33-20

Toronto

2005

80-82

775

705

89-73

-9

16-31

25-14

Boston

2002

93-69

859

665

101-61

-8

13-23

34-19

Chicago Cubs

2002

67-95

706

759

75-87

-8

18-36

16-18

Houston

2003

87-75

805

677

95-67

-8

19-21

28-14

Detroit

2004

72-90

827

844

79-83

-7

12-27

23-22

Boston

2007

96-66

867

657

103-59

-7

22-28

36-17

NY Mets

2005

83-79

722

648

90-72

-7

21-24

25-17

Chicago Sox

2005

99-63

741

645

92-70

+7

35-19

21-16

Minnesota

2002

94-67

768

712

87-75

+7

29-16

23-20

Oakland

2006

93-69

771

727

86-76

+7

32-22

21-22

Cincinnati

2003

69-93

694

885

62-100

+7

30-21

9-29

St. Louis

2007

78-84

725

829

70-92

+8

16-20

25-38

Seattle

2007

88-74

794

813

79-83

+9

28-20

24-29

Cincinnati

2004

76-86

750

907

66-96

+10

25-20

11-35

Arizona

2007

90-72

712

732

79-83

+11

32-20

20-26

NY Yankees

2004

101-61

897

808

89-73

+12

24-16

27-28

Arizona

2005

77-85

696

856

64-98

+13

28-18

18-35

Given the list of teams, their W/L records needed to be estimated for the one-run games and the blowouts in order to see if they could help explain the difference.

1 run games - I adjusted these W/L records by the percentage averaged the actual and Pythagorean winning percentages. I figured the teams ideal winning percentage was somewhere in between the actual and Pythagorean values and the average gave me that value. For example the 2007 Diamondbacks went 32-20 in one-run games so here is the formula to get the estimated wining percentage.

Estimated Winning % = ((90/162)+(79/162))/2 = 52.16%

52.16% of 52 is 27, so their estimated record would be 27-25.

So over the 2007 season the Diamondbacks won 5 more games 1 runs than they should have won. This change in wins was subtracted or added to the teams W/L total.

Blowouts - The winning percentage was needed also to estimate the number of games the team should have been blown out. The actual W/L total was not adjusted, instead, the runs allowed (for teams being blown out too often) or runs scored (teams administering the blowout) were adjusted. Taking an average of all the blowout games, the average run difference was 7.5 runs. Since 4 runs was not considered a blowout, 3.5 runs per game would be added or subtracted from Runs Scored or Runs Allowed.

For example using the 2007 Diamondbacks again, they should have had a record in blowouts of 24-22. They were probably blown out 4 too many times, so their Runs Allowed was decreased by 14 runs (4 games * 3.5 runs/game).

The following shows what the teams Actual and Pythagorean records would be after being adjusted for a more normalized record in 1 run games and blowouts.

Table 2

Team

Year

Actual Record (adjusted)

RS (adjusted)

RA (adjusted)

Pythag.

Record (adjusted)

Difference (actual – pythag.) (adjusted)

Cleveland

2006

83-79

850

782

88-74

-5

Toronto

2005

88-74

759

705

87-75

1

Boston

2002

101-61

850

665

100-62

1

Chicago Cubs

2002

72-90

702

759

75-87

-3

Houston

2003

90-72

789

677

93-69

-3

Detroit

2004

78-84

819

844

79-83

-1

Boston

2007

105-57

855

657

102-60

3

NY Mets

2005

86-76

713

648

89-73

-3

Chicago Sox

2005

96-66

744

645

92-70

4

Minnesota

2002

91-71

772

712

87-75

4

Oakland

2006

91-71

781

727

87-75

4

Cincinnati

2003

59-103

716

885

64-98

-5

St. Louis

2007

78-84

738

829

72-90

6

Seattle

2007

85-77

806

813

80-82

5

Cincinnati

2004

71-91

782

907

69-93

2

Arizona

2007

85-77

782

907

80-82

5

NY Yankees

2004

100-62

913

808

91-71

9

Arizona

2005

69-93

714

856

66-96

3


Initially the average difference in wins of the actual to Pythagorean records were off by 8.72 wins. After being adjusted they were off by only 3.72 wins (average difference for all 180 was 3.29 wins). There was one case where a team maintained a difference of of greater than 7 games. The 2004 Yankees only changed from winning more 11 games than they were supposed to winning 9 more. With all this information, one run games and blowouts, in most cases, can explain why a team's actual and Pythagorean records don't match.

There were several reasons I read about for reasons some teams might be better than other in the 1-runs games and blowouts. Besides the articles by Dan Rosenheck and Chris Jaffe, not many gave good explanations, but here are some of their theories (and the reasons I do or don't believe them).

Good in one run games.

  • Good bullpens allowing team to win close games. The bullpens of the overachieving teams had an ERA of 4.11, while the ERA of the underachievers was 4.18. This idea didn't hold much water

  • Clutch hitting. People have been trying to determine clutching hitting for years and if you Google “clutch hitting statistics”, 768,000 articles will be available for reading, but I am not going to begin to tackle the subject.

Bad in one run games

  • Bad bullpens. I actually looked to see if Blown Saves and the record in 1-run games was correlated. They weren't significantly correlated (teams that overachieved averaged 18.8 blown saves, while the underachievers averaged 20.0). There was some teams that this could definitely be the case with though, such as Detroit in 2004 blew 28 saves

More blowouts wins

  • Really good offensive team. These teams would jump out to an early lead and the other team throws out the dreg pitchers for the high power offense to tee of on. There is some truth to this in that of the underachievers scored about 50 runs per season less than the overachievers (804 runs vice 754 runs per season).

More blowout losses

  • Bi-polar starting and/or relief staffs (pitching staffs that have pitchers that are really good or really bad, no middle of the road pitchers). The bullpen aspect was looked at in the two previous articles by Rosenheck and Jaffe. This could also be the case with a team's starting rotation. A team could have 3 aces and the other starters are horrible, thereby increasing the number of blowouts..

I might look into expanding this topic in the future, for now my question has been answered.

Other articles on difference in Pythagorean record and actual record:

Pondering Pythagoras

by David Gassko

http://www.hardballtimes.com/main/article/pondering-pythagoras/

Managers and the Pythagorean Theorem

By Pizza Cutter

http://mvn.com/mlb-stats/2007/12/15/managers-and-the-pythagorean-theorem/

Pythagoras solved?: An R-squared of 97.8 percent

By Pizza Cutter

http://mvn.com/mlb-stats/2007/11/05/pythagoras-solved-an-r-squared-of-978-percent/

Update for 2008 season: Again 10% of the teams (3) had a difference of more than 7 games between their actual and Pythagorean records and their records when adjusting for 1 run games and blowouts:


Team Actual Record RS RA Pythag. Record Difference (actual – pythag.) Record in 1 run games Record in Blowouts
Toronto 86-76 714 610 94-68 -8 24-32 24-10
Houston 86-75 765 697 77-84 9 21-21 18-24
LA Angels 100-62 765 697 92-70 11 31-21 20-20


Team Wins Losses RS (adjusted) RA (adjusted) Wins (Pythag adjusted) Losses (Pythag adjusted) Difference (actual – adjusted pythag)
Toronto 86 76 696 610 92 70 1
Houston 86 75 723 743 78 83 7
LA Angels 100 62 808 697 93 69 7


No comments: