Why I asked the question: In 2001, Jeff Suppan went all season without giving up more than five earned runs in any of his starts and he shutout the competition once while posting an ERA of 4.37. I always thought he was a really consistent pitcher, especially that season, but there was no way to measure consistency.
Analysis:
Approach
To determine variance between starts, I measured how many runs a pitcher allowed from one start to the next and how many earned runs varied from one game to the next. This analysis was done by taking the earned runs for that game and extrapolating them to a 9 inning game. The following formula was used to get the pergame ERA:
Per game ERA = Earned Runs / (Innings pitch/9)
Here is an example of the first 5 games of Jeff Suppan’s 2001 season:
2001 Suppan 





Date / Box  Opponent  IP  ER  Runs  Per Game ERA 
04022001 *  at Yankees  5  5  5  9.000 
04072001 *  vs Twins  6.33  3  3  4.265 
04132001 *  at Blue Jays  6  2  2  3.000 
04182001 *  at Twins  6.67  1  1  1.349 
04252001 *  at Devil Rays  6  5  5  7.500 
04302001 *  at Blue Jays  7.33  3  3  3.683 



 Season ERA  4.37 
The Standard Deviation of the Per Game ERAs was used. (Note: Standard Deviation measures how spread out a set of values is compared to the mean of that set of values. One Standard Deviation should contain 66% of the values. The following website gives a good summary: http://www.techbookreport.com/tutorials/stddev30secs.html.) In this study, it describes how the per game ERAs vary when compared to each other.
ASSUMPTIONS
The pitcher was given credit for 1/3 of an inning pitched when he had an “infinity ERA,” which happens when he is removed from the game before making one out.
Only innings and runs from games started counted toward the per game ERA. In some cases a pitcher had a lot of innings in a relief role, thereby causing him to miss his next start. I decided not to count these innings as starts, but that could be part of another analysis.
The per game ERA was limited to 27.One bad outing can really skew the values of a fairly decent season. I wanted to find inconsistency, but some games skewed the data too much. For example Luke Hudson pitched a game on August 13, 2006 where he gave up 10 earned runs in 1/3^{rd} of a inning, leading to a per game ERA of 272.72. This game was a loss, but this value really skews his data. See below:

Standard Deviation
Includes Game
69.45
Excludes Game
2.53
Including this game doesn’t make sense because it means he should range +/69 runs per game, using one Standard Deviation (66% of his starts.) He only allowed 9 runs or less per start for the rest of the season. I decided on a per game limit of 27 (an ERA equal to 1 run allowed per out recorded). Twenty seven runs were high enough to punish the pitcher’s standard deviation for there bad outing, but not too high to ruin the rest of the season’s results. In the case of Luke Hudson his values became:

Standard Deviation
27 runs per game maximum
6.44
Even though the impetus for this analysis was Jeff Suppan’s 2001 season, I chose 2007 for the analysis year for two reasons:
My memory is better about last year’s pitching than pitching that occurred seven years ago.
I want to start keeping a yearly total, beginning with the the current year and working backwards. After each season completes, I can compute the new values.
Pitchers with 25 or more starts were used. Most statisticians require at least 30 subjects (games) for the stats to be considered useful. Twenty five was close to 30 and it allowed for quite a few more pitchers to be examined.
RESULTS
Given the above assumptions, I calculated the Per Start Variation (or PSV) for pitchers from the 2007 season that made 25 or more starts. There were 97 pitchers that made the cut and here are the results:
Top and Bottom ranked pitcher by PSV for 2007
Rank  Player  Team  PSV 
1  Dan Haren  OAK  2.29 
2  Johan Santana  MIN  2.43 
3  Javier Vazquez  CHW  2.49 
4  Josh Beckett  BOS  2.52 
5  Scott Kazmir  TAM  2.68 
93  Edwin Jackson  TAM  7.30 
94  Ervin Santana  LAA  7.38 
95  Nate Robertson  DET  7.60 
96  Jeff Weaver  SEA  7.93 
97  Tom Glavine  NYM  8.43 
Median Value = 4.92
Pitchers with lower ERA's generally have a lower PSV
Not much surprise here, pitchers that give up less runs are more constant. This doesn't necessarily have to be the case, though, because two pitchers could have the same PSV (1.15) resulting from per game ERA's of 2,2,4,4 and per game ERA's of 4,4,6,6, leading to total ERAs of 3 and 5, respectively.
PSV vs ERA:
(I calculated the linear trend line and Rsquared value for each graph relationship. The linear trend shows ERA and Standard Deviation are somewhat directly correlated. The Rsquared value shows how strong a parameter is correlated to another parameter (here is a website the further explain rsquared: http://www.dummies.com/WileyCDA/DummiesArticle/UnderstandingCorrelationinExcelSalesForecasting.id3125.html ). A value of 1 means that the parameters are directly correlated, and a value of 0 means the parameters are not correlated at all. As a general rule:
Rsquare = 0.0 to 0.3  Parameters are not correlated
Rsquared = 0.3 to 0.7  Parameters are somewhat correlated
Rsquared = 0.7 to 1.0  Parameters are very correlated
In this case, the values are somewhat correlated, meaning that pitchers with lower ERAs show some level of consistency.
Quality Starts and PSV are not related.
A Quality Start is when a pitcher completes at least six innings and permits no more than three earned runs. I used Quality Start percent (QS%) because this would show how often a pitcher had a decent start.
Here is the graph of the comparison:
The data is therefore insignificantly related. For example, 68% of Tom Glavine’s #13 in QS and #90 in PSV starts were Quality Starts, but he definitely ranks as the worst when it comes to PSV. QS % has been used a measure of consistency, but the either it is a QS or not a QS requirement doesn't allow for how much pitchers was away from the norm.
A lower PSV is somewhat correlated to the pitcher's winning percentage, especially if pitcher's team offensive production is taken into account.
The whole concept of a consistent pitcher is that each night he pitches, he will keep his team in the game by limiting the other team’s runs. So if the pitcher continues to limit the opponent’s runs to a consistent number, the pitcher's team knows how many runs it needs to win the game. The more runs a team scores, the better chance the pitcher has to win. I expected to see consistent pitchers with good run support win more than pitchers with bad run support. I was also expecting pitchers with high PSVs to win half of their games no matter what because they would be throwing either lights out or batting practice. This was not the case.
There was not enough data to really compare subsets of the data (i.e. pitcher's winning percent with low PSV and high run support), so only comparisons of the whole group were published. I did run some comparisons on smaller sets, but nothing stood out.
For 2007, an average pitcher had a +/ 1% change in their winning percentage for each 14.5 runs scored above and below that year's league average. Incorporating this change into the winning percentage, I plotted winning percentage vs PSV and here is the result:
As you can see PSV and Winning Percentage are some what correlated (Rsquared = .300) and the following equation can be used to predict a pitcher’s winning percentage:
=(0.049*PSV)+(0.768((Run Difference for league average)/1450))
Using that equation, the standard deviation of the difference between predicted and actual winning percentage was 14%, thereby creating an estimation of pitcher's winning percentage based on their PSV and how many runs the team scores.
I also lumped together all the pitchers with a PSV between 2.00 and 2.49 (marked as 2 in the graph), 2.50 and 2.99, to values >7, and calculated their average winning percentage in order to take some of the clutter out of the graph.
This graph shows that the winning percentage really begins to decrease with a PSV > 5.5.
PSV seems to barely correlate to winning percentage, except at larger values. More data is needed to help clarify if and by how much PSV can help determine the pitcher's winning percentage.
A team is not affected in the W/L column by pitchers that have large per game ERA (> 27 per game ERA)
Another claim about consistent pitchers is that their team knows what they will get from the pitchers in terms of innings pitched. Since they won't blowup as often, their bullpen won't be stressed as much. I wanted to determine if a team's bullpen would be stressed from a bad start, but with the small sample size I used (all the games where the pitcher's single game ERA was greater then 27) there seems to be no correlation.
With the small sample size of 29 instances, teams actually performed better after the pitching blowup than before.
Finally, how does Jeff Suppan’s PSV stack up to the pitcher's PSV of the 2007?
Jeff Suppan’s numbers during 2001 were:
Name  ERA  PSV 
Suppan  4.37  2.56 
Comparing Suppan’s 2001 numbers to the 97 pitchers from 2007 he would rank as follows:
Category Rank
ERA #57
PSV #5
So, my hunch was correct in that Jeff Suppan had a very consistent season for the Royals in 2001, even though his ERA wasn't one of the league's best.
CONCLUSIONS
A pitcher's consistency can be measured better with PSV compared to Quality Start Percentage, but it barely correlates to the pitchers winning percentage and it has no effect on a team’s win/loss chances if it has inconsistent pitcher starts. I was able to answer my personal question about Jeff Suppan being consistent, but found that it didn't really matter for the team’s record. For future research, I plan to expand the research to more years’ worth of data to see if these conclusions hold true. Any feedback would be greatly appreciated.
2 comments:
wouldnt pqs be a better consistency check? the runs a pitcher allows in a single game can be pretty random.
by the way, i like your work here.
I don't like pqs because it is an either it is a PQS or it isn't. Going 9 innings and giving up 4 runs is not a QS, but going 6 innings and giving up 3 is a QS even though the first is a better start. I was looking for a way to measure each start.
Post a Comment