So here’s the thing. I get lots of ideas in my head, and most of them I either discard, or I begin working on it, then give up half way through and move on. The SONAR project, something I started years ago, is somewhere in category 2, because I have a lot of ideas for it, but its incredibly in depth, it takes a ton of time, and its imperfect. Until I figure out the ways to remove most of the imperfections, it will remain dead. But in the meantime, I had an idea that kind of comes out of that semi-failed project. I was thinking it would be fun to have a way to take a quick and dirty look at how our prospects are performing, but without doing a ton of heavy lifting and adjusting. I figured once I built the spreadsheet, I could do a lot of copy/paste work, and it would basically update itself. And I think I was right. Which means I should be able to maintain this with little effort, which means I won’t get bored with it and discard it. Maybe. Anyway, I figured I’d unveil it here, and we’ll see how it goes. Full explanation below the fold, as well as the inaugural rankings
I figure the best way to introduce this is to answer a few basic W questions
What: What is this? Its just a toy. Its not meant to be statistically significant, because it takes raw data and runs it through a formula. I came up with the weightings, which I’ll get to in a minute, and its based on how I evaluate prospects. The whole formula will be outlined, I’m not hiding anything or trying to rig the system to produce results. You shouldn’t take the results as some sort of written in stone reading on a prospect. That’s why its called a toy. Its fun.
Why: Well, why not? I think that’s easy enough.
When: I plan to update this every 2 weeks. Hopefully. Remind me if I forget.
Where: On this site, silly. I’m going to create a page to keep all of the results. It will be at the top. Its not there now, so don’t panic.
So I don’t lose most of you with the math (even though its basic, because I suck at math), I’m going to outline this in really simple terms.
The actual score
The score is represented as a percentage. The higher the percentage, the higher the score. It will make sense when you see it.
The study population
I made very simple cutoffs in terms of who to include. The following minor leaguers were omitted from the results
3A: Age 27 and older players
2A: Age 26 and older players
A+: Age 25 and older players
A: Age 24 and older players
If you’re still at Lakewood and you’re 24, your prospect ship has probably sailed. Probably.
As I mentioned in the intro, the point is for this to be quick and dirty. For hitters, I chose 2 stats, very easy to calculate, and they are
Secondary Average – This stat does a lovely job of incorporating a player’s power, his ability to draw walks, and his stolen base capability.
Strikeout Rate – Or better expressed, contact rate. Guys who rack up huge strikeout numbers against bad pitching in the low minors are generally going to struggle to make contact as they climb to tougher levels. This is important. Calculated by dividing strikeouts by plate appearances
For pitchers, I used 3 metrics:
Strikeout Rate – The ability to miss bats is hugely important for a pitcher
Walk Rate – Wildness may scare hitters, but they’ll just take their base. Walking guys isn’t great
Home Run Rate – Pitchers have some control over this, less than the first two, but it is an important indicator
These are the “three true outcomes” for a pitcher. They will be weighted differently, I’ll get to that in a minute.
Hitters and pitchers also have two more adjustments made to their performance, and they are:
Playing time – If you just adjust a player’s performance relative to league average (which I’m doing, more on that in a minute) and don’t consider the sample size, you’ll get some really wild results.
Age – I harp on this constantly, but its here as well. A .300/.400/.500 line in A ball means something completely different if an 18 year old does it, compared to, say, a 23 year old. Context baby, it matters!
Its a great song by The Band, but in this case, I’m talking about the weights given to each statistic, and the adjustments made for playing time and age. Here they are
For hitters, SecA and K%:
SecA = 75%
K% = 25%
Simply put, SecA factors in power, walks, and speed, while K% factors in just contact skills. Makes sense, I think. Again, quick and easy
For pitchers, K/9, BB/9, HR/9
K/9 = 50%
BB/9 = 35%
HR/9 = 15%
Strikeouts are really important. If you can’t miss bats in A ball, the odds of you missing bats as you climb the ladder isn’t great. A guy with sub-par control but great swing and miss stuff might be able to learn some control, but a guy with only fringy stuff, even if hes a strike thrower, is gonna struggle against better hitters. HR rate is the weakest of the indicators, because there is the normal batted ball luck, but generally, if you’re giving up a lot of home runs, either your stuff just isn’t great, or more likely, your command isn’t good and you’re leaving pitches in the fat part of the plate. Again, its less reliable as an indicator, but it still has at least some analytical value, moreso than hits/9. And its easy to calculate. Again, quick and dirty.
Age adjustments are tough, but this is the scale I use to weigh a player’s score
Age 21-22 = 1.5
Age 23-24 = 1.0
Age 25-26 = 0.5
Age 20-21 = 1.5
Age 22-23 = 1.0
Age 24-25 = 0.5
Age 19-20 = 1.5
Age 21-22 = 1.0
Age 23-24 = 0.5
Age 18-19 = 1.5
Age 20-21 = 1.0
Age 22-23 = 0.5
I hope this makes sense. The “average” legit prospect age in A ball is 20-21. So they are the baseline, at 1.0. If you are younger, you get a bump (1.5) and if you are older, you get docked (0.5). If a player has a “negative” score, his age factor is flipped. For example, if a player is 22 in A ball with a positive raw score of 10, his age adjusted score is 5. 10 * .5 = 5. On the flip side, if a 22 year old player has a raw score of -10, his adjusted score is -15 (-10*1.5) = -15). Your age either helps you or hurts you. If you are old for the level, it hurts you either way. If you are young for the level, it helps you either way. I hope this makes sense, it does to me because I’ve been working on versions of this formula for a long time.
Playing time is trickier. To adjust for position players, I took the total number of games the team has played, and multiplied it by 3, 3 being the number of plate appearances. So, if a team has played 50 games, I multiply by 3 and get 150 PA, which would be the baseline. If you have fewer than 150 PA, your weighted score is reduced. If you have more than 150 PA, your weighted score gets a boost. For pitchers, I assumed a baseline of 120 innings pitched in a minor league season. That’s essentially 24 innings per month, for 5 months. If a starter begins the year in the rotation and stays there all year, he’ll surpass that. If not, he probably won’t. But you have to “punish” relievers, because they pitch fewer innings, and their ratios are subject to more noise than a starter, and well, I value relievers less than starters as a general philosophy. Again, quick and dirty, not really scientifically significant, but its a toy!
Edit, I forgot the formula! Each category (SecA, K%, K/9, BB/9, HR/9) is calculated based on the league average. So, if the player is in AAA, its his numbers against the league average for all players in the International League. The same applies to all leagues. The hitter’s performance, after adjusted relative to the league average, is adjusted based on age, and then finally the playing time adjustment is made. Playing time adjustment is based on a 100% scale. If the baseline is 150 PA, and the hitter has exactly 150 PA, then his playing time adjustment is 100%. If the player has more than 150 PA, his PT adjustment will be greater than 100%, if he has less, it will be less than 100%. This is to keep it a positive number. For pitchers, the weighted sub-total against the league average is multiplied by playing time and age adjustments.
The result, for both pitchers and hitters, is a score that is scalable and comparable. 0% represents a perfectly league average performance, adjusted for age and playing time. A positive percent represents above league average performance, adjusted for age and playing time. A negative percentage represents a below league average performance, adjusted for age and playing time. Because of the weights (adding up to 100%) for both pitchers and hitters, the scores are comparable. So 30% for a hitter is the same as 30% for a pitcher.
What is missing?
Well, a lot, but for the 10th time, its a toy!
For position players, defense is not counted. For pitchers, nothing outside of the 3 true outcomes is counted. Also, none of the numbers are park adjusted. Because minor league park factors are really wonky, and to be honest, I don’t trust them, even the 3 year weighted numbers. And that would take a lot more time to compile. Lakewood is a pitcher friendly park, Reading a hitter friendly park, so you can take those players with a pinch of salt, and conversely, adjust the hitters and pitchers respectively in your head. Since all players are evaluated based on their league averages, the performances don’t have to be further adjusted to take league strength into account.
The initial standings!
I’m going to post these as image files, because google documents is kind of a pain, and its easier to just post the image so you don’t have to click a bunch of links. As I mentioned, I’m going to incorporate this into the site at the top, but I haven’t done it yet. There are a number of things I want to do with regard to the site layout/navigation, but its gonna take some time. I’m going to give you the full chart of hitters, the full chart of pitchers, and then the consolidated rankings with just their PPR score. You can’t see the formulas, because its just EXCEL values, but I explained all of the weights above. Players highlighted in YELLOW have played at multiple levels, so I had to create a separate sheet to calculate their totals, which is why their weights are blacked out. I’m not hiding them, I just have to calculate it on a separate worksheet. All stats are current as of May 30th and taken from baseball-reference.com. Its late, if something looks wrong, tell me so I can correct it later. Discuss/Enjoy!
(click the images if they are too small, they should open in a new tab!)
This is not a re-ranking of my top 30 and isn’t meant to be. This is simply evaluating the statistical performance of the player, in the context of his league, with considerations given to his age and his playing time sample size. That’s it. Its a toy. Its fun. Don’t get carried away!
Edit 1 —> I knew something didn’t look right. I forgot to add one of the cells in the formula. Fixed!